From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <law@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [216.205.24.124])
 by sourceware.org (Postfix) with ESMTP id AD4693857804
 for <gcc-patches@gcc.gnu.org>; Mon, 30 Nov 2020 06:26:09 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org AD4693857804
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-567-ikE1fGVIO1Os9YATejKOsQ-1; Mon, 30 Nov 2020 01:26:05 -0500
X-MC-Unique: ikE1fGVIO1Os9YATejKOsQ-1
Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com
 [10.5.11.23])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 10864802B48;
 Mon, 30 Nov 2020 06:26:04 +0000 (UTC)
Received: from localhost.localdomain (ovpn-112-145.phx2.redhat.com
 [10.3.112.145])
 by smtp.corp.redhat.com (Postfix) with ESMTP id 7C08A19C59;
 Mon, 30 Nov 2020 06:26:03 +0000 (UTC)
Subject: Re: [21/23] doc: Add documentation for rtl-ssa
To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com
References: <mpth7ptad81.fsf@arm.com> <mptsg9d6503.fsf@arm.com>
From: Jeff Law <law@redhat.com>
Message-ID: <b73d8e73-713b-147b-3864-fb5c18855139@redhat.com>
Date: Sun, 29 Nov 2020 23:26:02 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.4.0
MIME-Version: 1.0
In-Reply-To: <mptsg9d6503.fsf@arm.com>
X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Content-Language: en-US
X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Mon, 30 Nov 2020 06:26:12 -0000


On 11/13/20 1:22 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds some documentation to rtl.texi about the SSA form.
> It only really describes the high-level structure -- I think for
> API-level stuff it's better to rely on function comments instead.
>
> gcc/
> 	* doc/rtl.texi (RTL SSA): New node.
I suspect we'll need to twiddle things as we have more consumers and get
feedback from folks looking to try and wire this into existing RTL
optimizers and I like that the docs focus on how to add the SSA
on-the-side info to an existing pass.

> +
> +Since resource numbers so closely match register numbers, it is somtimes
NIT, typo "somtimes" -> "sometimes"

I'm still struggling to see how we avoid the lost copy problem as
changes are made to the SSA graph.  I suspect we're going to end up with
some additional documentation around that.  But I think we can have a
distinct patch for that.

Jeff

> +convenient to refer to them simply as register numbers, or ``regnos''
> +for short.  However, the RTL SSA form also provides an abstraction
> +of resources in the form of @code{rtl_ssa::resource_info}.
> +This is a lightweight class that records both the regno of a resource
> +and the @code{machine_mode} that the resource has (@pxref{Machine Modes}).
> +It has functions for testing whether a resource is a register or memory.
> +In principle it could be extended to other kinds of resource in future.
> +
> +@node RTL SSA Accesses
> +@subsection RTL SSA Register and Memory Accesses
> +
> +In the RTL SSA form, most reads or writes of a resource are
> +represented as a @code{rtl_ssa::access_info}@footnote{The exceptions
> +are call clobbers, which are generally represented separately.
> +See the comment above @code{rtl_ssa::insn_info} for details.}.
> +These @code{rtl_ssa::access_info}s are organized into the following
> +class hierarchy:
> +
> +@findex rtl_ssa::access_info
> +@findex rtl_ssa::use_info
> +@findex rtl_ssa::def_info
> +@findex rtl_ssa::clobber_info
> +@findex rtl_ssa::set_info
> +@findex rtl_ssa::phi_info
> +@smallexample
> +rtl_ssa::access_info
> +  |
> +  +-- rtl_ssa::use_info
> +  |
> +  +-- rtl_ssa::def_info
> +        |
> +        +-- rtl_ssa::clobber_info
> +        |
> +        +-- rtl_ssa::set_info
> +              |
> +              +-- rtl_ssa::phi_info
> +@end smallexample
> +
> +A @code{rtl_ssa::use_info} represents a read or use of a resource and
> +a @code{rtl_ssa::def_info} represents a write or definition of a resource.
> +As in the main RTL representation, there are two basic types of
> +definition: clobbers and sets.  The difference is that a clobber
> +leaves the register with an unspecified value that cannot be used
> +or relied on by later instructions, while a set leaves the register
> +with a known value that later instructions could use if they wanted to.
> +A @code{rtl_ssa::clobber_info} represents a clobber and
> +a @code{rtl_ssa::set_info} represent a set.
> +
> +Each @code{rtl_ssa::use_info} records which single @code{rtl_ssa::set_info}
> +provides the value of the resource; this is null if the resource is
> +completely undefined at the point of use.  Each @code{rtl_ssa::set_info}
> +in turn records all the @code{rtl_ssa::use_info}s that use its value.
> +
> +If a value of a resource can come from multiple sources,
> +a @code{rtl_ssa::phi_info} brings those multiple sources together
> +into a single definition (@pxref{RTL SSA Phi Nodes}).
> +
> +@node RTL SSA Phi Nodes
> +@subsection RTL SSA Phi Nodes
> +
> +@cindex phi nodes, RTL SSA
> +@findex rtl_ssa::phi_info
> +If a resource is live on entry to an extended basic block and if the
> +resource's value can come from multiple sources, the extended basic block
> +has a ``phi node'' that collects together these multiple sources.
> +The phi node conceptually has one input for each incoming edge of
> +the extended basic block, with the input specifying the value of
> +the resource on that edge.  For example, suppose a function contains
> +the following RTL:
> +
> +@smallexample
> +;; Basic block bb3
> +@dots{}
> +(set (reg:SI R1) (const_int 0))  ;; A
> +(set (pc) (label_ref bb5))
> +
> +;; Basic block bb4
> +@dots{}
> +(set (reg:SI R1) (const_int 1))  ;; B
> +;; Fall through
> +
> +;; Basic block bb5
> +;; preds: bb3, bb4
> +;; live in: R1 @dots{}
> +(code_label bb5)
> +@dots{}
> +(set (reg:SI @var{R2})
> +     (plus:SI (reg:SI R1) @dots{}))  ;; C
> +@end smallexample
> +
> +The value of R1 on entry to block 5 can come from either A or B@.
> +The extended basic block that contains block 5 would therefore have a
> +phi node with two inputs: the first input would have the value of
> +R1 defined by A and the second input would have the value of
> +R1 defined by B@.  This phi node would then provide the value of
> +R1 for C (assuming that R1 does not change again between
> +the start of block 5 and C).
> +
> +Since RTL is not a ``native'' SSA representation, these phi nodes
> +simply collect together definitions that already exist.  Each input
> +to a phi node for a resource @var{R} is itself a definition of
> +resource @var{R} (or is null if the resource is completely
> +undefined for a particular incoming edge).  This is in contrast
> +to a native SSA representation like GIMPLE, where the phi inputs
> +can be arbitrary expressions.  As a result, RTL SSA phi nodes
> +never involve ``hidden'' moves: all moves are instead explicit.
> +
> +Phi nodes are represented as a @code{rtl_ssa::phi_node}.
> +Each input to a phi node is represented as an @code{rtl_ssa::use_info}.
> +
> +@node RTL SSA Access Lists
> +@subsection RTL SSA Access Lists
> +
> +All the definitions of a resource are chained together in reverse postorder.
> +In general, this list can contain an arbitrary mix of both sets
> +(@code{rtl_ssa::set_info}) and clobbers (@code{rtl_ssa::clobber_info}).
> +However, it is often useful to skip over all intervening clobbers
> +of a resource in order to find the next set.  The list is constructed
> +in such a way that this can be done in amortized constant time.
> +
> +All uses (@code{rtl_ssa::use_info}) of a given set are also chained
> +together into a list.  This list of uses is divided into three parts:
> +
> +@enumerate
> +@item
> +uses by ``real'' nondebug instructions (@pxref{real RTL SSA insns})
> +
> +@item
> +uses by real debug instructions
> +
> +@item
> +uses by phi nodes (@pxref{RTL SSA Phi Nodes})
> +@end enumerate
> +
> +The first and second parts individually follow reverse postorder.
> +The third part has no particular order.
> +
> +@cindex degenerate phi node, RTL SSA
> +The last use by a real nondebug instruction always comes earlier in
> +the reverse postorder than the next definition of the resource (if any).
> +This means that the accesses follow a linear sequence of the form:
> +
> +@itemize @bullet
> +@item
> +first definition of resource R
> +
> +@itemize @bullet
> +@item
> +first use by a real nondebug instruction of the first definition of resource R
> +
> +@item
> +@dots{}
> +
> +@item
> +last use by a real nondebug instruction of the first definition of resource R
> +@end itemize
> +
> +@item
> +second definition of resource R
> +
> +@itemize @bullet
> +@item
> +first use by a real nondebug instruction of the second definition of resource R
> +
> +@item
> +@dots{}
> +
> +@item
> +last use by a real nondebug instruction of the second definition of resource R
> +@end itemize
> +
> +@item
> +@dots{}
> +
> +@item
> +last definition of resource R
> +
> +@itemize @bullet
> +@item
> +first use by a real nondebug instruction of the last definition of resource R
> +
> +@item
> +@dots{}
> +
> +@item
> +last use by a real nondebug instruction of the last definition of resource R
> +@end itemize
> +@end itemize
> +
> +(Note that clobbers never have uses; only sets do.)
> +
> +This linear view is easy to achieve when there is only a single definition
> +of a resource, which is commonly true for pseudo registers.  However,
> +things are more complex  if code has a structure like the following:
> +
> +@smallexample
> +// ebb2, bb2
> +R = @var{va};        // A
> +if (@dots{})
> +  @{
> +    // ebb2, bb3
> +    use1 (R);  // B
> +    @dots{}
> +    R = @var{vc};    // C
> +  @}
> +else
> +  @{
> +    // ebb4, bb4
> +    use2 (R);  // D
> +  @}
> +@end smallexample
> +
> +The list of accesses would begin as follows:
> +
> +@itemize @bullet
> +@item
> +definition of R by A
> +
> +@itemize @bullet
> +@item
> +use of A's definition of R by B
> +@end itemize
> +
> +@item
> +definition of R by C
> +@end itemize
> +
> +The next access to R is in D, but the value of R that D uses comes from
> +A rather than C@.
> +
> +This is resolved by adding a phi node for @code{ebb4}.  All inputs to this
> +phi node have the same value, which in the example above is A's definition
> +of R@.  In other circumstances, it would not be necessary to create a phi
> +node when all inputs are equal, so these phi nodes are referred to as
> +``degenerate'' phi nodes.
> +
> +The full list of accesses to R is therefore:
> +
> +@itemize @bullet
> +@item
> +definition of R by A
> +
> +@itemize @bullet
> +@item
> +use of A's definition of R by B
> +@end itemize
> +
> +@item
> +definition of R by C
> +
> +@item
> +definition of R by ebb4's phi instruction, with the input coming from A
> +
> +@itemize @bullet
> +@item
> +use of the ebb4's R phi definition of R by B
> +@end itemize
> +@end itemize
> +
> +Note that A's definition is also used by ebb4's phi node, but this
> +use belongs to the third part of the use list described above and
> +so does not form part of the linear sequence.
> +
> +It is possible to ``look through'' any degenerate phi to the ultimate
> +definition using the function @code{look_through_degenerate_phi}.
> +Note that the input to a degenerate phi is never itself provided
> +by a degenerate phi.
> +
> +At present, the SSA form takes this principle one step further
> +and guarantees that, for any given resource @var{res}, one of the
> +following is true:
> +
> +@itemize
> +@item
> +The resource has a single definition @var{def}, which is not a phi node.
> +Excluding uses of undefined registers, all uses of @var{res} by real
> +nondebug instructions use the value provided by @var{def}.
> +
> +@item
> +Excluding uses of undefined registers, all uses of @var{res} use
> +values provided by definitions that occur earlier in the same
> +extended basic block.  These definitions might come from phi nodes
> +or from real instructions.
> +@end itemize
> +
> +@node Changing RTL Instructions
> +@subsection Using the RTL SSA framework to change instructions
> +
> +@findex rtl_ssa::insn_change
> +There are various routines that help to change a single RTL instruction
> +or a group of RTL instructions while keeping the RTL SSA form up-to-date.
> +This section first describes the process for changing a single instruction,
> +then goes on to describe the differences when changing multiple instructions.
> +
> +@menu
> +* Changing One RTL SSA Instruction::
> +* Changing Multiple RTL SSA Instructions::
> +@end menu
> +
> +@node Changing One RTL SSA Instruction
> +@subsubsection Changing One RTL SSA Instruction
> +
> +Before making a change, passes should first use a statement like the
> +following:
> +
> +@smallexample
> +auto attempt = crtl->ssa->new_change_attempt ();
> +@end smallexample
> +
> +Here, @code{attempt} is an RAII object that should remain in scope
> +for the entire change attempt.  It automatically frees temporary
> +memory related to the changes when it goes out of scope.
> +
> +Next, the pass should create an @code{rtl_ssa::insn_change} object
> +for the instruction that it wants to change.  This object specifies
> +several things:
> +
> +@itemize @bullet
> +@item
> +what the instruction's new list of uses should be (@code{new_uses}).
> +By default this is the same as the instruction's current list of uses.
> +
> +@item
> +what the instruction's new list of definitions should be (@code{new_defs}).
> +By default this is the same as the instruction's current list of
> +definitions.
> +
> +@item
> +where the instruction should be located (@code{move_range}).
> +This is a range of instructions after which the instruction could
> +be placed, represented as an @code{rtl_ssa::insn_range}.
> +By default the instruction must remain at its current position.
> +@end itemize
> +
> +If a pass was attempting to change all these properties of an instruction
> +@code{insn}, it might do something like this:
> +
> +@smallexample
> +rtl_ssa::insn_change change (insn);
> +change.new_defs = @dots{};
> +change.new_uses = @dots{};
> +change.move_range = @dots{};
> +@end smallexample
> +
> +This @code{rtl_ssa::insn_change} only describes something that the
> +pass @emph{might} do; at this stage, nothing has actually changed.
> +
> +As noted above, the default @code{move_range} requires the instruction
> +to remain where it is.  At the other extreme, it is possible to allow
> +the instruction to move anywhere within its extended basic block,
> +provided that all the new uses and definitions can be performed
> +at the new location.  The way to do this is:
> +
> +@smallexample
> +change.move_range = insn->ebb ()->insn_range ();
> +@end smallexample
> +
> +In either case, the next step is to make sure that move range is
> +consistent with the new uses and definitions.  The way to do this is:
> +
> +@smallexample
> +if (!rtl_ssa::restrict_movement (change))
> +  return false;
> +@end smallexample
> +
> +This function tries to limit @code{move_range} to a range of instructions
> +at which @code{new_uses} and @code{new_defs} can be correctly performed.
> +It returns true on success or false if no suitable location exists.
> +
> +The pass should also tentatively change the pattern of the instruction
> +to whatever form the pass wants the instruction to have.  This should use
> +the facilities provided by @file{recog.c}.  For example:
> +
> +@smallexample
> +rtl_insn *rtl = insn->rtl ();
> +insn_change_watermark watermark;
> +validate_change (rtl, &PATTERN (rtl), new_pat, 1);
> +@end smallexample
> +
> +will tentatively replace @code{insn}'s pattern with @code{new_pat}.
> +
> +These changes and the construction of the @code{rtl_ssa::insn_change}
> +can happen in either order or be interleaved.
> +
> +After the tentative changes to the instruction are complete,
> +the pass should check whether the new pattern matches a target
> +instruction or satisfies the requirements of an inline asm:
> +
> +@smallexample
> +if (!rtl_ssa::recog (change))
> +  return false;
> +@end smallexample
> +
> +This step might change the instruction pattern further in order to
> +make it match.  It might also add new definitions or restrict the range
> +of the move.  For example, if the new pattern did not match in its original
> +form, but could be made to match by adding a clobber of the flags
> +register, @code{rtl_ssa::recog} will check whether the flags register
> +is free at an appropriate point.  If so, it will add a clobber of the
> +flags register to @code{new_defs} and restrict @code{move_range} to
> +the locations at which the flags register can be safely clobbered.
> +
> +Even if the proposed new instruction is valid according to
> +@code{rtl_ssa::recog}, the change might not be worthwhile.
> +For example, when optimizing for speed, the new instruction might
> +turn out to be slower than the original one.  When optimizing for
> +size, the new instruction might turn out to be bigger than the
> +original one.
> +
> +Passes should check for this case using @code{change_is_worthwhile}.
> +For example:
> +
> +@smallexample
> +if (!rtl_ssa::change_is_worthwhile (change))
> +  return false;
> +@end smallexample
> +
> +If the change passes this test too then the pass can perform the change using:
> +
> +@smallexample
> +confirm_change_group ();
> +crtl->ssa->change_insn (change);
> +@end smallexample
> +
> +Putting all this together, the change has the following form:
> +
> +@smallexample
> +auto attempt = crtl->ssa->new_change_attempt ();
> +
> +rtl_ssa::insn_change change (insn);
> +change.new_defs = @dots{};
> +change.new_uses = @dots{};
> +change.move_range = @dots{};
> +
> +if (!rtl_ssa::restrict_movement (change))
> +  return false;
> +
> +insn_change_watermark watermark;
> +// Use validate_change etc. to change INSN's pattern.
> +@dots{}
> +if (!rtl_ssa::recog (change)
> +    || !rtl_ssa::change_is_worthwhile (change))
> +  return false;
> +
> +confirm_change_group ();
> +crtl->ssa->change_insn (change);
> +@end smallexample
> +
> +@node Changing Multiple RTL SSA Instructions
> +@subsubsection Changing Multiple RTL SSA Instructions
> +
> +The process for changing multiple instructions is similar
> +to the process for changing single instructions
> +(@pxref{Changing One RTL SSA Instruction}).  The pass should
> +again start the change attempt with:
> +
> +@smallexample
> +auto attempt = crtl->ssa->new_change_attempt ();
> +@end smallexample
> +
> +and keep @code{attempt} in scope for the duration of the change
> +attempt.  It should then construct an @code{rtl_ssa::insn_change}
> +for each change that it wants to make.
> +
> +After this, it should combine the changes into a sequence of
> +@code{rtl_ssa::insn_change} pointers.  This sequence must be in
> +reverse postorder; the instructions will remain strictly in the
> +order that the sequence specifies.
> +
> +For example, if a pass is changing exactly two instructions,
> +it might do:
> +
> +@smallexample
> +rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
> +@end smallexample
> +
> +where @code{change1}'s instruction must come before @code{change2}'s.
> +Alternatively, if the pass is changing a variable number of
> +instructions, it might build up the sequence in a
> +@code{vec<rtl_ssa::insn_change *>}.
> +
> +By default, @code{rtl_ssa::restrict_movement} assumes that all
> +instructions other than the one passed to it will remain in their
> +current positions and will retain their current uses and definitions.
> +When changing multiple instructions, it is usually more effective
> +to ignore the other instructions that are changing.  The sequencing
> +described above ensures that the changing instructions remain
> +in the correct order with respect to each other.
> +The way to do this is:
> +
> +@smallexample
> +if (!rtl_ssa::restrict_movement (change, insn_is_changing (changes)))
> +  return false;
> +@end smallexample
> +
> +Similarly, when @code{rtl_ssa::restrict_movement} is detecting
> +whether a register can be clobbered, it by default assumes that
> +all other instructions will remain in their current positions and
> +retain their current form.  It is again more effective to ignore
> +changing instructions (which might, for example, no longer need
> +to clobber the flags register).  The way to do this is:
> +
> +@smallexample
> +if (!rtl_ssa::recog (change, insn_is_changing (changes)))
> +  return false;
> +@end smallexample
> +
> +When changing multiple instructions, the important question is usually
> +not whether each individual change is worthwhile, but whether the changes
> +as a whole are worthwhile.  The way to test this is:
> +
> +@smallexample
> +if (!rtl_ssa::changes_are_worthwhile (changes))
> +  return false;
> +@end smallexample
> +
> +The process for changing single instructions makes sure that one
> +@code{rtl_ssa::insn_change} in isolation is valid.  But when changing
> +multiple instructions, it is also necessary to test whether the
> +sequence as a whole is valid.  For example, it might be impossible
> +to satisfy all of the @code{move_range}s at once.
> +
> +Therefore, once the pass has a sequence of changes that are
> +individually correct, it should use:
> +
> +@smallexample
> +if (!crtl->ssa->verify_insn_changes (changes))
> +  return false;
> +@end smallexample
> +
> +to check whether the sequence as a whole is valid.  If all checks pass,
> +the final step is:
> +
> +@smallexample
> +confirm_change_group ();
> +crtl->ssa->change_insns (changes);
> +@end smallexample
> +
> +Putting all this together, the process for a two-instruction change is:
> +
> +@smallexample
> +auto attempt = crtl->ssa->new_change_attempt ();
> +
> +rtl_ssa::insn_change change (insn1);
> +change1.new_defs = @dots{};
> +change1.new_uses = @dots{};
> +change1.move_range = @dots{};
> +
> +rtl_ssa::insn_change change (insn2);
> +change2.new_defs = @dots{};
> +change2.new_uses = @dots{};
> +change2.move_range = @dots{};
> +
> +rtl_ssa::insn_change *changes[] = @{ &change1, change2 @};
> +
> +auto is_changing = insn_is_changing (changes);
> +if (!rtl_ssa::restrict_movement (change1, is_changing)
> +    || !rtl_ssa::restrict_movement (change2, is_changing))
> +  return false;
> +
> +insn_change_watermark watermark;
> +// Use validate_change etc. to change INSN1's and INSN2's patterns.
> +@dots{}
> +if (!rtl_ssa::recog (change1, is_changing)
> +    || !rtl_ssa::recog (change2, is_changing)
> +    || !rtl_ssa::changes_are_worthwhile (changes)
> +    || !crtl->ssa->verify_insn_changes (changes))
> +  return false;
> +
> +confirm_change_group ();
> +crtl->ssa->change_insns (changes);
> +@end smallexample
> +
>  @node Sharing
>  @section Structure Sharing Assumptions
>  @cindex sharing of RTL components