From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id AD4693857804 for ; Mon, 30 Nov 2020 06:26:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org AD4693857804 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-567-ikE1fGVIO1Os9YATejKOsQ-1; Mon, 30 Nov 2020 01:26:05 -0500 X-MC-Unique: ikE1fGVIO1Os9YATejKOsQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 10864802B48; Mon, 30 Nov 2020 06:26:04 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-145.phx2.redhat.com [10.3.112.145]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7C08A19C59; Mon, 30 Nov 2020 06:26:03 +0000 (UTC) Subject: Re: [21/23] doc: Add documentation for rtl-ssa To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com References: From: Jeff Law Message-ID: Date: Sun, 29 Nov 2020 23:26:02 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Nov 2020 06:26:12 -0000 On 11/13/20 1:22 AM, Richard Sandiford via Gcc-patches wrote: > This patch adds some documentation to rtl.texi about the SSA form. > It only really describes the high-level structure -- I think for > API-level stuff it's better to rely on function comments instead. > > gcc/ > * doc/rtl.texi (RTL SSA): New node. I suspect we'll need to twiddle things as we have more consumers and get feedback from folks looking to try and wire this into existing RTL optimizers and I like that the docs focus on how to add the SSA on-the-side info to an existing pass. > + > +Since resource numbers so closely match register numbers, it is somtimes NIT, typo "somtimes" -> "sometimes" I'm still struggling to see how we avoid the lost copy problem as changes are made to the SSA graph.  I suspect we're going to end up with some additional documentation around that.  But I think we can have a distinct patch for that. Jeff > +convenient to refer to them simply as register numbers, or ``regnos'' > +for short. However, the RTL SSA form also provides an abstraction > +of resources in the form of @code{rtl_ssa::resource_info}. > +This is a lightweight class that records both the regno of a resource > +and the @code{machine_mode} that the resource has (@pxref{Machine Modes}). > +It has functions for testing whether a resource is a register or memory. > +In principle it could be extended to other kinds of resource in future. > + > +@node RTL SSA Accesses > +@subsection RTL SSA Register and Memory Accesses > + > +In the RTL SSA form, most reads or writes of a resource are > +represented as a @code{rtl_ssa::access_info}@footnote{The exceptions > +are call clobbers, which are generally represented separately. > +See the comment above @code{rtl_ssa::insn_info} for details.}. > +These @code{rtl_ssa::access_info}s are organized into the following > +class hierarchy: > + > +@findex rtl_ssa::access_info > +@findex rtl_ssa::use_info > +@findex rtl_ssa::def_info > +@findex rtl_ssa::clobber_info > +@findex rtl_ssa::set_info > +@findex rtl_ssa::phi_info > +@smallexample > +rtl_ssa::access_info > + | > + +-- rtl_ssa::use_info > + | > + +-- rtl_ssa::def_info > + | > + +-- rtl_ssa::clobber_info > + | > + +-- rtl_ssa::set_info > + | > + +-- rtl_ssa::phi_info > +@end smallexample > + > +A @code{rtl_ssa::use_info} represents a read or use of a resource and > +a @code{rtl_ssa::def_info} represents a write or definition of a resource. > +As in the main RTL representation, there are two basic types of > +definition: clobbers and sets. The difference is that a clobber > +leaves the register with an unspecified value that cannot be used > +or relied on by later instructions, while a set leaves the register > +with a known value that later instructions could use if they wanted to. > +A @code{rtl_ssa::clobber_info} represents a clobber and > +a @code{rtl_ssa::set_info} represent a set. > + > +Each @code{rtl_ssa::use_info} records which single @code{rtl_ssa::set_info} > +provides the value of the resource; this is null if the resource is > +completely undefined at the point of use. Each @code{rtl_ssa::set_info} > +in turn records all the @code{rtl_ssa::use_info}s that use its value. > + > +If a value of a resource can come from multiple sources, > +a @code{rtl_ssa::phi_info} brings those multiple sources together > +into a single definition (@pxref{RTL SSA Phi Nodes}). > + > +@node RTL SSA Phi Nodes > +@subsection RTL SSA Phi Nodes > + > +@cindex phi nodes, RTL SSA > +@findex rtl_ssa::phi_info > +If a resource is live on entry to an extended basic block and if the > +resource's value can come from multiple sources, the extended basic block > +has a ``phi node'' that collects together these multiple sources. > +The phi node conceptually has one input for each incoming edge of > +the extended basic block, with the input specifying the value of > +the resource on that edge. For example, suppose a function contains > +the following RTL: > + > +@smallexample > +;; Basic block bb3 > +@dots{} > +(set (reg:SI R1) (const_int 0)) ;; A > +(set (pc) (label_ref bb5)) > + > +;; Basic block bb4 > +@dots{} > +(set (reg:SI R1) (const_int 1)) ;; B > +;; Fall through > + > +;; Basic block bb5 > +;; preds: bb3, bb4 > +;; live in: R1 @dots{} > +(code_label bb5) > +@dots{} > +(set (reg:SI @var{R2}) > + (plus:SI (reg:SI R1) @dots{})) ;; C > +@end smallexample > + > +The value of R1 on entry to block 5 can come from either A or B@. > +The extended basic block that contains block 5 would therefore have a > +phi node with two inputs: the first input would have the value of > +R1 defined by A and the second input would have the value of > +R1 defined by B@. This phi node would then provide the value of > +R1 for C (assuming that R1 does not change again between > +the start of block 5 and C). > + > +Since RTL is not a ``native'' SSA representation, these phi nodes > +simply collect together definitions that already exist. Each input > +to a phi node for a resource @var{R} is itself a definition of > +resource @var{R} (or is null if the resource is completely > +undefined for a particular incoming edge). This is in contrast > +to a native SSA representation like GIMPLE, where the phi inputs > +can be arbitrary expressions. As a result, RTL SSA phi nodes > +never involve ``hidden'' moves: all moves are instead explicit. > + > +Phi nodes are represented as a @code{rtl_ssa::phi_node}. > +Each input to a phi node is represented as an @code{rtl_ssa::use_info}. > + > +@node RTL SSA Access Lists > +@subsection RTL SSA Access Lists > + > +All the definitions of a resource are chained together in reverse postorder. > +In general, this list can contain an arbitrary mix of both sets > +(@code{rtl_ssa::set_info}) and clobbers (@code{rtl_ssa::clobber_info}). > +However, it is often useful to skip over all intervening clobbers > +of a resource in order to find the next set. The list is constructed > +in such a way that this can be done in amortized constant time. > + > +All uses (@code{rtl_ssa::use_info}) of a given set are also chained > +together into a list. This list of uses is divided into three parts: > + > +@enumerate > +@item > +uses by ``real'' nondebug instructions (@pxref{real RTL SSA insns}) > + > +@item > +uses by real debug instructions > + > +@item > +uses by phi nodes (@pxref{RTL SSA Phi Nodes}) > +@end enumerate > + > +The first and second parts individually follow reverse postorder. > +The third part has no particular order. > + > +@cindex degenerate phi node, RTL SSA > +The last use by a real nondebug instruction always comes earlier in > +the reverse postorder than the next definition of the resource (if any). > +This means that the accesses follow a linear sequence of the form: > + > +@itemize @bullet > +@item > +first definition of resource R > + > +@itemize @bullet > +@item > +first use by a real nondebug instruction of the first definition of resource R > + > +@item > +@dots{} > + > +@item > +last use by a real nondebug instruction of the first definition of resource R > +@end itemize > + > +@item > +second definition of resource R > + > +@itemize @bullet > +@item > +first use by a real nondebug instruction of the second definition of resource R > + > +@item > +@dots{} > + > +@item > +last use by a real nondebug instruction of the second definition of resource R > +@end itemize > + > +@item > +@dots{} > + > +@item > +last definition of resource R > + > +@itemize @bullet > +@item > +first use by a real nondebug instruction of the last definition of resource R > + > +@item > +@dots{} > + > +@item > +last use by a real nondebug instruction of the last definition of resource R > +@end itemize > +@end itemize > + > +(Note that clobbers never have uses; only sets do.) > + > +This linear view is easy to achieve when there is only a single definition > +of a resource, which is commonly true for pseudo registers. However, > +things are more complex if code has a structure like the following: > + > +@smallexample > +// ebb2, bb2 > +R = @var{va}; // A > +if (@dots{}) > + @{ > + // ebb2, bb3 > + use1 (R); // B > + @dots{} > + R = @var{vc}; // C > + @} > +else > + @{ > + // ebb4, bb4 > + use2 (R); // D > + @} > +@end smallexample > + > +The list of accesses would begin as follows: > + > +@itemize @bullet > +@item > +definition of R by A > + > +@itemize @bullet > +@item > +use of A's definition of R by B > +@end itemize > + > +@item > +definition of R by C > +@end itemize > + > +The next access to R is in D, but the value of R that D uses comes from > +A rather than C@. > + > +This is resolved by adding a phi node for @code{ebb4}. All inputs to this > +phi node have the same value, which in the example above is A's definition > +of R@. In other circumstances, it would not be necessary to create a phi > +node when all inputs are equal, so these phi nodes are referred to as > +``degenerate'' phi nodes. > + > +The full list of accesses to R is therefore: > + > +@itemize @bullet > +@item > +definition of R by A > + > +@itemize @bullet > +@item > +use of A's definition of R by B > +@end itemize > + > +@item > +definition of R by C > + > +@item > +definition of R by ebb4's phi instruction, with the input coming from A > + > +@itemize @bullet > +@item > +use of the ebb4's R phi definition of R by B > +@end itemize > +@end itemize > + > +Note that A's definition is also used by ebb4's phi node, but this > +use belongs to the third part of the use list described above and > +so does not form part of the linear sequence. > + > +It is possible to ``look through'' any degenerate phi to the ultimate > +definition using the function @code{look_through_degenerate_phi}. > +Note that the input to a degenerate phi is never itself provided > +by a degenerate phi. > + > +At present, the SSA form takes this principle one step further > +and guarantees that, for any given resource @var{res}, one of the > +following is true: > + > +@itemize > +@item > +The resource has a single definition @var{def}, which is not a phi node. > +Excluding uses of undefined registers, all uses of @var{res} by real > +nondebug instructions use the value provided by @var{def}. > + > +@item > +Excluding uses of undefined registers, all uses of @var{res} use > +values provided by definitions that occur earlier in the same > +extended basic block. These definitions might come from phi nodes > +or from real instructions. > +@end itemize > + > +@node Changing RTL Instructions > +@subsection Using the RTL SSA framework to change instructions > + > +@findex rtl_ssa::insn_change > +There are various routines that help to change a single RTL instruction > +or a group of RTL instructions while keeping the RTL SSA form up-to-date. > +This section first describes the process for changing a single instruction, > +then goes on to describe the differences when changing multiple instructions. > + > +@menu > +* Changing One RTL SSA Instruction:: > +* Changing Multiple RTL SSA Instructions:: > +@end menu > + > +@node Changing One RTL SSA Instruction > +@subsubsection Changing One RTL SSA Instruction > + > +Before making a change, passes should first use a statement like the > +following: > + > +@smallexample > +auto attempt = crtl->ssa->new_change_attempt (); > +@end smallexample > + > +Here, @code{attempt} is an RAII object that should remain in scope > +for the entire change attempt. It automatically frees temporary > +memory related to the changes when it goes out of scope. > + > +Next, the pass should create an @code{rtl_ssa::insn_change} object > +for the instruction that it wants to change. This object specifies > +several things: > + > +@itemize @bullet > +@item > +what the instruction's new list of uses should be (@code{new_uses}). > +By default this is the same as the instruction's current list of uses. > + > +@item > +what the instruction's new list of definitions should be (@code{new_defs}). > +By default this is the same as the instruction's current list of > +definitions. > + > +@item > +where the instruction should be located (@code{move_range}). > +This is a range of instructions after which the instruction could > +be placed, represented as an @code{rtl_ssa::insn_range}. > +By default the instruction must remain at its current position. > +@end itemize > + > +If a pass was attempting to change all these properties of an instruction > +@code{insn}, it might do something like this: > + > +@smallexample > +rtl_ssa::insn_change change (insn); > +change.new_defs = @dots{}; > +change.new_uses = @dots{}; > +change.move_range = @dots{}; > +@end smallexample > + > +This @code{rtl_ssa::insn_change} only describes something that the > +pass @emph{might} do; at this stage, nothing has actually changed. > + > +As noted above, the default @code{move_range} requires the instruction > +to remain where it is. At the other extreme, it is possible to allow > +the instruction to move anywhere within its extended basic block, > +provided that all the new uses and definitions can be performed > +at the new location. The way to do this is: > + > +@smallexample > +change.move_range = insn->ebb ()->insn_range (); > +@end smallexample > + > +In either case, the next step is to make sure that move range is > +consistent with the new uses and definitions. The way to do this is: > + > +@smallexample > +if (!rtl_ssa::restrict_movement (change)) > + return false; > +@end smallexample > + > +This function tries to limit @code{move_range} to a range of instructions > +at which @code{new_uses} and @code{new_defs} can be correctly performed. > +It returns true on success or false if no suitable location exists. > + > +The pass should also tentatively change the pattern of the instruction > +to whatever form the pass wants the instruction to have. This should use > +the facilities provided by @file{recog.c}. For example: > + > +@smallexample > +rtl_insn *rtl = insn->rtl (); > +insn_change_watermark watermark; > +validate_change (rtl, &PATTERN (rtl), new_pat, 1); > +@end smallexample > + > +will tentatively replace @code{insn}'s pattern with @code{new_pat}. > + > +These changes and the construction of the @code{rtl_ssa::insn_change} > +can happen in either order or be interleaved. > + > +After the tentative changes to the instruction are complete, > +the pass should check whether the new pattern matches a target > +instruction or satisfies the requirements of an inline asm: > + > +@smallexample > +if (!rtl_ssa::recog (change)) > + return false; > +@end smallexample > + > +This step might change the instruction pattern further in order to > +make it match. It might also add new definitions or restrict the range > +of the move. For example, if the new pattern did not match in its original > +form, but could be made to match by adding a clobber of the flags > +register, @code{rtl_ssa::recog} will check whether the flags register > +is free at an appropriate point. If so, it will add a clobber of the > +flags register to @code{new_defs} and restrict @code{move_range} to > +the locations at which the flags register can be safely clobbered. > + > +Even if the proposed new instruction is valid according to > +@code{rtl_ssa::recog}, the change might not be worthwhile. > +For example, when optimizing for speed, the new instruction might > +turn out to be slower than the original one. When optimizing for > +size, the new instruction might turn out to be bigger than the > +original one. > + > +Passes should check for this case using @code{change_is_worthwhile}. > +For example: > + > +@smallexample > +if (!rtl_ssa::change_is_worthwhile (change)) > + return false; > +@end smallexample > + > +If the change passes this test too then the pass can perform the change using: > + > +@smallexample > +confirm_change_group (); > +crtl->ssa->change_insn (change); > +@end smallexample > + > +Putting all this together, the change has the following form: > + > +@smallexample > +auto attempt = crtl->ssa->new_change_attempt (); > + > +rtl_ssa::insn_change change (insn); > +change.new_defs = @dots{}; > +change.new_uses = @dots{}; > +change.move_range = @dots{}; > + > +if (!rtl_ssa::restrict_movement (change)) > + return false; > + > +insn_change_watermark watermark; > +// Use validate_change etc. to change INSN's pattern. > +@dots{} > +if (!rtl_ssa::recog (change) > + || !rtl_ssa::change_is_worthwhile (change)) > + return false; > + > +confirm_change_group (); > +crtl->ssa->change_insn (change); > +@end smallexample > + > +@node Changing Multiple RTL SSA Instructions > +@subsubsection Changing Multiple RTL SSA Instructions > + > +The process for changing multiple instructions is similar > +to the process for changing single instructions > +(@pxref{Changing One RTL SSA Instruction}). The pass should > +again start the change attempt with: > + > +@smallexample > +auto attempt = crtl->ssa->new_change_attempt (); > +@end smallexample > + > +and keep @code{attempt} in scope for the duration of the change > +attempt. It should then construct an @code{rtl_ssa::insn_change} > +for each change that it wants to make. > + > +After this, it should combine the changes into a sequence of > +@code{rtl_ssa::insn_change} pointers. This sequence must be in > +reverse postorder; the instructions will remain strictly in the > +order that the sequence specifies. > + > +For example, if a pass is changing exactly two instructions, > +it might do: > + > +@smallexample > +rtl_ssa::insn_change *changes[] = @{ &change1, change2 @}; > +@end smallexample > + > +where @code{change1}'s instruction must come before @code{change2}'s. > +Alternatively, if the pass is changing a variable number of > +instructions, it might build up the sequence in a > +@code{vec}. > + > +By default, @code{rtl_ssa::restrict_movement} assumes that all > +instructions other than the one passed to it will remain in their > +current positions and will retain their current uses and definitions. > +When changing multiple instructions, it is usually more effective > +to ignore the other instructions that are changing. The sequencing > +described above ensures that the changing instructions remain > +in the correct order with respect to each other. > +The way to do this is: > + > +@smallexample > +if (!rtl_ssa::restrict_movement (change, insn_is_changing (changes))) > + return false; > +@end smallexample > + > +Similarly, when @code{rtl_ssa::restrict_movement} is detecting > +whether a register can be clobbered, it by default assumes that > +all other instructions will remain in their current positions and > +retain their current form. It is again more effective to ignore > +changing instructions (which might, for example, no longer need > +to clobber the flags register). The way to do this is: > + > +@smallexample > +if (!rtl_ssa::recog (change, insn_is_changing (changes))) > + return false; > +@end smallexample > + > +When changing multiple instructions, the important question is usually > +not whether each individual change is worthwhile, but whether the changes > +as a whole are worthwhile. The way to test this is: > + > +@smallexample > +if (!rtl_ssa::changes_are_worthwhile (changes)) > + return false; > +@end smallexample > + > +The process for changing single instructions makes sure that one > +@code{rtl_ssa::insn_change} in isolation is valid. But when changing > +multiple instructions, it is also necessary to test whether the > +sequence as a whole is valid. For example, it might be impossible > +to satisfy all of the @code{move_range}s at once. > + > +Therefore, once the pass has a sequence of changes that are > +individually correct, it should use: > + > +@smallexample > +if (!crtl->ssa->verify_insn_changes (changes)) > + return false; > +@end smallexample > + > +to check whether the sequence as a whole is valid. If all checks pass, > +the final step is: > + > +@smallexample > +confirm_change_group (); > +crtl->ssa->change_insns (changes); > +@end smallexample > + > +Putting all this together, the process for a two-instruction change is: > + > +@smallexample > +auto attempt = crtl->ssa->new_change_attempt (); > + > +rtl_ssa::insn_change change (insn1); > +change1.new_defs = @dots{}; > +change1.new_uses = @dots{}; > +change1.move_range = @dots{}; > + > +rtl_ssa::insn_change change (insn2); > +change2.new_defs = @dots{}; > +change2.new_uses = @dots{}; > +change2.move_range = @dots{}; > + > +rtl_ssa::insn_change *changes[] = @{ &change1, change2 @}; > + > +auto is_changing = insn_is_changing (changes); > +if (!rtl_ssa::restrict_movement (change1, is_changing) > + || !rtl_ssa::restrict_movement (change2, is_changing)) > + return false; > + > +insn_change_watermark watermark; > +// Use validate_change etc. to change INSN1's and INSN2's patterns. > +@dots{} > +if (!rtl_ssa::recog (change1, is_changing) > + || !rtl_ssa::recog (change2, is_changing) > + || !rtl_ssa::changes_are_worthwhile (changes) > + || !crtl->ssa->verify_insn_changes (changes)) > + return false; > + > +confirm_change_group (); > +crtl->ssa->change_insns (changes); > +@end smallexample > + > @node Sharing > @section Structure Sharing Assumptions > @cindex sharing of RTL components