On 2018-06-20 04:23 AM, Richard Biener wrote: > On Wed, Jun 20, 2018 at 7:31 AM Jeff Law wrote: >> >> On 06/19/2018 12:30 PM, Michael Ploujnikov wrote: >>> Hi everyone, >>> >>> (I hope this is the right place to ask, if not my apologies; please >>> point me in the right direction) >>> >>> I'm trying to get a better understanding of the following part in >>> tree_swap_operands_p(): >>> >>> /* It is preferable to swap two SSA_NAME to ensure a canonical form >>> for commutative and comparison operators. Ensuring a canonical >>> form allows the optimizers to find additional redundancies without >>> having to explicitly check for both orderings. */ >>> if (TREE_CODE (arg0) == SSA_NAME >>> && TREE_CODE (arg1) == SSA_NAME >>> && SSA_NAME_VERSION (arg0) > SSA_NAME_VERSION (arg1)) >>> return 1; >>> >>> My questions in no particular order: It looks like this was added in >>> 2004. I couldn't find any info other than what's in the corresponding >>> commit (cc0bdf913) so I'm wondering if the canonical form/order still >>> relevant/needed today? Does the ordering have to be done based on the >>> name versions specifically? Or can it be based on something more >>> intrinsic to the input source code rather than a GCC internal value, eg: >>> would alphabetic sort order of the variable names be a reasonable >>> replacement? >> Canonicalization is still important and useful. > > Indeed. > >> However, canonicalizing on SSA_NAMEs is problematical due to the way we >> recycle nodes and re-pack them. > > In the past we made sure to not disrupt order - hopefully that didn't change > so the re-packing shoudln't invaidate previous canonicalization: > > static void > release_free_names_and_compact_live_names (function *fun) > { > ... > /* And compact the SSA number space. We make sure to not change the > relative order of SSA versions. */ > for (i = 1, j = 1; i < fun->gimple_df->ssa_names->length (); ++i) > { > > >> I think defining additional rules for canonicalization prior to using >> SSA_NAME_VERSION as the fallback would be looked upon favorably. > > I don't see a good reason to do that, it will be harder to spot canonicalization > issues and it will take extra compile-time. > >> Note however, that many of the _DECL nodes referenced by SSA_NAMEs are >> temporaries generated by the compiler and do not correspond to any >> declared/defined object in the original source. So you'll still need >> the SSA_NAME_VERSION (or something as stable or better) canonicalization >> to handle those cases. > > And not all SSA_NAMEs have underlying _DECL nodes (or IDENTIFIER_NODE names). > > Richard. > >> Jeff After a bit more digging I found that insert_phi_nodes inserts PHIs in the order of UIDs, which indirectly affects the order of vars in old_ssa_names, which in turn affects the order in which make_ssa_name_fn is called to pick SSA versions from FREE_SSANAMES so in the end ordering by SSA_NAME_VERSION's is more or less equivalent to ordering by UIDs. I'm trying to figure out if there's a way to avoid depending on UIDs being ordered in a certain way. So if tree_swap_operands_p stays the same I'm wondering if there's some other info available at the point of insert_phi_nodes that would be a good replacement for UID. From my very limited experience with a very small source input, and if I understand things correctly, all of the var_infos have a var which is DECL_P and thus should have an IDENTIFIER_NODE. Is that true in the general case? I don't fully understand what are all the things that insert_phi_nodes iterates over. - Michael