From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24049 invoked by alias); 10 May 2009 11:48:24 -0000 Received: (qmail 23832 invoked by uid 22791); 10 May 2009 11:48:16 -0000 X-SWARE-Spam-Status: No, hits=-2.9 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_35,J_CHICKENPOX_51 X-Spam-Check-By: sourceware.org Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 10 May 2009 11:48:05 +0000 Received: from relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 9B1BA79727; Sun, 10 May 2009 13:48:01 +0200 (CEST) Date: Sun, 10 May 2009 11:48:00 -0000 From: Richard Guenther To: Martin Jambor Cc: GCC Patches , Jan Hubicka Subject: Re: [PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates. In-Reply-To: <20090510103233.GA28857@alvy.suse.cz> Message-ID: References: <20090428100429.051912011@virgil.suse.cz> <20090428100449.910685488@virgil.suse.cz> <20090428101423.GA23578@virgil.suse.cz> <20090510103233.GA28857@alvy.suse.cz> User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2009-05/txt/msg00508.txt.bz2 On Sun, 10 May 2009, Martin Jambor wrote: > > > expr = TREE_OPERAND (expr, 0); > > > bit_ref = true; > > > } > > > else > > > bit_ref = false; > > > > > > while (TREE_CODE (expr) == NOP_EXPR > > > > CONVERT_EXPR_P (expr) > > OK... but at another place in the email you said it might not even > appear in a valid gimple statement? Should I remove it altogether? Indeed. If you not build trees from tuple stmts then a NOP_EXPR cannot appear as a rhs1 or rhs2 of an assignment (instead it is always the subcode in gimple stmt and the rhs1 is simply sth valid for is_gimple_val). > > > || TREE_CODE (expr) == VIEW_CONVERT_EXPR > > > || TREE_CODE (expr) == REALPART_EXPR > > > || TREE_CODE (expr) == IMAGPART_EXPR) > > > expr = TREE_OPERAND (expr, 0); > > > > Why do this here btw, and not just lump ... > > > > > switch (TREE_CODE (expr)) > > > { > > > case ADDR_EXPR: > > > case SSA_NAME: > > > case INDIRECT_REF: > > > break; > > > > > > case VAR_DECL: > > > case PARM_DECL: > > > case RESULT_DECL: > > > case COMPONENT_REF: > > > case ARRAY_REF: > > > ret = create_access (expr, write); > > > break; > > > > ... this ... > > > > > case REALPART_EXPR: > > > case IMAGPART_EXPR: > > > expr = TREE_OPERAND (expr, 0); > > > ret = create_access (expr, write); > > > > ... and this together? Won't you create bogus accesses if you > > strip for example IMAGPART_EXPR (which has non-zero offset)? > > That would break the complex number into its components. I thought > that they are meant to stay together for some reason, otherwise they > would not be represented explicitly in gimple... do you think it does > not matter? What about vectors then? > > The access is not bogus because modification functions take care of > these statements in a special way. However, if it is indeed OK to > split complex numbers into their components, I will gladly simplify > this as you suggested. Yes, it is valid to split them (and complex lowering indeed does that). It _might_ be useful to keep a complex together in a single SSA_NAME for optimization purposes, but I guess you detect that anyway if there is a read of the whole complex element into a register and keep it that way. I would favor simplifying SRA in this case and just split them if that is valid. > > > break; > > > > > > case ARRAY_RANGE_REF: > > > > it should just be handled fine I think. > > OK, I will try that at some later stage. > > > > default: > > > walk_tree (&safe_expr, disqualify_all, NULL, NULL); > > > > and if not, this should just disqualify the base of the access, like > > get_base_address (safe_expr) (save_expr you mean?) and then if that > > is a DECL, disqualify that decl. > > I'll test just doing nothing, things handled by get_base_address are > either fine or already accounted for. Fine with me. > > > return SRA_SA_NONE; > > > > > > lhs_ptr = gimple_assign_lhs_ptr (stmt); > > > rhs_ptr = gimple_assign_rhs1_ptr (stmt); > > > > you probably don't need to pass pointers to trees everywhere as you > > are not changing them. > > Well, this function is a callback called by scan_function which can > also call sra_modify_expr in the last stage of the pass when > statements are modified. I have considered splitting the function > into two but in the end I thought they would be too similar and the > overhead is hopefully manageable. Yeah, I noticed this later. It is somewhat confusing at first sight, so maybe just amending the comment before this function could clarify things. > > > if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr)) > > > return SRA_SA_NONE; > > > > > > racc = build_access_from_expr_1 (rhs_ptr, gsi, false); > > > lacc = build_access_from_expr_1 (lhs_ptr, gsi, true); > > > > just avoid calling into build_access_from_expr_1 for SSA_NAMEs > > or is_gimple_min_invariant lhs/rhs, that should make that > > function more regular. > > In what sense? build_access_from_expr_1 looks at TREE_CODE anyway and > can discard the two cases, without for example looking into ADR_EXPRs > like is_gimple_min_invariant(). > > But if you really think it is indeed beneficial, I can do that, sure - > to me it just looks ugly). Ok, just keep it as is. > > > if (lacc && racc > > > && !lacc->grp_unscalarizable_region > > > && !racc->grp_unscalarizable_region > > > && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr)) > > > && lacc->size <= racc->size > > > && useless_type_conversion_p (lacc->type, racc->type)) > > > > useless_type_conversion_p should be always true here. > > I don't think so, build_access_from_expr_1 can look through V_C_Es and > the types of accesses are the type of the operand in such cases.. Ok, but what is the point of looking through V_C_Es there if it makes this test fail? Hmm, IIRC this was only to track struct copies, right? I guess it's ok then. > > That would just be useless information. I guess you copied this > > from old SRA? > > Yes. All this fancy naming stuff is quite useless but I find it very > handy when debugging SRA issues. Yeah, sort of. Still using no name in that case will do exactly the same thing ;) > > > static tree > > > create_access_replacement (struct access *access) > > > { > > > tree repl; > > > > > > repl = make_rename_temp (access->type, "SR"); > > > get_var_ann (repl); > > > add_referenced_var (repl); > > > > > > DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base); > > > DECL_ARTIFICIAL (repl) = 1; > > > > > > if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base)) > > > > at least && !DECL_ARTIFICIAL (access->base) I think. > > This part is also largely copied from the old SRA. So far it seems to > work nicely, replacements of artificial declarations get SRsome_number > fancy names and that makes them easy to distinguish. Nevertheless, I > can change the condition if it is somehow wrong. Or do you expect any > other problems beside not-so-fancy fancy names? No, it merely uses up memory. Not that other passes do not do this ... Thus, I probably do not care too much. > > > if (access->grp_bfr_lhs) > > > DECL_GIMPLE_REG_P (repl) = 0; > > > > But you never set it (see update_address_taken for more cases, > > most notably VIEW_CONVERT_EXPR on the lhs which need to be taken > > care of). You should set it for COMPLEX_TYPE and VECTOR_TYPE > > replacements. > > This function is the only place where I still use make_rename_temp > which sets it exactly in these two cases. I did not really know why > it is required in these two cases and only in these two cases so I > left it there, at least for now. I guess I understand that now after > seeing update_address_taken. > > I can replace this with calling create_tmp_var() and doing all the > rest that make_rename_temp does - I believe that you intend to to > remove it - I have just not found out why it is so bad. The bad thing about it is that it supports using the SSA renamer to write a single variable into SSA. That is usually more costly than just manually allocating SSA_NAMEs and updating SSA form, which is usally very easy. It's not used much, in which case the easiest thing might be to fix all remaining uses to manually update SSA form. But yes, I now see why that zeroing is necessary. > > > > CONVERT_EXPR_P (expr) > > > > > || TREE_CODE (expr) == VIEW_CONVERT_EXPR) > > > > VIEW_CONVERT_EXPR is also a handled_component_p. > > > > Note that NOP_EXPR should never occur here - that would be invalid > > gimple. So I think you can (and should) just delete the above. > > I haven't seen a NOP_EXPR for a while, do they still exist in lower > gimple? Thus I have removed their handling. > > Removing diving through V_C_E breaks ADA, though. The reason is that > we get a different size (and max_size) when calling > get_ref_base_and_extent on the V_C_E and on its argument. However, I > believe both should be represented by a single access representative. Yeah, I remember this :/ It is technically invalid GIMPLE that the Ada FE generates though. The size of the V_C_E result has to match that of the operand. Please add a FIXME before this stripping refering to the Ada problem. > > > tree repl = get_access_replacement (access); > > > if (!TREE_ADDRESSABLE (type)) > > > { > > > tree tmp = create_tmp_var (type, "SRvce"); > > > > > > add_referenced_var (tmp); > > > if (is_gimple_reg_type (type)) > > > tmp = make_ssa_name (tmp, NULL); > > > > Should be always is_gimple_reg_type () if it is a type suitable for > > a SRA scalar replacement. > > No, it is the type suitable for the statement, it can be a union type > or a record with only one field. But see the more thorough explanation > below... I think it should be always a register type, but see below... ;) > > But you should set DECL_GIMPLE_REG_P for > > VECTOR and COMPLEX types here. > > > > > if (write) > > > { > > > gimple stmt; > > > tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp); > > > > This needs to either always fold to plain 'tmp' or tmp has to be a > > non-register. Otherwise you will create invalid gimple. > > > > > *expr = tmp; > > > if (is_gimple_reg_type (type)) > > > SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi); > > > > See above. > > > > > stmt = gimple_build_assign (repl, conv); > > > gsi_insert_after (gsi, stmt, GSI_NEW_STMT); > > > update_stmt (stmt); > > > } > > > else > > > { > > > gimple stmt; > > > tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl); > > > > > > stmt = gimple_build_assign (tmp, conv); > > > gsi_insert_before (gsi, stmt, GSI_SAME_STMT); > > > if (is_gimple_reg_type (type)) > > > SSA_NAME_DEF_STMT (tmp) = stmt; > > > > See above. (I wonder if the patch still passes bootstrap & regtest > > after the typecking patch) > > > > > *expr = tmp; > > > update_stmt (stmt); > > > } > > > } > > > else > > > { > > > if (write) > > > { > > > gimple stmt; > > > > > > stmt = gimple_build_assign (repl, unshare_expr (access->expr)); > > > gsi_insert_after (gsi, stmt, GSI_NEW_STMT); > > > update_stmt (stmt); > > > } > > > else > > > { > > > gimple stmt; > > > > > > stmt = gimple_build_assign (unshare_expr (access->expr), repl); > > > gsi_insert_before (gsi, stmt, GSI_SAME_STMT); > > > update_stmt (stmt); > > > } > > > > I don't understand this path. Are the types here always compatible? > > And I don't really understand the comments. The function is called by > sra_modify_expr (the function doing the replacements in all non-assign > statements) when it needs to replace a reference by a scalar but the > types don't match. This can happen when replacing a V_C_E, a union > access when we picked a different type that the one used in the > statement or (and this case can be remarkably irritating) an access to > a records with only one (scalar) field. > > My original idea was to simply put a V_C_E in the place. However, I > believe there are places where this is not possible - or at least one > case, a LHS of a call statement because V_C_Es of gimple_registers > (ssa_names) are not allowed on LHSs. My initial idea to handle these > cases were to create a new temporary with a matching and a V_C_E > assign statement (with the V_C_E always on the RHS - I believe that > works even with gimple registers) that would do the conversion and > load/store it to the replacement variable (this is what the > !TREE_ADDRESSABLE branch does). > > The problem with this idea are TREE_ADDRESSABLE types. These types > need to be constructed and thus we cannot create temporary variables > of these types. On the other hand they absolutely need to be SRAed, > not doing so slows down tramp3d by a factor of two (and the current > SRA also breaks them up). And quite a few C++ classes are such types > that are "non-addressable" and have only one scalar field. > Identifying such records is possible, I soon realized that I can > simply leave the statement as it is and produce a new statement to do > load/store from the original field (that's what the outermost else > branch does). > > Does this make sense or is there some fundamental flaw in my reasoning > about gimple again? Does this explain what the function does? Ok, so the case in question is struct X { int i; } x; x = foo (); where you want to scalarize x. Indeed the obvious scalarization would be x = foo (); SR_1 = x.i; For all other LHS cases (not calls) you can move the V_C_E to the RHS and should be fine. I don't understand the TREE_ADDRESSABLE thingy yet. Especially why the type should be a non-register type. My understanding was that in reads the replacement LHS will be a register, for mismatched types you simply put a V_C_E around the RHS. For writes the replacement RHS will be a register, possibly surrounded by a V_C_E. The special case is calls where you cannot put a V_C_E around the "RHS". So, can you explain with an example, how it comes that the scalar replacement type is a non-register type? (Whatever our conclusion will be, the function should have a big comment enumerating the cases we have to deal with) > It certainly passes bootstrap and testing, I use --enable-checking=yes. That's good. > > > { > > > /* I have never seen this code path trigger but if it can happen the > > > following should handle it gracefully. */ > > > > It can trigger for vector constants. > > OK, I'll remove the comment. Apparently there are none in the > testsuite, I believe I tested with a gcc_unreachable here. Err, for vector constants we have VECTOR_CST, so it triggers for non-constant vector constructors like vector int x = { a, b, c, d }; > > > update_stmt (aux_stmt); > > > new_stmt = gimple_build_assign (get_access_replacement (access), > > > fold_build2 (COMPLEX_EXPR, access->type, > > > rp, ip)); > > > gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT); > > > update_stmt (new_stmt); > > > > Hm. So you do what complex lowering does here. Note that this may > > create loads from uninitialized memory with all its problems. > > Yes, but I have not had any such problems with complex types (as > opposed to simple loads from half-initialized records, for example). > OTOH, I have also contemplated setting DECL_GIMPLE_REG_P to zero of > complex replacement which appear in IMAG_PART or REAL_PART on a LHS of > a statement. Yes, that's necessary. I still think SRA should not bother about this at all ;) > > WRT the complex stuff. If you would do scalarization and analysis > > just on the components (not special case REAL/IMAGPART_EXPR everywhere) > > it should work better, correct? You still could handle group > > scalarization for the case of for example passing a complex argument > > to a function. > > Well, my reasoning was that if complex types were first-class citizens > in gimple (as opposed to a record), there was a reason to keep them > together and so I attempted that. But again, if that is a > misconception of mine and there is no point in keeping them together, > I will gladly remove this. It's not clear. Complex lowering decomposes all complex variables to components if possible. Again, simplifying SRA is probably better. > > void bar(_Complex float); > > void foo(float x, float y) > > { > > _Complex float z = x; > > __imag z = y; > > bar(z); > > } > > > > The same applies for vectors - the REAL/IMAGPART_EXPRs equivalent > > there is BIT_FIELD_REF. > > These are handled by setting DECL_GIMPLE_REG_P to zero if a B_F_R is > on a LHS. I believe the current SRA does the same. It works fine and > there's a lot less fuss about them. > > > > return SRA_SA_PROCESSED; > > > } > > > > > > /* Return true iff T has a VIEW_CONVERT_EXPR among its handled components. */ > > > > > > static bool > > > contains_view_convert_expr_p (tree t) > > > { > > > while (1) > > > { > > > if (TREE_CODE (t) == VIEW_CONVERT_EXPR) > > > return true; > > > if (!handled_component_p (t)) > > > return false; > > > t = TREE_OPERAND (t, 0); > > > } > > > } > > > > Place this in tree-flow-inline.h next to ref_contains_array_ref, also > > structure the loop in the same way. > > OK, but I'd like the function to work if passed declarations too. > Thus I cannot really use a do-while loop. I'll send it in a separate > patch. > > > > /* Change STMT to assign compatible types by means of adding component or array > > > references or VIEW_CONVERT_EXPRs. All parameters have the same meaning as > > > variable with the same names in sra_modify_assign. This is done in a > > > such a complicated way in order to make > > > testsuite/g++.dg/tree-ssa/ssa-sra-2.C happy and so it helps in at least some > > > cases. */ > > > > > > static void > > > fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt, > > > struct access *lacc, struct access *racc, > > > tree lhs, tree *rhs, tree ltype, tree rtype) > > > { > > > if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype) > > > && !access_has_children_p (lacc)) > > > { > > > tree expr = unshare_expr (lhs); > > > bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype, > > > false); > > > if (found) > > > { > > > gimple_assign_set_lhs (*stmt, expr); > > > return; > > > } > > > } > > > > > > if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype) > > > && !access_has_children_p (racc)) > > > { > > > tree expr = unshare_expr (*rhs); > > > bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype, > > > false); > > > if (found) > > > { > > > gimple_assign_set_rhs1 (*stmt, expr); > > > return; > > > } > > > } > > > > > > *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs); > > > gimple_assign_set_rhs_from_tree (gsi, *rhs); > > > *stmt = gsi_stmt (*gsi); > > > > Reading this I have a deja-vu - isn't there another function in this > > file doing the same thing? You are doing much unsharing even though > > you re-build the access tree from scratch? > > This function has a similar purpose as fix_incompatible_types_for_expr > but this time only for assign statements. That is easier because we > can always put the V_C_E on the RHS and be safe and so no additional > statements need to be generated. > > However, the V_C_Es rather than COMPONENT_REFs and ARRAY_REFs feel > unnatural for accessing fields from single field records and unions > and single element arrays. According to the comment I used to have > problems of some sort with that in the ssa-sra-2.C testcase but I can > no longer reproduce them (and don't remember them). > > I call unshare_expr in this context only when one side of an > assignment statement is a scalar replacement and the other one is an > aggregate (but not necessarily a declaration) which can happen only in > the cases listed above. That is not very many calls and chances are > good that build_ref_for_offset succeeds. > > Does that explain what is going on here? Yes. I guess merging the two functions would make sense to me (or putting them next to each other at least). The function signatures also look seemingly weird (that you store into *rhs but still set the stmt rhs, etc.). I have another look here with the new version. > > > } > > > > > > /* Callback of scan_function to process assign statements. It examines both > > > sides of the statement, replaces them with a scalare replacement if there is > > > one and generating copying of replacements if scalarized aggregates have been > > > used in the assignment. STMT is a pointer to the assign statement, GSI is > > > used to hold generated statements for type conversions and subtree > > > copying. */ > > > > > > static enum scan_assign_result > > > sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi, > > > void *data ATTRIBUTE_UNUSED) > > > { > > > struct access *lacc, *racc; > > > tree ltype, rtype; > > > tree lhs, rhs; > > > bool modify_this_stmt; > > > > > > if (gimple_assign_rhs2 (*stmt)) > > > > !gimple_assign_single_p (*stmt) > > > > (the only gimple assign that may access memory) > > OK > > > > > > return SRA_SA_NONE; > > > lhs = gimple_assign_lhs (*stmt); > > > rhs = gimple_assign_rhs1 (*stmt); > > > > > > if (TREE_CODE (rhs) == CONSTRUCTOR) > > > return sra_modify_constructor_assign (stmt, gsi); > > > > > > if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR) > > > return sra_modify_partially_complex_lhs (*stmt, gsi); > > > > > > if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR > > > || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF) > > > { > > > modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt), > > > gsi, false, data); > > > modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt), > > > gsi, true, data); > > > return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE; > > > } > > > > > > lacc = get_access_for_expr (lhs); > > > racc = get_access_for_expr (rhs); > > > if (!lacc && !racc) > > > return SRA_SA_NONE; > > > > > > modify_this_stmt = ((lacc && lacc->grp_to_be_replaced) > > > || (racc && racc->grp_to_be_replaced)); > > > > > > if (lacc && lacc->grp_to_be_replaced) > > > { > > > lhs = get_access_replacement (lacc); > > > gimple_assign_set_lhs (*stmt, lhs); > > > ltype = lacc->type; > > > } > > > else > > > ltype = TREE_TYPE (lhs); > > > > > > if (racc && racc->grp_to_be_replaced) > > > { > > > rhs = get_access_replacement (racc); > > > gimple_assign_set_rhs1 (*stmt, rhs); > > > rtype = racc->type; > > > } > > > else > > > rtype = TREE_TYPE (rhs); > > > > > > /* The possibility that gimple_assign_set_rhs_from_tree() might reallocate > > > the statement makes the position of this pop_stmt_changes() a bit awkward > > > but hopefully make some sense. */ > > > > I don't see pop_stmt_changes(). > > Yeah, the comment is outdated. I've removed it. > > > > if (modify_this_stmt) > > > { > > > if (!useless_type_conversion_p (ltype, rtype)) > > > fix_modified_assign_compatibility (gsi, stmt, lacc, racc, > > > lhs, &rhs, ltype, rtype); > > > } > > > > > > if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs) > > > || (access_has_children_p (racc) > > > && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset)) > > > || (access_has_children_p (lacc) > > > && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset))) > > > > ? A comment is missing what this case is about ... > > > > (this smells like fixup that could be avoided by doing things correct > > in the first place) > > From this point on, the function deals with assignments in between > aggregates when at least one has scalar reductions of some of its > components. There are three possible scenarios: Both the LHS and RHS > have to-be-scalarized components, 2) only the RHS has or 3) only the > LHS has. > > In the first case, we would like to load the LHS components from RHS > components whenever possible. If that is not possible, we would like > to read it directly from the RHS (after updating it by storing in it > its own components). If there are some necessary unscalarized data in > the LHS, those will be loaded by the original assignment too. If > neither of these cases happen, the original statement can be removed. > Most of this is done by load_assign_lhs_subreplacements. > > In the second case, we would like to store all RHS scalarized > components directly into LHS and if they cover the aggregate > completely, remove the statement too. In the third case, we want the > LHS components to be loaded directly from the RHS (DSE will remove the > original statement if it becomes redundant). > > This is a bit complex but manageable when types match and when unions > do not cause confusion in a way that we cannot really load a component > of LHS from the RHS or vice versa (the access representing this level > can have subaccesses that are accessible only through a different > union field at a higher level - different from the one used in the > examined expression). Unions are fun. > > Therefore, I specially handle a fourth case, happening when there is a > specific type cast or it is impossible to locate a scalarized > subaccess on the other side of the expression. If that happens, I > simply "refresh" the RHS by storing in it is scalarized components > leave the original statement there to do the copying and then load the > scalar replacements of the LHS. This is what the first branch does. > > Is it clearer now? Perhaps I should put these five paragraphs as a > comment into the function? Yes - that would be nice. Thanks, Richard.