Thanks Richard for your comments. I changes algorithm to remove dead scalar statements as you proposed. Bootstrap and regression testing did not show any new failures on x86-64. Is it OK for trunk? Changelog: 2016-02-10 Yuri Rumyantsev PR tree-optimization/69652 * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 to nested loop, did source re-formatting, skip debug statements, add check on statement with volatile operand, remove dead scalar statements. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr69652.c: New test. 2016-02-09 15:33 GMT+03:00 Richard Biener : > On Fri, Feb 5, 2016 at 3:54 PM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is updated patch - I came back to move call statements also since >> masked loads are presented by internal call. I also assume that for >> the following simple loop >> for (i = 0; i < n; i++) >> if (b1[i]) >> a1[i] = sqrtf(a2[i] * a2[i] + a3[i] * a3[i]); >> motion must be done for all vector statements in semi-hammock including SQRT. >> >> Bootstrap and regression testing did not show any new failures. >> Is it OK for trunk? > > The patch is incredibly hard to parse due to the re-indenting. Please > consider sending > diffs with -b. > > This issue exposes that you are moving (masked) stores across loads without > checking aliasing. In the specific case those loads are dead and thus > this is safe > but in general I thought we were checking that we are using the same VUSE > during the sinking operation. > > Thus, I'd rather have > > + /* Check that LHS does not have uses outside of STORE_BB. */ > + res = true; > + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) > + { > + gimple *use_stmt; > + use_stmt = USE_STMT (use_p); > + if (is_gimple_debug (use_stmt)) > + continue; > + if (gimple_bb (use_stmt) != store_bb) > + { > + res = false; > + break; > + } > + } > > also check for the dead code case and DCE those stmts here. Like so: > > if (has_zero_uses (lhs)) > { > gsi_remove (&gsi_from, true); > continue; > } > > before the above loop. > > Richard. > >> ChangeLog: >> >> 2016-02-05 Yuri Rumyantsev >> >> PR tree-optimization/69652 >> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >> skipped scalar statements, introduce variable LAST_VUSE to keep >> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >> begining of current masked store processing, did source re-formatting, >> skip parsing of debug gimples, stop processing if a gimple with >> volatile operand has been encountered, save scalar statement >> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >> iterator, change vuse of all saved scalar statements to LAST_VUSE if >> it makes sence. >> >> gcc/testsuite/ChangeLog: >> * gcc.dg/torture/pr69652.c: New test. >> >> 2016-02-04 19:40 GMT+03:00 Jakub Jelinek : >>> On Thu, Feb 04, 2016 at 05:46:27PM +0300, Yuri Rumyantsev wrote: >>>> Here is a patch that cures the issues with non-correct vuse for scalar >>>> statements during code motion, i.e. if vuse of scalar statement is >>>> vdef of masked store which has been sunk to new basic block, we must >>>> fix it up. The patch also fixed almost all remarks pointed out by >>>> Jacub. >>>> >>>> Bootstrapping and regression testing on v86-64 did not show any new failures. >>>> Is it OK for trunk? >>>> >>>> ChangeLog: >>>> 2016-02-04 Yuri Rumyantsev >>>> >>>> PR tree-optimization/69652 >>>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1 >>>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all >>>> skipped scalar statements, introduce variable LAST_VUSE that has >>>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the >>>> begining of current masked store processing, did source re-formatting, >>>> skip parsing of debug gimples, stop processing when call or gimple >>>> with volatile operand habe been encountered, save scalar statement >>>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE >>>> iterator, change vuse of all saved scalar statements to LAST_VUSE if >>>> it makes sence. >>>> >>>> gcc/testsuite/ChangeLog: >>>> * gcc.dg/torture/pr69652.c: New test. >>> >>> Your mailer breaks ChangeLog formatting, so it is hard to check the >>> formatting of the ChangeLog entry. >>> >>> diff --git a/gcc/testsuite/gcc.dg/torture/pr69652.c b/gcc/testsuite/gcc.dg/torture/pr69652.c >>> new file mode 100644 >>> index 0000000..91f30cf >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/torture/pr69652.c >>> @@ -0,0 +1,14 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2 -ffast-math -ftree-vectorize " } */ >>> +/* { dg-additional-options "-mavx" { target { i?86-*-* x86_64-*-* } } } */ >>> + >>> +void fn1(double **matrix, int column, int row, int n) >>> +{ >>> + int k; >>> + for (k = 0; k < n; k++) >>> + if (matrix[row][k] != matrix[column][k]) >>> + { >>> + matrix[column][k] = -matrix[column][k]; >>> + matrix[row][k] = matrix[row][k] - matrix[column][k]; >>> + } >>> +} >>> \ No newline at end of file >>> >>> Please make sure the last line of the test is a new-line. >>> >>> @@ -6971,6 +6972,8 @@ optimize_mask_stores (struct loop *loop) >>> gsi_next (&gsi)) >>> { >>> stmt = gsi_stmt (gsi); >>> + if (is_gimple_debug (stmt)) >>> + continue; >>> if (is_gimple_call (stmt) >>> && gimple_call_internal_p (stmt) >>> && gimple_call_internal_fn (stmt) == IFN_MASK_STORE) >>> >>> This is not needed, you do something only for is_gimple_call, >>> which is never true if is_gimple_debug, so the code used to be fine as is. >>> >>> + /* Skip debug sstatements. */ >>> >>> s/ss/s/ >>> >>> + if (is_gimple_debug (gsi_stmt (gsi))) >>> + continue; >>> + stmt1 = gsi_stmt (gsi); >>> + /* Do not consider writing to memory,volatile and call >>> >>> Missing space after , >>> >>> + /* Skip scalar statements. */ >>> + if (!VECTOR_TYPE_P (TREE_TYPE (lhs))) >>> + { >>> + /* If scalar statement has vuse we need to modify it >>> + when another masked store will be sunk. */ >>> + if (gimple_vuse (stmt1)) >>> + scalar_vuse.safe_push (stmt1); >>> continue; >>> + } >>> >>> I don't think it is safe to take for granted that the scalar stmts are all >>> going to be DCEd, but I could be wrong. >>> >>> + /* Check that LHS does not have uses outside of STORE_BB. */ >>> + res = true; >>> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) >>> + { >>> + gimple *use_stmt; >>> + use_stmt = USE_STMT (use_p); >>> + if (is_gimple_debug (use_stmt)) >>> + continue; >>> >>> Ignoring debug stmts to make decision whether you move or not is >>> of course the right thing to do. But IMHO you should remember if >>> you saw any is_gimple_debug stmts in some bool var. >>> >>> + if (gimple_bb (use_stmt) != store_bb) >>> + { >>> + res = false; >>> + break; >>> + } >>> + } >>> + if (!res) >>> + break; >>> >>> - if (gimple_vuse (stmt1) >>> - && gimple_vuse (stmt1) != gimple_vuse (last_store)) >>> - break; >>> + if (gimple_vuse (stmt1) >>> + && gimple_vuse (stmt1) != gimple_vuse (last_store)) >>> + break; >>> >>> + /* Can move STMT1 to STORE_BB. */ >>> + if (dump_enabled_p ()) >>> + { >>> + dump_printf_loc (MSG_NOTE, vect_location, >>> + "Move stmt to created bb\n"); >>> + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, stmt1, 0); >>> + } >>> >>> And if yes, invalidate them here, because the move would otherwise >>> generate invalid IL. >>> >>> + gsi_move_before (&gsi_from, &gsi_to); >>> + /* Shift GSI_TO for further insertion. */ >>> + gsi_prev (&gsi_to); >>> + } >>> + /* Put other masked stores with the same mask to STORE_BB. */ >>> + if (worklist.is_empty () >>> + || gimple_call_arg (worklist.last (), 2) != mask >>> + || worklist.last () != stmt1) >>> + break; >>> + last = worklist.pop (); >>> } >>> add_phi_arg (phi, gimple_vuse (last_store), e, UNKNOWN_LOCATION); >>> + /* Mask stores motion could crossing scalar statements with vuse >>> + which should be corrected. */ >>> >>> s/crossing/cross/ >>> That said, I'm not really sure if without some verification if such >>> reads are really dead it is safe to skip them and update this way. >>> Richard? >>> >>> + last_vuse = gimple_vuse (last_store); >>> + while (!scalar_vuse.is_empty ()) >>> + { >>> + stmt = scalar_vuse.pop (); >>> + if (gimple_vuse (stmt) != last_vuse) >>> + { >>> + gimple_set_vuse (stmt, last_vuse); >>> + update_stmt (stmt); >>> + } >>> + } >>> } >>> } >>> >>> Jakub