Richard, Here is updated patch which does not include your proposal related to the target hook deletion. You wrote: > I still don't understand why you need the new target hook. If we have a masked > load/store then the mask is computed by an assignment with a VEC_COND_EXPR > (in your example) and thus a test for a zero mask is expressible as just > > if (vect__ifc__41.17_167 == { 0, 0, 0, 0... }) > > or am I missing something? Such vector compare produces vector and does not set up cc flags required for conditional branch (GIMPLE_COND). If we use such comparison for GIMPLE_COND we got error message, so I used target hook which does set up cc flags aka ptest instruction and I left this part. I moved new routines to loop-vectorizer.c file and both they are static. I re-wrote is_valid_sink function to use def-use chain as you proposed. I also add new parameter to control such transformation. Few redundant tests have also been deleted. Any comments will be appreciated. Thanks. Yuri. 2015-06-18 Yuri Rumyantsev * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h. (ix86_vectorize_build_zero_vector_test): New function. (TARGET_VECTORIZE_BUILD_ZERO_VECTOR_TEST): New target macro * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_BUILD_ZERO_VECTOR_TEST. * doc/tm.texi: Updated. * params.def (PARAM_ZERO_TEST_FOR_STORE_MASK): New DEFPARAM. * params.h (ENABLE_ZERO_TEST_FOR_STORE_MASK): new macros. * target.def (build_zero_vector_test): New DEFHOOK. * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize has_mask_store field of vect_info. * tree-vectorizer.c: Include files stringpool.h and tree-ssanames.h. (is_valid_sink): New function. (optimize_mask_stores): New function. (vectorize_loops): Invoke optimaze_mask_stores for loops having masked stores. * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and correspondent macros. gcc/testsuite/ChangeLog: * gcc.target/i386/avx2-vect-mask-store-move1.c: New test. 2015-06-09 15:13 GMT+03:00 Richard Biener : > On Wed, May 20, 2015 at 4:00 PM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is updated patch to optimize mask stores. The main goal of it is >> to avoid execution of mask store if its mask is zero vector since >> loads that follow it can be blocked. >> The following changes were done: >> 1. A test on sink legality was added - it simply prohibits to cross >> statements with non-null vdef or vuse. >> 2. New phi node is created in join block for moved MASK_STORE statements. >> 3. Test was changed to check that 2 MASK_STORE statements are not >> moved to the same block. >> 4. New field was added to loop_vec_info structure to mark loops >> having MASK_STORE's. >> >> Any comments will be appreciated. >> Yuri. > > I still don't understand why you need the new target hook. If we have a masked > load/store then the mask is computed by an assignment with a VEC_COND_EXPR > (in your example) and thus a test for a zero mask is expressible as just > > if (vect__ifc__41.17_167 == { 0, 0, 0, 0... }) > > or am I missing something? As we dont' want this transform on all targets > (or always) we can control it by either a --param or a new flag which default > targets can adjust. [Is the hazard still present with Skylake or other > AVX512 implementations for example?] > > +/* Helper for optimize_mask_stores: returns true if STMT sinking to end > + of BB is valid and false otherwise. */ > + > +static bool > +is_valid_sink (gimple stmt) > +{ > > so STMT is either a masked store or a masked load? You shouldn't > walk all stmts here but instead rely on the factored use-def def-use > chains of virtual operands. > > +void > +optimize_mask_stores (struct loop *loop) > +{ > + basic_block bb = loop->header; > + gimple_stmt_iterator gsi; > + gimple stmt; > + auto_vec worklist; > + > + if (loop->dont_vectorize > + || loop->num_nodes > 2) > + return; > > looks like no longer needed given the place this is called from now > (or does it guard against outer loop vectorization as well?) > Also put it into tree-vect-loop.c and make it static. > > + /* Loop was not vectorized if mask does not have vector type. */ > + if (!VECTOR_TYPE_P (TREE_TYPE (mask))) > + return; > > this should be always false > > + store_bb->frequency = bb->frequency / 2; > + efalse->probability = REG_BR_PROB_BASE / 2; > > I think the == 0 case should end up as unlikely. > > + if (dom_info_available_p (CDI_POST_DOMINATORS)) > + set_immediate_dominator (CDI_POST_DOMINATORS, bb, store_bb); > > post dominators are not available in the vectorizer. > > + /* Put definition statement of stored value in STORE_BB > + if possible. */ > + arg3 = gimple_call_arg (last, 3); > + if (TREE_CODE (arg3) == SSA_NAME && has_single_use (arg3)) > + { > + def_stmt = SSA_NAME_DEF_STMT (arg3); > + /* Move def_stmt to STORE_BB if it is in the same bb and > + it is legal. */ > + if (gimple_bb (def_stmt) == bb > + && is_valid_sink (def_stmt)) > > ok, so you move arbitrary statements. You can always move non-memory > statements and if you keep track of the last store you moved you > can check if gimple_vuse () is the same as on that last store and be > done with that, or if you sink another store (under the same condition) > then just update the last store. > > Otherwise this looks better now. > > Thus - get rid of the target hook and of is_valid_sink. > > Thanks, > Richard. > >> 2015-05-20 Yuri Rumyantsev >> >> * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h. >> (ix86_vectorize_is_zero_vector): New function. >> (TARGET_VECTORIZE_IS_ZERO_VECTOR): New target macro >> * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_IS_ZERO_VECTOR. >> * doc/tm.texi: Updated. >> * target.def (is_zero_vector): New DEFHOOK. >> * tree-vect-stmts.c : Include tree-into-ssa.h. >> (vectorizable_mask_load_store): Initialize has_mask_store field. >> (is_valid_sink): New function. >> (optimize_mask_stores): New function. >> * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for >> loops having masked stores. >> * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and >> correspondent macros. >> (optimize_mask_stores): Update prototype. >> >> gcc/testsuite/ChangeLog: >> * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.