Hi Richi, Thanks for the comments! on 2021/5/7 下午5:43, Richard Biener wrote: > On Fri, May 7, 2021 at 5:30 AM Kewen.Lin via Gcc-patches > wrote: >> >> Hi, >> >> When I was investigating density_test heuristics, I noticed that >> the current rs6000_density_test could be used for single scalar >> iteration cost calculation, through the call trace: >> vect_compute_single_scalar_iteration_cost >> -> rs6000_finish_cost >> -> rs6000_density_test >> >> It looks unexpected as its desriptive function comments and Bill >> helped to confirm this needs to be fixed (thanks!). >> >> So this patch is to check the passed data, if it's the same as >> the one in loop_vinfo, it indicates it's working on vector version >> cost calculation, otherwise just early return. >> >> Bootstrapped/regtested on powerpc64le-linux-gnu P9. >> >> Nothing remarkable was observed with SPEC2017 Power9 full run. >> >> Is it ok for trunk? > > + /* Only care about cost of vector version, so exclude scalar > version here. */ > + if (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) != (void *) data) > + return; > > Hmm, looks like a quite "random" test to me. What about adding a > parameter to finish_cost () (or init_cost?) indicating the cost kind? > I originally wanted to change the hook interface, but noticed that the finish_cost in function vect_estimate_min_profitable_iters is the only invocation with LOOP_VINFO_TARGET_COST_DATA (loop_vinfo), it looks enough to differentiate the scalar version costing or vector version costing for loop. Do you mean this observation/ assumption easy to be broken sometime later? The attached patch to add one new parameter to indicate the costing kind explicitly as you suggested. Does it look better? gcc/ChangeLog: * doc/tm.texi: Regenerated. * target.def (init_cost): Add new parameter costing_for_scalar. * targhooks.c (default_init_cost): Adjust for new parameter. * targhooks.h (default_init_cost): Likewise. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Likewise. (vect_compute_single_scalar_iteration_cost): Likewise. (vect_analyze_loop_2): Likewise. * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise. (vect_bb_vectorization_profitable_p): Likewise. * tree-vectorizer.h (init_cost): Likewise. * config/aarch64/aarch64.c (aarch64_init_cost): Likewise. * config/i386/i386.c (ix86_init_cost): Likewise. * config/rs6000/rs6000.c (rs6000_init_cost): Likewise. > OTOH we already pass scalar_stmt to individual add_stmt_cost, > so not sure whether the context really matters. That said, > the density test looks "interesting" ... the intent was that finish_cost > might look at gathered data from add_stmt, not that it looks at > the GIMPLE IL ... so why are you not counting vector_stmt vs. > scalar_stmt entries in vect_body and using that for this metric? > Good to know the intention behind finish_cost, thanks! I'm afraid that the check on vector_stmt and scalar_stmt entries from add_stmt_cost doesn't work for the density test here. The density test focuses on the vector version itself, there are some stmts whose relevants are marked as vect_unused_in_scope, IIUC they won't be passed down when costing for both versions. But the existing density check would like to know the cost for the non-vectorized part. The current implementation does: vec_cost = data->cost[vect_body] if (!STMT_VINFO_RELEVANT_P (stmt_info) && !STMT_VINFO_IN_PATTERN_P (stmt_info)) not_vec_cost++ density_pct = (vec_cost * 100) / (vec_cost + not_vec_cost); it takes those unrelevant stmts into account, and then has both costs from the non-vectorized part (not_vec_cost) and vectorized part (cost[vect_body]), it can calculate the vectorization code density ratio. BR, Kewen