On Fri, May 22, 2015 at 7:45 PM, Richard Biener wrote: > On Wed, May 20, 2015 at 11:41 AM, Bin Cheng wrote: >> Hi, >> As we know, GCC is too conservative when checking overflow behavior in SCEV >> and loop related optimizers. Result is some variable can't be recognized as >> scalar evolution and thus optimizations are missed. To be specific, >> optimizers like ivopts and vectorizer are affected. >> This issue is more severe on 64 bit platforms, for example, PR62173 is >> failed on aarch64; scev-3.c and scev-4.c were marked as XFAIL on lp64 >> platforms. >> >> As the first part to improve overflow checking in GCC, this patch does below >> improvements: >> 1) Ideally, chrec_convert should be responsible to convert scev like >> "(type){base, step}" to scev like "{(type)base, (type)step} when the result >> scev doesn't overflow; chrec_convert_aggressive should do the conversion if >> the result scev could overflow/wrap. Unfortunately, current implementation >> may use chrec_convert_aggressive to return a scev that won't overflow. This >> is because of a) the static parameter "fold_conversions" for >> instantiate_scev_convert can only tracks whether chrec_convert_aggressive >> may be called, rather than if it does some overflow conversion or not; b) >> the implementation of instantiate_scev_convert sometimes shortcuts the call >> to chrec_convert and misses conversion opportunities. This patch improves >> this. >> 2) iv->no_overflow computed in simple_iv is too conservative. With 1) >> fixed, iv->no_overflow should reflects whether chrec_convert_aggressive does >> return an overflow scev. This patch improves this. >> 3) chrec_convert should be able to prove the resulting scev won't overflow >> with loop niter information. This patch doesn't finish this, but it >> factored a new interface out of scev_probably_wraps_p for future >> improvement. And that will be the part II patch. >> >> With the improvements in SCEV, this patch also improves optimizer(IVOPT) >> that uses scev information like below: >> For array reference in the form of arr[IV], GCC tries to derive new >> address iv {arr+iv.base, iv.step*elem_size} from IV. If IV overflow wrto a >> type that is narrower than address space, this derivation is not true >> because &arr[IV] isn't a scev. Root cause why scev-*.c are failed now is >> the overflow information of IV is too conservative. IVOPT has to be >> conservative to reject &arr[IV] as a scev. With more accurate overflow >> information, IVOPT can be improved too. So this patch fixes the mentioned >> long standing issues. >> >> Bootstrap and test on x86_64, x86 and aarch64. >> BTW, test gcc.target/i386/pr49781-1.c failed on x86_64, but I can confirmed >> it's not this patch's fault. >> >> So what's your opinion on this?. > > I maybe mixing things up but does > > +chrec_convert_aggressive (tree type, tree chrec, bool *fold_conversions) > { > ... > + if (evolution_function_is_affine_p (chrec)) > + { > + tree base, step; > + struct loop *loop; > + > + loop = get_chrec_loop (chrec); > + base = CHREC_LEFT (chrec); > + step = CHREC_RIGHT (chrec); > + if (convert_affine_scev (loop, type, &base, &step, NULL, true)) > + return build_polynomial_chrec (loop->num, base, step); > > ^^^ not forget to set *fold_conversions to true? Or we need to use > convert_affine_scev (..., false)? Nice catch. It's supposed to be called only if source scev has no overflow behavior introduced by previous call to chrec_convert_aggressive. In other words, it should be guarded by "!*fold_conversions" like below: + + if (!*fold_conversions && evolution_function_is_affine_p (chrec)) + { + tree base, step; + struct loop *loop; + + loop = get_chrec_loop (chrec); + base = CHREC_LEFT (chrec); + step = CHREC_RIGHT (chrec); + if (convert_affine_scev (loop, type, &base, &step, NULL, true)) + return build_polynomial_chrec (loop->num, base, step); + } The scenario is rare that didn't exposed in either bootstrap or reg-test. Here is the updated patch without any other difference. Bootstrap and test on x86_64 & AArch64. Thanks, bin > > + } > > (bah, and the diff somehow messes up -p context :/ which is why I like > context diffs more) > > Other from the above the patch looks good to me. > > Thanks, > Richard. > >> Thanks, >> bin >> >> 2015-05-20 Bin Cheng >> >> PR tree-optimization/62173 >> * tree-ssa-loop-ivopts.c (struct iv): New field. Reorder fields. >> (alloc_iv, set_iv): New parameter. >> (determine_biv_step): Delete. >> (find_bivs): Inline original determine_biv_step. Pass new >> argument to set_iv. >> (idx_find_step): Use no_overflow information for conversion. >> * tree-scalar-evolution.c (analyze_scalar_evolution_in_loop): Let >> resolve_mixers handle folded_casts. >> (instantiate_scev_name): Change bool parameter to bool pointer. >> (instantiate_scev_poly, instantiate_scev_binary): Ditto. >> (instantiate_array_ref, instantiate_scev_not): Ditto. >> (instantiate_scev_3, instantiate_scev_2): Ditto. >> (instantiate_scev_1, instantiate_scev_r): Ditto. >> (instantiate_scev_convert, ): Change parameter. Pass argument >> to chrec_convert_aggressive. >> (instantiate_scev): Change argument. >> (resolve_mixers): New parameter and set it. >> (scev_const_prop): New argument. >> * tree-scalar-evolution.h (resolve_mixers): New parameter. >> * tree-chrec.c (convert_affine_scev): Call chrec_convert instead >> of chrec_conert_1. >> (chrec_convert): New parameter. Move definition below. >> (chrec_convert_aggressive): New parameter and set it. Call >> convert_affine_scev. >> * tree-chrec.h (chrec_convert): New parameter. >> (chrec_convert_aggressive): Ditto. >> * tree-ssa-loop-niter.c (loop_exits_before_overflow): New function. >> (scev_probably_wraps_p): Factor loop niter related code into >> loop_exits_before_overflow. >> >> gcc/testsuite/ChangeLog >> 2015-05-20 Bin Cheng >> >> PR tree-optimization/62173 >> * gcc.dg/tree-ssa/scev-3.c: Remove xfail. >> * gcc.dg/tree-ssa/scev-4.c: Ditto. >> * gcc.dg/tree-ssa/scev-8.c: New.