From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29109 invoked by alias); 26 Jan 2015 10:30:31 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 28942 invoked by uid 48); 26 Jan 2015 10:30:10 -0000 From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/62173] [5.0 regression] 64bit Arch can't ivopt while 32bit Arch can Date: Mon, 26 Jan 2015 10:30:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Priority: P1 X-Bugzilla-Assigned-To: jiwang at gcc dot gnu.org X-Bugzilla-Target-Milestone: 5.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-01/txt/msg02820.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173 --- Comment #17 from Richard Biener --- I really wonder why IVOPTs calls convert_affine_scev with !use_overflow_semantics. Note that for the original testcase 'i' may be negative or zero and thus 'd' may be zero. We do a bad analysis here because IVOPTs follows complete peeling immediately... but at least we have range information that looks useful: : # RANGE [0, 10] NONZERO 15 # d_26 = PHI # RANGE [0, 9] NONZERO 15 d_13 = d_26 + -1; _14 = A[d_26]; # RANGE [0, 255] NONZERO 255 _15 = (int) _14; # USE = nonlocal # CLB = nonlocal foo (_15); if (d_13 != 0) goto ; else goto ; : goto ; but unfortunately we expand the initial value of the IV for d all the way to i_6(D) so we don't see that i_6(D) is constrained by the range for d_26. So when we are in idx_find_step before we replace *idx with iv->base we could check range-information on whether it wrapped. Hmm, I think we can't really compute this. But we can transfer range information (temporarily) from d_26 to iv->base i_6(D) and make use of that in scev_probably_wraps_p. There we currently compute whether (unsigned) i_6(D) + 2147483648 (??) > 9 using fold_binary but with range information [0, 10] it would compute as false (huh, so what is it actually testing?!). I think the computation of 'delta' should instead be adjusted to use range information - max for negative step and min for positive step. Like the following: Index: gcc/tree-ssa-loop-niter.c =================================================================== --- gcc/tree-ssa-loop-niter.c (revision 220038) +++ gcc/tree-ssa-loop-niter.c (working copy) @@ -3863,12 +3863,17 @@ scev_probably_wraps_p (tree base, tree s bound of the type, and verify that the loop is exited before this occurs. */ unsigned_type = unsigned_type_for (type); - base = fold_convert (unsigned_type, base); - if (tree_int_cst_sign_bit (step)) { tree extreme = fold_convert (unsigned_type, lower_bound_in_type (type, type)); + wide_int min, max; + if (TREE_CODE (base) == SSA_NAME + && INTEGRAL_TYPE_P (TREE_TYPE (base)) + && get_range_info (base, &min, &max) == VR_RANGE) + base = wide_int_to_tree (unsigned_type, max); + else + base = fold_convert (unsigned_type, base); delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme); step_abs = fold_build1 (NEGATE_EXPR, unsigned_type, fold_convert (unsigned_type, step)); @@ -3877,6 +3882,13 @@ scev_probably_wraps_p (tree base, tree s { tree extreme = fold_convert (unsigned_type, upper_bound_in_type (type, type)); + wide_int min, max; + if (TREE_CODE (base) == SSA_NAME + && INTEGRAL_TYPE_P (TREE_TYPE (base)) + && get_range_info (base, &min, &max) == VR_RANGE) + base = wide_int_to_tree (unsigned_type, min); + else + base = fold_convert (unsigned_type, base); delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base); step_abs = fold_convert (unsigned_type, step); } doesn't really help this case unless i_6(D) gets range-information transfered temporarily as I said above, of course.