*[PATCH] Tree-level fix for PR 69526@ 2016-07-21 10:42 Robin Dapp2016-07-21 11:28 ` Richard Biener 0 siblings, 1 reply; 47+ messages in thread From: Robin Dapp @ 2016-07-21 10:42 UTC (permalink / raw) To: gcc-patches [-- Attachment #1: Type: text/plain, Size: 958 bytes --] As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526, we currently fail to simplify cases like (unsigned long)(a - 1) + 1 to (unsigned long)a when VRP knows that (a - 1) does not overflow. This patch introduces a match.pd pattern as well as a helper function that checks for overflow in a binary operation using VRP information and simplifies when no overflow is present. Some effort was put in to stay in the inner type in cases like this (unsigned long)(a + CST1) - CST2 -> (unsigned long)(a + CST3) rather than (unsigned long)a + CST3 where abs(CST3) = abs(CST1 - CST) <= abs(CST1). I wonder if this is warranted, i.e. if it is always advantageous or if the evaluation should rather involve a cost estimation - e.g. distinguish between costs for operations in int vs. in long. Absence of signed overflow is also exploited: (long)(a + 2) - 1 -> (long)(a + 1) Bootstrapped and regression tested on s390x and x86_64. Regards Robin [-- Attachment #2: gcc-indvar-v5-p1.diff --] [-- Type: text/x-patch, Size: 12220 bytes --] diff --git a/gcc/match.pd b/gcc/match.pd index b24bfb4..87925e0 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1119,6 +1119,124 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (with + { + /* If the inner operation is wrapped inside a conversion, we have to + check it overflows/underflows and can only then perform the + simplification, i.e. add the second constant to the first + (wrapped) and convert afterwards. This fixes + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526 */ + + bool inner_ovf = false; + bool combined_ovf = false; + bool combined_ovf2 = false; + bool need_outer = false; + tree cst = NULL_TREE; + tree cst_inner = NULL_TREE; + bool combine_constants = false; + + tree inner_type = TREE_TYPE (@3); + // check overflow in wrapped binop? + bool check_inner_ovf = !TYPE_OVERFLOW_UNDEFINED (inner_type); + // check overflow when combining constants + bool check_combine_ovf = TYPE_OVERFLOW_UNDEFINED (inner_type) + || TYPE_OVERFLOW_UNDEFINED (type); + + if (check_inner_ovf) + { + // check if the inner binop does not overflow i.e. we have VRP + // information and can prove prove it + inner_ovf = binop_overflow (inner_op, @0, @1, inner_type); + } + + // check for overflow in combined operation + wide_int combined_cst = wi::add (@1, @2, + TYPE_SIGN (type), + &combined_ovf); + wide_int w1 = @2; + unsigned int prec = w1.get_precision(); + unsigned HOST_WIDE_INT tm = 0; + if (tree_fits_uhwi_p (TYPE_MAX_VALUE (inner_type))) + tree_to_uhwi (TYPE_MAX_VALUE (inner_type)); + wide_int w2 = wi::uhwi (tm, prec); + if (wi::gtu_p (wi::abs (w1), w2)) + combined_ovf = true; + + wide_int combined_cst_outer = wi::add (@2, @1, + TYPE_SIGN (type), + &combined_ovf2); + + // the combined constant is ok if its type does not have an + // undefined overflow or one of the combination options does not + // overflow + bool combined_cst_ok = !check_combine_ovf + || !(combined_ovf && combined_ovf2); + + if (check_inner_ovf ? !inner_ovf && combined_cst_ok : + combined_cst_ok) + combine_constants = true; + + if (combine_constants) + { + cst_inner = wide_int_to_tree (inner_type, combined_cst); + cst = wide_int_to_tree (type, combined_cst_outer); + + int s1 = tree_int_cst_sgn (@1); + int s2 = tree_int_cst_sgn (cst_inner); + + bool not_larger_abs = wi::leu_p (wi::abs (cst_inner), + wi::abs (@1)); + + // check for an inner overflow with the combined constant + bool inner_ovf2 = binop_overflow (inner_op, @0, cst_inner, + inner_type); + + if (check_inner_ovf ? !inner_ovf2 : not_larger_abs && + !combined_ovf) + { + cst = cst_inner; + } + else + { + // we need the outer type if the signs + // differ i.e. if we had an (a - 2) before and have an + // (a + 1) afterwards since the proved non-overflow + // would not necessarily hold + need_outer = true; + + // if we overflow in the inner type (with non-undefined + // type) we need to zero-extend the overflowed result + // in the outer type + if (check_inner_ovf && combined_ovf + && tree_fits_uhwi_p (cst_inner)) + { + HOST_WIDE_INT hw = tree_to_uhwi (cst_inner); + wide_int ww = wi::uhwi (hw, TYPE_PRECISION (type)); + cst = wide_int_to_tree (type, ww); + } + + // if we need the outer type but the operation overflows + // we cannot do anything + if (combined_ovf2) + combine_constants = false; + } + } + } + (if (combine_constants && cst && @0) + (switch + (if (!need_outer) + (convert (outer_op { @0; } { cst; }))) + (if (need_outer) + (outer_op (convert { @0; }) { cst; })) + )))))) +#endif + /* (A +- CST) +- CST -> A + CST */ (for outer_op (plus minus) (for inner_op (plus minus) diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c new file mode 100644 index 0000000..25ffcb5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c @@ -0,0 +1,151 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop1-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "forwprop1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 5 "vrp1" } } */ + +#include <limits.h> + +// should simplify, ignore possible signed overflow in (a - 2) and combine +// constants +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +// should simplify +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +// should simplify with combined constant in outer type +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +// should simplify +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +// should simplify (same as above) +long bak(int a) +{ + return (long)(a + 1) + 3; +} + +// should simplify (same) +long bal(int a) +{ + return (long)(a - 7) - 4; +} + +// should simplify with combined constant in inner type, no overflow in +// combined constant in inner type (1 - INT_MAX) although it looks like it :) +long bam(int a) +{ + return (long)(a - 1) - INT_MAX; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long bam2(int a) +{ + return (long)(a + 1) + INT_MAX; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long ban(int a) +{ + return (long)(a - 1) + INT_MIN; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long ban2(int a) +{ + return (long)(a + 1) - INT_MIN; +} + +// should not simplify, LONG_MAX doesn't fit in int and combined constant +// overflows in long +long bao(int a) +{ + return (long)(a + 2) + LONG_MAX; +} + +// should not simplify although at first looks like foo on tree level, +// but a is promoted to long +long bas(int a) +{ + return (long)(a + UINT_MAX) + 1; +} + +// should not simplify although at first looks like baz on tree level +unsigned long bat(unsigned int a) +{ + return (unsigned long)(a + UINT_MAX) + 2; +} + +// should not simplify (except if (a - 1) does not overflow (via VRP)) +// overflow in combined constant in outer type but type wraps +unsigned long foo2(unsigned int a) +{ + return (unsigned long)(a - 1) + 1; +} + +// should not simplify +unsigned long bar2(unsigned int a) +{ + return (unsigned long)(a + 1) - 1; +} + +// should simplify (same as above with VRP) +unsigned long oof2(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a - 1) + 1; +} + +// same +unsigned long bap(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) + ULONG_MAX; +} + +// should simplify +unsigned long bar3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 5; +} + +// should simplify in outer type as we cannot prove non-overflow +unsigned long bar4(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 6; +} + +// should simplify +unsigned long baq(int a) +{ + return (unsigned long)(a - 2) + 1; +} + +// should not simplify +long baq2(unsigned int a) +{ + return (long)(a - 1) + 1; +} + +// should simplify +long baq3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (long)(a - 1) + 1; +} diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 4333d60..b31f848 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -8906,6 +8906,34 @@ simplify_truth_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) return true; } +/* Simplify ((type) (b +- CST1)) + CST2 to (type) b + (CST1 + CST2) + if (b +- CST1) does not underflow. */ + +static bool +simplify_plus_or_minus_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + tree op0 = gimple_assign_rhs1 (stmt); + tree op1 = gimple_assign_rhs2 (stmt); + + if ((code == PLUS_EXPR || code == MINUS_EXPR) && + op0 != NULL && op1 != NULL) + { + gimple *stmt1 = SSA_NAME_DEF_STMT (op0); + if (gassign *def = dyn_cast <gassign *> (stmt1)) + { + if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)) + && TREE_CODE (op1) == INTEGER_CST) + { + if (fold_stmt (gsi, follow_single_use_edges)) + return true; + } + } + } + + return false; +} + /* Simplify a division or modulo operator to a right shift or bitwise and if the first operand is unsigned or is greater than zero and the second operand is an exact power of two. @@ -9905,6 +9933,12 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi) return simplify_min_or_max_using_ranges (stmt); break; + case MINUS_EXPR: + case PLUS_EXPR: + if (TREE_CODE (rhs1) == SSA_NAME) + return simplify_plus_or_minus_using_ranges (gsi, stmt); + break; + default: break; } diff --git a/gcc/tree.c b/gcc/tree.c index 2295111..bc477fa 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -1358,6 +1358,108 @@ force_fit_type (tree type, const wide_int_ref &cst, return wide_int_to_tree (type, cst); } +bool binop_overflow (enum tree_code op, tree t1, tree t2, tree type) +{ + if (t1 == NULL_TREE || t2 == NULL_TREE || type == NULL_TREE) + return true; + + if (TYPE_OVERFLOW_UNDEFINED (type)) + return false; + + if (!INTEGRAL_TYPE_P (type)) + return true; + + if (TREE_CODE (t1) != SSA_NAME) + return true; + + if (op != PLUS_EXPR && op != MINUS_EXPR) + return true; + + wide_int min1; + wide_int max1; + + if (TREE_CODE (t1) != INTEGER_CST) + { + enum value_range_type vrtype1 = get_range_info (t1, &min1, &max1); + + if (!vrtype1 || (vrtype1 != VR_RANGE && vrtype1 != VR_ANTI_RANGE)) + return true; + + if (vrtype1 == VR_ANTI_RANGE) + { + bool ovf; + wide_int tmpmin = wi::add (min1, 1, TYPE_SIGN (TREE_TYPE (t1)), &ovf); + if (ovf) + return true; + wide_int tmpmax = wi::sub (max1, 1, TYPE_SIGN (TREE_TYPE (t1)), &ovf); + if (ovf) + return true; + min1 = tmpmin; + max1 = tmpmax; + } + } + else + { + min1 = t1; + max1 = t1; + } + + wide_int min2; + wide_int max2; + + if (TREE_CODE (t2) != INTEGER_CST) + { + enum value_range_type vrtype2 = get_range_info (t2, &min2, &max2); + + if (!vrtype2 || (vrtype2 != VR_RANGE && vrtype2 != VR_ANTI_RANGE)) + return true; + + if (vrtype2 == VR_ANTI_RANGE) + { + bool ovf; + wide_int tmpmin = wi::add (min2, 1, TYPE_SIGN (TREE_TYPE (t2)), &ovf); + if (ovf) + return true; + wide_int tmpmax = wi::sub (max2, 1, TYPE_SIGN (TREE_TYPE (t2)), &ovf); + if (ovf) + return true; + min2 = tmpmin; + max2 = tmpmax; + } + } + else + { + min2 = t2; + max2 = t2; + } + + wide_int wmin, wmax; + + bool ovf; + signop sgned = TYPE_SIGN (TREE_TYPE (t1)) == SIGNED + || TYPE_SIGN (TREE_TYPE (t2)) == SIGNED ? SIGNED : UNSIGNED; + if (op == MINUS_EXPR) + { + wmin = wi::sub (min1, min2, sgned, &ovf); + if (ovf) + return true; + wmax = wi::sub (max1, max2, sgned, &ovf); + if (ovf) + return true; + } + else + { + wmin = wi::add (min1, min2, sgned, &ovf); + if (ovf) + return true; + wmax = wi::add (max1, max2, sgned, &ovf); + if (ovf) + return true; + } + + return wi::gt_p (wmin, wmax, TYPE_SIGN (TREE_TYPE (t1))); +} + /* These are the hash table functions for the hash table of INTEGER_CST nodes of a sizetype. */ diff --git a/gcc/tree.h b/gcc/tree.h index 99ed8ff..641be00 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -5482,4 +5482,7 @@ desired_pro_or_demotion_p (const_tree to_type, const_tree from_type) return to_type_precision <= TYPE_PRECISION (from_type); } +extern bool +binop_overflow (enum tree_code op, tree t1, tree t2, tree type); + #endif /* GCC_TREE_H */ ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-07-21 10:42 [PATCH] Tree-level fix for PR 69526 Robin Dapp@ 2016-07-21 11:28 ` Richard Biener2016-08-22 14:58 ` Robin Dapp 2016-08-23 7:11 ` [PATCH] Tree-level fix for PR 69526 Robin Dapp 0 siblings, 2 replies; 47+ messages in thread From: Richard Biener @ 2016-07-21 11:28 UTC (permalink / raw) To: Robin Dapp;+Cc:GCC Patches On Thu, Jul 21, 2016 at 12:42 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: > As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526, we > currently fail to simplify cases like > > (unsigned long)(a - 1) + 1 > > to > > (unsigned long)a > > when VRP knows that (a - 1) does not overflow. > > This patch introduces a match.pd pattern as well as a helper function > that checks for overflow in a binary operation using VRP information and > simplifies when no overflow is present. > > Some effort was put in to stay in the inner type in cases like this > > (unsigned long)(a + CST1) - CST2 > -> > (unsigned long)(a + CST3) rather than > (unsigned long)a + CST3 > > where abs(CST3) = abs(CST1 - CST) <= abs(CST1). I wonder if this is > warranted, i.e. if it is always advantageous or if the evaluation should > rather involve a cost estimation - e.g. distinguish between costs for > operations in int vs. in long. > > Absence of signed overflow is also exploited: > > (long)(a + 2) - 1 > -> > (long)(a + 1) > > Bootstrapped and regression tested on s390x and x86_64. I find this a bit hard to follow (looking at the match.pd pattern). + if (check_inner_ovf) + { + // check if the inner binop does not overflow i.e. we have VRP + // information and can prove prove it + inner_ovf = binop_overflow (inner_op, @0, @1, inner_type); + } if !inner_ovf (just set that to false if !check_inner_ovf to simplify checks please). you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2 (if T is wider than the inner type which I think you should check and which should simplify things). You can then combine (T)@1 and @2 where I think you fail to handle mixed MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for integers). So you have (T)@0 + combined-in-T which you then can either emit or check whether combined-in-T fits in the inner type and @0 + combined-in-T does not overflow in which case (T)(@0 + combined-in-T) is safe. I believe that for simplicity we want to split this transform into two - one doing (T)(a + CST) - CST -> (T)a + CST' and one doing (T)a + CST -> (T)(a + CST). The testcase is rather unspecific as to what testcases shoudl fold and what not given your very simple scan and mixing should/should-not simplify cases. Please consider splitting it up and make it a run test that verifies the bogus transforms do not take place. Doing +static bool +simplify_plus_or_minus_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + tree op0 = gimple_assign_rhs1 (stmt); + tree op1 = gimple_assign_rhs2 (stmt); + + if ((code == PLUS_EXPR || code == MINUS_EXPR) && + op0 != NULL && op1 != NULL) + { + gimple *stmt1 = SSA_NAME_DEF_STMT (op0); + if (gassign *def = dyn_cast <gassign *> (stmt1)) + { + if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)) + && TREE_CODE (op1) == INTEGER_CST) + { + if (fold_stmt (gsi, follow_single_use_edges)) + return true; causes such stmts to be folded twice as substitute_and_fold does /* Some statements may be simplified using propagator specific information. Do this before propagating into the stmt to not disturb pass specific information. */ if (fold_fn && (*fold_fn)(&i)) { did_replace = true; prop_stats.num_stmts_folded++; stmt = gsi_stmt (i); update_stmt (stmt); } /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); /* If we made a replacement, fold the statement. */ if (did_replace) { fold_stmt (&i, follow_single_use_edges); stmt = gsi_stmt (i); which is less than ideal. I think that given we have fold_stmt following SSA edges and thus not only stmts we propagated into are possibly interesting to fold (but also their single-uses, recursively), we should evaluate the compile-time cost of doing the fold_stmt unconditionally. diff --git a/gcc/tree.c b/gcc/tree.c index 2295111..bc477fa 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -1358,6 +1358,108 @@ force_fit_type (tree type, const wide_int_ref &cst, return wide_int_to_tree (type, cst); } +bool binop_overflow (enum tree_code op, tree t1, tree t2, tree type) +{ tree.c doesn't look like the best fit. I think putting it into tree-vrp.c is better and I think that extract_range_from_binary_expr_1 itself should compute what we want here as additional output. Conservative handling for all but plus/minus is ok with me. + if (t1 == NULL_TREE || t2 == NULL_TREE || type == NULL_TREE) + return true; + + if (TYPE_OVERFLOW_UNDEFINED (type)) + return false; + + if (!INTEGRAL_TYPE_P (type)) + return true; note that we'll ICE if you call TYPE_OVERFLOW_UNDEFINED on a type that is not ANY_INTEGRAL_TYPE_P, so better re-order the checks. Thanks, Richard. > Regards > Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-07-21 11:28 ` Richard Biener@ 2016-08-22 14:58 ` Robin Dapp2016-09-05 7:50 ` Robin Dapp 2016-09-14 13:08 ` Richard Biener 2016-08-23 7:11 ` [PATCH] Tree-level fix for PR 69526 Robin Dapp 1 sibling, 2 replies; 47+ messages in thread From: Robin Dapp @ 2016-08-22 14:58 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches [-- Attachment #1: Type: text/plain, Size: 3310 bytes --] > if !inner_ovf (just set that to false if !check_inner_ovf to simplify > checks please). > you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2 > (if T is wider than the inner type which I think you should check and > which should > simplify things). I simplified the control flow a little and split both parts of the combination into separate patterns. > You can then combine (T)@1 and @2 where I think you fail to handle mixed > MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for integers). Is it ok to do something like this (see patch) in order to deal with MINUS_EXPR and then perform a wi::add? if (inner_op == MINUS_EXPR) w1 = wi::neg (w1); if (outer_op == MINUS_EXPR) w2 = wi::neg (w2); > The testcase is rather unspecific as to what testcases shoudl fold and what not > given your very simple scan and mixing should/should-not simplify cases. Please > consider splitting it up and make it a run test that verifies the > bogus transforms > do not take place. I divided the testcases into signed/unsigned and should simplify/should not simplify. The bogus unsigned transforms are now checked via assert(). > Doing [...] > causes such stmts to be folded twice as substitute_and_fold does [...] > which is less than ideal. I think that given we have fold_stmt following > SSA edges and thus not only stmts we propagated into are possibly > interesting to fold (but also their single-uses, recursively), we should > evaluate the compile-time cost of doing the fold_stmt unconditionally. Only using update_stmt seems to avoid the double call to fold_stmt but is obviously the wrong thing to do as this then tries to update everything that matches the pattern in tree-vrp.c. Copying some checks from match.pd to tree-vrp.c would help but isn't there a proper way to prevent duplicating the checking logic? In addition, is the compile-time cost checking something I could do for this patch or a general thing to be done later? > tree.c doesn't look like the best fit. I think putting it into > tree-vrp.c is better > and I think that extract_range_from_binary_expr_1 itself should compute what we > want here as additional output. Conservative handling for all but plus/minus is > ok with me. I kept the extra function for now because I find extract_range_from_binary_expr_1 somewhat lengthy and hard to follow already :) Wouldn't it be better to "separate concerns"/split it up in the long run and merge the functionality needed here at some time? Bootstrapped and reg-tested on s390x, bootstrap on x86 running. Regards Robin gcc/ChangeLog: 2016-08-22 Robin Dapp <rdapp@linux.vnet.ibm.com> * match.pd: Introduce patterns to simplify binary operations wrapped inside a cast. * tree-vrp.c (bool binop_overflow): (simplify_plus_or_minus_using_ranges): Make use of newly introduced patterns. (simplify_stmt_using_ranges): Call simplify_plus_or_minus_using_ranges * tree-vrp.h: New file. * gimple-match-head.c: Include tree-vrp.h gcc/testsuite/ChangeLog: 2016-08-22 Robin Dapp <rdapp@linux.vnet.ibm.com> * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. * gcc.dg/wrapped-binop-simplify-signed-2.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test. [-- Attachment #2: gcc-indvar-v7-p1.diff --] [-- Type: text/x-patch, Size: 13952 bytes --] diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c index 2beadbc..d66fcb1 100644 --- a/gcc/gimple-match-head.c +++ b/gcc/gimple-match-head.c @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "internal-fn.h" #include "case-cfn-macros.h" #include "gimplify.h" +#include "tree-vrp.h" /* Forward declarations of the private auto-generated matchers. diff --git a/gcc/match.pd b/gcc/match.pd index b24bfb4..f54b734 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1119,6 +1119,113 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (with + { + /* If the inner operation is wrapped inside a conversion, we have to + check it overflows/underflows and can only then perform the + simplification, i.e. add the second constant to the first + (wrapped) and convert afterwards. This fixes + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526 */ + bool inner_ovf = false; + + bool combine_ovf = true; + tree cst = NULL_TREE; + bool combine_constants = false; + bool types_ok = false; + + tree inner_type = TREE_TYPE (@3); + /* Check overflow in wrapped binop? */ + bool check_inner_ovf = !TYPE_OVERFLOW_UNDEFINED (inner_type); + /* Check overflow when combining constants? */ + bool check_combine_ovf = TYPE_OVERFLOW_UNDEFINED (type); + + signop s1 = TYPE_SIGN (type); + + if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type)) + { + types_ok = true; + + /* Check if the inner binop does not overflow i.e. we have VRP + information and can prove prove it. */ + if (check_inner_ovf) + inner_ovf = binop_overflow (inner_op, @0, @1, inner_type); + else + inner_ovf = false; + + wide_int w1 = @1; + wide_int w2 = @2; + + wide_int combined_cst; + + /* Convert to TYPE keeping possible signedness even when + dealing with unsigned types. */ + wide_int v1; + v1 = v1.from (w1, w2.get_precision(), tree_int_cst_sign_bit + (@1) ? SIGNED : UNSIGNED); + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + /* Combine in outer, larger type */ + combined_cst = wi::add (v1, w2, TYPE_SIGN (type), &combine_ovf); + + /* The combined constant is ok if its type does not exhibit + undefined overflow or there was no overflow when + combining. */ + bool combined_cst_ok = !check_combine_ovf || !combine_ovf; + if (!inner_ovf && combined_cst_ok) + { + /* convert to tree of outer type */ + cst = wide_int_to_tree (type, combined_cst); + combine_constants = true; + } + } + } + (if (types_ok && combine_constants && cst) + (outer_op (convert { @0; }) { cst; })) + )))) +#endif + + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert @0) INTEGER_CST@2) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool simplify = false; + + wide_int cst = @2; + tree inner_type = TREE_TYPE (@0); + tree cst_inner = wide_int_to_tree (inner_type, cst); + bool inner_ovf = true; + + if (TYPE_PRECISION (inner_type) >= TYPE_PRECISION (type) + || TREE_CODE (inner_type) != INTEGER_TYPE) + simplify = false; + else + { + inner_ovf = binop_overflow (outer_op, @0, cst_inner, inner_type); + if (!inner_ovf) + simplify = true; + } + } + (if (simplify && @0) + (convert (outer_op @0 (convert { @2; })))) + ))) +#endif + /* (A +- CST) +- CST -> A + CST */ (for outer_op (plus minus) (for inner_op (plus minus) @@ -1132,6 +1239,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (inner_op @0 { cst; } )))))) + /* (CST - A) +- CST -> CST - A */ (for outer_op (plus minus) (simplify diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c new file mode 100644 index 0000000..0afe99a --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c @@ -0,0 +1,76 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "forwprop1" } } */ + +#include <limits.h> + +// should simplify, ignore possible signed overflow in (a - 2) and combine +// constants +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +// should simplify +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +// should simplify with combined constant in outer type +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +// should simplify +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +// should simplify (same as above) +long bak(int a) +{ + return (long)(a + 1) + 3; +} + +// should simplify (same) +long bal(int a) +{ + return (long)(a - 7) - 4; +} + +// should simplify with combined constant in inner type, no overflow in +// combined constant in inner type (1 - INT_MAX) although it looks like it :) +long bam(int a) +{ + return (long)(a - 1) - INT_MAX; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long bam2(int a) +{ + return (long)(a + 1) + INT_MAX; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long ban(int a) +{ + return (long)(a - 1) + INT_MIN; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long ban2(int a) +{ + return (long)(a + 1) - INT_MIN; +} + +// should simplify, assumes no inner overflow +unsigned long baq(int a) +{ + return (unsigned long)(a + 1) - 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c new file mode 100644 index 0000000..9b166f6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop1-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 0 "forwprop1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 0 "vrp1" } } */ + +#include <limits.h> +#include <assert.h> + +// should not simplify, 2 + LONG_MAX overflows +long bao(int a) +{ + return (long)(a + 2) + LONG_MAX; +} + +// should not simplify +long bar(int a) +{ + return (long)(a - 2) - LONG_MAX; +} + +// should not simplify although at first looks like (long)(a - 1) + 1 on +// tree level, but a is promoted to long +long bas(int a) +{ + return (long)(a + UINT_MAX) + 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c new file mode 100644 index 0000000..fb0463d --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop1-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 1 "forwprop1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 5 "vrp1" } } */ + +#include <limits.h> + +// should simplify (same as above with VRP) +unsigned long oof2(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a - 1) + 1; +} + +// same +unsigned long bap(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) + ULONG_MAX; +} + +// should simplify +unsigned long bar3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 5; +} + +// should simplify in outer type as we cannot prove non-overflow +unsigned long bar4(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 6; +} + +// should simplify +unsigned long baq(int a) +{ + return (unsigned long)(a - 2) + 1; +} + +// should simplify +long baq3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (long)(a - 1) + 1; +} + diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c new file mode 100644 index 0000000..3843b6d --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <assert.h> +#include <limits.h> + +unsigned int a = 3; +unsigned int aa = UINT_MAX; + +int main() +{ + volatile unsigned long b = (unsigned long)(aa + 1) - 1; + assert (b == 18446744073709551615ul); + + volatile unsigned long c = (unsigned long)(a - 4) + 1; + assert (c == 4294967296); + + volatile unsigned long d = (unsigned long)(a + UINT_MAX - 4) + 2; + assert (d == 4294967296); + + volatile unsigned long e = (unsigned long)(a - UINT_MAX) + UINT_MAX; + assert (e == 4294967299); + + volatile unsigned long f = (unsigned long)(a + UINT_MAX) - UINT_MAX; + assert (f == 18446744069414584323ul); + + volatile long g = (long)(a - 4) + 1; + assert (g == 4294967296); +} diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 4333d60..f9bf2d4 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -3231,6 +3231,109 @@ extract_range_from_binary_expr (value_range *vr, } } +/* Check if the binary operation overflows. */ +bool binop_overflow (enum tree_code op, tree t1, tree t2, tree type) +{ + if (t1 == NULL_TREE || t2 == NULL_TREE || type == NULL_TREE) + return true; + + if (TREE_CODE (type) != INTEGER_TYPE) + return true; + + return true; + if (TREE_CODE (t1) != SSA_NAME) + + if (op != PLUS_EXPR && op != MINUS_EXPR) + return true; + + wide_int min1; + wide_int max1; + signop st1 = TYPE_SIGN (TREE_TYPE (t1)); + + if (TREE_CODE (t1) != INTEGER_CST) + { + enum value_range_type vrtype1 = get_range_info (t1, &min1, &max1); + + if (!vrtype1 || (vrtype1 != VR_RANGE && vrtype1 != VR_ANTI_RANGE)) + return true; + + if (vrtype1 == VR_ANTI_RANGE) + { + bool ovf; + wide_int tmpmin = wi::add (min1, 1, st1, &ovf); + if (st1 == SIGNED && ovf) + return true; + wide_int tmpmax = wi::sub (max1, 1, st1, &ovf); + if (st1 == SIGNED && ovf) + return true; + min1 = tmpmin; + max1 = tmpmax; + } + } + else + { + min1 = t1; + max1 = t1; + } + + wide_int min2; + wide_int max2; + signop st2 = TYPE_SIGN (TREE_TYPE (t2)); + + if (TREE_CODE (t2) != INTEGER_CST) + { + enum value_range_type vrtype2 = get_range_info (t2, &min2, &max2); + + if (!vrtype2 || (vrtype2 != VR_RANGE && vrtype2 != VR_ANTI_RANGE)) + return true; + + if (vrtype2 == VR_ANTI_RANGE) + { + bool ovf; + wide_int tmpmin = wi::add (min2, 1, st2, &ovf); + if (st2 == SIGNED && ovf) + return true; + wide_int tmpmax = wi::sub (max2, 1, st2, &ovf); + if (st2 == SIGNED && ovf) + return true; + min2 = tmpmin; + max2 = tmpmax; + } + } + else + { + min2 = t2; + max2 = t2; + } + + wide_int wmin, wmax; + + bool ovf; + signop sgn = TYPE_SIGN (TREE_TYPE (t1)) == SIGNED + || TYPE_SIGN (TREE_TYPE (t2)) == SIGNED ? SIGNED : UNSIGNED; + if (op == MINUS_EXPR) + { + wmin = wi::sub (min1, min2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::gt_p (wmin, min1, sgn))) + return true; + wmax = wi::sub (max1, max2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::gt_p (wmax, max1, sgn))) + return true; + } + else + { + wmin = wi::add (min1, min2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::lt_p (wmin, min1, sgn))) + return true; + wmax = wi::add (max1, max2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::lt_p (wmax, max1, sgn))) + return true; + } + + return wi::gt_p (wmin, wmax, TYPE_SIGN (TREE_TYPE (t1))); +} + + /* Extract range information from a unary operation CODE based on the range of its operand *VR0 with type OP0_TYPE with resulting type TYPE. The resulting range is stored in *VR. */ @@ -8906,6 +9009,37 @@ simplify_truth_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) return true; } +/* Simplify ((type) (b +- CST1)) + CST2 to (type) b + (CST1 + CST2) + if (b +- CST1) does not underflow. */ + +static bool +simplify_plus_or_minus_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + tree op0 = gimple_assign_rhs1 (stmt); + tree op1 = gimple_assign_rhs2 (stmt); + + if ((code == PLUS_EXPR || code == MINUS_EXPR) && + op0 != NULL && op1 != NULL) + { + gimple *stmt1 = SSA_NAME_DEF_STMT (op0); + if (gassign *def = dyn_cast <gassign *> (stmt1)) + { + if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)) + && TREE_CODE (op1) == INTEGER_CST) + { + /* Only mark the stmt as updated here, so fold_stmt in + tree-ssa-propagate will call the matching pattern in + match.pd. */ + update_stmt (gsi_stmt (*gsi)); + return true; + } + } + } + + return false; +} + /* Simplify a division or modulo operator to a right shift or bitwise and if the first operand is unsigned or is greater than zero and the second operand is an exact power of two. @@ -9905,6 +10039,12 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi) return simplify_min_or_max_using_ranges (stmt); break; + case MINUS_EXPR: + case PLUS_EXPR: + if (TREE_CODE (rhs1) == SSA_NAME) + return simplify_plus_or_minus_using_ranges (gsi, stmt); + break; + default: break; } diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h new file mode 100644 index 0000000..5a867bc --- /dev/null +++ b/gcc/tree-vrp.h @@ -0,0 +1,2 @@ +extern bool +binop_overflow (enum tree_code op, tree t1, tree t2, tree type); ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-07-21 11:28 ` Richard Biener 2016-08-22 14:58 ` Robin Dapp@ 2016-08-23 7:11 ` Robin Dapp1 sibling, 0 replies; 47+ messages in thread From: Robin Dapp @ 2016-08-23 7:11 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches gah, this + return true; + if (TREE_CODE (t1) != SSA_NAME) should of course be like this + if (TREE_CODE (t1) != SSA_NAME) + return true; in the last patch. ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-08-22 14:58 ` Robin Dapp@ 2016-09-05 7:50 ` Robin Dapp2016-09-14 13:08 ` Richard Biener 1 sibling, 0 replies; 47+ messages in thread From: Robin Dapp @ 2016-09-05 7:50 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches [-- Attachment #1: Type: text/plain, Size: 6 bytes --] Ping. [-- Attachment #2: gcc-indvar-v7-p1.diff --] [-- Type: text/x-patch, Size: 13954 bytes --] diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c index 2beadbc..d66fcb1 100644 --- a/gcc/gimple-match-head.c +++ b/gcc/gimple-match-head.c @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "internal-fn.h" #include "case-cfn-macros.h" #include "gimplify.h" +#include "tree-vrp.h" /* Forward declarations of the private auto-generated matchers. diff --git a/gcc/match.pd b/gcc/match.pd index b24bfb4..f54b734 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1119,6 +1119,113 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (with + { + /* If the inner operation is wrapped inside a conversion, we have to + check it overflows/underflows and can only then perform the + simplification, i.e. add the second constant to the first + (wrapped) and convert afterwards. This fixes + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526 */ + bool inner_ovf = false; + + bool combine_ovf = true; + tree cst = NULL_TREE; + bool combine_constants = false; + bool types_ok = false; + + tree inner_type = TREE_TYPE (@3); + /* Check overflow in wrapped binop? */ + bool check_inner_ovf = !TYPE_OVERFLOW_UNDEFINED (inner_type); + /* Check overflow when combining constants? */ + bool check_combine_ovf = TYPE_OVERFLOW_UNDEFINED (type); + + signop s1 = TYPE_SIGN (type); + + if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type)) + { + types_ok = true; + + /* Check if the inner binop does not overflow i.e. we have VRP + information and can prove prove it. */ + if (check_inner_ovf) + inner_ovf = binop_overflow (inner_op, @0, @1, inner_type); + else + inner_ovf = false; + + wide_int w1 = @1; + wide_int w2 = @2; + + wide_int combined_cst; + + /* Convert to TYPE keeping possible signedness even when + dealing with unsigned types. */ + wide_int v1; + v1 = v1.from (w1, w2.get_precision(), tree_int_cst_sign_bit + (@1) ? SIGNED : UNSIGNED); + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + /* Combine in outer, larger type */ + combined_cst = wi::add (v1, w2, TYPE_SIGN (type), &combine_ovf); + + /* The combined constant is ok if its type does not exhibit + undefined overflow or there was no overflow when + combining. */ + bool combined_cst_ok = !check_combine_ovf || !combine_ovf; + if (!inner_ovf && combined_cst_ok) + { + /* convert to tree of outer type */ + cst = wide_int_to_tree (type, combined_cst); + combine_constants = true; + } + } + } + (if (types_ok && combine_constants && cst) + (outer_op (convert { @0; }) { cst; })) + )))) +#endif + + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert @0) INTEGER_CST@2) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool simplify = false; + + wide_int cst = @2; + tree inner_type = TREE_TYPE (@0); + tree cst_inner = wide_int_to_tree (inner_type, cst); + bool inner_ovf = true; + + if (TYPE_PRECISION (inner_type) >= TYPE_PRECISION (type) + || TREE_CODE (inner_type) != INTEGER_TYPE) + simplify = false; + else + { + inner_ovf = binop_overflow (outer_op, @0, cst_inner, inner_type); + if (!inner_ovf) + simplify = true; + } + } + (if (simplify && @0) + (convert (outer_op @0 (convert { @2; })))) + ))) +#endif + /* (A +- CST) +- CST -> A + CST */ (for outer_op (plus minus) (for inner_op (plus minus) @@ -1132,6 +1239,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (inner_op @0 { cst; } )))))) + /* (CST - A) +- CST -> CST - A */ (for outer_op (plus minus) (simplify diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c new file mode 100644 index 0000000..0afe99a --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c @@ -0,0 +1,76 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "forwprop1" } } */ + +#include <limits.h> + +// should simplify, ignore possible signed overflow in (a - 2) and combine +// constants +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +// should simplify +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +// should simplify with combined constant in outer type +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +// should simplify +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +// should simplify (same as above) +long bak(int a) +{ + return (long)(a + 1) + 3; +} + +// should simplify (same) +long bal(int a) +{ + return (long)(a - 7) - 4; +} + +// should simplify with combined constant in inner type, no overflow in +// combined constant in inner type (1 - INT_MAX) although it looks like it :) +long bam(int a) +{ + return (long)(a - 1) - INT_MAX; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long bam2(int a) +{ + return (long)(a + 1) + INT_MAX; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long ban(int a) +{ + return (long)(a - 1) + INT_MIN; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long ban2(int a) +{ + return (long)(a + 1) - INT_MIN; +} + +// should simplify, assumes no inner overflow +unsigned long baq(int a) +{ + return (unsigned long)(a + 1) - 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c new file mode 100644 index 0000000..9b166f6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop1-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 0 "forwprop1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 0 "vrp1" } } */ + +#include <limits.h> +#include <assert.h> + +// should not simplify, 2 + LONG_MAX overflows +long bao(int a) +{ + return (long)(a + 2) + LONG_MAX; +} + +// should not simplify +long bar(int a) +{ + return (long)(a - 2) - LONG_MAX; +} + +// should not simplify although at first looks like (long)(a - 1) + 1 on +// tree level, but a is promoted to long +long bas(int a) +{ + return (long)(a + UINT_MAX) + 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c new file mode 100644 index 0000000..fb0463d --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-forwprop1-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 1 "forwprop1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 5 "vrp1" } } */ + +#include <limits.h> + +// should simplify (same as above with VRP) +unsigned long oof2(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a - 1) + 1; +} + +// same +unsigned long bap(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) + ULONG_MAX; +} + +// should simplify +unsigned long bar3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 5; +} + +// should simplify in outer type as we cannot prove non-overflow +unsigned long bar4(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 6; +} + +// should simplify +unsigned long baq(int a) +{ + return (unsigned long)(a - 2) + 1; +} + +// should simplify +long baq3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (long)(a - 1) + 1; +} + diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c new file mode 100644 index 0000000..3843b6d --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <assert.h> +#include <limits.h> + +unsigned int a = 3; +unsigned int aa = UINT_MAX; + +int main() +{ + volatile unsigned long b = (unsigned long)(aa + 1) - 1; + assert (b == 18446744073709551615ul); + + volatile unsigned long c = (unsigned long)(a - 4) + 1; + assert (c == 4294967296); + + volatile unsigned long d = (unsigned long)(a + UINT_MAX - 4) + 2; + assert (d == 4294967296); + + volatile unsigned long e = (unsigned long)(a - UINT_MAX) + UINT_MAX; + assert (e == 4294967299); + + volatile unsigned long f = (unsigned long)(a + UINT_MAX) - UINT_MAX; + assert (f == 18446744069414584323ul); + + volatile long g = (long)(a - 4) + 1; + assert (g == 4294967296); +} diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 4333d60..f9bf2d4 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -3231,6 +3231,109 @@ extract_range_from_binary_expr (value_range *vr, } } +/* Check if the binary operation overflows. */ +bool binop_overflow (enum tree_code op, tree t1, tree t2, tree type) +{ + if (t1 == NULL_TREE || t2 == NULL_TREE || type == NULL_TREE) + return true; + + if (TREE_CODE (type) != INTEGER_TYPE) + return true; + + if (TREE_CODE (t1) != SSA_NAME) + return true; + + if (op != PLUS_EXPR && op != MINUS_EXPR) + return true; + + wide_int min1; + wide_int max1; + signop st1 = TYPE_SIGN (TREE_TYPE (t1)); + + if (TREE_CODE (t1) != INTEGER_CST) + { + enum value_range_type vrtype1 = get_range_info (t1, &min1, &max1); + + if (!vrtype1 || (vrtype1 != VR_RANGE && vrtype1 != VR_ANTI_RANGE)) + return true; + + if (vrtype1 == VR_ANTI_RANGE) + { + bool ovf; + wide_int tmpmin = wi::add (min1, 1, st1, &ovf); + if (st1 == SIGNED && ovf) + return true; + wide_int tmpmax = wi::sub (max1, 1, st1, &ovf); + if (st1 == SIGNED && ovf) + return true; + min1 = tmpmin; + max1 = tmpmax; + } + } + else + { + min1 = t1; + max1 = t1; + } + + wide_int min2; + wide_int max2; + signop st2 = TYPE_SIGN (TREE_TYPE (t2)); + + if (TREE_CODE (t2) != INTEGER_CST) + { + enum value_range_type vrtype2 = get_range_info (t2, &min2, &max2); + + if (!vrtype2 || (vrtype2 != VR_RANGE && vrtype2 != VR_ANTI_RANGE)) + return true; + + if (vrtype2 == VR_ANTI_RANGE) + { + bool ovf; + wide_int tmpmin = wi::add (min2, 1, st2, &ovf); + if (st2 == SIGNED && ovf) + return true; + wide_int tmpmax = wi::sub (max2, 1, st2, &ovf); + if (st2 == SIGNED && ovf) + return true; + min2 = tmpmin; + max2 = tmpmax; + } + } + else + { + min2 = t2; + max2 = t2; + } + + wide_int wmin, wmax; + + bool ovf; + signop sgn = TYPE_SIGN (TREE_TYPE (t1)) == SIGNED + || TYPE_SIGN (TREE_TYPE (t2)) == SIGNED ? SIGNED : UNSIGNED; + if (op == MINUS_EXPR) + { + wmin = wi::sub (min1, min2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::gt_p (wmin, min1, sgn))) + return true; + wmax = wi::sub (max1, max2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::gt_p (wmax, max1, sgn))) + return true; + } + else + { + wmin = wi::add (min1, min2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::lt_p (wmin, min1, sgn))) + return true; + wmax = wi::add (max1, max2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::lt_p (wmax, max1, sgn))) + return true; + } + + return wi::gt_p (wmin, wmax, TYPE_SIGN (TREE_TYPE (t1))); +} + + /* Extract range information from a unary operation CODE based on the range of its operand *VR0 with type OP0_TYPE with resulting type TYPE. The resulting range is stored in *VR. */ @@ -8906,6 +9009,37 @@ simplify_truth_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) return true; } +/* Simplify ((type) (b +- CST1)) + CST2 to (type) b + (CST1 + CST2) + if (b +- CST1) does not underflow. */ + +static bool +simplify_plus_or_minus_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + tree op0 = gimple_assign_rhs1 (stmt); + tree op1 = gimple_assign_rhs2 (stmt); + + if ((code == PLUS_EXPR || code == MINUS_EXPR) && + op0 != NULL && op1 != NULL) + { + gimple *stmt1 = SSA_NAME_DEF_STMT (op0); + if (gassign *def = dyn_cast <gassign *> (stmt1)) + { + if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)) + && TREE_CODE (op1) == INTEGER_CST) + { + /* Only mark the stmt as updated here, so fold_stmt in + tree-ssa-propagate will call the matching pattern in + match.pd. */ + update_stmt (gsi_stmt (*gsi)); + return true; + } + } + } + + return false; +} + /* Simplify a division or modulo operator to a right shift or bitwise and if the first operand is unsigned or is greater than zero and the second operand is an exact power of two. @@ -9905,6 +10039,12 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi) return simplify_min_or_max_using_ranges (stmt); break; + case MINUS_EXPR: + case PLUS_EXPR: + if (TREE_CODE (rhs1) == SSA_NAME) + return simplify_plus_or_minus_using_ranges (gsi, stmt); + break; + default: break; } diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h new file mode 100644 index 0000000..5a867bc --- /dev/null +++ b/gcc/tree-vrp.h @@ -0,0 +1,2 @@ +extern bool +binop_overflow (enum tree_code op, tree t1, tree t2, tree type); ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-08-22 14:58 ` Robin Dapp 2016-09-05 7:50 ` Robin Dapp@ 2016-09-14 13:08 ` Richard Biener2016-09-14 17:04 ` Jeff Law 2016-09-20 12:39 ` Robin Dapp 1 sibling, 2 replies; 47+ messages in thread From: Richard Biener @ 2016-09-14 13:08 UTC (permalink / raw) To: Robin Dapp;+Cc:GCC Patches On Mon, Aug 22, 2016 at 4:58 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >> if !inner_ovf (just set that to false if !check_inner_ovf to simplify >> checks please). >> you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2 >> (if T is wider than the inner type which I think you should check and >> which should >> simplify things). > > I simplified the control flow a little and split both parts of the > combination into separate patterns. > >> You can then combine (T)@1 and @2 where I think you fail to handle mixed >> MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for > integers). > > Is it ok to do something like this (see patch) in order to deal with > MINUS_EXPR and then perform a wi::add? > > if (inner_op == MINUS_EXPR) > w1 = wi::neg (w1); > > if (outer_op == MINUS_EXPR) > w2 = wi::neg (w2); Yes. >> The testcase is rather unspecific as to what testcases shoudl fold and > what not >> given your very simple scan and mixing should/should-not simplify > cases. Please >> consider splitting it up and make it a run test that verifies the >> bogus transforms >> do not take place. > > I divided the testcases into signed/unsigned and should simplify/should > not simplify. The bogus unsigned transforms are now checked via assert(). > >> Doing > [...] >> causes such stmts to be folded twice as substitute_and_fold does > [...] >> which is less than ideal. I think that given we have fold_stmt following >> SSA edges and thus not only stmts we propagated into are possibly >> interesting to fold (but also their single-uses, recursively), we should >> evaluate the compile-time cost of doing the fold_stmt unconditionally. > > Only using update_stmt seems to avoid the double call to fold_stmt but > is obviously the wrong thing to do as this then tries to update > everything that matches the pattern in tree-vrp.c. Copying some checks > from match.pd to tree-vrp.c would help but isn't there a proper way to > prevent duplicating the checking logic? I think what triggers it is that you return 'true', not that you call update_stmt. > In addition, is the compile-time cost checking something I could do for > this patch or a general thing to be done later? I meant to do sth like Index: tree-ssa-propagate.c =================================================================== --- tree-ssa-propagate.c (revision 240133) +++ tree-ssa-propagate.c (working copy) @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); - /* If we made a replacement, fold the statement. */ - if (did_replace) + /* Fold the statement. */ + if (fold_stmt (&i, follow_single_use_edges)) { - fold_stmt (&i, follow_single_use_edges); + did_replace = true; stmt = gsi_stmt (i); } this would need compile-time cost evaluation (and avoid the tree-vrp.c folding part of your patch). >> tree.c doesn't look like the best fit. I think putting it into >> tree-vrp.c is better >> and I think that extract_range_from_binary_expr_1 itself should > compute what we >> want here as additional output. Conservative handling for all but > plus/minus is >> ok with me. > > I kept the extra function for now because I find > extract_range_from_binary_expr_1 somewhat lengthy and hard to follow > already :) Heh, but I think duplicating code is even worse. Just looking at the simpler pattern right now: + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert @0) INTEGER_CST@2) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool simplify = false; + + wide_int cst = @2; + tree inner_type = TREE_TYPE (@0); + tree cst_inner = wide_int_to_tree (inner_type, cst); What prevents CST being truncated here? It looks like (long)intvar + 0xffff00000000L will be mishandled that way. + bool inner_ovf = true; + + if (TYPE_PRECISION (inner_type) >= TYPE_PRECISION (type) + || TREE_CODE (inner_type) != INTEGER_TYPE) + simplify = false; + else + { + inner_ovf = binop_overflow (outer_op, @0, cst_inner, inner_type); + if (!inner_ovf) + simplify = true; + } + } + (if (simplify && @0) @0 is always "true" + (convert (outer_op @0 (convert { @2; })))) You already have @2 converted -- cst_inner. Thus just use (convert (outer_op @0 { cst_inner; })) + ))) +#endif Now onto the other pattern. + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (with + { + /* If the inner operation is wrapped inside a conversion, we have to + check it overflows/underflows and can only then perform the + simplification, i.e. add the second constant to the first + (wrapped) and convert afterwards. This fixes + https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526 */ + bool inner_ovf = false; + + bool combine_ovf = true; + tree cst = NULL_TREE; + bool combine_constants = false; + bool types_ok = false; + + tree inner_type = TREE_TYPE (@3); + /* Check overflow in wrapped binop? */ + bool check_inner_ovf = !TYPE_OVERFLOW_UNDEFINED (inner_type); + /* Check overflow when combining constants? */ + bool check_combine_ovf = TYPE_OVERFLOW_UNDEFINED (type); I think that should be ! TYPE_OVERFLOW_WRAPS (type) instead. + + signop s1 = TYPE_SIGN (type); + + if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type)) + { OTOH given that you use this to decide whether you can use a combined constant that doesn't look necessary (if the constant is correct, that is) -- what you need to make sure is that the final operation, (T)(A) +- CST, does not overflow if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the original operation. I think that always holds, thus the combine_ovf check looks superfluous to me. Now looking at binop_overflow (that I don't like very much, but ...) +/* Check if the binary operation overflows. */ +bool binop_overflow (enum tree_code op, tree t1, tree t2, tree type) +{ new-line after 'bool' + if (TREE_CODE (type) != INTEGER_TYPE) INTEGRAL_TYPE_P + return true; + + if (TREE_CODE (t1) != SSA_NAME) + return true; that seems unnecessary You have duplicate code to handle range-info for SSA name t1/t2, please split it out. + if (vrtype2 == VR_ANTI_RANGE) + { + bool ovf; + wide_int tmpmin = wi::add (min2, 1, st2, &ovf); + if (st2 == SIGNED && ovf) + return true; + wide_int tmpmax = wi::sub (max2, 1, st2, &ovf); + if (st2 == SIGNED && ovf) + return true; + min2 = tmpmin; + max2 = tmpmax; this looks wrong, from ~[0, 0] you get min == -1 and max == 1 while in reality min == INT_MIN and max == INT_MAX. I suggest to give up conservatively for anti-ranges. + signop sgn = TYPE_SIGN (TREE_TYPE (t1)) == SIGNED + || TYPE_SIGN (TREE_TYPE (t2)) == SIGNED ? SIGNED : UNSIGNED; uh, you want to handle mixed-sign inputs? why? + if (op == MINUS_EXPR) + { + wmin = wi::sub (min1, min2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::gt_p (wmin, min1, sgn))) + return true; + wmax = wi::sub (max1, max2, sgn, &ovf); + if (sgn == SIGNED && (ovf || wi::gt_p (wmax, max1, sgn))) + return true; + } I think you may want to simplify this and instead compute the min/max values into widest_ints, perform the operation (conservatively all four variants, min1 OP min2, min1 OP max2, max1 OP min2, max1 OP max2) and verify if the result fits the desired type via wi::fits_to_tree_p. > Wouldn't it be better to "separate concerns"/split it up in > the long run and merge the functionality needed here at some time? For sure splitting out the PLUS/MINUS handling from extract_range_from_binary_expr_1 is fine with me, not sure how it immediately helps this patch. Finally please mention the PR that made you work on this in the ChangeLog (does it actually fix the issue or at least improve it? Thanks, Richard. > Bootstrapped and reg-tested on s390x, bootstrap on x86 running. > > Regards > Robin > > > gcc/ChangeLog: > > 2016-08-22 Robin Dapp <rdapp@linux.vnet.ibm.com> > > * match.pd: Introduce patterns to simplify binary operations wrapped > inside a cast. > * tree-vrp.c (bool binop_overflow): > (simplify_plus_or_minus_using_ranges): Make use of newly introduced > patterns. > (simplify_stmt_using_ranges): Call simplify_plus_or_minus_using_ranges > * tree-vrp.h: New file. > * gimple-match-head.c: Include tree-vrp.h > > > gcc/testsuite/ChangeLog: > > 2016-08-22 Robin Dapp <rdapp@linux.vnet.ibm.com> > > * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. > * gcc.dg/wrapped-binop-simplify-signed-2.c: New test. > * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. > * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test. ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-09-14 13:08 ` Richard Biener@ 2016-09-14 17:04 ` Jeff Law2016-10-14 8:33 ` Robin Dapp 2016-09-20 12:39 ` Robin Dapp 1 sibling, 1 reply; 47+ messages in thread From: Jeff Law @ 2016-09-14 17:04 UTC (permalink / raw) To: Richard Biener, Robin Dapp;+Cc:GCC Patches [ Another patch I'd started working through, but hadn't completed... ] On 09/14/2016 07:05 AM, Richard Biener wrote: > On Mon, Aug 22, 2016 at 4:58 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >>> if !inner_ovf (just set that to false if !check_inner_ovf to simplify >>> checks please). >>> you know it's valid to transform the op to (T)@0 innerop (T)@1 outerop @2 >>> (if T is wider than the inner type which I think you should check and >>> which should >>> simplify things). >> >> I simplified the control flow a little and split both parts of the >> combination into separate patterns. >> >>> You can then combine (T)@1 and @2 where I think you fail to handle mixed >>> MINUS_EXPR/PLUS_EXPR (practically you'll see only PLUS_EXPRs for >> integers). >> >> Is it ok to do something like this (see patch) in order to deal with >> MINUS_EXPR and then perform a wi::add? >> >> if (inner_op == MINUS_EXPR) >> w1 = wi::neg (w1); >> >> if (outer_op == MINUS_EXPR) >> w2 = wi::neg (w2); > > Yes. > I was concerned about the case where w1 or w2 might be the minimum (negative) integer for its type. That creates a constant that can't be represented. I'm not familiar enough with the rest of hte code yet to know if that's problematical. > >>> tree.c doesn't look like the best fit. I think putting it into >>> tree-vrp.c is better >>> and I think that extract_range_from_binary_expr_1 itself should >> compute what we >>> want here as additional output. Conservative handling for all but >> plus/minus is >>> ok with me. >> >> I kept the extra function for now because I find >> extract_range_from_binary_expr_1 somewhat lengthy and hard to follow >> already :) > > Heh, but I think duplicating code is even worse. I was going to suggest a pre-patch that just refactored the various cases in extract_range_from_binary_expr_1 into their own functions. THat might it easier to keep things manageable. [ ... ] > > Now looking at binop_overflow (that I don't like very much, but ...) Note that the naked "return true" in there makes 95% of that function unreachable code. That's where I stopped without looking deeply at the rest of the code. Jeff ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-09-14 13:08 ` Richard Biener 2016-09-14 17:04 ` Jeff Law@ 2016-09-20 12:39 ` Robin Dapp2016-09-20 15:31 ` Jeff Law ` (2 more replies) 1 sibling, 3 replies; 47+ messages in thread From: Robin Dapp @ 2016-09-20 12:39 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches [-- Attachment #1: Type: text/plain, Size: 3318 bytes --] Hi, > I meant to do sth like > > Index: tree-ssa-propagate.c > =================================================================== > --- tree-ssa-propagate.c (revision 240133) > +++ tree-ssa-propagate.c (working copy) > @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d > /* Replace real uses in the statement. */ > did_replace |= replace_uses_in (stmt, get_value_fn); > > - /* If we made a replacement, fold the statement. */ > - if (did_replace) > + /* Fold the statement. */ > + if (fold_stmt (&i, follow_single_use_edges)) > { > - fold_stmt (&i, follow_single_use_edges); > + did_replace = true; > stmt = gsi_stmt (i); > } > > this would need compile-time cost evaluation (and avoid the tree-vrp.c > folding part > of your patch). Using this causes the simplifications to be performed in ccp1 instead of fwprop1. I noticed two tests failing that rely on propagation being performed in fwprop. Should these be adapted or rather the patch be changed? > Heh, but I think duplicating code is even worse. Ok, I changed extract_range_... after all, it's not so bad as I had thought. Imho it should be simplified nevertheless, perhaps I'll do it in the future if I find some time. > + tree cst_inner = wide_int_to_tree (inner_type, cst); > > What prevents CST being truncated here? It looks like > (long)intvar + 0xffff00000000L will be mishandled that way. > Right, it may be truncated here, but the following TYPE_PRECISION () guard prevents the value from being used in that case. I pulled the line into the condition for clarity. > OTOH given that you use this to decide whether you can use a combined constant > that doesn't look necessary (if the constant is correct, that is) -- > what you need > to make sure is that the final operation, (T)(A) +- CST, does not overflow > if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the > original operation. I think that always holds, thus the combine_ovf check looks > superfluous to me. > I included this to account for cases like return (long)(a + 2) + LONG_MAX which should not simplify as opposed to return (unsigned long)(a + 2) + LONG_MAX where the constant is usable despite the overflow. Do you think it should be handled differently? Revised version attached. Regards Robin -- gcc/ChangeLog: 2016-09-20 Robin Dapp <rdapp@linux.vnet.ibm.com> PR middle-end/69526 This enables combining of wrapped binary operations and fixes the tree level part of the PR. * match.pd: Introduce patterns to simplify binary operations wrapped inside a cast. * tree-vrp.h: Export extract_range_from_binary_expr (). * tree-ssa-propagate.c (substitute_and_fold_dom_walker::before_dom_children): Try to fold regardless of prior use propagations. * tree-vrp.c (extract_range_from_binary_expr_1): Add overflow check arguments and wrapper function without them. (extract_range_from_binary_expr): Add wrapper function. gcc/testsuite/ChangeLog: 2016-09-20 Robin Dapp <rdapp@linux.vnet.ibm.com> * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. * gcc.dg/wrapped-binop-simplify-signed-2.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test. [-- Attachment #2: gcc-indvar-v8-p1.diff --] [-- Type: text/x-patch, Size: 17016 bytes --] diff --git a/gcc/match.pd b/gcc/match.pd index 123e754..f624e7e 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1138,6 +1138,130 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (with + { + /* If the inner operation is wrapped inside a conversion, we have + to check it overflows/underflows and can only then perform the + simplification, i.e. add the second constant to the first + (wrapped) and convert afterwards. */ + bool inner_ovf = false; + + bool combine_ovf = true; + tree cst = NULL_TREE; + bool combine_constants = false; + bool types_ok = false; + + tree inner_type = TREE_TYPE (@3); + /* Check overflow in wrapped binop? */ + bool check_inner_ovf = !TYPE_OVERFLOW_UNDEFINED (inner_type); + bool check_combine_ovf = !TYPE_OVERFLOW_WRAPS (type); + + signop s1 = TYPE_SIGN (type); + + if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type)) + { + types_ok = true; + + /* Check if the inner binop does not overflow i.e. we have VRP + information and can prove prove it. */ + if (check_inner_ovf) + { + value_range vr = VR_INITIALIZER; + if (@0 == NULL_TREE || @1 == NULL_TREE + || inner_type == NULL_TREE + || TREE_CODE (type) != INTEGER_TYPE) + { + inner_ovf = true; + } + else + { + extract_range_from_binary_expr (&vr, inner_op, + inner_type, @0, @1, true, &inner_ovf); + } + } + else + inner_ovf = false; + + wide_int w1 = @1; + wide_int w2 = @2; + + wide_int combined_cst; + + /* Convert to TYPE keeping possible signedness even when + dealing with unsigned types. */ + wide_int v1; + v1 = v1.from (w1, w2.get_precision(), tree_int_cst_sgn (@1) ? + SIGNED : UNSIGNED); + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + /* Combine in outer, larger type. */ + combined_cst = wi::add (v1, w2, + TYPE_SIGN (type), &combine_ovf); + + /* The combined constant is ok if its type does not exhibit + undefined overflow (caught by !inner_ovf) or there was + no overflow when combining. */ + bool combined_cst_ok = !check_combine_ovf || !combine_ovf; + if (!inner_ovf && combined_cst_ok) + { + /* convert to tree of outer type */ + cst = wide_int_to_tree (type, combined_cst); + combine_constants = true; + } + } + } + (if (types_ok && combine_constants && cst) + (outer_op (convert { @0; }) { cst; })) + )))) +#endif + + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert @0) INTEGER_CST@2) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool simplify = false; + + wide_int cst = @2; + tree inner_type = TREE_TYPE (@0); + tree cst_inner; + bool inner_ovf = true; + value_range vr = VR_INITIALIZER; + + if (TYPE_PRECISION (inner_type) >= TYPE_PRECISION (type) + || TREE_CODE (inner_type) != INTEGER_TYPE + || @0 == NULL_TREE || inner_type == NULL_TREE + || TREE_CODE (type) != INTEGER_TYPE) + simplify = false; + else + { + cst_inner = wide_int_to_tree (inner_type, cst); + extract_range_from_binary_expr (&vr, outer_op, inner_type, + @0, cst_inner, true, &inner_ovf); + if (!inner_ovf) + simplify = true; + } + } + (if (simplify) + (convert (outer_op @0 { cst_inner; }))) + ))) +#endif + /* (A +- CST) +- CST -> A + CST */ (for outer_op (plus minus) (for inner_op (plus minus) @@ -1151,6 +1275,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (inner_op @0 { cst; } )))))) + /* (CST - A) +- CST -> CST - A */ (for outer_op (plus minus) (simplify diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c new file mode 100644 index 0000000..93c4e50 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c @@ -0,0 +1,76 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "ccp1" } } */ + +#include <limits.h> + +// should simplify, ignore possible signed overflow in (a - 2) and combine +// constants +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +// should simplify +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +// should simplify with combined constant in outer type +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +// should simplify +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +// should simplify (same as above) +long bak(int a) +{ + return (long)(a + 1) + 3; +} + +// should simplify (same) +long bal(int a) +{ + return (long)(a - 7) - 4; +} + +// should simplify with combined constant in inner type, no overflow in +// combined constant in inner type (1 - INT_MAX) +long bam(int a) +{ + return (long)(a - 1) - INT_MAX; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long bam2(int a) +{ + return (long)(a + 1) + INT_MAX; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long ban(int a) +{ + return (long)(a - 1) + INT_MIN; +} + +// should simplify with combined constant in outer type, overflow in +// combined constant in inner type +long ban2(int a) +{ + return (long)(a + 1) - INT_MIN; +} + +// should simplify, assumes no inner overflow +unsigned long baq(int a) +{ + return (unsigned long)(a + 1) - 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c new file mode 100644 index 0000000..180bb8d --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 0 "ccp1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 0 "vrp1" } } */ + +#include <limits.h> +#include <assert.h> + +// should not simplify, 2 + LONG_MAX overflows +long bao(int a) +{ + return (long)(a + 2) + LONG_MAX; +} + +// should not simplify +long bar(int a) +{ + return (long)(a - 2) - LONG_MAX; +} + +// should not simplify although at first looks like (long)(a - 1) + 1 on +// tree level, but a is promoted to long +long bas(int a) +{ + return (long)(a + UINT_MAX) + 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c new file mode 100644 index 0000000..db4abb4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 1 "ccp1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 5 "vrp1" } } */ + +#include <limits.h> + +// should simplify +unsigned long oof2(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a - 1) + 1; +} + +// same +unsigned long bap(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) + ULONG_MAX; +} + +// should simplify +unsigned long bar3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 5; +} + +// should simplify in outer type as we cannot prove non-overflow +unsigned long bar4(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 6; +} + +// should simplify +unsigned long baq(int a) +{ + return (unsigned long)(a - 2) + 1; +} + +// should simplify +long baq3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (long)(a - 1) + 1; +} + diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c new file mode 100644 index 0000000..3843b6d --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c @@ -0,0 +1,29 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <assert.h> +#include <limits.h> + +unsigned int a = 3; +unsigned int aa = UINT_MAX; + +int main() +{ + volatile unsigned long b = (unsigned long)(aa + 1) - 1; + assert (b == 18446744073709551615ul); + + volatile unsigned long c = (unsigned long)(a - 4) + 1; + assert (c == 4294967296); + + volatile unsigned long d = (unsigned long)(a + UINT_MAX - 4) + 2; + assert (d == 4294967296); + + volatile unsigned long e = (unsigned long)(a - UINT_MAX) + UINT_MAX; + assert (e == 4294967299); + + volatile unsigned long f = (unsigned long)(a + UINT_MAX) - UINT_MAX; + assert (f == 18446744069414584323ul); + + volatile long g = (long)(a - 4) + 1; + assert (g == 4294967296); +} diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c index c8cf078..f436ac3 100644 --- a/gcc/tree-ssa-propagate.c +++ b/gcc/tree-ssa-propagate.c @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb) /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); - /* If we made a replacement, fold the statement. */ - if (did_replace) + /* Fold the statement. */ + if (fold_stmt (&i, follow_single_use_edges)) { - fold_stmt (&i, follow_single_use_edges); + did_replace = true; stmt = gsi_stmt (i); } diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 45882c4..b670764 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -61,8 +61,6 @@ along with GCC; see the file COPYING3. If not see #include "params.h" #include "alloc-pool.h" -#define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL } - /* Allocation pools for tree-vrp allocations. */ static object_allocator<value_range> vrp_value_range_pool ("Tree VRP value ranges"); static bitmap_obstack vrp_equiv_obstack; @@ -2154,7 +2152,8 @@ extract_range_from_multiplicative_op_1 (value_range *vr, static void extract_range_from_binary_expr_1 (value_range *vr, enum tree_code code, tree expr_type, - value_range *vr0_, value_range *vr1_) + value_range *vr0_, value_range *vr1_, + bool get_ovf, bool *ovf) { value_range vr0 = *vr0_, vr1 = *vr1_; value_range vrtem0 = VR_INITIALIZER, vrtem1 = VR_INITIALIZER; @@ -2213,12 +2212,13 @@ extract_range_from_binary_expr_1 (value_range *vr, if (vr0.type == VR_ANTI_RANGE && ranges_from_anti_range (&vr0, &vrtem0, &vrtem1)) { - extract_range_from_binary_expr_1 (vr, code, expr_type, &vrtem0, vr1_); + extract_range_from_binary_expr_1 (vr, code, expr_type, + &vrtem0, vr1_, get_ovf, ovf); if (vrtem1.type != VR_UNDEFINED) { value_range vrres = VR_INITIALIZER; extract_range_from_binary_expr_1 (&vrres, code, expr_type, - &vrtem1, vr1_); + &vrtem1, vr1_, get_ovf, ovf); vrp_meet (vr, &vrres); } return; @@ -2227,12 +2227,13 @@ extract_range_from_binary_expr_1 (value_range *vr, if (vr1.type == VR_ANTI_RANGE && ranges_from_anti_range (&vr1, &vrtem0, &vrtem1)) { - extract_range_from_binary_expr_1 (vr, code, expr_type, vr0_, &vrtem0); + extract_range_from_binary_expr_1 (vr, code, expr_type, + vr0_, &vrtem0, get_ovf, ovf); if (vrtem1.type != VR_UNDEFINED) { value_range vrres = VR_INITIALIZER; extract_range_from_binary_expr_1 (&vrres, code, expr_type, - vr0_, &vrtem1); + vr0_, &vrtem1, get_ovf, ovf); vrp_meet (vr, &vrres); } return; @@ -2445,6 +2446,11 @@ extract_range_from_binary_expr_1 (value_range *vr, max_ovf = 1; } + if (get_ovf) + { + *ovf = wi::gt_p (wmin, wmax, TYPE_SIGN (expr_type)); + } + /* If we have overflow for the constant part and the resulting range will be symbolic, drop to VR_VARYING. */ if ((min_ovf && sym_min_op0 != sym_min_op1) @@ -2787,7 +2793,7 @@ extract_range_from_binary_expr_1 (value_range *vr, saved_flag_wrapv = flag_wrapv; flag_wrapv = 1; extract_range_from_binary_expr_1 (vr, MULT_EXPR, expr_type, - &vr0, &vr1p); + &vr0, &vr1p, get_ovf, ovf); flag_wrapv = saved_flag_wrapv; return; } @@ -3165,35 +3171,67 @@ extract_range_from_binary_expr_1 (value_range *vr, set_value_range (vr, type, min, max, NULL); } +static void +extract_range_from_binary_expr_1 (value_range *vr, + enum tree_code code, tree expr_type, + value_range *vr0_, value_range *vr1_) +{ + bool tmp = true; + extract_range_from_binary_expr_1 (vr, code, expr_type, vr0_, vr1_, + false, &tmp); +} + /* Extract range information from a binary expression OP0 CODE OP1 based on the ranges of each of its operands with resulting type EXPR_TYPE. The resulting range is stored in *VR. */ -static void +void extract_range_from_binary_expr (value_range *vr, enum tree_code code, - tree expr_type, tree op0, tree op1) + tree expr_type, tree op0, tree op1, + bool get_ovf, bool *ovf) { value_range vr0 = VR_INITIALIZER; value_range vr1 = VR_INITIALIZER; + if (get_ovf) + *ovf = true; /* Get value ranges for each operand. For constant operands, create a new value range with the operand to simplify processing. */ if (TREE_CODE (op0) == SSA_NAME) - vr0 = *(get_value_range (op0)); + { + value_range *vtmp = get_value_range (op0); + if (vtmp == NULL) + { + vr->type = VR_VARYING; + return; + } + else + vr0 = *vtmp; + } else if (is_gimple_min_invariant (op0)) set_value_range_to_value (&vr0, op0, NULL); else set_value_range_to_varying (&vr0); if (TREE_CODE (op1) == SSA_NAME) - vr1 = *(get_value_range (op1)); + { + value_range *vtmp = get_value_range (op1); + if (vtmp == NULL) + { + vr->type = VR_VARYING; + return; + } + else + vr1 = *vtmp; + } else if (is_gimple_min_invariant (op1)) set_value_range_to_value (&vr1, op1, NULL); else set_value_range_to_varying (&vr1); - extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &vr1, + get_ovf, ovf); /* Try harder for PLUS and MINUS if the range of one operand is symbolic and based on the other operand, for example if it was deduced from a @@ -3221,7 +3259,8 @@ extract_range_from_binary_expr (value_range *vr, else set_value_range (&n_vr1, VR_RANGE, op1, op1, NULL); - extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &n_vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &n_vr1, + get_ovf, ovf); } if (vr->type == VR_VARYING @@ -3245,10 +3284,21 @@ extract_range_from_binary_expr (value_range *vr, else set_value_range (&n_vr0, VR_RANGE, op0, op0, NULL); - extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1, + get_ovf, ovf); } } +static void +extract_range_from_binary_expr (value_range *vr, + enum tree_code code, + tree expr_type, tree op0, tree op1) +{ + bool tmp = true; + extract_range_from_binary_expr (vr, code, expr_type, op0, op1, + false, &tmp); +} + /* Extract range information from a unary operation CODE based on the range of its operand *VR0 with type OP0_TYPE with resulting type TYPE. The resulting range is stored in *VR. */ @@ -10443,7 +10493,7 @@ identify_jump_threads (void) /* We're basically looking for a switch or any kind of conditional with integral or pointer type arguments. Note the type of the second argument will be the same as the first argument, so no need to - check it explicitly. + check it explicitly. We also handle the case where there are no statements in the block. This come up with forwarder blocks that are not diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h index 7ffb7e7..a0b3d1a 100644 --- a/gcc/tree-vrp.h +++ b/gcc/tree-vrp.h @@ -48,7 +48,13 @@ struct GTY(()) value_range bitmap equiv; }; +#define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL } + extern void vrp_intersect_ranges (value_range *vr0, value_range *vr1); extern void vrp_meet (value_range *vr0, const value_range *vr1); extern void dump_value_range (FILE *, const value_range *); - +extern bool binop_overflow (enum tree_code op, tree t1, tree t2, tree type); +extern void extract_range_from_binary_expr (value_range *vr, + enum tree_code code, + tree expr_type, tree op0, tree op1, + bool get_ovf, bool *ovf); ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-09-20 12:39 ` Robin Dapp@ 2016-09-20 15:31 ` Jeff Law2016-10-05 10:40 ` Robin Dapp 2016-10-14 11:49 ` Richard Biener 2 siblings, 0 replies; 47+ messages in thread From: Jeff Law @ 2016-09-20 15:31 UTC (permalink / raw) To: Robin Dapp, Richard Biener;+Cc:GCC Patches On 09/20/2016 06:31 AM, Robin Dapp wrote: > Hi, > >> I meant to do sth like >> >> Index: tree-ssa-propagate.c >> =================================================================== >> --- tree-ssa-propagate.c (revision 240133) >> +++ tree-ssa-propagate.c (working copy) >> @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >> /* Replace real uses in the statement. */ >> did_replace |= replace_uses_in (stmt, get_value_fn); >> >> - /* If we made a replacement, fold the statement. */ >> - if (did_replace) >> + /* Fold the statement. */ >> + if (fold_stmt (&i, follow_single_use_edges)) >> { >> - fold_stmt (&i, follow_single_use_edges); >> + did_replace = true; >> stmt = gsi_stmt (i); >> } >> >> this would need compile-time cost evaluation (and avoid the tree-vrp.c >> folding part >> of your patch). > > Using this causes the simplifications to be performed in ccp1 instead of > fwprop1. I noticed two tests failing that rely on propagation being > performed in fwprop. Should these be adapted or rather the patch be changed? ISTM this is an indication that something changed the IL without folding prior to ccp & forwprop. That's not in and of itself bad, but may be worth investigating to see if whatever prior pass that made this change ought to be adjusted. That investigation would likely guide you with what to do with the testcase. If you find that the earlier pass should have folded, then you fix it and change the testcase to verify the earlier pass folded properly. Else you change the testcase to verify ccp folds the statement. I'm going to let Richi own the review on this. Just thought I'd chime in on that one topic. jeff ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-09-20 12:39 ` Robin Dapp 2016-09-20 15:31 ` Jeff Law@ 2016-10-05 10:40 ` Robin Dapp2016-10-14 11:49 ` Richard Biener 2 siblings, 0 replies; 47+ messages in thread From: Robin Dapp @ 2016-10-05 10:40 UTC (permalink / raw) Cc: GCC Patches Ping. ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-09-14 17:04 ` Jeff Law@ 2016-10-14 8:33 ` Robin Dapp0 siblings, 0 replies; 47+ messages in thread From: Robin Dapp @ 2016-10-14 8:33 UTC (permalink / raw) To: Richard Biener, Jeff Law;+Cc:GCC Patches Ping :) ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-09-20 12:39 ` Robin Dapp 2016-09-20 15:31 ` Jeff Law 2016-10-05 10:40 ` Robin Dapp@ 2016-10-14 11:49 ` Richard Biener2016-11-16 15:54 ` Robin Dapp 2 siblings, 1 reply; 47+ messages in thread From: Richard Biener @ 2016-10-14 11:49 UTC (permalink / raw) To: Robin Dapp;+Cc:GCC Patches On Tue, Sep 20, 2016 at 2:31 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: > Hi, > >> I meant to do sth like >> >> Index: tree-ssa-propagate.c >> =================================================================== >> --- tree-ssa-propagate.c (revision 240133) >> +++ tree-ssa-propagate.c (working copy) >> @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >> /* Replace real uses in the statement. */ >> did_replace |= replace_uses_in (stmt, get_value_fn); >> >> - /* If we made a replacement, fold the statement. */ >> - if (did_replace) >> + /* Fold the statement. */ >> + if (fold_stmt (&i, follow_single_use_edges)) >> { >> - fold_stmt (&i, follow_single_use_edges); >> + did_replace = true; >> stmt = gsi_stmt (i); >> } >> >> this would need compile-time cost evaluation (and avoid the tree-vrp.c >> folding part >> of your patch). > > Using this causes the simplifications to be performed in ccp1 instead of > fwprop1. I noticed two tests failing that rely on propagation being > performed in fwprop. Should these be adapted or rather the patch be changed? > >> Heh, but I think duplicating code is even worse. > > Ok, I changed extract_range_... after all, it's not so bad as I had > thought. Imho it should be simplified nevertheless, perhaps I'll do it > in the future if I find some time. > >> + tree cst_inner = wide_int_to_tree (inner_type, cst); >> >> What prevents CST being truncated here? It looks like >> (long)intvar + 0xffff00000000L will be mishandled that way. >> > > Right, it may be truncated here, but the following TYPE_PRECISION () > guard prevents the value from being used in that case. I pulled the > line into the condition for clarity. > >> OTOH given that you use this to decide whether you can use a combined constant >> that doesn't look necessary (if the constant is correct, that is) -- >> what you need >> to make sure is that the final operation, (T)(A) +- CST, does not overflow >> if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the >> original operation. I think that always holds, thus the combine_ovf check looks >> superfluous to me. >> > > I included this to account for cases like > return (long)(a + 2) + LONG_MAX > which should not simplify as opposed to > return (unsigned long)(a + 2) + LONG_MAX > where the constant is usable despite the overflow. Do you think it > should be handled differently? Well, (long)(a + 2) + LONG_MAX -> (long)a + (LONG_MAX + 2) is indeed invalid because the resulting expression may overflow. But that's "easily fixable" by computing it in unsigned arithmetic, that is doing (long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2)) which I believe is still desired? Anyway, since the desired transform in the end is (long)a + CST -> (long)(a + CST) this constant will not fit here anyway... Few comments on patch details: + if (TYPE_PRECISION (type) >= TYPE_PRECISION (inner_type)) please put the precision tests as a pattern if, thus (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) (if (TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3))) (with { ... that makes it more obvious you are matching a widening conversion. + if (@0 == NULL_TREE || @1 == NULL_TREE + || inner_type == NULL_TREE captures cannot be NULL_TREE nor can their type be NULL_TREE. If you run into such cases sth is broken elsewhere. You have tests against INTEGER_TYPE - please put those alongsize (even before) the precision test. Note that you probably want to use INTEGRAL_TYPE_P there to also catch ENUM_TYPE and BOOLEAN_TYPE. + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) you can use @0 to get at the type of inner_op. For my taste there is too many temporaries, like 'check_inner_ovf' where at the single point of use a !TYPE_OVERFLOW_UNDEFINED (inner_type); would be much more descriptive. + /* Convert to TYPE keeping possible signedness even when + dealing with unsigned types. */ + wide_int v1; + v1 = v1.from (w1, w2.get_precision(), tree_int_cst_sgn (@1) ? + SIGNED : UNSIGNED); better to comment it as /* Extend @1 to TYPE according to its sign. */ I also think you want to use TYPE_SIGN (inner_type) instead of tree_int_cst_sgn (@1) just to simplify the code. You also want to use TYPE_PRECISION (inner_type) instead of w2.get_precision (). + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); I think you want to negate v1 here? (better not introduce v1 but use w1 for the extension?) + /* Combine in outer, larger type. */ + combined_cst = wi::add (v1, w2 + TYPE_SIGN (type), &combine_ovf); So now we know that for (T)(X + CST1) + CST2, (T)CST1 + CST2 does not overflow. But we do not really care for that, we want to know whether (T)X + CST' might invoke undefined behavior when the original expression did not. This involves range information on X. I don't see how we ensure this here. Richard. > > Revised version attached. > > Regards > Robin > > -- > > gcc/ChangeLog: > > 2016-09-20 Robin Dapp <rdapp@linux.vnet.ibm.com> > > PR middle-end/69526 > This enables combining of wrapped binary operations and fixes > the tree level part of the PR. > > * match.pd: Introduce patterns to simplify binary operations > wrapped inside a cast. > * tree-vrp.h: Export extract_range_from_binary_expr (). > * tree-ssa-propagate.c > (substitute_and_fold_dom_walker::before_dom_children): > Try to fold regardless of prior use propagations. > * tree-vrp.c (extract_range_from_binary_expr_1): > Add overflow check arguments and wrapper function without them. > (extract_range_from_binary_expr): > Add wrapper function. > > > gcc/testsuite/ChangeLog: > > 2016-09-20 Robin Dapp <rdapp@linux.vnet.ibm.com> > > * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. > * gcc.dg/wrapped-binop-simplify-signed-2.c: New test. > * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. > * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test. ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-10-14 11:49 ` Richard Biener@ 2016-11-16 15:54 ` Robin Dapp2016-11-25 6:49 ` Robin Dapp 2016-11-25 13:47 ` Richard Biener 0 siblings, 2 replies; 47+ messages in thread From: Robin Dapp @ 2016-11-16 15:54 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches [-- Attachment #1: Type: text/plain, Size: 3345 bytes --] Found some time to look into this again. > Index: tree-ssa-propagate.c > =================================================================== > --- tree-ssa-propagate.c (revision 240133) > +++ tree-ssa-propagate.c (working copy) > @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d > /* Replace real uses in the statement. */ > did_replace |= replace_uses_in (stmt, get_value_fn); > > - /* If we made a replacement, fold the statement. */ > - if (did_replace) > + /* Fold the statement. */ > + if (fold_stmt (&i, follow_single_use_edges)) > { > - fold_stmt (&i, follow_single_use_edges); > + did_replace = true; > stmt = gsi_stmt (i); > } > > this would need compile-time cost evaluation (and avoid the tree-vrp.c > folding part > of your patch). This causes an ICE and bootstrap errors with newer revisions. I tested my patch on r240691 where it still works. How can this be done properly now? > OTOH given that you use this to decide whether you can use a combined constant > that doesn't look necessary (if the constant is correct, that is) -- > what you need > to make sure is that the final operation, (T)(A) +- CST, does not overflow > if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the > original operation. I think that always holds, thus the combine_ovf check looks > superfluous to me. Removed the check and addressed the other remarks. > So now we know that for (T)(X + CST1) + CST2, (T)CST1 + CST2 > does not overflow. But we do not really care for that, we want to know > whether (T)X + CST' might invoke undefined behavior when the original > expression did not. This involves range information on X. I don't > see how we ensure this here. I guess I'm still missing an undefined behavior case. In which case can (T)X + CST' trigger undefined behavior where the original statement did not? I see the need for checking in the second pattern ((T)(X) + CST' -> (T)(X + CST')), of course. > But that's "easily fixable" by computing it in unsigned arithmetic, that is > doing > > (long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2)) Does this also work if (unsigned long)a + (LONG_MAX + 2) does not fit into [0,LONG_MAX]? IIRC (unsigned long)(LONG_MAX + 2) is implementation-defined and not undefined so it should work? Revised patch version attached. One thing I'm still not sure about is the handling of sth. like (unsigned long)(a + UINT_MAX) + 1 for a = 0. In the current patch version I always do a sign-extend of the first constant (UINT_MAX here) which seems to cause no problems in the testsuite and some custom tests still worked. Can UINT_MAX, -1 and similar cases be disambiguated (and correctly converted to the outer type) when done in unsigned arithmetic? Thinking about the second pattern, on s390x it introduces more casts than just using the first one (e.g. in cases where the value will be sign-extended after the operation which wouldn't have happened when performing the operation in the larger type. Can we catch this with a cost function? On a side note: Can/should VRP infer ranges assuming no undefined behavior will take place when -fstrict-overflow is in use? I.e. inferring ~[INT_MIN,INT_MIN] for (long)(a - 1)? Would this even make sense? Regards Robin [-- Attachment #2: gcc-wrapped-binop.diff --] [-- Type: text/x-patch, Size: 15558 bytes --] diff --git a/gcc/match.pd b/gcc/match.pd index ba7e013..3d3f5bb 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1150,6 +1150,86 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (TREE_CODE (type) == INTEGER_TYPE && + TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3))) + (with + { + bool range_split = false; + tree cst = NULL_TREE; + tree inner_type = TREE_TYPE (@3); + value_range vr = VR_INITIALIZER; + + if (TYPE_OVERFLOW_UNDEFINED (inner_type)) + range_split = false; + else + { + /* Check the inner binop using VRP information. */ + extract_range_from_binary_expr (&vr, inner_op, + inner_type, @0, @1, &range_split); + } + + wide_int w1 = @1; + wide_int w2 = @2; + + wide_int combined_cst; + + /* Sign-extend @1 to TYPE. */ + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + /* Combine in outer, larger type. */ + bool combine_ovf = true; + combined_cst = wi::add (w1, w2, SIGNED, &combine_ovf); + + /* Convert combined constant to tree of outer type if + there was no value range split in the original operation. */ + if (!range_split) + { + cst = wide_int_to_tree (type, combined_cst); + } + } + (if (cst) + (outer_op (convert { @0; }) { cst; })) + ))))) +#endif + + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert @0) INTEGER_CST@2) + (if (TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) + && TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE + && TREE_CODE (type) == INTEGER_TYPE) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + tree cst_inner; + bool range_split = true; + + wide_int cst = @2; + cst_inner = wide_int_to_tree (TREE_TYPE (@0), cst); + value_range vr = VR_INITIALIZER; + extract_range_from_binary_expr (&vr, outer_op, TREE_TYPE (@0), + @0, cst_inner, &range_split); + } + (if (!range_split) + (convert (outer_op @0 { cst_inner; }))) + )))) +#endif + /* (A +- CST) +- CST -> A + CST */ (for outer_op (plus minus) (for inner_op (plus minus) @@ -1163,6 +1243,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (inner_op @0 { cst; } )))))) + /* (CST - A) +- CST -> CST - A */ (for outer_op (plus minus) (simplify diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c new file mode 100644 index 0000000..19b787b --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "ccp1" } } */ + +#include <limits.h> + +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +long bak(int a) +{ + return (long)(a + 1) + 3; +} + +long bal(int a) +{ + return (long)(a - 7) - 4; +} + +long bam(int a) +{ + return (long)(a - 1) - INT_MAX; +} + +long bam2(int a) +{ + return (long)(a + 1) + INT_MAX; +} + +long ban(int a) +{ + return (long)(a - 1) + INT_MIN; +} + +long ban2(int a) +{ + return (long)(a + 1) - INT_MIN; +} + +unsigned long baq(int a) +{ + return (unsigned long)(a + 1) - 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c new file mode 100644 index 0000000..8befc96 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details -fdump-tree-evrp-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 1 "ccp1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 1 "evrp" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 5 "vrp1" } } */ + +#include <limits.h> + +unsigned long oof2(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a - 1) + 1; +} + +unsigned long bap(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) + ULONG_MAX; +} + +unsigned long bar3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 5; +} + +unsigned long bar4(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 6; +} + +unsigned long baq(int a) +{ + return (unsigned long)(a - 2) + 1; +} + +long baq3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (long)(a - 1) + 1; +} + +// should simplify (combined constant is 0 not (unsigned long)UINT_MAX + 1 +// VRP creates an anti-range for a +unsigned long foo(short b) +{ + unsigned int a = b; + return (unsigned long)(a + UINT_MAX) + 1; +} + diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c new file mode 100644 index 0000000..fb6d3d9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <assert.h> +#include <limits.h> + +unsigned int a = 3; +int aa = 3; + +int main() +{ + volatile unsigned long b = (unsigned long)(UINT_MAX + 1) - 1; + assert (b == 18446744073709551615ul); + + volatile unsigned long c = (unsigned long)(a - 4) + 1; + assert (c == 4294967296); + + volatile unsigned long d = (unsigned long)(a + UINT_MAX - 4) + 2; + assert (d == 4294967296); + + volatile unsigned long e = (unsigned long)(a - UINT_MAX) + UINT_MAX; + assert (e == 4294967299); + + volatile unsigned long f = (unsigned long)(a + UINT_MAX) - UINT_MAX; + assert (f == 18446744069414584323ul); + + volatile long g = (long)(a - 4) + 1; + assert (g == 4294967296); + + volatile long h = (long)(aa + UINT_MAX) + 1; + assert (h == 3); +} diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c index 97cfde5..16b4a6b 100644 --- a/gcc/tree-ssa-propagate.c +++ b/gcc/tree-ssa-propagate.c @@ -1106,9 +1106,10 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb) did_replace |= replace_uses_in (stmt, get_value_fn); /* If we made a replacement, fold the statement. */ - if (did_replace) + if (fold_stmt (&i, follow_single_use_edges)) { fold_stmt (&i, follow_single_use_edges); + did_replace = true; stmt = gsi_stmt (i); } diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 7a08be7..1140402 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -62,8 +62,7 @@ along with GCC; see the file COPYING3. If not see #include "alloc-pool.h" #include "domwalk.h" #include "tree-cfgcleanup.h" - -#define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL } +#include "tree-vrp.h" /* Allocation pools for tree-vrp allocations. */ static object_allocator<value_range> vrp_value_range_pool ("Tree VRP value ranges"); @@ -2199,7 +2198,8 @@ extract_range_from_multiplicative_op_1 (value_range *vr, static void extract_range_from_binary_expr_1 (value_range *vr, enum tree_code code, tree expr_type, - value_range *vr0_, value_range *vr1_) + value_range *vr0_, value_range *vr1_, + bool *range_split) { value_range vr0 = *vr0_, vr1 = *vr1_; value_range vrtem0 = VR_INITIALIZER, vrtem1 = VR_INITIALIZER; @@ -2258,12 +2258,13 @@ extract_range_from_binary_expr_1 (value_range *vr, if (vr0.type == VR_ANTI_RANGE && ranges_from_anti_range (&vr0, &vrtem0, &vrtem1)) { - extract_range_from_binary_expr_1 (vr, code, expr_type, &vrtem0, vr1_); + extract_range_from_binary_expr_1 (vr, code, expr_type, + &vrtem0, vr1_, range_split); if (vrtem1.type != VR_UNDEFINED) { value_range vrres = VR_INITIALIZER; extract_range_from_binary_expr_1 (&vrres, code, expr_type, - &vrtem1, vr1_); + &vrtem1, vr1_, range_split); vrp_meet (vr, &vrres); } return; @@ -2272,12 +2273,13 @@ extract_range_from_binary_expr_1 (value_range *vr, if (vr1.type == VR_ANTI_RANGE && ranges_from_anti_range (&vr1, &vrtem0, &vrtem1)) { - extract_range_from_binary_expr_1 (vr, code, expr_type, vr0_, &vrtem0); + extract_range_from_binary_expr_1 (vr, code, expr_type, + vr0_, &vrtem0, range_split); if (vrtem1.type != VR_UNDEFINED) { value_range vrres = VR_INITIALIZER; extract_range_from_binary_expr_1 (&vrres, code, expr_type, - vr0_, &vrtem1); + vr0_, &vrtem1, range_split); vrp_meet (vr, &vrres); } return; @@ -2490,6 +2492,11 @@ extract_range_from_binary_expr_1 (value_range *vr, max_ovf = 1; } + if (range_split != NULL) + { + *range_split = wi::gt_p (wmin, wmax, TYPE_SIGN (expr_type)); + } + /* If we have overflow for the constant part and the resulting range will be symbolic, drop to VR_VARYING. */ if ((min_ovf && sym_min_op0 != sym_min_op1) @@ -2832,7 +2839,7 @@ extract_range_from_binary_expr_1 (value_range *vr, saved_flag_wrapv = flag_wrapv; flag_wrapv = 1; extract_range_from_binary_expr_1 (vr, MULT_EXPR, expr_type, - &vr0, &vr1p); + &vr0, &vr1p, range_split); flag_wrapv = saved_flag_wrapv; return; } @@ -3210,35 +3217,65 @@ extract_range_from_binary_expr_1 (value_range *vr, set_value_range (vr, type, min, max, NULL); } +static void +extract_range_from_binary_expr_1 (value_range *vr, + enum tree_code code, tree expr_type, + value_range *vr0_, value_range *vr1_) +{ + extract_range_from_binary_expr_1 (vr, code, expr_type, vr0_, vr1_, NULL); +} + /* Extract range information from a binary expression OP0 CODE OP1 based on the ranges of each of its operands with resulting type EXPR_TYPE. The resulting range is stored in *VR. */ -static void +void extract_range_from_binary_expr (value_range *vr, enum tree_code code, - tree expr_type, tree op0, tree op1) + tree expr_type, tree op0, tree op1, + bool *range_split) { value_range vr0 = VR_INITIALIZER; value_range vr1 = VR_INITIALIZER; + if (range_split != NULL) + *range_split = true; /* Get value ranges for each operand. For constant operands, create a new value range with the operand to simplify processing. */ if (TREE_CODE (op0) == SSA_NAME) - vr0 = *(get_value_range (op0)); + { + value_range *vtmp = get_value_range (op0); + if (range_split != NULL && vtmp == NULL) + { + vr->type = VR_VARYING; + return; + } + else + vr0 = *vtmp; + } else if (is_gimple_min_invariant (op0)) set_value_range_to_value (&vr0, op0, NULL); else set_value_range_to_varying (&vr0); if (TREE_CODE (op1) == SSA_NAME) - vr1 = *(get_value_range (op1)); + { + value_range *vtmp = get_value_range (op1); + if (range_split != NULL && vtmp == NULL) + { + vr->type = VR_VARYING; + return; + } + else + vr1 = *vtmp; + } else if (is_gimple_min_invariant (op1)) set_value_range_to_value (&vr1, op1, NULL); else set_value_range_to_varying (&vr1); - extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &vr1, + range_split); /* Try harder for PLUS and MINUS if the range of one operand is symbolic and based on the other operand, for example if it was deduced from a @@ -3266,7 +3303,8 @@ extract_range_from_binary_expr (value_range *vr, else set_value_range (&n_vr1, VR_RANGE, op1, op1, NULL); - extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &n_vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &n_vr1, + range_split); } if (vr->type == VR_VARYING @@ -3290,10 +3328,19 @@ extract_range_from_binary_expr (value_range *vr, else set_value_range (&n_vr0, VR_RANGE, op0, op0, NULL); - extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1, + range_split); } } +static void +extract_range_from_binary_expr (value_range *vr, + enum tree_code code, + tree expr_type, tree op0, tree op1) +{ + extract_range_from_binary_expr (vr, code, expr_type, op0, op1, NULL); +} + /* Extract range information from a unary operation CODE based on the range of its operand *VR0 with type OP0_TYPE with resulting type TYPE. The resulting range is stored in *VR. */ @@ -3699,14 +3746,32 @@ check_for_binary_op_overflow (enum tree_code subcode, tree type, value_range vr0 = VR_INITIALIZER; value_range vr1 = VR_INITIALIZER; if (TREE_CODE (op0) == SSA_NAME) - vr0 = *get_value_range (op0); + { + value_range *vrtmp = get_value_range (op0); + if (vrtmp != NULL) + vr0 = *vrtmp; + else + { + *ovf = true; + return true; + } + } else if (TREE_CODE (op0) == INTEGER_CST) set_value_range_to_value (&vr0, op0, NULL); else set_value_range_to_varying (&vr0); if (TREE_CODE (op1) == SSA_NAME) - vr1 = *get_value_range (op1); + { + value_range *vrtmp = get_value_range (op1); + if (vrtmp != NULL) + vr1 = *vrtmp; + else + { + *ovf = true; + return true; + } + } else if (TREE_CODE (op1) == INTEGER_CST) set_value_range_to_value (&vr1, op1, NULL); else @@ -10497,7 +10562,7 @@ identify_jump_threads (void) /* We're basically looking for a switch or any kind of conditional with integral or pointer type arguments. Note the type of the second argument will be the same as the first argument, so no need to - check it explicitly. + check it explicitly. We also handle the case where there are no statements in the block. This come up with forwarder blocks that are not diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h index 5cea709..793b584 100644 --- a/gcc/tree-vrp.h +++ b/gcc/tree-vrp.h @@ -17,6 +17,9 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#ifndef GCC_TREE_VRP_H +#define GCC_TREE_VRP_H + /* Type of value ranges. See value_range_d In tree-vrp.c for a description of these types. */ enum value_range_type { VR_UNDEFINED, VR_RANGE, @@ -48,6 +51,8 @@ struct GTY(()) value_range bitmap equiv; }; +#define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL } + extern void vrp_intersect_ranges (value_range *vr0, value_range *vr1); extern void vrp_meet (value_range *vr0, const value_range *vr1); extern void dump_value_range (FILE *, const value_range *); @@ -56,4 +61,8 @@ extern void extract_range_from_unary_expr (value_range *vr, tree type, value_range *vr0_, tree op0_type); - +extern void extract_range_from_binary_expr (value_range *vr, + enum tree_code code, + tree expr_type, tree op0, tree op1, + bool *ovf); +#endif /* GCC_TREE_VRP_H */ ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-11-16 15:54 ` Robin Dapp@ 2016-11-25 6:49 ` Robin Dapp2016-11-25 13:47 ` Richard Biener 1 sibling, 0 replies; 47+ messages in thread From: Robin Dapp @ 2016-11-25 6:49 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches Ping. ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-11-16 15:54 ` Robin Dapp 2016-11-25 6:49 ` Robin Dapp@ 2016-11-25 13:47 ` Richard Biener2016-11-28 11:13 ` Richard Biener 1 sibling, 1 reply; 47+ messages in thread From: Richard Biener @ 2016-11-25 13:47 UTC (permalink / raw) To: Robin Dapp;+Cc:GCC Patches On Wed, Nov 16, 2016 at 4:54 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: > Found some time to look into this again. > >> Index: tree-ssa-propagate.c >> =================================================================== >> --- tree-ssa-propagate.c (revision 240133) >> +++ tree-ssa-propagate.c (working copy) >> @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >> /* Replace real uses in the statement. */ >> did_replace |= replace_uses_in (stmt, get_value_fn); >> >> - /* If we made a replacement, fold the statement. */ >> - if (did_replace) >> + /* Fold the statement. */ >> + if (fold_stmt (&i, follow_single_use_edges)) >> { >> - fold_stmt (&i, follow_single_use_edges); >> + did_replace = true; >> stmt = gsi_stmt (i); >> } >> >> this would need compile-time cost evaluation (and avoid the tree-vrp.c >> folding part >> of your patch). > > This causes an ICE and bootstrap errors with newer revisions. I tested > my patch on r240691 where it still works. How can this be done properly now? Not sure, I'd have to investigate. It should still just work (throwing off a bootstrap with just that change over the weekend). >> OTOH given that you use this to decide whether you can use a combined constant >> that doesn't look necessary (if the constant is correct, that is) -- >> what you need >> to make sure is that the final operation, (T)(A) +- CST, does not overflow >> if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the >> original operation. I think that always holds, thus the combine_ovf check looks >> superfluous to me. > > Removed the check and addressed the other remarks. > >> So now we know that for (T)(X + CST1) + CST2, (T)CST1 + CST2 >> does not overflow. But we do not really care for that, we want to know >> whether (T)X + CST' might invoke undefined behavior when the original >> expression did not. This involves range information on X. I don't >> see how we ensure this here. > > I guess I'm still missing an undefined behavior case. In which case can > (T)X + CST' trigger undefined behavior where the original statement did > not? I see the need for checking in the second pattern ((T)(X) + CST' -> > (T)(X + CST')), of course. Looking at + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (TREE_CODE (type) == INTEGER_TYPE && + TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3))) so the conversion (T) is widening or sign-changing. (&& go to the next line) If A + CST overflows we cannot do the transform (you check that with extract_range_from_binary_expr setting 'range_split'). If A + CST does not overflow but is unsigned and we are just changing sign (precision ==) then (T)A + (CST + CST) might overflow. Consider (int)(INT_MAX + 1) + 1 -> INT_MAX + 2. I think here the important part is whether A + CST fits in T for the case where we just change the type to a type with !TYPE_OVERFLOW_WRAPS. Certainly restricting to widenings would avoid the issue. >> But that's "easily fixable" by computing it in unsigned arithmetic, that is >> doing >> >> (long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2)) > > Does this also work if (unsigned long)a + (LONG_MAX + 2) does not fit > into [0,LONG_MAX]? IIRC (unsigned long)(LONG_MAX + 2) is > implementation-defined and not undefined so it should work? Yes, implementation-defined beavior is fine. > Revised patch version attached. One thing I'm still not sure about is > the handling of sth. like (unsigned long)(a + UINT_MAX) + 1 for a = 0. > In the current patch version I always do a sign-extend of the first > constant (UINT_MAX here) which seems to cause no problems in the > testsuite and some custom tests still worked. + /* Sign-extend @1 to TYPE. */ + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); not sure why you do always sign-extend. If the inner op is unsigned and we widen then that's certainly bogus considering your UINT_MAX example above. Does w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN (inner_type)); not work for some reason? + /* Combine in outer, larger type. */ + bool combine_ovf = true; + combined_cst = wi::add (w1, w2, SIGNED, &combine_ovf); as you ignore combine_ovf you can simply use combined_cst = wi::add (w1, w2); + /* Convert combined constant to tree of outer type if + there was no value range split in the original operation. */ + if (!range_split) + { I'd say you want to condition on range_split early, like with bool range_split; if (TYPE_OVERFLOW_UNDEFINED (inner_type) || ! (extract_range_from_binary_expr (...., &range_split), range_split)) { ... } and avoid all the work if you throw it away anyway. > Can UINT_MAX, -1 and > similar cases be disambiguated (and correctly converted to the outer > type) when done in unsigned arithmetic? see above how I expect it to "just work". > > Thinking about the second pattern, on s390x it introduces more casts > than just using the first one (e.g. in cases where the value will be > sign-extended after the operation which wouldn't have happened when > performing the operation in the larger type. Can we catch this > with a cost function? Not sure, if it's really too bad we can try merging the two patterns again, thus have (T)(A +- CST) +- CST -> (T)(A +- CST) again. Now that both patterns look simple enough that should be possible without too much hassle. + (if (cst) + (outer_op (convert { @0; }) { cst; })) you can write this as (if (cst) (outer_op (convert @0) { cst})) no need for the {}s around @0. + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert @0) INTEGER_CST@2) + (if (TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) + && TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE + && TREE_CODE (type) == INTEGER_TYPE) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + tree cst_inner; + bool range_split = true; + + wide_int cst = @2; + cst_inner = wide_int_to_tree (TREE_TYPE (@0), cst); not sure if that really does what you want (you're truncating the constant to the smaller type). I think you want to check int_fits_type_p (TREE_TYPE (@0), @2) and then simply use tree cst_inner = fold_convert (TREE_TYPE (@0), @2); as you do not allow a simple sign-change here if you merge the patterns disallowing it in the above one would simplify things as well. + value_range vr = VR_INITIALIZER; + extract_range_from_binary_expr (&vr, outer_op, TREE_TYPE (@0), + @0, cst_inner, &range_split); > > On a side note: Can/should VRP infer ranges assuming no undefined > behavior will take place when -fstrict-overflow is in use? I.e. > inferring ~[INT_MIN,INT_MIN] for (long)(a - 1)? Would this even make sense? It could do that but it does not at the moment. Richard. > Regards > Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-11-25 13:47 ` Richard Biener@ 2016-11-28 11:13 ` Richard Biener2016-11-28 13:26 ` Robin Dapp 0 siblings, 1 reply; 47+ messages in thread From: Richard Biener @ 2016-11-28 11:13 UTC (permalink / raw) To: Robin Dapp;+Cc:GCC Patches On Fri, Nov 25, 2016 at 2:46 PM, Richard Biener <richard.guenther@gmail.com> wrote> On Wed, Nov 16, 2016 at 4:54 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >> Found some time to look into this again. >> >>> Index: tree-ssa-propagate.c >>> =================================================================== >>> --- tree-ssa-propagate.c (revision 240133) >>> +++ tree-ssa-propagate.c (working copy) >>> @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker::before_d >>> /* Replace real uses in the statement. */ >>> did_replace |= replace_uses_in (stmt, get_value_fn); >>> >>> - /* If we made a replacement, fold the statement. */ >>> - if (did_replace) >>> + /* Fold the statement. */ >>> + if (fold_stmt (&i, follow_single_use_edges)) >>> { >>> - fold_stmt (&i, follow_single_use_edges); >>> + did_replace = true; >>> stmt = gsi_stmt (i); >>> } >>> >>> this would need compile-time cost evaluation (and avoid the tree-vrp.c >>> folding part >>> of your patch). >> >> This causes an ICE and bootstrap errors with newer revisions. I tested >> my patch on r240691 where it still works. How can this be done properly now? > > Not sure, I'd have to investigate. It should still just work > (throwing off a bootstrap > with just that change over the weekend). Index: gcc/tree-ssa-propagate.c =================================================================== --- gcc/tree-ssa-propagate.c (revision 242875) +++ gcc/tree-ssa-propagate.c (working copy) @@ -1065,13 +1065,15 @@ substitute_and_fold_dom_walker::before_d /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); + if (did_replace) + gimple_set_modified (stmt, true); /* If we made a replacement, fold the statement. */ - if (did_replace) + if (fold_stmt (&i, follow_single_use_edges)) { - fold_stmt (&i, follow_single_use_edges); stmt = gsi_stmt (i); gimple_set_modified (stmt, true); + did_replace = true; } /* Some statements may be simplified using propagator works fine for me. >>> OTOH given that you use this to decide whether you can use a combined constant >>> that doesn't look necessary (if the constant is correct, that is) -- >>> what you need >>> to make sure is that the final operation, (T)(A) +- CST, does not overflow >>> if ! TYPE_OVERFLOW_WRAPS and there wasn't any overflow in the >>> original operation. I think that always holds, thus the combine_ovf check looks >>> superfluous to me. >> >> Removed the check and addressed the other remarks. >> >>> So now we know that for (T)(X + CST1) + CST2, (T)CST1 + CST2 >>> does not overflow. But we do not really care for that, we want to know >>> whether (T)X + CST' might invoke undefined behavior when the original >>> expression did not. This involves range information on X. I don't >>> see how we ensure this here. >> >> I guess I'm still missing an undefined behavior case. In which case can >> (T)X + CST' trigger undefined behavior where the original statement did >> not? I see the need for checking in the second pattern ((T)(X) + CST' -> >> (T)(X + CST')), of course. > > Looking at > > + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ > +#if GIMPLE > + (for outer_op (plus minus) > + (for inner_op (plus minus) > + (simplify > + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) > + (if (TREE_CODE (type) == INTEGER_TYPE && > + TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@3))) > > so the conversion (T) is widening or sign-changing. (&& go to the next line) > > If A + CST overflows we cannot do the transform (you check that > with extract_range_from_binary_expr setting 'range_split'). > > If A + CST does not overflow but is unsigned and we are just changing sign > (precision ==) then (T)A + (CST + CST) might overflow. Consider > (int)(INT_MAX + 1) + 1 -> INT_MAX + 2. I think here the important part > is whether A + CST fits in T for the case where we just change the type > to a type with !TYPE_OVERFLOW_WRAPS. Certainly restricting to > widenings would avoid the issue. > >>> But that's "easily fixable" by computing it in unsigned arithmetic, that is >>> doing >>> >>> (long)(a + 2) + LONG_MAX -> (long)((unsigned long)a + (LONG_MAX + 2)) >> >> Does this also work if (unsigned long)a + (LONG_MAX + 2) does not fit >> into [0,LONG_MAX]? IIRC (unsigned long)(LONG_MAX + 2) is >> implementation-defined and not undefined so it should work? > > Yes, implementation-defined beavior is fine. > >> Revised patch version attached. One thing I'm still not sure about is >> the handling of sth. like (unsigned long)(a + UINT_MAX) + 1 for a = 0. >> In the current patch version I always do a sign-extend of the first >> constant (UINT_MAX here) which seems to cause no problems in the >> testsuite and some custom tests still worked. > > + /* Sign-extend @1 to TYPE. */ > + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); > > not sure why you do always sign-extend. If the inner op is unsigned > and we widen then that's certainly bogus considering your UINT_MAX > example above. Does > > w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN (inner_type)); > > not work for some reason? > > + /* Combine in outer, larger type. */ > + bool combine_ovf = true; > + combined_cst = wi::add (w1, w2, SIGNED, &combine_ovf); > > as you ignore combine_ovf you can simply use > > combined_cst = wi::add (w1, w2); > > + /* Convert combined constant to tree of outer type if > + there was no value range split in the original operation. */ > + if (!range_split) > + { > > I'd say you want to condition on range_split early, like with > > bool range_split; > if (TYPE_OVERFLOW_UNDEFINED (inner_type) > || ! (extract_range_from_binary_expr (...., &range_split), range_split)) > { > ... > } > > and avoid all the work if you throw it away anyway. > >> Can UINT_MAX, -1 and >> similar cases be disambiguated (and correctly converted to the outer >> type) when done in unsigned arithmetic? > > see above how I expect it to "just work". > >> >> Thinking about the second pattern, on s390x it introduces more casts >> than just using the first one (e.g. in cases where the value will be >> sign-extended after the operation which wouldn't have happened when >> performing the operation in the larger type. Can we catch this >> with a cost function? > > Not sure, if it's really too bad we can try merging the two patterns again, > thus have (T)(A +- CST) +- CST -> (T)(A +- CST) again. Now that both > patterns look simple enough that should be possible without too much hassle. > > + (if (cst) > + (outer_op (convert { @0; }) { cst; })) > > you can write this as > > (if (cst) > (outer_op (convert @0) { cst})) > > no need for the {}s around @0. > > + /* ((T)(A)) +- CST -> (T)(A +- CST) */ > +#if GIMPLE > + (for outer_op (plus minus) > + (simplify > + (outer_op (convert @0) INTEGER_CST@2) > + (if (TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) > + && TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE > + && TREE_CODE (type) == INTEGER_TYPE) > + /* Perform binary operation inside the cast if the constant fits > + and there is no overflow. */ > + (with > + { > + tree cst_inner; > + bool range_split = true; > + > + wide_int cst = @2; > + cst_inner = wide_int_to_tree (TREE_TYPE (@0), cst); > > not sure if that really does what you want (you're truncating the constant > to the smaller type). I think you want to check > > int_fits_type_p (TREE_TYPE (@0), @2) > > and then simply use > > tree cst_inner = fold_convert (TREE_TYPE (@0), @2); > > as you do not allow a simple sign-change here if you merge the patterns > disallowing it in the above one would simplify things as well. > > + value_range vr = VR_INITIALIZER; > + extract_range_from_binary_expr (&vr, outer_op, TREE_TYPE (@0), > + @0, cst_inner, &range_split); > > >> >> On a side note: Can/should VRP infer ranges assuming no undefined >> behavior will take place when -fstrict-overflow is in use? I.e. >> inferring ~[INT_MIN,INT_MIN] for (long)(a - 1)? Would this even make sense? > > It could do that but it does not at the moment. > > Richard. > >> Regards >> Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-11-28 11:13 ` Richard Biener@ 2016-11-28 13:26 ` Robin Dapp2016-12-05 7:57 ` Robin Dapp 2016-12-06 13:03 ` Richard Biener 0 siblings, 2 replies; 47+ messages in thread From: Robin Dapp @ 2016-11-28 13:26 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches >> + /* Sign-extend @1 to TYPE. */ >> + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); >> >> not sure why you do always sign-extend. If the inner op is unsigned >> and we widen then that's certainly bogus considering your UINT_MAX >> example above. Does >> >> w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN (inner_type)); >> >> not work for some reason? With TYPE_SIGN (inner_type), I encountered situations like this in the testsuite (mostly Fortran but also 20000422-1.c): Folding statement: _1 = _9 + 1; Applying pattern match.pd:1214, gimple-match.c:8719 gimple_simplified to _93 = (sizetype) _8; _1 = _93 + 4294967296; Folded into: _1 = _93 + 4294967296; in _8 = (unsigned int) i_73; _5 = _8 + 4294967295; _9 = (sizetype) _5; _93 = (sizetype) _8; _1 = _93 + 4294967296; TYPE_SIGN (inner_type) is PLUS here, although it should be MINUS in order to combine -1 and +1 to 0. Perhaps this can be handled differently with keeping TYPE_SIGN (inner_type)? On the other hand, using SIGNED instead of TYPE_SIGN (inner_type) worked for all cases I looked at, like if (a > 1 && a < 10) return (unsigned long)(a + UINT_MAX) + 1; For if (a > 0 && a < 10) extract_range...() would not find a non-VR_VARYING range although we should probably still be able to simplify this. if (a > 0) Here, vrp figures out [0,4294967294] and the simplification takes place. For if (a <= 10) return (unsigned long)(a + (UINT_MAX - 10)) + 1; 003t.original already reads return (long unsigned int) a + 4294967286 From this I hand-wavingly deduced, we'd only see instances where a sign-extension of the constant is fine (test suites on s390x and x86 affirm this for what it's woth) but I don't have a cogent reason hence my doubts :) I'm ok with omitting the sign-changing case (I hadn't though of it anyway) and adapted the patch. I haven't attached the updated version though, because I'm still unsure about the issue above (despite the clean test suite). Regards Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-11-28 13:26 ` Robin Dapp@ 2016-12-05 7:57 ` Robin Dapp2016-12-06 13:03 ` Richard Biener 1 sibling, 0 replies; 47+ messages in thread From: Robin Dapp @ 2016-12-05 7:57 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches Ping. Any idea how to tackle this? ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-11-28 13:26 ` Robin Dapp 2016-12-05 7:57 ` Robin Dapp@ 2016-12-06 13:03 ` Richard Biener2016-12-07 16:15 ` Robin Dapp 1 sibling, 1 reply; 47+ messages in thread From: Richard Biener @ 2016-12-06 13:03 UTC (permalink / raw) To: Robin Dapp;+Cc:GCC Patches On Mon, Nov 28, 2016 at 2:26 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >>> + /* Sign-extend @1 to TYPE. */ >>> + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); >>> >>> not sure why you do always sign-extend. If the inner op is unsigned >>> and we widen then that's certainly bogus considering your UINT_MAX >>> example above. Does >>> >>> w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN (inner_type)); >>> >>> not work for some reason? > > With TYPE_SIGN (inner_type), I encountered situations like this in the > testsuite (mostly Fortran but also 20000422-1.c): > > Folding statement: _1 = _9 + 1; > Applying pattern match.pd:1214, gimple-match.c:8719 > gimple_simplified to _93 = (sizetype) _8; > _1 = _93 + 4294967296; > Folded into: _1 = _93 + 4294967296; > > in > _8 = (unsigned int) i_73; > _5 = _8 + 4294967295; > _9 = (sizetype) _5; > _93 = (sizetype) _8; > _1 = _93 + 4294967296; > > TYPE_SIGN (inner_type) is PLUS here, although it should be MINUS in > order to combine -1 and +1 to 0. Perhaps this can be handled differently > with keeping TYPE_SIGN (inner_type)? So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type) produces (uint64_t)uint32 + -1U + 1. This simply means that we cannot ignore overflow of the inner operation and for some reason your change to extract_range_from_binary_expr didn't catch this. That is _8 + 4294967295 overflows but we ignored that. > On the other hand, using SIGNED instead of TYPE_SIGN (inner_type) worked > for all cases I looked at, like > if (a > 1 && a < 10) > return (unsigned long)(a + UINT_MAX) + 1; > For > if (a > 0 && a < 10) > extract_range...() would not find a non-VR_VARYING range although we > should probably still be able to simplify this. > > if (a > 0) > Here, vrp figures out [0,4294967294] and the simplification takes place. > > For > if (a <= 10) > return (unsigned long)(a + (UINT_MAX - 10)) + 1; > 003t.original already reads > return (long unsigned int) a + 4294967286 > > From this I hand-wavingly deduced, we'd only see instances where a > sign-extension of the constant is fine (test suites on s390x and x86 > affirm this for what it's woth) but I don't have a cogent reason hence > my doubts :) Well, even if sign-extending you can probably construct a testcase where that's still wrong with respect to overflow. Richard. > I'm ok with omitting the sign-changing case (I hadn't though of it > anyway) and adapted the patch. I haven't attached the updated version > though, because I'm still unsure about the issue above (despite the > clean test suite). > > Regards > Robin > ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-12-06 13:03 ` Richard Biener@ 2016-12-07 16:15 ` Robin Dapp2016-12-13 14:13 ` Richard Biener 0 siblings, 1 reply; 47+ messages in thread From: Robin Dapp @ 2016-12-07 16:15 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches [-- Attachment #1: Type: text/plain, Size: 1053 bytes --] > So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type) > produces (uint64_t)uint32 + -1U + 1. This simply means that we cannot ignore > overflow of the inner operation and for some reason your change > to extract_range_from_binary_expr didn't catch this. That is _8 + 4294967295 > overflows but we ignored that. In this case the range of _8 was [1,1] so no overflow. I think the problem is rather about the interpretation of the inner constant. I tried discerning two cases now, a range split (i.e. when the single range of a binop variable becomes an anti range or two ranges after combining it with the binop's other range) and an overflow of the range's min and max. If the range didn't split, we can perform the simplification. If there was a min and max overflow, we have to interpret the inner constant as signed and perform a sign extension before converting it to the outer type. Without overflow we can use TYPE_SIGN (inner_type). Does this make sense? Included the remarks and attached the new version. Regards Robin [-- Attachment #2: gcc-wrapped-binop.diff --] [-- Type: text/x-patch, Size: 15187 bytes --] diff --git a/gcc/match.pd b/gcc/match.pd index feaa4a1..19df142 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1195,6 +1195,81 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (minus @0 (minus @0 @1)) @1) + /* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (TREE_CODE (type) == INTEGER_TYPE + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@3))) + (with + { + struct vr_binop_overflow ovf; + tree cst = NULL_TREE; + tree inner_type = TREE_TYPE (@3); + value_range vr = VR_INITIALIZER; + + /* Convert combined constant to tree of outer type if + there was no value range split in the original operation. */ + if (TYPE_OVERFLOW_UNDEFINED (inner_type) + || (extract_range_from_binary_expr (&vr, inner_op, + inner_type, @0, @1, &ovf), !ovf.range_split)) + { + wide_int w1 = @1; + wide_int w2 = @2; + + wide_int combined_cst; + + /* Extend @1 to TYPE. Perform sign extension if the range + overflowed but did not split. */ + w1 = w1.from (w1, TYPE_PRECISION (type), ovf.ovf ? SIGNED : + TYPE_SIGN (inner_type)); + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + /* Combine in outer, larger type. */ + bool combine_ovf = true; + combined_cst = wi::add (w1, w2); + + cst = wide_int_to_tree (type, combined_cst); + } + } + (if (cst) + (outer_op (convert @0) { cst; })) + ))))) +#endif + + /* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert @0) INTEGER_CST@2) + (if (TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) + && TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE + && TREE_CODE (type) == INTEGER_TYPE) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + struct vr_binop_overflow ovf; + + int_fits_type_p (@2, TREE_TYPE (@0)); + tree cst_inner = fold_convert (TREE_TYPE (@0), @2); + + value_range vr = VR_INITIALIZER; + extract_range_from_binary_expr (&vr, outer_op, TREE_TYPE (@0), + @0, cst_inner, &ovf); + } + (if (!ovf.range_split) + (convert (outer_op @0 { cst_inner; }))) + )))) +#endif + /* (A +- CST1) +- CST2 -> A + CST3 */ (for outer_op (plus minus) (for inner_op (plus minus) diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c new file mode 100644 index 0000000..19b787b --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "ccp1" } } */ + +#include <limits.h> + +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +long bak(int a) +{ + return (long)(a + 1) + 3; +} + +long bal(int a) +{ + return (long)(a - 7) - 4; +} + +long bam(int a) +{ + return (long)(a - 1) - INT_MAX; +} + +long bam2(int a) +{ + return (long)(a + 1) + INT_MAX; +} + +long ban(int a) +{ + return (long)(a - 1) + INT_MIN; +} + +long ban2(int a) +{ + return (long)(a + 1) - INT_MIN; +} + +unsigned long baq(int a) +{ + return (unsigned long)(a + 1) - 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c new file mode 100644 index 0000000..8befc96 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details -fdump-tree-evrp-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 1 "ccp1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 1 "evrp" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 5 "vrp1" } } */ + +#include <limits.h> + +unsigned long oof2(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a - 1) + 1; +} + +unsigned long bap(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) + ULONG_MAX; +} + +unsigned long bar3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 5; +} + +unsigned long bar4(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 6; +} + +unsigned long baq(int a) +{ + return (unsigned long)(a - 2) + 1; +} + +long baq3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (long)(a - 1) + 1; +} + +// should simplify (combined constant is 0 not (unsigned long)UINT_MAX + 1 +// VRP creates an anti-range for a +unsigned long foo(short b) +{ + unsigned int a = b; + return (unsigned long)(a + UINT_MAX) + 1; +} + diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c new file mode 100644 index 0000000..fb6d3d9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <assert.h> +#include <limits.h> + +unsigned int a = 3; +int aa = 3; + +int main() +{ + volatile unsigned long b = (unsigned long)(UINT_MAX + 1) - 1; + assert (b == 18446744073709551615ul); + + volatile unsigned long c = (unsigned long)(a - 4) + 1; + assert (c == 4294967296); + + volatile unsigned long d = (unsigned long)(a + UINT_MAX - 4) + 2; + assert (d == 4294967296); + + volatile unsigned long e = (unsigned long)(a - UINT_MAX) + UINT_MAX; + assert (e == 4294967299); + + volatile unsigned long f = (unsigned long)(a + UINT_MAX) - UINT_MAX; + assert (f == 18446744069414584323ul); + + volatile long g = (long)(a - 4) + 1; + assert (g == 4294967296); + + volatile long h = (long)(aa + UINT_MAX) + 1; + assert (h == 3); +} diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 600634d..7b8714d 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -62,8 +62,7 @@ along with GCC; see the file COPYING3. If not see #include "alloc-pool.h" #include "domwalk.h" #include "tree-cfgcleanup.h" - -#define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL } +#include "tree-vrp.h" /* Allocation pools for tree-vrp allocations. */ static object_allocator<value_range> vrp_value_range_pool ("Tree VRP value ranges"); @@ -2214,7 +2213,8 @@ extract_range_from_multiplicative_op_1 (value_range *vr, static void extract_range_from_binary_expr_1 (value_range *vr, enum tree_code code, tree expr_type, - value_range *vr0_, value_range *vr1_) + value_range *vr0_, value_range *vr1_, + struct vr_binop_overflow *ovf) { value_range vr0 = *vr0_, vr1 = *vr1_; value_range vrtem0 = VR_INITIALIZER, vrtem1 = VR_INITIALIZER; @@ -2273,12 +2273,13 @@ extract_range_from_binary_expr_1 (value_range *vr, if (vr0.type == VR_ANTI_RANGE && ranges_from_anti_range (&vr0, &vrtem0, &vrtem1)) { - extract_range_from_binary_expr_1 (vr, code, expr_type, &vrtem0, vr1_); + extract_range_from_binary_expr_1 (vr, code, expr_type, + &vrtem0, vr1_, ovf); if (vrtem1.type != VR_UNDEFINED) { value_range vrres = VR_INITIALIZER; extract_range_from_binary_expr_1 (&vrres, code, expr_type, - &vrtem1, vr1_); + &vrtem1, vr1_, ovf); vrp_meet (vr, &vrres); } return; @@ -2287,12 +2288,13 @@ extract_range_from_binary_expr_1 (value_range *vr, if (vr1.type == VR_ANTI_RANGE && ranges_from_anti_range (&vr1, &vrtem0, &vrtem1)) { - extract_range_from_binary_expr_1 (vr, code, expr_type, vr0_, &vrtem0); + extract_range_from_binary_expr_1 (vr, code, expr_type, + vr0_, &vrtem0, ovf); if (vrtem1.type != VR_UNDEFINED) { value_range vrres = VR_INITIALIZER; extract_range_from_binary_expr_1 (&vrres, code, expr_type, - vr0_, &vrtem1); + vr0_, &vrtem1, ovf); vrp_meet (vr, &vrres); } return; @@ -2505,6 +2507,13 @@ extract_range_from_binary_expr_1 (value_range *vr, max_ovf = 1; } + if (ovf != NULL) + { + ovf->range_split = min_ovf != max_ovf; + ovf->ovf = (min_ovf == 1 && max_ovf == 1) + || (min_ovf == -1 && max_ovf == -1); + } + /* If we have overflow for the constant part and the resulting range will be symbolic, drop to VR_VARYING. */ if ((min_ovf && sym_min_op0 != sym_min_op1) @@ -2847,7 +2856,7 @@ extract_range_from_binary_expr_1 (value_range *vr, saved_flag_wrapv = flag_wrapv; flag_wrapv = 1; extract_range_from_binary_expr_1 (vr, MULT_EXPR, expr_type, - &vr0, &vr1p); + &vr0, &vr1p, ovf); flag_wrapv = saved_flag_wrapv; return; } @@ -3225,35 +3234,70 @@ extract_range_from_binary_expr_1 (value_range *vr, set_value_range (vr, type, min, max, NULL); } +static void +extract_range_from_binary_expr_1 (value_range *vr, + enum tree_code code, tree expr_type, + value_range *vr0_, value_range *vr1_) +{ + extract_range_from_binary_expr_1 (vr, code, expr_type, vr0_, vr1_, + NULL); +} + /* Extract range information from a binary expression OP0 CODE OP1 based on the ranges of each of its operands with resulting type EXPR_TYPE. The resulting range is stored in *VR. */ -static void +void extract_range_from_binary_expr (value_range *vr, enum tree_code code, - tree expr_type, tree op0, tree op1) + tree expr_type, tree op0, tree op1, + struct vr_binop_overflow *ovf) { value_range vr0 = VR_INITIALIZER; value_range vr1 = VR_INITIALIZER; + /* + if (ovf != NULL) + { + ovf->range_split = true; + ovf->ovf = true; + } + */ /* Get value ranges for each operand. For constant operands, create a new value range with the operand to simplify processing. */ if (TREE_CODE (op0) == SSA_NAME) - vr0 = *(get_value_range (op0)); + { + value_range *vtmp = get_value_range (op0); + if (ovf != NULL && vtmp == NULL) + { + vr->type = VR_VARYING; + return; + } + else + vr0 = *vtmp; + } else if (is_gimple_min_invariant (op0)) set_value_range_to_value (&vr0, op0, NULL); else set_value_range_to_varying (&vr0); if (TREE_CODE (op1) == SSA_NAME) - vr1 = *(get_value_range (op1)); + { + value_range *vtmp = get_value_range (op1); + if (ovf != NULL && vtmp == NULL) + { + vr->type = VR_VARYING; + return; + } + else + vr1 = *vtmp; + } else if (is_gimple_min_invariant (op1)) set_value_range_to_value (&vr1, op1, NULL); else set_value_range_to_varying (&vr1); - extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &vr1, ovf); /* Try harder for PLUS and MINUS if the range of one operand is symbolic and based on the other operand, for example if it was deduced from a @@ -3281,7 +3325,8 @@ extract_range_from_binary_expr (value_range *vr, else set_value_range (&n_vr1, VR_RANGE, op1, op1, NULL); - extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &n_vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &n_vr1, + ovf); } if (vr->type == VR_VARYING @@ -3305,10 +3350,19 @@ extract_range_from_binary_expr (value_range *vr, else set_value_range (&n_vr0, VR_RANGE, op0, op0, NULL); - extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1, + ovf); } } +static void +extract_range_from_binary_expr (value_range *vr, + enum tree_code code, + tree expr_type, tree op0, tree op1) +{ + extract_range_from_binary_expr (vr, code, expr_type, op0, op1, NULL); +} + /* Extract range information from a unary operation CODE based on the range of its operand *VR0 with type OP0_TYPE with resulting type TYPE. The resulting range is stored in *VR. */ @@ -3714,14 +3768,32 @@ check_for_binary_op_overflow (enum tree_code subcode, tree type, value_range vr0 = VR_INITIALIZER; value_range vr1 = VR_INITIALIZER; if (TREE_CODE (op0) == SSA_NAME) - vr0 = *get_value_range (op0); + { + value_range *vrtmp = get_value_range (op0); + if (vrtmp != NULL) + vr0 = *vrtmp; + else + { + *ovf = true; + return true; + } + } else if (TREE_CODE (op0) == INTEGER_CST) set_value_range_to_value (&vr0, op0, NULL); else set_value_range_to_varying (&vr0); if (TREE_CODE (op1) == SSA_NAME) - vr1 = *get_value_range (op1); + { + value_range *vrtmp = get_value_range (op1); + if (vrtmp != NULL) + vr1 = *vrtmp; + else + { + *ovf = true; + return true; + } + } else if (TREE_CODE (op1) == INTEGER_CST) set_value_range_to_value (&vr1, op1, NULL); else @@ -10553,7 +10625,7 @@ identify_jump_threads (void) /* We're basically looking for a switch or any kind of conditional with integral or pointer type arguments. Note the type of the second argument will be the same as the first argument, so no need to - check it explicitly. + check it explicitly. We also handle the case where there are no statements in the block. This come up with forwarder blocks that are not diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h index 5cea709..3ddffad 100644 --- a/gcc/tree-vrp.h +++ b/gcc/tree-vrp.h @@ -17,6 +17,9 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#ifndef GCC_TREE_VRP_H +#define GCC_TREE_VRP_H + /* Type of value ranges. See value_range_d In tree-vrp.c for a description of these types. */ enum value_range_type { VR_UNDEFINED, VR_RANGE, @@ -48,6 +51,21 @@ struct GTY(()) value_range bitmap equiv; }; +/* Overflow properties of a binop's value range. */ +struct GTY(()) vr_binop_overflow +{ + /* True if both min_ovf and max_ovf occurred. */ + bool ovf; + /* True if the range split during range calculation i.e. either + min_ovf or max_ovf occurred e.g. [0,1] + [UINT_MAX, UINT_MAX]. */ + bool range_split; + + vr_binop_overflow() : + ovf(true), range_split(true) {} +}; + +#define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL } + extern void vrp_intersect_ranges (value_range *vr0, value_range *vr1); extern void vrp_meet (value_range *vr0, const value_range *vr1); extern void dump_value_range (FILE *, const value_range *); @@ -56,4 +74,9 @@ extern void extract_range_from_unary_expr (value_range *vr, tree type, value_range *vr0_, tree op0_type); +extern void extract_range_from_binary_expr (value_range *vr, + enum tree_code code, + tree expr_type, tree op0, tree op1, + struct vr_binop_overflow *overflow); +#endif /* GCC_TREE_VRP_H */ [-- Attachment #3: gcc-propagate.diff --] [-- Type: text/x-patch, Size: 763 bytes --] diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c index fbe7e13..110587d 100644 --- a/gcc/tree-ssa-propagate.c +++ b/gcc/tree-ssa-propagate.c @@ -1065,13 +1065,15 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb) /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); + if (did_replace) + gimple_set_modified (stmt, true); /* If we made a replacement, fold the statement. */ - if (did_replace) + if (fold_stmt (&i, follow_single_use_edges)) { - fold_stmt (&i, follow_single_use_edges); stmt = gsi_stmt (i); gimple_set_modified (stmt, true); + did_replace = true; } /* Some statements may be simplified using propagator ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-12-07 16:15 ` Robin Dapp@ 2016-12-13 14:13 ` Richard Biener2017-01-10 13:33 ` Robin Dapp 0 siblings, 1 reply; 47+ messages in thread From: Richard Biener @ 2016-12-13 14:13 UTC (permalink / raw) To: Robin Dapp;+Cc:GCC Patches On Wed, Dec 7, 2016 at 5:14 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >> So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type) >> produces (uint64_t)uint32 + -1U + 1. This simply means that we cannot ignore >> overflow of the inner operation and for some reason your change >> to extract_range_from_binary_expr didn't catch this. That is _8 + 4294967295 >> overflows but we ignored that. > > In this case the range of _8 was [1,1] so no overflow. > I think the problem is rather about the interpretation of the inner > constant. I tried discerning two cases now, a range split (i.e. when the > single range of a binop variable becomes an anti range or two ranges > after combining it with the binop's other range) and an overflow of the > range's min and max. > If the range didn't split, we can perform the simplification. If there > was a min and max overflow, we have to interpret the inner constant as > signed and perform a sign extension before converting it to the outer > type. Without overflow we can use TYPE_SIGN (inner_type). > Does this make sense? I'm not sure there is anything to "interpret" -- the operation is unsigned and overflow is when the operation may wrap around zero. There might be clever ways of re-writing the expression to (uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1) avoiding the overflow and thus allowing the transform but I'm not sure that's good. A related thing would be canonicalizing unsigned X plus CST to unsigned X minus CST' if CST' has a smaller absolute value than CST. I think currently we simply canonicalize to 'plus CST' but also only in fold-const.c, not in match.pd (ah we do, but only in a simplified manner). That said, can we leave that "trick" out of the patch? I think your more complicated "overflows" result from extract_range_from_binary_expr_1 doesn't apply to all ops (like MULT_EXPR where more complicated cases can arise). Richard. > Included the remarks and attached the new version. > > Regards > Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262016-12-13 14:13 ` Richard Biener@ 2017-01-10 13:33 ` Robin Dapp2017-01-17 7:34 ` Robin Dapp 2017-01-17 9:48 ` Richard Biener 0 siblings, 2 replies; 47+ messages in thread From: Robin Dapp @ 2017-01-10 13:33 UTC (permalink / raw) To: Richard Biener, GCC Patches Perhaps I'm still missing how some cases are handled or not handled, sorry for the noise. > I'm not sure there is anything to "interpret" -- the operation is unsigned > and overflow is when the operation may wrap around zero. There might > be clever ways of re-writing the expression to > (uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1) > avoiding the overflow and thus allowing the transform but I'm not sure that's > good. The extra work I introduced was to discern between (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a), (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a) + (uint64_t)(UINT_MAX + 1), For a's range of [1,1] there is an overflow in both cases. We still want to simplify the first case by combining UINT_MAX + 1 -> 0. If "interpreting" UINT_MAX as -1 is not the correct thing to do, perhaps (uint64_t)((uint32_t)(UINT_MAX + 1)) is? This fails, however, if the outer constant is larger than UINT_MAX. What else can we do here? Do we see cases like the second one at all? If it's not needed, the extra work is likely not needed. > A related thing would be canonicalizing unsigned X plus CST to > unsigned X minus CST' > if CST' has a smaller absolute value than CST. I think currently we > simply canonicalize > to 'plus CST' but also only in fold-const.c, not in match.pd (ah we > do, but only in a simplified manner). I can imagine this could simplify the treatment of some cases, yet I'm already a bit lost with the current cases :) > That said, can we leave that "trick" out of the patch? I think your > more complicated > "overflows" result from extract_range_from_binary_expr_1 doesn't apply to all > ops (like MULT_EXPR where more complicated cases can arise). There is certainly additional work to be done for MULT_EXPR, I disregarded it so far. For this patch, I'd rather conservatively assume overflow. Regards Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262017-01-10 13:33 ` Robin Dapp@ 2017-01-17 7:34 ` Robin Dapp2017-01-17 9:48 ` Richard Biener 1 sibling, 0 replies; 47+ messages in thread From: Robin Dapp @ 2017-01-17 7:34 UTC (permalink / raw) To: Richard Biener, GCC Patches Ping. To put it shortly, I'm not sure how to differentiate between: example range of a: [3,3] (ulong)(a + UINT_MAX) + 1 --> (ulong)(a) + (ulong)(-1 + 1), sign extend example range of a: [0,0] (ulong)(a + UINT_MAX) + 1 --> (ulong)(a) + (ulong)(UINT_MAX + 1), no sign extend In this case, there is an "ok" overflow in the first example's range and no overflow in the second, but I don't think this suffices as criterion. As said, your proposed canonicalization (" unsigned X plus CST to unsigned X minus CST' if CST' has a smaller absolute value than CST) might help here, but I'm unsure how to do it currently. ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262017-01-10 13:33 ` Robin Dapp 2017-01-17 7:34 ` Robin Dapp@ 2017-01-17 9:48 ` Richard Biener2017-02-02 9:27 ` Robin Dapp 2017-05-11 15:08 ` Bin.Cheng 1 sibling, 2 replies; 47+ messages in thread From: Richard Biener @ 2017-01-17 9:48 UTC (permalink / raw) To: Robin Dapp;+Cc:GCC Patches On Tue, Jan 10, 2017 at 2:32 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: > Perhaps I'm still missing how some cases are handled or not handled, > sorry for the noise. > >> I'm not sure there is anything to "interpret" -- the operation is unsigned >> and overflow is when the operation may wrap around zero. There might >> be clever ways of re-writing the expression to >> (uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1) >> avoiding the overflow and thus allowing the transform but I'm not sure that's >> good. > > The extra work I introduced was to discern between > > (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a), > (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a) + (uint64_t)(UINT_MAX + 1), > > For a's range of [1,1] there is an overflow in both cases. > We still want to simplify the first case by combining UINT_MAX + 1 -> 0. So there's the case where a == 0 where (uint64_t)(0 + UINT_MAX) + 1 is not zero. I think we're still searching for the proper condition on when it is allowed to re-write (uint64_t)(uint32_t + uint32_t-CST) + uint64_t-CST to (uint64_t)(uint32_t) + (uint64_t)uint32_t-CST + uint64_t-CST or to (uint64_t)(uint32_t) + (uint64_t)(uint32_t-CST + (uint32_t)uint64_t-CST) > If "interpreting" UINT_MAX as -1 is not the correct thing to do, perhaps > (uint64_t)((uint32_t)(UINT_MAX + 1)) is? This fails, however, if the > outer constant is larger than UINT_MAX. What else can we do here? > Do we see cases like the second one at all? If it's not needed, the > extra work is likely not needed. We do have the need to associate and simplfy across conversions but of course we have to do it conservatively correct. >> A related thing would be canonicalizing unsigned X plus CST to >> unsigned X minus CST' >> if CST' has a smaller absolute value than CST. I think currently we >> simply canonicalize >> to 'plus CST' but also only in fold-const.c, not in match.pd (ah we >> do, but only in a simplified manner). > > I can imagine this could simplify the treatment of some cases, yet I'm > already a bit lost with the current cases :) Yes, but I am lost a bit as well. I don't know the correct conditions to test off-head -- I know we may not introduce new undefined overflow ops and of course we need to not compute wrong numbers either. >> That said, can we leave that "trick" out of the patch? I think your >> more complicated >> "overflows" result from extract_range_from_binary_expr_1 doesn't apply to all >> ops (like MULT_EXPR where more complicated cases can arise). > > There is certainly additional work to be done for MULT_EXPR, I > disregarded it so far. For this patch, I'd rather conservatively assume > overflow. Ok... Richard. > Regards > Robin > ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262017-01-17 9:48 ` Richard Biener@ 2017-02-02 9:27 ` Robin Dapp2017-05-09 7:31 ` Robin Dapp 2017-05-11 15:08 ` Bin.Cheng 1 sibling, 1 reply; 47+ messages in thread From: Robin Dapp @ 2017-02-02 9:27 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches [-- Attachment #1: Type: text/plain, Size: 1424 bytes --] I skimmed through the code to see where transformation like (a - 1) -> (a + UINT_MAX) are performed. It seems there are only two places, match.pd (/* A - B -> A + (-B) if B is easily negatable. */) and fold-const.c. In order to be able to reliably know whether to zero-extend or to sign-extend the constant (1) in (ulong)(a - 1) + 1 -> (ulong)(a) + 0 or -> (ulong)(a) + (ulong)(UINT_MAX + 1) we'd have to know if the constant was converted by a transformation like above. I first tried removing the two transformations altogether but this causes around 20 new regressions on s390x and I didn't go through all of them to see whether they can be fixed. Is there a rationale for applying the transformations other than canonicalization? I saw some optimizations that only apply to PLUS_EXPRs but that can easily be changed to also include MINUS_EXPR. My other attempt entails an additional flag TREE_WAS_SIGNED for the lack of a better naming idea. It is set when the above transformation takes place. I used (NODE)->base.deprecated_flag but there may be better choices. Tentative/example patch attached for reference. Using this, the type extension in my original patch can be changed to w1 = w1.from (w1, TYPE_PRECISION (type), TREE_WAS_SIGNED (@1) ? SIGNED : TYPE_SIGN (inner_type)); which works for me and does not introduce regressions on s390x. Is this a viable approach? Regards Robin [-- Attachment #2: gcc-was-signed.diff --] [-- Type: text/x-patch, Size: 2736 bytes --] diff --git a/gcc/fold-const.c b/gcc/fold-const.c index c649e54..cbb7469 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -9648,9 +9648,15 @@ fold_binary_loc (location_t loc, && (TREE_CODE (op1) != REAL_CST || REAL_VALUE_NEGATIVE (TREE_REAL_CST (op1)))) || INTEGRAL_TYPE_P (type))) - return fold_build2_loc (loc, PLUS_EXPR, type, - fold_convert_loc (loc, type, arg0), - negate_expr (op1)); + { + tree negtem = negate_expr (op1); + if (CONSTANT_CLASS_P (negtem)) + TREE_WAS_SIGNED (negtem) = 1; + tree tem = fold_build2_loc (loc, PLUS_EXPR, type, + fold_convert_loc (loc, type, arg0), + negtem); + return tem; + } /* Fold &a[i] - &a[j] to i-j. */ if (TREE_CODE (arg0) == ADDR_EXPR @@ -13620,6 +13626,7 @@ fold_negate_const (tree arg0, tree type) t = force_fit_type (type, val, 1, (overflow | TREE_OVERFLOW (arg0)) && !TYPE_UNSIGNED (type)); + TREE_WAS_SIGNED (t) = TREE_WAS_SIGNED (arg0); break; } diff --git a/gcc/match.pd b/gcc/match.pd index b3f2063..e791942 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -870,7 +870,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (simplify (minus @0 negate_expr_p@1) (if (!FIXED_POINT_TYPE_P (type)) - (plus @0 (negate @1)))) + (with { + if (CONSTANT_CLASS_P (@1)) + TREE_WAS_SIGNED (@1) = 1; + } + (plus @0 (negate @1))))) /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)) when profitable. @@ -1223,8 +1227,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Extend @1 to TYPE. Perform sign extension if the range overflowed but did not split. */ - w1 = w1.from (w1, TYPE_PRECISION (type), ovf.ovf ? SIGNED : - TYPE_SIGN (inner_type)); + w1 = w1.from (w1, TYPE_PRECISION (type), TREE_WAS_SIGNED (@1) + ? SIGNED : TYPE_SIGN (inner_type)); /* w1 = w1.from (w1, TYPE_PRECISION (type), TYPE_SIGN (inner_type)); */ diff --git a/gcc/tree.h b/gcc/tree.h index 62cd7bb..1e7efd9 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -735,11 +735,18 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int, #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag) +/* Foo. */ + +#define TREE_WAS_SIGNED(NODE) (CST_CHECK (NODE)->base.deprecated_flag) + /* TREE_OVERFLOW can only be true for EXPR of CONSTANT_CLASS_P. */ #define TREE_OVERFLOW_P(EXPR) \ (CONSTANT_CLASS_P (EXPR) && TREE_OVERFLOW (EXPR)) +#define TREE_WAS_SIGNED_P(EXPR) \ + (CONSTANT_CLASS_P (EXPR) && TREE_WAS_SIGNED (EXPR)) + /* In a VAR_DECL, FUNCTION_DECL, NAMESPACE_DECL or TYPE_DECL, nonzero means name is to be accessible from outside this translation unit. In an IDENTIFIER_NODE, nonzero means an external declaration ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262017-02-02 9:27 ` Robin Dapp@ 2017-05-09 7:31 ` Robin Dapp0 siblings, 0 replies; 47+ messages in thread From: Robin Dapp @ 2017-05-09 7:31 UTC (permalink / raw) To: Richard Biener;+Cc:GCC Patches ping. ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262017-01-17 9:48 ` Richard Biener 2017-02-02 9:27 ` Robin Dapp@ 2017-05-11 15:08 ` Bin.Cheng2017-05-18 14:47 ` Robin Dapp ` (3 more replies) 1 sibling, 4 replies; 47+ messages in thread From: Bin.Cheng @ 2017-05-11 15:08 UTC (permalink / raw) To: Richard Biener;+Cc:Robin Dapp, GCC Patches On Tue, Jan 17, 2017 at 9:48 AM, Richard Biener <richard.guenther@gmail.com> wrote: > On Tue, Jan 10, 2017 at 2:32 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >> Perhaps I'm still missing how some cases are handled or not handled, >> sorry for the noise. >> >>> I'm not sure there is anything to "interpret" -- the operation is unsigned >>> and overflow is when the operation may wrap around zero. There might >>> be clever ways of re-writing the expression to >>> (uint64_t)((uint32_t)((int32_t)uint32 + -1) + 1) >>> avoiding the overflow and thus allowing the transform but I'm not sure that's >>> good. >> >> The extra work I introduced was to discern between >> >> (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a), >> (uint64_t)(a + UINT_MAX) + 1 -> (uint64_t)(a) + (uint64_t)(UINT_MAX + 1), >> >> For a's range of [1,1] there is an overflow in both cases. >> We still want to simplify the first case by combining UINT_MAX + 1 -> 0. > > So there's the case where a == 0 where (uint64_t)(0 + UINT_MAX) + 1 is not zero. > > I think we're still searching for the proper condition on when it is allowed to > re-write > > (uint64_t)(uint32_t + uint32_t-CST) + uint64_t-CST > > to > > (uint64_t)(uint32_t) + (uint64_t)uint32_t-CST + uint64_t-CST Hmm, won't (uint32_t + uint32_t-CST) doesn't overflow be sufficient condition for such transformation? > > or to > > (uint64_t)(uint32_t) + (uint64_t)(uint32_t-CST + (uint32_t)uint64_t-CST) We don't actually need to prove this? What complicates this is when it's safe to transform: (uint64_t)(uint32_t + uint32_t-CST) + uint64_t-CST into (uint64_t)(uint32_t + uint32_t-CSTx) where uint32_t-CSTx = uint32_t-CST + (uint32_t)uint64_t-CST Am I misunderstanding the whole thing? Thanks, bin > >> If "interpreting" UINT_MAX as -1 is not the correct thing to do, perhaps >> (uint64_t)((uint32_t)(UINT_MAX + 1)) is? This fails, however, if the >> outer constant is larger than UINT_MAX. What else can we do here? >> Do we see cases like the second one at all? If it's not needed, the >> extra work is likely not needed. > > We do have the need to associate and simplfy across conversions but of > course we have to do it conservatively correct. > >>> A related thing would be canonicalizing unsigned X plus CST to >>> unsigned X minus CST' >>> if CST' has a smaller absolute value than CST. I think currently we >>> simply canonicalize >>> to 'plus CST' but also only in fold-const.c, not in match.pd (ah we >>> do, but only in a simplified manner). >> >> I can imagine this could simplify the treatment of some cases, yet I'm >> already a bit lost with the current cases :) > > Yes, but I am lost a bit as well. I don't know the correct conditions to test > off-head -- I know we may not introduce new undefined overflow ops and > of course we need to not compute wrong numbers either. > >>> That said, can we leave that "trick" out of the patch? I think your >>> more complicated >>> "overflows" result from extract_range_from_binary_expr_1 doesn't apply to all >>> ops (like MULT_EXPR where more complicated cases can arise). >> >> There is certainly additional work to be done for MULT_EXPR, I >> disregarded it so far. For this patch, I'd rather conservatively assume >> overflow. > > Ok... > > Richard. > >> Regards >> Robin >> ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH] Tree-level fix for PR 695262017-05-11 15:08 ` Bin.Cheng@ 2017-05-18 14:47 ` Robin Dapp2017-05-18 14:48 ` [PATCH 1/3] Simplify wrapped binops Robin Dapp ` (2 subsequent siblings) 3 siblings, 0 replies; 47+ messages in thread From: Robin Dapp @ 2017-05-18 14:47 UTC (permalink / raw) To: Bin.Cheng, Richard Biener;+Cc:GCC Patches > Hmm, won't (uint32_t + uint32_t-CST) doesn't overflow be sufficient > condition for such transformation? Yes, in principle this should suffice. What we're actually looking for is something like a "proper" (or no) overflow, i.e. an overflow in both min and max of the value range. In (a + cst1) + cst2 an overflow of (a + cst1) will still produce a valid range as long as overflow(a_min + cst1) == overflow(a_max + cst1). When max overflows but min does not we must not simplify. Currently, I'm checking if the range of (a + cst1) is still a VR_RANGE, for now disregarding ANTI_RANGEs which could most likely be dealt with as well. A major discussion point in my last try was to wrongly always use sign-extend when extending cst1 to the outer type. This is now changed to use sign-extension when (a + cst1) "properly" overflows as in ASSUME (a > 0); (unsigned long)(a + UINT_MAX) + 1; resulting in (unsigned long)(a) + (unsigned long)0, or use the type sign of the constant like in ASSUME (a < 2); (unsigned long)(a + 4294967294u) + 10; resulting in (unsigned long)(a) + (unsigned long)((unsigned long)4294967294 + (unsigned long)10). The additional flag from the last patch is not necessary. Test suite is clean on s390x and x86, bootstraps successful. Regards Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*[PATCH 1/3] Simplify wrapped binops2017-05-11 15:08 ` Bin.Cheng 2017-05-18 14:47 ` Robin Dapp@ 2017-05-18 14:48 ` Robin Dapp2017-05-18 14:49 ` [PATCH 2/3] " Robin Dapp 2017-05-18 15:08 ` [PATCH 3/3] " Robin Dapp 3 siblings, 0 replies; 47+ messages in thread From: Robin Dapp @ 2017-05-18 14:48 UTC (permalink / raw) To: Bin.Cheng, Richard Biener;+Cc:GCC Patches [-- Attachment #1: Type: text/plain, Size: 546 bytes --] This tries to fold unconditionally and fixes some test cases. gcc/ChangeLog: 2017-05-18 Robin Dapp <rdapp@linux.vnet.ibm.com> * tree-ssa-propagate.c (substitute_and_fold_dom_walker::before_dom_children): Always try to fold. gcc/testsuite/ChangeLog: 2017-05-18 Robin Dapp <rdapp@linux.vnet.ibm.com> * g++.dg/tree-ssa/ssa-dse-2.C: Fix testcase. * gcc.dg/pr35691-1.c: Likewise. * gcc.dg/pr35691-2.c: Likewise. * gcc.dg/tree-ssa/forwprop-16.c: Likewise. * gcc.dg/tree-ssa/pr52631.c: Likewise. * gcc.dg/tree-ssa/vrp23.c: Likewise. [-- Attachment #2: gcc-wrapped-binop-p1.diff --] [-- Type: text/x-patch, Size: 4785 bytes --] diff --git a/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C b/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C index bee8651..3de195d 100644 --- a/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C +++ b/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-dse2-details" } */ +/* { dg-options "-O2 -fdump-tree-dse2-details -fdump-tree-vrp1-details" } */ typedef __SIZE_TYPE__ size_t; extern "C" @@ -54,6 +54,5 @@ fill_vec_av_set (av_set_t av) } /* { dg-final { scan-tree-dump-not "Trimming statement .head = -" "dse2" } } */ -/* { dg-final { scan-tree-dump "Deleted dead call: " "dse2" } } */ - - +/* { dg-final { scan-tree-dump "Folded into: GIMPLE_NOP" "vrp1" } } */ +/* { dg-final { scan-tree-dump-not "memmove" "dse2" } } */ diff --git a/gcc/testsuite/gcc.dg/pr35691-1.c b/gcc/testsuite/gcc.dg/pr35691-1.c index 125923d..2e1d8bd 100644 --- a/gcc/testsuite/gcc.dg/pr35691-1.c +++ b/gcc/testsuite/gcc.dg/pr35691-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-forwprop-details" } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details" } */ int foo(int z0, unsigned z1) { @@ -9,4 +9,4 @@ int foo(int z0, unsigned z1) return t2; } -/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) z1_\[0-9\]*\\(D\\);" "forwprop1" } } */ +/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) z1_\[0-9\]*\\(D\\);" "ccp1" } } */ diff --git a/gcc/testsuite/gcc.dg/pr35691-2.c b/gcc/testsuite/gcc.dg/pr35691-2.c index 70f68a6..d5e335b 100644 --- a/gcc/testsuite/gcc.dg/pr35691-2.c +++ b/gcc/testsuite/gcc.dg/pr35691-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-forwprop-details" } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details" } */ int foo(int z0, unsigned z1) { @@ -9,4 +9,4 @@ int foo(int z0, unsigned z1) return t2; } -/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) z1_\[0-9\]*\\(D\\);" "forwprop1" } } */ +/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) z1_\[0-9\]*\\(D\\);" "ccp1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-16.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-16.c index 1bef7e7..a153fb9 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-16.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-16.c @@ -10,4 +10,4 @@ int foo (double xx, double xy) return 2; } -/* { dg-final { scan-tree-dump "if \\\(x" "forwprop1" } } */ +/* { dg-final { scan-tree-dump "if \\\(x" "forwprop2" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr52631.c b/gcc/testsuite/gcc.dg/tree-ssa/pr52631.c index c18a5d5..2e99d11 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr52631.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr52631.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fno-tree-forwprop -fdump-tree-fre1-details" } */ +/* { dg-options "-O2 -fno-tree-forwprop -fdump-tree-ccp1-details" } */ unsigned f(unsigned a) { @@ -12,6 +12,6 @@ unsigned f(unsigned a) } /* We want to verify that we replace the b & 1 with b. */ -/* { dg-final { scan-tree-dump-times "Replaced b_\[0-9\]+ & 1 with b_\[0-9\]+ in" 1 "fre1"} } */ +/* { dg-final { scan-tree-dump-times "Match-and-simplified b_\[0-9\]+ & 1 to b_\[0-9\]+" 1 "ccp1"} } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c index ae68c090..2db8fa2 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fno-tree-forwprop -fdump-tree-vrp1-details" } */ +/* { dg-options "-O2 -fno-tree-forwprop -fdump-tree-ccp2-details" } */ void aa (void); void aos (void); @@ -45,5 +45,5 @@ L8: /* The n_sets > 0 test can be simplified into n_sets == 1 since the only way to reach the test is when n_sets <= 1, and the only value which satisfies both conditions is n_sets == 1. */ -/* { dg-final { scan-tree-dump-times "Simplified relational" 1 "vrp1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified" 1 "ccp2" } } */ diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c index 0693802..a2abffe 100644 --- a/gcc/tree-ssa-propagate.c +++ b/gcc/tree-ssa-propagate.c @@ -1065,11 +1065,13 @@ substitute_and_fold_dom_walker::before_dom_children (basic_block bb) /* Replace real uses in the statement. */ did_replace |= replace_uses_in (stmt, get_value_fn); + if (did_replace) + gimple_set_modified (stmt, true); /* If we made a replacement, fold the statement. */ - if (did_replace) + if (fold_stmt (&i, follow_single_use_edges)) { - fold_stmt (&i, follow_single_use_edges); + did_replace = true; stmt = gsi_stmt (i); gimple_set_modified (stmt, true); } ^ permalink raw reply [flat|nested] 47+ messages in thread

*[PATCH 2/3] Simplify wrapped binops2017-05-11 15:08 ` Bin.Cheng 2017-05-18 14:47 ` Robin Dapp 2017-05-18 14:48 ` [PATCH 1/3] Simplify wrapped binops Robin Dapp@ 2017-05-18 14:49 ` Robin Dapp2017-05-18 15:46 ` Bin.Cheng 2017-05-18 15:08 ` [PATCH 3/3] " Robin Dapp 3 siblings, 1 reply; 47+ messages in thread From: Robin Dapp @ 2017-05-18 14:49 UTC (permalink / raw) To: Bin.Cheng, Richard Biener;+Cc:GCC Patches [-- Attachment #1: Type: text/plain, Size: 289 bytes --] match.pd part of the patch. gcc/ChangeLog: 2017-05-18 Robin Dapp <rdapp@linux.vnet.ibm.com> * match.pd: Simplify wrapped binary operations. * tree-vrp.c (extract_range_from_binary_expr_1): Add overflow parameter. (extract_range_from_binary_expr): Likewise. * tree-vrp.h: Export. [-- Attachment #2: gcc-wrapped-binop-p2.diff --] [-- Type: text/x-patch, Size: 9121 bytes --] diff --git a/gcc/match.pd b/gcc/match.pd index 80a17ba..3fa18b9 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1290,6 +1290,85 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (plus { cst; } @0)))) +/* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (TREE_CODE (type) == INTEGER_TYPE + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@3))) + (with + { + bool ovf = true; + + tree cst = NULL_TREE; + tree inner_type = TREE_TYPE (@3); + value_range vr = VR_INITIALIZER; + + /* Convert combined constant to tree of outer type if + there was no overflow in the original operation. */ + wide_int minv, maxv; + if (TYPE_OVERFLOW_UNDEFINED (inner_type) + || (extract_range_from_binary_expr (&vr, inner_op, + inner_type, @0, @1, &ovf), vr.type == VR_RANGE)) + { + wide_int w1 = @1; + wide_int w2 = @2; + + wide_int combined_cst; + + /* Extend @1 to TYPE. */ + w1 = w1.from (w1, TYPE_PRECISION (type), + ovf ? SIGNED : TYPE_SIGN (TREE_TYPE (@1))); + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + /* Combine in outer, larger type. */ + combined_cst = wi::add (w1, w2); + + cst = wide_int_to_tree (type, combined_cst); + } + } + (if (cst) + (outer_op (convert @0) { cst; })) + ))))) +#endif + +/* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert @0) INTEGER_CST@2) + (if (TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) + && TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE + && TREE_CODE (type) == INTEGER_TYPE) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool ovf = true; + tree cst_inner = NULL_TREE; + value_range vr = VR_INITIALIZER; + + bool fits = int_fits_type_p (@2, TREE_TYPE (@0)); + if (fits) + { + tree cst_inner = fold_convert (TREE_TYPE (@0), @2); + + extract_range_from_binary_expr (&vr, outer_op, TREE_TYPE (@0), + @0, cst_inner, &ovf); + } + } + (if (vr.type == VR_RANGE && cst_inner) + (convert (outer_op @0 { cst_inner; }))) + )))) +#endif + /* ~A + A -> -1 */ (simplify (plus:c (bit_not @0) @0) diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index 0db8a3c..00f99a8 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -63,8 +63,6 @@ along with GCC; see the file COPYING3. If not see #include "domwalk.h" #include "tree-cfgcleanup.h" -#define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL } - /* Allocation pools for tree-vrp allocations. */ static object_allocator<value_range> vrp_value_range_pool ("Tree VRP value ranges"); static bitmap_obstack vrp_equiv_obstack; @@ -1940,7 +1938,8 @@ extract_range_from_multiplicative_op_1 (value_range *vr, static void extract_range_from_binary_expr_1 (value_range *vr, enum tree_code code, tree expr_type, - value_range *vr0_, value_range *vr1_) + value_range *vr0_, value_range *vr1_, + bool *ovf) { value_range vr0 = *vr0_, vr1 = *vr1_; value_range vrtem0 = VR_INITIALIZER, vrtem1 = VR_INITIALIZER; @@ -2012,12 +2011,13 @@ extract_range_from_binary_expr_1 (value_range *vr, if (vr0.type == VR_ANTI_RANGE && ranges_from_anti_range (&vr0, &vrtem0, &vrtem1)) { - extract_range_from_binary_expr_1 (vr, code, expr_type, &vrtem0, vr1_); + extract_range_from_binary_expr_1 (vr, code, expr_type, &vrtem0, vr1_, + ovf); if (vrtem1.type != VR_UNDEFINED) { value_range vrres = VR_INITIALIZER; extract_range_from_binary_expr_1 (&vrres, code, expr_type, - &vrtem1, vr1_); + &vrtem1, vr1_, ovf); vrp_meet (vr, &vrres); } return; @@ -2026,12 +2026,13 @@ extract_range_from_binary_expr_1 (value_range *vr, if (vr1.type == VR_ANTI_RANGE && ranges_from_anti_range (&vr1, &vrtem0, &vrtem1)) { - extract_range_from_binary_expr_1 (vr, code, expr_type, vr0_, &vrtem0); + extract_range_from_binary_expr_1 (vr, code, expr_type, vr0_, &vrtem0, + ovf); if (vrtem1.type != VR_UNDEFINED) { value_range vrres = VR_INITIALIZER; extract_range_from_binary_expr_1 (&vrres, code, expr_type, - vr0_, &vrtem1); + vr0_, &vrtem1, ovf); vrp_meet (vr, &vrres); } return; @@ -2270,6 +2271,10 @@ extract_range_from_binary_expr_1 (value_range *vr, max_ovf = 1; } + if (ovf != NULL) + *ovf = (min_ovf == 1 && max_ovf == 1) + || (min_ovf == -1 && max_ovf == -1); + /* If we have overflow for the constant part and the resulting range will be symbolic, drop to VR_VARYING. */ if ((min_ovf && sym_min_op0 != sym_min_op1) @@ -2589,7 +2594,7 @@ extract_range_from_binary_expr_1 (value_range *vr, saved_flag_wrapv = flag_wrapv; flag_wrapv = 1; extract_range_from_binary_expr_1 (vr, MULT_EXPR, expr_type, - &vr0, &vr1p); + &vr0, &vr1p, ovf); flag_wrapv = saved_flag_wrapv; return; } @@ -3012,14 +3017,24 @@ extract_range_from_binary_expr_1 (value_range *vr, set_value_range (vr, type, min, max, NULL); } +static void +extract_range_from_binary_expr_1 (value_range *vr, + enum tree_code code, tree expr_type, + value_range *vr0_, value_range *vr1_) +{ + extract_range_from_binary_expr_1 (vr, code, expr_type, + vr0_, vr1_, NULL); +} + /* Extract range information from a binary expression OP0 CODE OP1 based on the ranges of each of its operands with resulting type EXPR_TYPE. The resulting range is stored in *VR. */ -static void +void extract_range_from_binary_expr (value_range *vr, enum tree_code code, - tree expr_type, tree op0, tree op1) + tree expr_type, tree op0, tree op1, + bool *ovf) { value_range vr0 = VR_INITIALIZER; value_range vr1 = VR_INITIALIZER; @@ -3027,20 +3042,38 @@ extract_range_from_binary_expr (value_range *vr, /* Get value ranges for each operand. For constant operands, create a new value range with the operand to simplify processing. */ if (TREE_CODE (op0) == SSA_NAME) - vr0 = *(get_value_range (op0)); + { + value_range *vtmp = get_value_range (op0); + if (ovf != NULL && vtmp == NULL) + { + vr->type = VR_VARYING; + return; + } + else + vr0 = *vtmp; + } else if (is_gimple_min_invariant (op0)) set_value_range_to_value (&vr0, op0, NULL); else set_value_range_to_varying (&vr0); if (TREE_CODE (op1) == SSA_NAME) - vr1 = *(get_value_range (op1)); + { + value_range *vtmp = get_value_range (op1); + if (ovf != NULL && vtmp == NULL) + { + vr->type = VR_VARYING; + return; + } + else + vr1 = *vtmp; + } else if (is_gimple_min_invariant (op1)) set_value_range_to_value (&vr1, op1, NULL); else set_value_range_to_varying (&vr1); - extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &vr1, ovf); /* Try harder for PLUS and MINUS if the range of one operand is symbolic and based on the other operand, for example if it was deduced from a @@ -3068,7 +3101,7 @@ extract_range_from_binary_expr (value_range *vr, else set_value_range (&n_vr1, VR_RANGE, op1, op1, NULL); - extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &n_vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &vr0, &n_vr1, ovf); } if (vr->type == VR_VARYING @@ -3092,7 +3125,7 @@ extract_range_from_binary_expr (value_range *vr, else set_value_range (&n_vr0, VR_RANGE, op0, op0, NULL); - extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1); + extract_range_from_binary_expr_1 (vr, code, expr_type, &n_vr0, &vr1, ovf); } /* If we didn't derive a range for MINUS_EXPR, and @@ -3111,6 +3144,14 @@ extract_range_from_binary_expr (value_range *vr, set_value_range_to_nonnull (vr, TREE_TYPE (op0)); } +void +extract_range_from_binary_expr (value_range *vr, + enum tree_code code, + tree expr_type, tree op0, tree op1) +{ + extract_range_from_binary_expr (vr, code, expr_type, op0, op1, NULL); +} + /* Extract range information from a unary operation CODE based on the range of its operand *VR0 with type OP0_TYPE with resulting type TYPE. The resulting range is stored in *VR. */ diff --git a/gcc/tree-vrp.h b/gcc/tree-vrp.h index ef2c68a..ad55ad2 100644 --- a/gcc/tree-vrp.h +++ b/gcc/tree-vrp.h @@ -56,4 +56,9 @@ extern void extract_range_from_unary_expr (value_range *vr, tree type, value_range *vr0_, tree op0_type); +extern void extract_range_from_binary_expr (value_range *vr, + enum tree_code code, + tree expr_type, tree op0, tree op1, + bool *ovf); +#define VR_INITIALIZER { VR_UNDEFINED, NULL_TREE, NULL_TREE, NULL } ^ permalink raw reply [flat|nested] 47+ messages in thread

*[PATCH 3/3] Simplify wrapped binops2017-05-11 15:08 ` Bin.Cheng ` (2 preceding siblings ...) 2017-05-18 14:49 ` [PATCH 2/3] " Robin Dapp@ 2017-05-18 15:08 ` Robin Dapp3 siblings, 0 replies; 47+ messages in thread From: Robin Dapp @ 2017-05-18 15:08 UTC (permalink / raw) To: Bin.Cheng, Richard Biener;+Cc:GCC Patches [-- Attachment #1: Type: text/plain, Size: 264 bytes --] New testcases. gcc/testsuite/ChangeLog: 2017-05-18 Robin Dapp <rdapp@linux.vnet.ibm.com> * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test. [-- Attachment #2: gcc-wrapped-binop-p3.diff --] [-- Type: text/x-patch, Size: 3641 bytes --] diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c new file mode 100644 index 0000000..19b787b --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "ccp1" } } */ + +#include <limits.h> + +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +long bak(int a) +{ + return (long)(a + 1) + 3; +} + +long bal(int a) +{ + return (long)(a - 7) - 4; +} + +long bam(int a) +{ + return (long)(a - 1) - INT_MAX; +} + +long bam2(int a) +{ + return (long)(a + 1) + INT_MAX; +} + +long ban(int a) +{ + return (long)(a - 1) + INT_MIN; +} + +long ban2(int a) +{ + return (long)(a + 1) - INT_MIN; +} + +unsigned long baq(int a) +{ + return (unsigned long)(a + 1) - 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c new file mode 100644 index 0000000..aaaa850 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 1 "ccp1" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 5 "vrp1" } } */ + +#include <limits.h> + +unsigned long oof2(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a - 1) + 1; +} + +unsigned long bap(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) + ULONG_MAX; +} + +unsigned long bar3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 5; +} + +unsigned long bar4(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (unsigned long)(a + 1) - 6; +} + +unsigned long baq(int a) +{ + return (unsigned long)(a - 2) + 1; +} + +long baq3(unsigned int a) +{ + if (a > 3 && a < INT_MAX - 100) + return (long)(a - 1) + 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c new file mode 100644 index 0000000..6a900ef --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c @@ -0,0 +1,42 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <assert.h> +#include <limits.h> + +unsigned int a = 3; +int aa = 3; +int bb = 1; +int cc = 4; + +int main() +{ + volatile unsigned long b = (unsigned long)(UINT_MAX + 1) - 1; + assert (b == 18446744073709551615ul); + + volatile unsigned long c = (unsigned long)(a - 4) + 1; + assert (c == 4294967296); + + volatile unsigned long d = (unsigned long)(a + UINT_MAX - 4) + 2; + assert (d == 4294967296); + + volatile unsigned long e = (unsigned long)(a - UINT_MAX) + UINT_MAX; + assert (e == 4294967299); + + volatile unsigned long f = (unsigned long)(a + UINT_MAX) - UINT_MAX; + assert (f == 18446744069414584323ul); + + volatile long g = (long)(a - 4) + 1; + assert (g == 4294967296); + + volatile long h = (long)(aa + UINT_MAX) + 1; + assert (h == 3); + + /* Zero-extend. */ + volatile unsigned long i = (unsigned long)(bb + 4294967294u) + 5000000000; + assert (i == 9294967295); + + /* Sign-extend. */ + volatile unsigned long j = (unsigned long)(cc + 4294967294u) + 8000000000; + assert (j == 8000000002); +} ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-05-18 14:49 ` [PATCH 2/3] " Robin Dapp@ 2017-05-18 15:46 ` Bin.Cheng2017-05-18 16:09 ` Robin Dapp 0 siblings, 1 reply; 47+ messages in thread From: Bin.Cheng @ 2017-05-18 15:46 UTC (permalink / raw) To: Robin Dapp;+Cc:Richard Biener, GCC Patches On Thu, May 18, 2017 at 3:47 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: > match.pd part of the patch. > > gcc/ChangeLog: > > 2017-05-18 Robin Dapp <rdapp@linux.vnet.ibm.com> > > * match.pd: Simplify wrapped binary operations. > * tree-vrp.c (extract_range_from_binary_expr_1): Add overflow > parameter. > (extract_range_from_binary_expr): Likewise. > * tree-vrp.h: Export. Hi, I didn't follow this issue from the beginning, so might asking stupid questions. > diff --git a/gcc/match.pd b/gcc/match.pd > index 80a17ba..3fa18b9 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -1290,6 +1290,85 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (if (cst && !TREE_OVERFLOW (cst)) > (plus { cst; } @0)))) > > +/* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ > +#if GIMPLE > + (for outer_op (plus minus) > + (for inner_op (plus minus) > + (simplify > + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) > + (if (TREE_CODE (type) == INTEGER_TYPE > + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@3))) > + (with > + { > + bool ovf = true; > + > + tree cst = NULL_TREE; > + tree inner_type = TREE_TYPE (@3); > + value_range vr = VR_INITIALIZER; > + > + /* Convert combined constant to tree of outer type if > + there was no overflow in the original operation. */ > + wide_int minv, maxv; > + if (TYPE_OVERFLOW_UNDEFINED (inner_type) > + || (extract_range_from_binary_expr (&vr, inner_op, > + inner_type, @0, @1, &ovf), vr.type == VR_RANGE)) Any reason to expose tree-vrp.c internal interface here? The function looks quite expensive. Overflow check can be done by get_range_info and simple wi::cmp calls. Existing code like in tree-ssa-loop-niters.c already does that. Also could you avoid using comma expressions in condition please? It only makes the code harder to be read. Thanks, bin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-05-18 15:46 ` Bin.Cheng@ 2017-05-18 16:09 ` Robin Dapp2017-05-18 17:15 ` Bin.Cheng 0 siblings, 1 reply; 47+ messages in thread From: Robin Dapp @ 2017-05-18 16:09 UTC (permalink / raw) To: Bin.Cheng;+Cc:Richard Biener, GCC Patches > Any reason to expose tree-vrp.c internal interface here? The function > looks quite expensive. Overflow check can be done by get_range_info > and simple wi::cmp calls. Existing code like in > tree-ssa-loop-niters.c already does that. Also could you avoid using > comma expressions in condition please? It only makes the code harder > to be read. I tried get_range_info () as well and got a FAIL in gcc.c-torture/execute/bitfld-5.c. where get_range_info () returns a VR_RANGE but extract...() gives VR_VARYING. The test case relies on not simplifying, i.e. would expect a VR_VARYING here but I didn't look into it more. Also, is there a possibility to know if there was an "ok" overflow or not from get_range_info ()'s output? Would I have to compare the result with the involved variable's range? Regards Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-05-18 16:09 ` Robin Dapp@ 2017-05-18 17:15 ` Bin.Cheng2017-05-19 10:13 ` Robin Dapp 0 siblings, 1 reply; 47+ messages in thread From: Bin.Cheng @ 2017-05-18 17:15 UTC (permalink / raw) To: Robin Dapp;+Cc:Richard Biener, GCC Patches On Thu, May 18, 2017 at 5:08 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >> Any reason to expose tree-vrp.c internal interface here? The function >> looks quite expensive. Overflow check can be done by get_range_info >> and simple wi::cmp calls. Existing code like in >> tree-ssa-loop-niters.c already does that. Also could you avoid using >> comma expressions in condition please? It only makes the code harder >> to be read. > > I tried get_range_info () as well and got a FAIL in > gcc.c-torture/execute/bitfld-5.c. > where get_range_info () returns a VR_RANGE but extract...() gives > VR_VARYING. The test case relies on not simplifying, i.e. would expect > a VR_VARYING here but I didn't look into it more. I can guess what is happening here. It's a 40 bits unsigned long long field, (s.b-8) will be like: _1 = s.b _2 = _1 + 0xfffffffff8 Also get_range_info returns value range [0, 0xFFFFFFFFFF] for _1. You'd need to check if _1(with range [0, 0xFFFFFFFFFF]) + 0xfffffffff8 overflows against precision of the bit-field which is 40 bits precision. The failure might because overflowness is checked against unsigned long long's precision which is 64 bits. > > Also, is there a possibility to know if there was an "ok" overflow or > not from get_range_info ()'s output? Would I have to compare the result > with the involved variable's range? I think you have to check it manually against max/min value of that type precision. Thanks, bin > > Regards > Robin > ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-05-18 17:15 ` Bin.Cheng@ 2017-05-19 10:13 ` Robin Dapp2017-05-19 10:22 ` Bin.Cheng 0 siblings, 1 reply; 47+ messages in thread From: Robin Dapp @ 2017-05-19 10:13 UTC (permalink / raw) To: Bin.Cheng;+Cc:Richard Biener, GCC Patches > I can guess what is happening here. It's a 40 bits unsigned long long > field, (s.b-8) will be like: > _1 = s.b > _2 = _1 + 0xfffffffff8 > Also get_range_info returns value range [0, 0xFFFFFFFFFF] for _1. > You'd need to check if _1(with range [0, 0xFFFFFFFFFF]) + 0xfffffffff8 > overflows against precision of the bit-field which is 40 bits > precision. The failure might because overflowness is checked against > unsigned long long's precision which is 64 bits. >> Also, is there a possibility to know if there was an "ok" overflow or >> not from get_range_info ()'s output? Would I have to compare the result >> with the involved variable's range? > I think you have to check it manually against max/min value of that > type precision. Currently, extract_... () does that all that for me, is it really too expensive to call? I guess, using get_range_info first and calling extract when get_range_info returns a VR_RANGE is not really a favorable thing to do either? :) Regards Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-05-19 10:13 ` Robin Dapp@ 2017-05-19 10:22 ` Bin.Cheng2017-05-19 10:32 ` Richard Biener 2017-06-20 13:08 ` Robin Dapp 0 siblings, 2 replies; 47+ messages in thread From: Bin.Cheng @ 2017-05-19 10:22 UTC (permalink / raw) To: Robin Dapp;+Cc:Richard Biener, GCC Patches On Fri, May 19, 2017 at 11:09 AM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >> I can guess what is happening here. It's a 40 bits unsigned long long >> field, (s.b-8) will be like: >> _1 = s.b >> _2 = _1 + 0xfffffffff8 >> Also get_range_info returns value range [0, 0xFFFFFFFFFF] for _1. >> You'd need to check if _1(with range [0, 0xFFFFFFFFFF]) + 0xfffffffff8 >> overflows against precision of the bit-field which is 40 bits >> precision. The failure might because overflowness is checked against >> unsigned long long's precision which is 64 bits. > >>> Also, is there a possibility to know if there was an "ok" overflow or >>> not from get_range_info ()'s output? Would I have to compare the result >>> with the involved variable's range? >> I think you have to check it manually against max/min value of that >> type precision. > > Currently, extract_... () does that all that for me, is it really too > expensive to call? I guess, using get_range_info first and calling > extract when get_range_info returns a VR_RANGE is not really a favorable > thing to do either? :) Not only the cost, we should avoid introducing more interfaces while old ones can do the work. Anyway, it's Richard's call here. Thanks, bin > > Regards > Robin > ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-05-19 10:22 ` Bin.Cheng@ 2017-05-19 10:32 ` Richard Biener2017-06-20 13:08 ` Robin Dapp 1 sibling, 0 replies; 47+ messages in thread From: Richard Biener @ 2017-05-19 10:32 UTC (permalink / raw) To: Bin.Cheng;+Cc:Robin Dapp, GCC Patches On Fri, May 19, 2017 at 12:13 PM, Bin.Cheng <amker.cheng@gmail.com> wrote: > On Fri, May 19, 2017 at 11:09 AM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >>> I can guess what is happening here. It's a 40 bits unsigned long long >>> field, (s.b-8) will be like: >>> _1 = s.b >>> _2 = _1 + 0xfffffffff8 >>> Also get_range_info returns value range [0, 0xFFFFFFFFFF] for _1. >>> You'd need to check if _1(with range [0, 0xFFFFFFFFFF]) + 0xfffffffff8 >>> overflows against precision of the bit-field which is 40 bits >>> precision. The failure might because overflowness is checked against >>> unsigned long long's precision which is 64 bits. >> >>>> Also, is there a possibility to know if there was an "ok" overflow or >>>> not from get_range_info ()'s output? Would I have to compare the result >>>> with the involved variable's range? >>> I think you have to check it manually against max/min value of that >>> type precision. >> >> Currently, extract_... () does that all that for me, is it really too >> expensive to call? I guess, using get_range_info first and calling >> extract when get_range_info returns a VR_RANGE is not really a favorable >> thing to do either? :) > Not only the cost, we should avoid introducing more interfaces while > old ones can do the work. Anyway, it's Richard's call here. Using get_range_info and wi:: is prefered, I didn't look into the issue you are running into but wi:: do have proper bit-precision tracking. Maybe the overflow checks are not implemented correctly there though. Richard. > Thanks, > bin >> >> Regards >> Robin >> ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-05-19 10:22 ` Bin.Cheng 2017-05-19 10:32 ` Richard Biener@ 2017-06-20 13:08 ` Robin Dapp2017-06-20 13:49 ` Richard Biener 1 sibling, 1 reply; 47+ messages in thread From: Robin Dapp @ 2017-06-20 13:08 UTC (permalink / raw) To: Bin.Cheng;+Cc:Richard Biener, GCC Patches [-- Attachment #1: Type: text/plain, Size: 806 bytes --] >> Currently, extract_... () does that all that for me, is it really too >> expensive to call? I guess, using get_range_info first and calling >> extract when get_range_info returns a VR_RANGE is not really a favorable >> thing to do either? :) > Not only the cost, we should avoid introducing more interfaces while > old ones can do the work. Anyway, it's Richard's call here. I rewrote the match.pd patterns to use get_range_info () now, keeping track of an "ok" overflow (both min and max overflow) and one which does not allow us to continue (min xor max overflow, split/anti range). Test suite on s390x has no regressions, bootstrap is ok, x86 running. Regards Robin -- gcc/ChangeLog: 2017-06-19 Robin Dapp <rdapp@linux.vnet.ibm.com> * match.pd: Simplify wrapped binary operations. [-- Attachment #2: gcc-wrapped-binop-p2.diff --] [-- Type: text/x-patch, Size: 3533 bytes --] diff --git a/gcc/match.pd b/gcc/match.pd index 80a17ba..66c37f6 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1290,6 +1290,128 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (plus { cst; } @0)))) +/* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (TREE_CODE (type) == INTEGER_TYPE + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@3))) + (with + { + tree cst = NULL_TREE; + tree inner_type = TREE_TYPE (@3); + wide_int wmin, wmax; + wide_int wmin0, wmax0; + + bool ovf = true; + bool ovf_undef = TYPE_OVERFLOW_UNDEFINED (inner_type); + + enum value_range_type vr_outer = + get_range_info (@3, &wmin, &wmax); + enum value_range_type vr0 = + get_range_info (@0, &wmin0, &wmax0); + + /* Convert combined constant to tree of outer type if + there was no overflow in the original operation. */ + if (ovf_undef || vr_outer == VR_RANGE) + { + wide_int w1 = @1; + wide_int w2 = @2; + + if (ovf_undef || vr0 == VR_RANGE) + { + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + bool split_range = true; + + if (!ovf_undef && vr0 == VR_RANGE) + { + int max_ovf = 0; + int min_ovf = 0; + + signop sgn = TYPE_SIGN (inner_type); + + wmin = wi::add (wmin0, w1); + min_ovf = wi::cmp (wmin, w1, sgn) < 0; + + wmax = wi::add (wmax0, w1); + max_ovf = wi::cmp (wmax, w1, sgn) < 0; + + ovf = min_ovf || max_ovf; + + split_range = ((min_ovf && !max_ovf) + || (!min_ovf && max_ovf)); + } + + if (ovf_undef || !split_range) + { + /* Extend @1 to TYPE. */ + w1 = w1.from (w1, TYPE_PRECISION (type), + ovf ? SIGNED : TYPE_SIGN (TREE_TYPE (@1))); + + /* Combine in outer, larger type. */ + wide_int combined_cst; + combined_cst = wi::add (w1, w2); + + cst = wide_int_to_tree (type, combined_cst); + } + } + } + } +(if (cst) + (outer_op (convert @0) { cst; })) + ))))) +#endif + +/* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert SSA_NAME@0) INTEGER_CST@2) + (if (TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) + && TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE + && TREE_CODE (type) == INTEGER_TYPE) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool split_range = true; + tree cst_inner = NULL_TREE; + enum value_range_type vr = VR_VARYING; + tree inner_type = TREE_TYPE (@0); + + if (int_fits_type_p (@2, inner_type)) + { + cst_inner = fold_convert (inner_type, @2); + + wide_int wmin0, wmax0; + wide_int w1 = cst_inner; + signop sgn = TYPE_SIGN (inner_type); + vr = get_range_info (@0, &wmin0, &wmax0); + + if (vr == VR_RANGE) + { + wide_int wmin = wi::add (wmin0, w1); + bool min_ovf = wi::cmp (wmin, w1, sgn) < 0; + + wide_int wmax = wi::add (wmax0, w1); + bool max_ovf = wi::cmp (wmax, w1, sgn) < 0; + + split_range = (min_ovf && !max_ovf) || (!min_ovf && max_ovf); + } + } + } + (if (cst_inner && !split_range) + (convert (outer_op @0 { cst_inner; }))) + )))) +#endif + /* ~A + A -> -1 */ (simplify (plus:c (bit_not @0) @0) ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-06-20 13:08 ` Robin Dapp@ 2017-06-20 13:49 ` Richard Biener2017-06-21 11:44 ` Robin Dapp 0 siblings, 1 reply; 47+ messages in thread From: Richard Biener @ 2017-06-20 13:49 UTC (permalink / raw) To: Robin Dapp;+Cc:Bin.Cheng, GCC Patches On Tue, Jun 20, 2017 at 3:08 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >>> Currently, extract_... () does that all that for me, is it really too >>> expensive to call? I guess, using get_range_info first and calling >>> extract when get_range_info returns a VR_RANGE is not really a favorable >>> thing to do either? :) >> Not only the cost, we should avoid introducing more interfaces while >> old ones can do the work. Anyway, it's Richard's call here. > > I rewrote the match.pd patterns to use get_range_info () now, keeping > track of an "ok" overflow (both min and max overflow) and one which does > not allow us to continue (min xor max overflow, split/anti range). Test > suite on s390x has no regressions, bootstrap is ok, x86 running. + (if (TREE_CODE (type) == INTEGER_TYPE + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@3))) + (with use INTEGRAL_TYPE_P. + bool ovf_undef = TYPE_OVERFLOW_UNDEFINED (inner_type); + so this is overflow behavior of the inner op. + /* Convert combined constant to tree of outer type if + there was no overflow in the original operation. */ "in the original inner operation." you then go on and use ovf_undef also for the outer operation: + if (ovf_undef || vr_outer == VR_RANGE) + { but you do not actually _use_ vr_outer. Do you think that if vr_outer is a VR_RANGE then the outer operation may not possibly have wrapped? That's a false conclusion. But I don't see how overflow in the original outer operation matters and the code lacks comments as to explaining that as well. So if you have a vr0 then you can compute whether the inner operation cannot overflow. You do this here: + if (!ovf_undef && vr0 == VR_RANGE) + { + int max_ovf = 0; + int min_ovf = 0; + + signop sgn = TYPE_SIGN (inner_type); + + wmin = wi::add (wmin0, w1); + min_ovf = wi::cmp (wmin, w1, sgn) < 0; + + wmax = wi::add (wmax0, w1); + max_ovf = wi::cmp (wmax, w1, sgn) < 0; + + ovf = min_ovf || max_ovf; + + split_range = ((min_ovf && !max_ovf) + || (!min_ovf && max_ovf)); ah, here's the use of the outer value-range. This lacks a comment (and it looks fishy given the outer value-range is a conservative approximation and thus could be [-INF, +INF]). Why's this not using the wi::add overload with the overflow flag? ISTR you want to handle "negative" unsigned constants somehow, but then I don't see how the above works. I'd say if wmin/wmax interpreted as signed are positive and then using a signed op to add w1 results in a still positive number you're fine (you don't seem to restrict the widening cast to either zero- or sign-extending). + if (ovf_undef || !split_range) + { + /* Extend @1 to TYPE. */ + w1 = w1.from (w1, TYPE_PRECISION (type), + ovf ? SIGNED : TYPE_SIGN (TREE_TYPE (@1))); ideally you could always interpret w1 as signed? + /* Combine in outer, larger type. */ + wide_int combined_cst; + combined_cst = wi::add (w1, w2); +(if (cst) + (outer_op (convert @0) { cst; })) + ))))) bogus indent. +/* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert SSA_NAME@0) INTEGER_CST@2) + (if (TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) + && TREE_CODE (TREE_TYPE (@0)) == INTEGER_TYPE + && TREE_CODE (type) == INTEGER_TYPE) INTEGRAL_TYPE_P and do that first before looking at TYPE_PRECISION. + if (vr == VR_RANGE) + { + wide_int wmin = wi::add (wmin0, w1); + bool min_ovf = wi::cmp (wmin, w1, sgn) < 0; + + wide_int wmax = wi::add (wmax0, w1); + bool max_ovf = wi::cmp (wmax, w1, sgn) < 0; + + split_range = (min_ovf && !max_ovf) || (!min_ovf && max_ovf); similar why not use wi:add overload with the overflow flag? Btw, I find (with { tree x = NULL; if (...) x = non-NULL; } (if (x) (....)))) ugly. Use (with { ... } (if (...) (... { non-NULL } ) or sth like that which makes control flow more easily visible. Richard. > Regards > Robin > > -- > > gcc/ChangeLog: > > 2017-06-19 Robin Dapp <rdapp@linux.vnet.ibm.com> > > * match.pd: Simplify wrapped binary operations. ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-06-20 13:49 ` Richard Biener@ 2017-06-21 11:44 ` Robin Dapp2017-06-27 7:17 ` Robin Dapp 2017-06-27 12:14 ` Richard Biener 0 siblings, 2 replies; 47+ messages in thread From: Robin Dapp @ 2017-06-21 11:44 UTC (permalink / raw) To: Richard Biener;+Cc:Bin.Cheng, GCC Patches [-- Attachment #1: Type: text/plain, Size: 1038 bytes --] > use INTEGRAL_TYPE_P. Done. > but you do not actually _use_ vr_outer. Do you think that if > vr_outer is a VR_RANGE then the outer operation may not > possibly have wrapped? That's a false conclusion. These were remains of a previous version. vr_outer is indeed not needed anymore; removed. > wi::add overload with the overflow flag? ISTR you want to handle "negative" > unsigned constants somehow, but then I don't see how the above works. > I'd say if wmin/wmax interpreted as signed are positive and then using > a signed op to add w1 results in a still positive number you're fine > (you don't seem > to restrict the widening cast to either zero- or sign-extending). Changed to using wi:add overload now. In essence, three cases are being handled: - wrapped_range --> do not simplify - !wrapped_range && ovf ("negative" unsigned) --> simplify and combine with sign extension in the outer type - !wrapped_range && !ovf ("positive" unsigned) --> simplify and combine with zero extension in the outer type. Regards Robin [-- Attachment #2: gcc-wrapped-binop-p2.diff --] [-- Type: text/x-patch, Size: 3176 bytes --] diff --git a/gcc/match.pd b/gcc/match.pd index 80a17ba..ec1af69 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1290,6 +1290,116 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (plus { cst; } @0)))) +/* ((T)(A +- CST)) +- CST -> (T)(A) +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (for inner_op (plus minus) + (simplify + (outer_op (convert (inner_op@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (INTEGRAL_TYPE_P (type) + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@3))) + (with + { + tree cst; + tree inner_type = TREE_TYPE (@3); + wide_int wmin0, wmax0; + + bool ovf = true; + bool ovf_undef = TYPE_OVERFLOW_UNDEFINED (inner_type); + + enum value_range_type vr0 = + get_range_info (@0, &wmin0, &wmax0); + + bool wrapped_range = true; + + /* Convert combined constant to tree of outer type if + there was no overflow in the original inner operation. */ + if (ovf_undef || vr0 == VR_RANGE) + { + wide_int w1 = @1; + wide_int w2 = @2; + + if (inner_op == MINUS_EXPR) + w1 = wi::neg (w1); + + if (outer_op == MINUS_EXPR) + w2 = wi::neg (w2); + + bool ovf; + + if (!ovf_undef && vr0 == VR_RANGE) + { + bool max_ovf; + bool min_ovf; + + signop sgn = TYPE_SIGN (inner_type); + wi::add (wmin0, w1, sgn, &min_ovf); + wi::add (wmax0, w1, sgn, &max_ovf); + + ovf = min_ovf || max_ovf; + wrapped_range = ((min_ovf && !max_ovf) + || (!min_ovf && max_ovf)); + } + + /* Extend @1 to TYPE. */ + w1 = w1.from (w1, TYPE_PRECISION (type), + ovf ? SIGNED : TYPE_SIGN (inner_type)); + + /* Combine in outer, larger type. */ + wide_int combined_cst; + combined_cst = wi::add (w1, w2); + + cst = wide_int_to_tree (type, combined_cst); + } + } + (if (ovf_undef || !wrapped_range) + (outer_op (convert @0) { cst; })) + ))))) +#endif + +/* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert SSA_NAME@0) INTEGER_CST@2) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) + && INTEGRAL_TYPE_P (type) + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0))) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool wrapped_range = true; + tree cst_inner = NULL_TREE; + enum value_range_type vr = VR_VARYING; + tree inner_type = TREE_TYPE (@0); + + if (int_fits_type_p (@2, inner_type)) + { + cst_inner = fold_convert (inner_type, @2); + + wide_int wmin0, wmax0; + wide_int w1 = cst_inner; + signop sgn = TYPE_SIGN (inner_type); + vr = get_range_info (@0, &wmin0, &wmax0); + + if (vr == VR_RANGE) + { + bool min_ovf; + wi::add (wmin0, w1, sgn, &min_ovf); + + bool max_ovf; + wi::add (wmax0, w1, sgn, &max_ovf); + + wrapped_range = (min_ovf && !max_ovf) || (!min_ovf && max_ovf); + } + } + } + (if (cst_inner && !wrapped_range) + (convert (outer_op @0 { cst_inner; }))) + )))) +#endif + /* ~A + A -> -1 */ (simplify (plus:c (bit_not @0) @0) ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-06-21 11:44 ` Robin Dapp@ 2017-06-27 7:17 ` Robin Dapp2017-06-27 12:14 ` Richard Biener 1 sibling, 0 replies; 47+ messages in thread From: Robin Dapp @ 2017-06-27 7:17 UTC (permalink / raw) To: Richard Biener;+Cc:Bin.Cheng, GCC Patches Ping. ^ permalink raw reply [flat|nested] 47+ messages in thread

*2017-06-21 11:44 ` Robin Dapp 2017-06-27 7:17 ` Robin DappRe: [PATCH 2/3] Simplify wrapped binops@ 2017-06-27 12:14 ` Richard Biener2017-06-28 14:35 ` Robin Dapp 1 sibling, 1 reply; 47+ messages in thread From: Richard Biener @ 2017-06-27 12:14 UTC (permalink / raw) To: Robin Dapp;+Cc:Bin.Cheng, GCC Patches On Wed, Jun 21, 2017 at 1:44 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote: >> use INTEGRAL_TYPE_P. > > Done. > >> but you do not actually _use_ vr_outer. Do you think that if >> vr_outer is a VR_RANGE then the outer operation may not >> possibly have wrapped? That's a false conclusion. > > These were remains of a previous version. vr_outer is indeed not needed > anymore; removed. > >> wi::add overload with the overflow flag? ISTR you want to handle "negative" >> unsigned constants somehow, but then I don't see how the above works. >> I'd say if wmin/wmax interpreted as signed are positive and then using >> a signed op to add w1 results in a still positive number you're fine >> (you don't seem >> to restrict the widening cast to either zero- or sign-extending). > > Changed to using wi:add overload now. > > In essence, three cases are being handled: > - wrapped_range --> do not simplify > - !wrapped_range && ovf ("negative" unsigned) --> simplify and combine > with sign extension in the outer type > - !wrapped_range && !ovf ("positive" unsigned) --> simplify and combine > with zero extension in the outer type. Let's split this and look at the simpler case: +/* ((T)(A)) +- CST -> (T)(A +- CST) */ +#if GIMPLE + (for outer_op (plus minus) + (simplify + (outer_op (convert SSA_NAME@0) INTEGER_CST@2) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) + && INTEGRAL_TYPE_P (type) + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0))) + /* Perform binary operation inside the cast if the constant fits + and there is no overflow. */ + (with + { + bool wrapped_range = true; + tree cst_inner = NULL_TREE; + enum value_range_type vr = VR_VARYING; + tree inner_type = TREE_TYPE (@0); + + if (int_fits_type_p (@2, inner_type)) do && get_range_info (...) == VR_RANGE) here. That avoids vr and its initialization and you get all of the "work" when you know it will eventually succeed. + { + cst_inner = fold_convert (inner_type, @2); ideally you'd use a wide-int here and defer the tree allocation to the result wide_int wi = wi::from (@2, TYPE_PRECISION (inner_type), TYPE_SIGN (inner_type)); + wide_int wmin0, wmax0; + wide_int w1 = cst_inner; + signop sgn = TYPE_SIGN (inner_type); + vr = get_range_info (@0, &wmin0, &wmax0); + + if (vr == VR_RANGE) + { + bool min_ovf; + wi::add (wmin0, w1, sgn, &min_ovf); + + bool max_ovf; + wi::add (wmax0, w1, sgn, &max_ovf); So I guess we never run into the outer_op == minus case as the above is clearly wrong for that? The comment above says "if there is no overflow" but below you allow min_ovf && max_ovf without any further explanation. + wrapped_range = (min_ovf && !max_ovf) || (!min_ovf && max_ovf); + } + } + } + (if (cst_inner && !wrapped_range) + (convert (outer_op @0 { cst_inner; }))) thus (if ((min_ovf && !max_ovf) || ....) (convert (outer_op @0 { wide_int_to_tree (inner_type, w1); } )))) try to keep vertical spacing in patterns minimal -- I belive that patterns should be small enough to fit in a terminal window (24 lines). Richard. + )))) +#endif > Regards > Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-06-27 12:14 ` Richard Biener@ 2017-06-28 14:35 ` Robin Dapp2017-07-03 13:10 ` Richard Biener 0 siblings, 1 reply; 47+ messages in thread From: Robin Dapp @ 2017-06-28 14:35 UTC (permalink / raw) To: Richard Biener;+Cc:Bin.Cheng, GCC Patches [-- Attachment #1: Type: text/plain, Size: 921 bytes --] > ideally you'd use a wide-int here and defer the tree allocation to the result Did that in the attached version. > So I guess we never run into the outer_op == minus case as the above is > clearly wrong for that? Right, damn, not only was the treatment for this missing but it was bogus in the other pattern as well. Since we are mostly dealing with PLUS_EXPR anyways it's probably better to defer the MINUS_EXPR case for now. This will also slim down the patterns a bit. > try to keep vertical spacing in patterns minimal -- I belive that patterns > should be small enough to fit in a terminal window (24 lines). I find using the expanded wrapped_range condition in the simplification somewhat cumbersome, especially because I need the condition to evaluate to true by default making the initialization unintuitive. Yet, I guess setting wrapped_range = true was not terribly intuitive either... Regards Robin [-- Attachment #2: gcc-wrapped-binop-p2.diff --] [-- Type: text/x-patch, Size: 2783 bytes --] diff --git a/gcc/match.pd b/gcc/match.pd index 80a17ba..ed497cf 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1290,6 +1290,79 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (plus { cst; } @0)))) +/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST */ +#if GIMPLE + (simplify + (plus (convert (plus@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (INTEGRAL_TYPE_P (type) + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@3))) + /* Combine CST1 and CST2 to CST and convert to outer type if + (A + CST1)'s range does not wrap. */ + (with + { + tree inner_type = TREE_TYPE (@3); + wide_int wmin0, wmax0; + wide_int w1 = @1; + wide_int w2 = @2; + wide_int combined_cst; + + bool ovf_undef = TYPE_OVERFLOW_UNDEFINED (inner_type); + bool min_ovf = true, max_ovf = false; + + enum value_range_type vr0 = + get_range_info (@0, &wmin0, &wmax0); + + if (ovf_undef || vr0 == VR_RANGE) + { + bool ovf = true; + if (!ovf_undef && vr0 == VR_RANGE) + { + wi::add (wmin0, w1, TYPE_SIGN (inner_type), &min_ovf); + wi::add (wmax0, w1, TYPE_SIGN (inner_type), &max_ovf); + ovf = min_ovf || max_ovf; + } + + /* Extend CST1 to TYPE. */ + w1 = w1.from (w1, TYPE_PRECISION (type), + ovf ? SIGNED : TYPE_SIGN (inner_type)); + } + } + (if (ovf_undef || !((min_ovf && !max_ovf) || (!min_ovf && max_ovf))) + (plus (convert @0) { wide_int_to_tree (type, wi::add (w1, w2)); } + ))))) +#endif + +/* ((T)(A)) + CST -> (T)(A + CST) */ +#if GIMPLE + (simplify + (plus (convert SSA_NAME@0) INTEGER_CST@1) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) + && INTEGRAL_TYPE_P (type) + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) + && int_fits_type_p (@1, TREE_TYPE (@0))) + /* Perform binary operation inside the cast if the constant fits + and (A + CST)'s range does not wrap. */ + (with + { + bool min_ovf = true, max_ovf = false; + tree inner_type = TREE_TYPE (@0); + + wide_int w1 = @1; + w1 = w1.from (w1, TYPE_PRECISION (inner_type), TYPE_SIGN + (inner_type)); + + wide_int wmin0, wmax0; + if (get_range_info (@0, &wmin0, &wmax0) == VR_RANGE) + { + wi::add (wmin0, w1, TYPE_SIGN (inner_type), &min_ovf); + wi::add (wmax0, w1, TYPE_SIGN (inner_type), &max_ovf); + } + } + (if (!((min_ovf && !max_ovf) || (!min_ovf && max_ovf)) ) + (convert (plus @0 { {wide_int_to_tree (TREE_TYPE (@0), w1)}; }))) + ))) +#endif + /* ~A + A -> -1 */ (simplify (plus:c (bit_not @0) @0) ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-06-28 14:35 ` Robin Dapp@ 2017-07-03 13:10 ` Richard Biener2017-07-05 8:51 ` Robin Dapp 0 siblings, 1 reply; 47+ messages in thread From: Richard Biener @ 2017-07-03 13:10 UTC (permalink / raw) To: Robin Dapp;+Cc:Bin.Cheng, GCC Patches On Wed, Jun 28, 2017 at 4:34 PM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote >> ideally you'd use a wide-int here and defer the tree allocation to the result > > Did that in the attached version. > >> So I guess we never run into the outer_op == minus case as the above is >> clearly wrong for that? > > Right, damn, not only was the treatment for this missing but it was > bogus in the other pattern as well. Since we are mostly dealing with > PLUS_EXPR anyways it's probably better to defer the MINUS_EXPR case for > now. This will also slim down the patterns a bit. > >> try to keep vertical spacing in patterns minimal -- I belive that patterns >> should be small enough to fit in a terminal window (24 lines). > > I find using the expanded wrapped_range condition in the simplification > somewhat cumbersome, especially because I need the condition to evaluate > to true by default making the initialization unintuitive. Yet, I guess > setting wrapped_range = true was not terribly intuitive either... + /* Perform binary operation inside the cast if the constant fits + and (A + CST)'s range does not wrap. */ + (with + { + bool min_ovf = true, max_ovf = false; While the initialization value doesn't matter (wi::add will overwrite it) better initialize both to false ;) Ah, you mean because we want to transform only if get_range_info returned VR_RANGE. Indeed somewhat unintuitive (but still the best variant for now). + wide_int w1 = @1; + w1 = w1.from (w1, TYPE_PRECISION (inner_type), TYPE_SIGN + (inner_type)); I think wi::from (@1, ....) should work as well. + (if (!((min_ovf && !max_ovf) || (!min_ovf && max_ovf)) ) + (convert (plus @0 { {wide_int_to_tree (TREE_TYPE (@0), w1)}; }))) so I'm still missing a comment on why min_ovf && max_ovf is ok. The simple-minded would have written (if (! min_ovf && ! max_ovf) ... I'd like to see testcase(s) with this patch, preferably exactly also for the case of min_ovf && max_ovf. That said, consider (long)[0xfffffffe, 0xffffffff] + 2 which should have min_ovf and max_ovf which results in [0x0, 0x1] in type unsigned int but [0x100000000, 0x100000001] in type long. Richard. > Regards > Robin ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-07-03 13:10 ` Richard Biener@ 2017-07-05 8:51 ` Robin Dapp2017-07-05 8:54 ` Robin Dapp 2017-07-15 9:58 ` Marc Glisse 0 siblings, 2 replies; 47+ messages in thread From: Robin Dapp @ 2017-07-05 8:51 UTC (permalink / raw) To: Richard Biener;+Cc:Bin.Cheng, GCC Patches [-- Attachment #1: Type: text/plain, Size: 732 bytes --] > While the initialization value doesn't matter (wi::add will overwrite it) > better initialize both to false ;) Ah, you mean because we want to > transform only if get_range_info returned VR_RANGE. Indeed somewhat > unintuitive (but still the best variant for now). > so I'm still missing a comment on why min_ovf && max_ovf is ok. > The simple-minded would have written [...] I suppose it's more a matter of considering too many things at the same time for me... I was still thinking of including more cases than necessary for the regression. Guess the attached version will do as well and should not contain any more surprises. If needed, I'll add additional cases some time. Tests in a followup message. Regards Robin [-- Attachment #2: gcc-wrapped-binop-p2.diff --] [-- Type: text/x-patch, Size: 2507 bytes --] diff --git a/gcc/match.pd b/gcc/match.pd index 80a17ba..3acf8be 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1290,6 +1290,70 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (cst && !TREE_OVERFLOW (cst)) (plus { cst; } @0)))) +/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST */ +#if GIMPLE + (simplify + (plus (convert (plus@3 @0 INTEGER_CST@1)) INTEGER_CST@2) + (if (INTEGRAL_TYPE_P (type) + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@3))) + /* Combine CST1 and CST2 to CST and convert to outer type if + (A + CST1)'s range does not overflow. */ + (with + { + tree inner_type = TREE_TYPE (@3); + wide_int wmin0, wmax0; + wide_int w1 = @1; + + bool ovf_undef = TYPE_OVERFLOW_UNDEFINED (inner_type); + bool min_ovf = true, max_ovf = true; + + enum value_range_type vr0 = get_range_info (@0, &wmin0, &wmax0); + + if (ovf_undef || vr0 == VR_RANGE) + { + if (!ovf_undef && vr0 == VR_RANGE) + { + wi::add (wmin0, w1, TYPE_SIGN (inner_type), &min_ovf); + wi::add (wmax0, w1, TYPE_SIGN (inner_type), &max_ovf); + } + w1 = w1.from (@1, TYPE_PRECISION (type), TYPE_SIGN (inner_type)); + } + } + (if (ovf_undef || !(min_ovf || max_ovf)) + (plus (convert @0) { wide_int_to_tree (type, wi::add (w1, @2)); } + ))))) +#endif + +/* ((T)(A)) + CST -> (T)(A + CST) */ +#if GIMPLE + (simplify + (plus (convert SSA_NAME@0) INTEGER_CST@1) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) + && INTEGRAL_TYPE_P (type) + && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)) + && int_fits_type_p (@1, TREE_TYPE (@0))) + /* Perform binary operation inside the cast if the constant fits + and (A + CST)'s range does not overflow. */ + (with + { + bool min_ovf = true, max_ovf = true; + tree inner_type = TREE_TYPE (@0); + + wide_int w1 = w1.from (@1, TYPE_PRECISION (inner_type), TYPE_SIGN + (inner_type)); + + wide_int wmin0, wmax0; + if (get_range_info (@0, &wmin0, &wmax0) == VR_RANGE) + { + wi::add (wmin0, w1, TYPE_SIGN (inner_type), &min_ovf); + wi::add (wmax0, w1, TYPE_SIGN (inner_type), &max_ovf); + } + } + (if (!min_ovf && !max_ovf) + (convert (plus @0 { {wide_int_to_tree (TREE_TYPE (@0), w1)}; }))) + ))) +#endif + /* ~A + A -> -1 */ (simplify (plus:c (bit_not @0) @0) ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-07-05 8:51 ` Robin Dapp@ 2017-07-05 8:54 ` Robin Dapp2017-07-15 9:58 ` Marc Glisse 1 sibling, 0 replies; 47+ messages in thread From: Robin Dapp @ 2017-07-05 8:54 UTC (permalink / raw) To: Richard Biener;+Cc:Bin.Cheng, GCC Patches [-- Attachment #1: Type: text/plain, Size: 347 bytes --] [3/3] Tests -- gcc/testsuite/ChangeLog: 2017-07-05 Robin Dapp <rdapp@linux.vnet.ibm.com> * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. * gcc.dg/wrapped-binop-simplify-signed-2.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test. [-- Attachment #2: gcc-wrapped-binop-p3.diff --] [-- Type: text/x-patch, Size: 5904 bytes --] diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c new file mode 100644 index 0000000..2571a07 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c @@ -0,0 +1,65 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 12 "ccp1" } } */ + +#include <limits.h> + +long foo(int a) +{ + return (long)(a - 2) + 1; +} + +long bar(int a) +{ + return (long)(a + 3) - 1; +} + +long baz(int a) +{ + return (long)(a - 1) + 2; +} + +long baf(int a) +{ + return (long)(a + 1) - 2; +} + +long bak(int a) +{ + return (long)(a + 1) + 3; +} + +long bal(int a) +{ + return (long)(a - 7) - 4; +} + +long bam(int a) +{ + return (long)(a - 1) - INT_MAX; +} + +long bam2(int a) +{ + return (long)(a + 1) + INT_MAX; +} + +long ban(int a) +{ + return (long)(a - 1) + INT_MIN; +} + +long ban2(int a) +{ + return (long)(a + 1) - INT_MIN; +} + +unsigned long baq(int a) +{ + return (unsigned long)(a + 1) - 1; +} + +unsigned long baq2(int a) +{ + return (unsigned long)(a - 2) + 1; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c new file mode 100644 index 0000000..5c897ba --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-2.c @@ -0,0 +1,39 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <assert.h> +#include <limits.h> + +int aa = -3; + +__attribute__((noinline)) +long foo (int a) +{ + return (long)(a - INT_MIN) + 1; +} + +__attribute__((noinline)) +long foo2 (int a) +{ + if (a > -10 && a < 10) + return (long)(a + 2) - 1; +} + +__attribute__((noinline)) +long foo3 (int a) +{ + if (a > -10 && a < 10) + return (long)(a) - 3; +} + +int main() +{ + volatile long h = foo (aa); + assert (h == 2147483646); + + volatile long i = foo2 (aa); + assert (i == -2); + + volatile long j = foo3 (aa); + assert (j == -6); +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c new file mode 100644 index 0000000..04a7ca49 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-evrp-details -fdump-tree-ccp2-details -fdump-tree-vrp1-details" } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 2 "evrp" } } */ +/* { dg-final { scan-tree-dump-times "Match-and-simplified" 2 "ccp2" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to" 3 "vrp1" } } */ + +#include <limits.h> + +unsigned long oof2(unsigned int a) +{ + if (a > 0) + return (unsigned long)(a - 1) + 1; +} + +unsigned long bah (unsigned int a) +{ + if (a > 0) + return (unsigned long)(a - 1) - 1; +} + +long baq3(unsigned int a) +{ + if (a > 0) + return (long)(a - 1) + 1; +} + +unsigned long bap(unsigned int a) +{ + if (a < UINT_MAX) + return (unsigned long)(a + 1) + ULONG_MAX; +} + +unsigned long bar3(unsigned int a) +{ + if (a < UINT_MAX) + return (unsigned long)(a + 1) - 5; +} + +unsigned long bar4(unsigned int a) +{ + if (a < UINT_MAX) + return (unsigned long)(a + 1) - 6; +} diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c new file mode 100644 index 0000000..46290e7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-2.c @@ -0,0 +1,125 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include <assert.h> +#include <limits.h> + +unsigned int a = 3; +int aa = 3; +int bb = 1; +int cc = 4; +unsigned int dd = 0; +unsigned int ee = 4294967294u; + +__attribute__((noinline)) +unsigned long foo1 (unsigned int a) +{ + return (unsigned long)(UINT_MAX + 1) - 1; +} + +__attribute__((noinline)) +unsigned long foo2 (unsigned int a) +{ + if (a < 4) + return (unsigned long)(a - 4) + 1; +} + +__attribute__((noinline)) +unsigned long foo3 (unsigned int a) +{ + if (a > 2) + return (unsigned long)(a + UINT_MAX - 4) + 2; +} + +__attribute__((noinline)) +unsigned long foo4 (unsigned int a) +{ + if (a > 2) + return (unsigned long)(a - UINT_MAX) + UINT_MAX; +} + +__attribute__((noinline)) +unsigned long foo5 (unsigned int a) +{ + if (a > 2) + return (unsigned long)(a + UINT_MAX) - UINT_MAX; +} + +__attribute__((noinline)) +long foo6 (unsigned int a) +{ + if (a > 2) + return (long)(a - 4) + 1; +} + +__attribute__((noinline)) +long foo7 (unsigned int a) +{ + if (a > 2) + return (long)(a + UINT_MAX) + 1; +} + +__attribute__((noinline)) +unsigned long foo8 (unsigned int a) +{ + if (a < 2) + return (unsigned long)(a + 4294967294u) + 5000000000; +} + +__attribute__((noinline)) +unsigned long foo9 (unsigned int a) +{ + if (a > 2) + return (unsigned long)(a + 4294967294u) + 8000000000; +} + +__attribute__((noinline)) +unsigned long foo10 (unsigned int a) +{ + if (a < 2) + return (unsigned long)(a + 4294967294u) + 2; +} + +__attribute__((noinline)) +unsigned long foo11 (unsigned int a) +{ + if (a > 4294967293u) + return (unsigned long)(a + 2u) + 2; +} + + +int main() +{ + unsigned long b = foo1 (UINT_MAX); + assert (b == 18446744073709551615ul); + + unsigned long c = foo2 (a); + assert (c == 4294967296u); + + unsigned long d = foo3 (a); + assert (d == 4294967296ul); + + unsigned long e = foo4 (a); + assert (e == 4294967299ul); + + unsigned long f = foo5 (a); + assert (f == 18446744069414584323ul); + + long g = foo6 (a); + assert (g == 4294967296ul); + + long h = foo7 (aa); + assert (h == 3); + + unsigned long i = foo8 (bb); + assert (i == 9294967295ul); + + unsigned long j = foo9 (cc); + assert (j == 8000000002); + + unsigned long k = foo10 (dd); + assert (k == 0x100000000); + + unsigned long l = foo11 (ee); + assert (l == 2); +} ^ permalink raw reply [flat|nested] 47+ messages in thread

*Re: [PATCH 2/3] Simplify wrapped binops2017-07-05 8:51 ` Robin Dapp 2017-07-05 8:54 ` Robin Dapp@ 2017-07-15 9:58 ` Marc Glisse1 sibling, 0 replies; 47+ messages in thread From: Marc Glisse @ 2017-07-15 9:58 UTC (permalink / raw) To: Robin Dapp;+Cc:Richard Biener, Bin.Cheng, GCC Patches On Wed, 5 Jul 2017, Robin Dapp wrote: >> While the initialization value doesn't matter (wi::add will overwrite it) >> better initialize both to false ;) Ah, you mean because we want to >> transform only if get_range_info returned VR_RANGE. Indeed somewhat >> unintuitive (but still the best variant for now). > >> so I'm still missing a comment on why min_ovf && max_ovf is ok. >> The simple-minded would have written [...] > > I suppose it's more a matter of considering too many things at the same > time for me... I was still thinking of including more cases than > necessary for the regression. Guess the attached version will do as > well and should not contain any more surprises. If needed, I'll add > additional cases some time. What happens for (long)(X+10)+LONG_MAX where X has type int and is in [-30, -20]? It looks like wi::add will overflow and you will generate X+negative which overflows at runtime. (It looks like you don't need to name @3, you could just use the type of @0 instead) -- Marc Glisse ^ permalink raw reply [flat|nested] 47+ messages in thread

end of thread, other threads:[~2017-07-15 9:58 UTC | newest]Thread overview:47+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-07-21 10:42 [PATCH] Tree-level fix for PR 69526 Robin Dapp 2016-07-21 11:28 ` Richard Biener 2016-08-22 14:58 ` Robin Dapp 2016-09-05 7:50 ` Robin Dapp 2016-09-14 13:08 ` Richard Biener 2016-09-14 17:04 ` Jeff Law 2016-10-14 8:33 ` Robin Dapp 2016-09-20 12:39 ` Robin Dapp 2016-09-20 15:31 ` Jeff Law 2016-10-05 10:40 ` Robin Dapp 2016-10-14 11:49 ` Richard Biener 2016-11-16 15:54 ` Robin Dapp 2016-11-25 6:49 ` Robin Dapp 2016-11-25 13:47 ` Richard Biener 2016-11-28 11:13 ` Richard Biener 2016-11-28 13:26 ` Robin Dapp 2016-12-05 7:57 ` Robin Dapp 2016-12-06 13:03 ` Richard Biener 2016-12-07 16:15 ` Robin Dapp 2016-12-13 14:13 ` Richard Biener 2017-01-10 13:33 ` Robin Dapp 2017-01-17 7:34 ` Robin Dapp 2017-01-17 9:48 ` Richard Biener 2017-02-02 9:27 ` Robin Dapp 2017-05-09 7:31 ` Robin Dapp 2017-05-11 15:08 ` Bin.Cheng 2017-05-18 14:47 ` Robin Dapp 2017-05-18 14:48 ` [PATCH 1/3] Simplify wrapped binops Robin Dapp 2017-05-18 14:49 ` [PATCH 2/3] " Robin Dapp 2017-05-18 15:46 ` Bin.Cheng 2017-05-18 16:09 ` Robin Dapp 2017-05-18 17:15 ` Bin.Cheng 2017-05-19 10:13 ` Robin Dapp 2017-05-19 10:22 ` Bin.Cheng 2017-05-19 10:32 ` Richard Biener 2017-06-20 13:08 ` Robin Dapp 2017-06-20 13:49 ` Richard Biener 2017-06-21 11:44 ` Robin Dapp 2017-06-27 7:17 ` Robin Dapp 2017-06-27 12:14 ` Richard Biener 2017-06-28 14:35 ` Robin Dapp 2017-07-03 13:10 ` Richard Biener 2017-07-05 8:51 ` Robin Dapp 2017-07-05 8:54 ` Robin Dapp 2017-07-15 9:58 ` Marc Glisse 2017-05-18 15:08 ` [PATCH 3/3] " Robin Dapp 2016-08-23 7:11 ` [PATCH] Tree-level fix for PR 69526 Robin Dapp

This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).