From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24773 invoked by alias); 17 Jul 2014 09:09:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 24672 invoked by uid 89); 17 Jul 2014 09:09:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: service87.mimecast.com Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 17 Jul 2014 09:09:25 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Thu, 17 Jul 2014 10:09:22 +0100 Received: from shawin233 ([10.1.255.212]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 17 Jul 2014 10:09:22 +0100 From: "Bin Cheng" To: Subject: [PATCH 3/3]Improve induction variable elimination Date: Thu, 17 Jul 2014 09:32:00 -0000 Message-ID: <003201cfa19e$cc5a7a20$650f6e60$@arm.com> MIME-Version: 1.0 X-MC-Unique: 114071710092202301 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0033_01CFA1A7.2E1FA570" X-IsSubscribed: yes X-SW-Source: 2014-07/txt/msg01201.txt.bz2 This is a multipart message in MIME format. ------=_NextPart_000_0033_01CFA1A7.2E1FA570 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Content-length: 1725 Hi, Function iv_elimination_compare_lt is used to eliminate induction variable when the loop's latch could run for zero time (i.e., may_be_zero in loop niter information evaluates to true). As stated in the first message, it only handles very specific case that rarely happens for either GCC bootstrap or spec2k/spec2k6 compilation. The function has two restrictions which could be improved: a) When checking that candidate iv doesn't overflow, it only handles candidates that are computed in a type that guarantees no overflows. More complex analysis can be used to prove the non-overflow ness, as in this patch. b) The function only handles the original form of may_be_zero like "a + 1 > b", but that expression could have been folded into other forms. This patch handles three folded forms and does iv elimination as well. I think this isn't a very corner case, because for many loops iterating from "0" (i.e., we have "a =3D=3D 0"), the expression will be folded. I also refactored period check from may_eliminate_iv into a single function so that it can be reused. Thanks, bin 2014-07-17 Bin Cheng * tree-ssa-loop-ivopts.c (iv_nowrap_period) (nowrap_cand_for_loop_niter_p): New functions. (period_greater_niter_exit): New function refactored from may_eliminate_iv. (iv_elimination_compare_lt): New parameter. Check wrapping behavior for candidate of wrapping type. Handle folded forms of may_be_zero expression. (may_eliminate_iv): Call period_greater_niter_exit. Pass new argument for iv_elimination_compare_lt. gcc/testsuite/ChangeLog 2014-07-17 Bin Cheng * gcc.dg/tree-ssa/ivopts-lt-3.c: New test. * gcc.dg/tree-ssa/ivopts-lt-4.c: New test.= ------=_NextPart_000_0033_01CFA1A7.2E1FA570 Content-Type: text/plain; name=iv_elimination-improve-c-20140716.txt Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="iv_elimination-improve-c-20140716.txt" Content-length: 14365 Index: gcc/tree-ssa-loop-ivopts.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- gcc/tree-ssa-loop-ivopts.c (revision 212387) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -4432,6 +4432,44 @@ iv_period (struct iv *iv) return period; } =20 +/* Returns no wrapping period of induction variable IV. For now + only unsigned type IV is handled, we could extend it in case + of non-overflow for signed ones. Return zero if it can't be + decided. */ + +static tree +iv_nowrap_period (struct iv *iv) +{ + bool overflow; + tree type; + tree base =3D iv->base, step =3D iv->step; + widest_int base_val, step_val, max_val, span, period; + + gcc_assert (step && TREE_CODE (step) =3D=3D INTEGER_CST); + + type =3D TREE_TYPE (base); + if (!TYPE_UNSIGNED (type) || TREE_CODE (base) !=3D INTEGER_CST) + return integer_zero_node; + + base_val =3D wi::to_widest (base); + step_val =3D wi::to_widest (step); + if (!POINTER_TYPE_P (type) && TYPE_MAX_VALUE (type) + && TREE_CODE (TYPE_MAX_VALUE (type)) =3D=3D INTEGER_CST) + max_val =3D wi::to_widest (TYPE_MAX_VALUE (type)); + else + { + wide_int max_wi =3D wi::max_value (TYPE_PRECISION (type), UNSIGNED); + max_val =3D wi::to_widest (wide_int_to_tree (type, max_wi)); + } + + span =3D max_val - base_val + step_val - 1; + period =3D wi::div_trunc (span, step_val, UNSIGNED, &overflow); + if (overflow) + return integer_zero_node; + + return wide_int_to_tree (type, period); +} + /* Returns the comparison operator used when eliminating the iv USE. */ =20 static enum tree_code @@ -4560,7 +4598,84 @@ difference_cannot_overflow_p (tree base, tree offs } } =20 -/* Tries to replace loop exit by one formulated in terms of a LT_EXPR +/* Check whether PERIOD of CAND is greater than the number of iterations + described by DESC for which the exit condition is true. The exit + condition is comparison against USE. */ + +static bool +period_greater_niter_exit (struct ivopts_data *data, + struct iv_use *use, struct iv_cand *cand, + tree period, struct tree_niter_desc *desc) +{ + struct loop *loop =3D data->current_loop; + + /* If the number of iterations is constant, compare against it directly.= */ + if (TREE_CODE (desc->niter) =3D=3D INTEGER_CST) + { + /* See cand_value_at. */ + if (stmt_after_increment (loop, cand, use->stmt)) + { + if (!tree_int_cst_lt (desc->niter, period)) + return false; + } + else + { + if (tree_int_cst_lt (period, desc->niter)) + return false; + } + } + + /* If not, and if this is the only possible exit of the loop, see whether + we can get a conservative estimate on the number of iterations of the + entire loop and compare against that instead. */ + else + { + widest_int period_value, max_niter; + + max_niter =3D desc->max; + if (stmt_after_increment (loop, cand, use->stmt)) + max_niter +=3D 1; + period_value =3D wi::to_widest (period); + if (wi::gtu_p (max_niter, period_value)) + { + /* See if we can take advantage of inferred loop bound informati= on. */ + if (data->loop_single_exit_p) + { + if (!max_loop_iterations (loop, &max_niter)) + return false; + /* The loop bound is already adjusted by adding 1. */ + if (wi::gtu_p (max_niter, period_value)) + return false; + } + else + return false; + } + } + + return true; +} + +/* Determine whether no wrapping period of CAND is greater than + the number of iterations described by DESC for which exit + conditionis true. The exit condition is comparison against USE. */ + +static bool +nowrap_cand_for_loop_niter_p (struct ivopts_data *data, + struct iv_use *use, + struct iv_cand *cand, + struct tree_niter_desc *desc) +{ + tree period; +=20=20 + period =3D iv_nowrap_period (cand->iv); + if (period !=3D integer_zero_node + && period_greater_niter_exit (data, use, cand, period, desc)) + return true; + + return false; +} + +/* Tries to replace loop exit in USE by one formulated in terms of a LT_EX= PR comparison with CAND. NITER describes the number of iterations of the loops. If successful, the comparison in COMP_P is altered accordin= gly. =20 @@ -4597,11 +4712,18 @@ difference_cannot_overflow_p (tree base, tree offs 1) if a + 1 <=3D b, then p_0 - a + b is the final value of p, hence the= re is no overflow in computing it or the values of p. 2) if a + 1 > b, then we need to verify that the expression p_0 - a doe= s not - overflow. To prove this, we use the fact that p_0 =3D base + a. */ + overflow. To prove this, we use the fact that p_0 =3D base + a. =20 + + Moreover, with more complex overflow analysis for unsigned type, it is + possible to eliminate I with P if P is an induction candidate of unsign= ed + integer type other than pointer if we can prove that P doesn't overflow + during all iterations of current loop. Also special forms of MAY_BE_ZE= RO + in NITER is checked because "a + 1 > b" could be folded. */ + static bool -iv_elimination_compare_lt (struct ivopts_data *data, - struct iv_cand *cand, enum tree_code *comp_p, +iv_elimination_compare_lt (struct ivopts_data *data, struct iv_use *use, + struct iv_cand *cand, enum tree_code *comp_p, struct tree_niter_desc *niter) { tree cand_type, a, b, mbz, nit_type =3D TREE_TYPE (niter->niter), offset; @@ -4610,11 +4732,13 @@ static bool HOST_WIDE_INT step; =20 /* We need to know that the candidate induction variable does not overfl= ow. - While more complex analysis may be used to prove this, for now just - check that the variable appears in the original program and that it - is computed in a type that guarantees no overflows. */ + Apart from checking that the variable appears in the original program + and that is is computed in a type that guarantees no overflows, we al= so + check no wrapping periord of the variable covers the niter if it has + unsigned type. More complex analysis may be used to prove this. */ cand_type =3D TREE_TYPE (cand->iv->base); - if (cand->pos !=3D IP_ORIGINAL || !nowrap_type_p (cand_type)) + if ((cand->pos !=3D IP_ORIGINAL || !nowrap_type_p (cand_type)) + && !nowrap_cand_for_loop_niter_p (data, use, cand, niter)) return false; =20 /* Make sure that the loop iterates till the loop bound is hit, as other= wise @@ -4657,6 +4781,67 @@ static bool else return false; } + else if (TREE_CODE (mbz) =3D=3D EQ_EXPR) + { + /* Check whether may_be_zero is in below three forms: + 1) b' =3D=3D TYPE_MAX_VALUE (TREE_TYPE (b')) + where b' is of type nit_type. The expression could be + folded from "b' + 1 < 1". In this case, A is "0" and B + is "b' + 1". + 2) b =3D=3D 0 + where b is of type nit_type. The expression could be + folded from "b < 1". In this case, A is "0" and B is + "b". + 3) b' =3D=3D -1 + where b' is of signed type which has nit_type as its + unsigned counterpart. The expression could be folded + from "(nit_type)b' + 1 < 1". In this case, A is "0" + and "(nit_type)b' + 1". */ + tree type; + tree op0 =3D TREE_OPERAND (mbz, 0); + tree op1 =3D TREE_OPERAND (mbz, 1); + + if (TREE_CODE (op1) !=3D INTEGER_CST) + { + tree tmp =3D op0; + op0 =3D op1; + op1 =3D tmp; + } + type =3D TREE_TYPE (op0); + if (TREE_CODE (op1) !=3D INTEGER_CST) + return false; + + /* Convert case 3 to case 1. */ + if (!TYPE_UNSIGNED (type) + && operand_equal_p (op1, integer_minus_one_node, 0) + && types_compatible_p (unsigned_type_for (type), nit_type)) + { + type =3D nit_type; + op0 =3D fold_convert_loc (UNKNOWN_LOCATION, nit_type, op0); + op1 =3D fold_convert_loc (UNKNOWN_LOCATION, nit_type, op1); + } + + if (!TYPE_UNSIGNED (type)) + return false; + + /* Case 1. */ + if (TYPE_MAX_VALUE (type) + && TREE_CODE (op1) =3D=3D INTEGER_CST + && operand_equal_p (op1, TYPE_MAX_VALUE (type), 0)) + { + a =3D integer_zero_node; + b =3D fold_build2 (PLUS_EXPR, type, op0, integer_one_node); + } + /* Case 2. */ + else if (TREE_CODE (op1) =3D=3D INTEGER_CST + && operand_equal_p (op1, integer_zero_node, 0)) + { + a =3D integer_zero_node; + b =3D op0; + } + else + return false; + } else return false; =20 @@ -4673,10 +4858,14 @@ static bool return false; =20 /* Finally, check that CAND->IV->BASE - CAND->IV->STEP * A does not - overflow. */ - offset =3D fold_build2 (MULT_EXPR, TREE_TYPE (cand->iv->step), - cand->iv->step, - fold_convert (TREE_TYPE (cand->iv->step), a)); + overflow. It is uncessary to build offset when A equals to ZERO. */ + if (a !=3D integer_zero_node) + offset =3D fold_build2 (MULT_EXPR, TREE_TYPE (cand->iv->step), + cand->iv->step, + fold_convert (TREE_TYPE (cand->iv->step), a)); + else + offset =3D a; + if (!difference_cannot_overflow_p (cand->iv->base, offset)) return false; =20 @@ -4733,50 +4922,9 @@ may_eliminate_iv (struct ivopts_data *data, This is the case iff the period of the induction variable is greater than the number of iterations for which the exit condition is true. = */ period =3D iv_period (cand->iv); + if (!period_greater_niter_exit (data, use, cand, period, desc)) + return false; =20 - /* If the number of iterations is constant, compare against it directly.= */ - if (TREE_CODE (desc->niter) =3D=3D INTEGER_CST) - { - /* See cand_value_at. */ - if (stmt_after_increment (loop, cand, use->stmt)) - { - if (!tree_int_cst_lt (desc->niter, period)) - return false; - } - else - { - if (tree_int_cst_lt (period, desc->niter)) - return false; - } - } - - /* If not, and if this is the only possible exit of the loop, see whether - we can get a conservative estimate on the number of iterations of the - entire loop and compare against that instead. */ - else - { - widest_int period_value, max_niter; - - max_niter =3D desc->max; - if (stmt_after_increment (loop, cand, use->stmt)) - max_niter +=3D 1; - period_value =3D wi::to_widest (period); - if (wi::gtu_p (max_niter, period_value)) - { - /* See if we can take advantage of inferred loop bound informati= on. */ - if (data->loop_single_exit_p) - { - if (!max_loop_iterations (loop, &max_niter)) - return false; - /* The loop bound is already adjusted by adding 1. */ - if (wi::gtu_p (max_niter, period_value)) - return false; - } - else - return false; - } - } - cand_value_at (loop, cand, use->stmt, desc->niter, &bnd); =20 *bound =3D fold_convert (TREE_TYPE (cand->iv->base), @@ -4796,7 +4944,7 @@ may_eliminate_iv (struct ivopts_data *data, base the exit condition on it. However, that is often too expensive. */ if (!integer_zerop (desc->may_be_zero)) - return iv_elimination_compare_lt (data, cand, comp, desc); + return iv_elimination_compare_lt (data, use, cand, comp, desc); =20 return true; } Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt-3.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt-3.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt-3.c (revision 0) @@ -0,0 +1,38 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O2 -fdump-tree-ivopts-details" } */ + +typedef long unsigned int size_t; +extern float a[100], b[100]; + +int foo (int M, unsigned int l) +{ + unsigned ivtmp =3D 0, niters, _37, _38, bnd; + size_t _67, _1; + float *vectp_a, *vectp_b, *vectp_a2; + float vect__6, vect__7, vect__8; + + _38 =3D (unsigned int)l; + bnd =3D _38 + 1; + + _1 =3D (size_t) M; + _67 =3D _1 * 4; + vectp_a =3D a; vectp_b =3D b; vectp_a2 =3D a + _67; + + do + { + vect__6 =3D *vectp_a; + vect__7 =3D *vectp_b; + vect__8 =3D vect__6 + vect__7; + *vectp_a =3D vect__8; + vectp_a =3D vectp_a + 4; + vectp_b =3D vectp_b + 4; + vectp_a2 =3D vectp_a2 + 4; + ivtmp +=3D 1; + } + while (ivtmp < bnd); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "Replacing exit test: if \\(bnd.*\\)"= 1 "ivopts" } } */ +/* { dg-final { cleanup-tree-dump "ivopts" } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt-4.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt-4.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt-4.c (revision 0) @@ -0,0 +1,38 @@ +/* { dg-do compile { target { lp64 } } } */ +/* { dg-options "-O2 -fdump-tree-ivopts-details" } */ + +typedef long unsigned int size_t; +extern float a[100], b[100]; + +int foo (int M, int l) +{ + unsigned ivtmp =3D 0, niters, _37, _38, bnd; + size_t _67, _1; + float *vectp_a, *vectp_b, *vectp_a2; + float vect__6, vect__7, vect__8; + + _38 =3D (unsigned int)l; + bnd =3D _38 + 1; + + _1 =3D (size_t) M; + _67 =3D _1 * 4; + vectp_a =3D a; vectp_b =3D b; vectp_a2 =3D a + _67; + + do + { + vect__6 =3D *vectp_a; + vect__7 =3D *vectp_b; + vect__8 =3D vect__6 + vect__7; + *vectp_a =3D vect__8; + vectp_a =3D vectp_a + 4; + vectp_b =3D vectp_b + 4; + vectp_a2 =3D vectp_a2 + 4; + ivtmp +=3D 1; + } + while (ivtmp < bnd); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "Replacing exit test: if \\(bnd.*\\)"= 1 "ivopts" } } */ +/* { dg-final { cleanup-tree-dump "ivopts" } } */ ------=_NextPart_000_0033_01CFA1A7.2E1FA570--