From: "Bin.Cheng" <amker.cheng@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Bin Cheng <bin.cheng@arm.com>, GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH GCC]Improve bound information in loop niter analysis
Date: Mon, 17 Aug 2015 10:06:00 -0000 [thread overview]
Message-ID: <CAHFci29KkT64t-aAkhCPdFippjn1Rgg9E+Xp_ETFU0C8ihB90A@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc0PS9riH1LKKtQjuQsPwfw66kHpbCfZ6d_fp2ykhAXjKA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3686 bytes --]
Thanks for all your reviews.
On Fri, Aug 14, 2015 at 4:17 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>> Hi,
>> Loop niter computes inaccurate bound information for different loops. This
>> patch is to improve it by using loop initial condition in
>> determine_value_range. Generally, loop niter is computed by subtracting
>> start var from end var in loop exit condition. Moreover, loop bound is
>> computed using value range information of both start and end variables.
>> Basic idea of this patch is to check if loop initial condition implies more
>> range information for both start/end variables. If yes, we refine range
>> information and use that to compute loop bound.
>> With this improvement, more accurate loop bound information is computed for
>> test cases added by this patch.
>
> + c0 = fold_convert (type, c0);
> + c1 = fold_convert (type, c1);
> +
> + if (operand_equal_p (var, c0, 0))
>
> I believe if c0 is not already of type type operand-equal_p will never succeed.
It's quite specific case targeting comparison between var and it's
range bounds. Given c0 is in form of "var + offc0", then the
comparison "var + offc0 != range bounds" doesn't have any useful
information. Maybe useless type conversion can be handled here
though, it might be even corner case.
>
> (side-note: we should get rid of the GMP use, that's expensive and now we
> have wide-int available which should do the trick as well)
>
> + /* Case of comparing with the bounds of the type. */
> + if (TYPE_MIN_VALUE (type)
> + && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
> + cmp = GT_EXPR;
> + if (TYPE_MAX_VALUE (type)
> + && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
> + cmp = LT_EXPR;
>
> don't use TYPE_MIN/MAX_VALUE. Instead use the types precision
> and all wide_int operations (see match.pd wi::max_value use).
Done.
>
> + else if (!operand_equal_p (var, varc0, 0))
> + goto end_2;
>
> ick - goto. We need sth like a auto_mpz class with a destructor.
Label end_2 removed.
>
> struct auto_mpz
> {
> auto_mpz () { mpz_init (m_val); }
> ~auto_mpz () { mpz_clear (m_val); }
> mpz& operator() { return m_val; }
> mpz m_val;
> };
>
>> Is it OK?
>
> I see the code follows existing practice in niter analysis even though
> my overall plan was to transition its copying of value-range related
> optimizations to use VRP infrastructure.
Yes, I think it's easy to push it to VRP infrastructure. Actually
from the name of the function, it's more vrp related. For now, the
function is called only by bound_difference, not so many as vrp
queries. We need cache facility in vrp otherwise it would be
expensive.
>
> I'm still ok with improving the existing code on the basis that I won't
> get to that for GCC 6.
>
> So - ok with the TYPE_MIN/MAX_VALUE change suggested above.
>
> Refactoring with auto_mpz welcome.
That will be an independent patch, so I skipped it in this one.
New version attached. Bootstrap and test on x86_64.
Thanks,
bin
>
> Thanks,
> RIchard.
>
>> Thanks,
>> bin
>>
>> 2015-07-28 Bin Cheng <bin.cheng@arm.com>
>>
>> * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>> (determine_value_range): Call refine_value_range_using_guard for
>> each loop initial condition to improve value range.
>>
>> gcc/testsuite/ChangeLog
>> 2015-07-28 Bin Cheng <bin.cheng@arm.com>
>>
>> * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>> * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>> * gcc.dg/tree-ssa/loop-bound-5.c: New test.
[-- Attachment #2: improve-loop-bound-analysis-20150817.txt --]
[-- Type: text/plain, Size: 11632 bytes --]
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c (revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+int *a;
+
+int
+foo (unsigned char s, unsigned char l)
+{
+ unsigned char i;
+ int sum = 0;
+
+ for (i = s; i > l; i -= 1)
+ {
+ sum += a[i];
+ }
+
+ return sum;
+}
+
+/* Check loop niter bound information. */
+/* { dg-final { scan-tree-dump "bounded by 254" "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "bounded by 255" "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-5.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-5.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-5.c (revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+int *a;
+
+int
+foo (unsigned char s)
+{
+ unsigned char i;
+ int sum = 0;
+
+ for (i = s; i > 0; i -= 1)
+ {
+ sum += a[i];
+ }
+
+ return sum;
+}
+
+/* Check loop niter bound information. */
+/* { dg-final { scan-tree-dump "bounded by 254" "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "bounded by 255" "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-1.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-1.c (revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+int *a;
+
+int
+foo (unsigned char s, unsigned char l)
+{
+ unsigned char i;
+ int sum = 0;
+
+ for (i = s; i < l; i += 1)
+ {
+ sum += a[i];
+ }
+
+ return sum;
+}
+
+/* Check loop niter bound information. */
+/* { dg-final { scan-tree-dump "bounded by 254" "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "bounded by 255" "ivopts" } } */
Index: gcc/tree-ssa-loop-niter.c
===================================================================
--- gcc/tree-ssa-loop-niter.c (revision 225916)
+++ gcc/tree-ssa-loop-niter.c (working copy)
@@ -122,6 +122,237 @@ split_to_var_and_offset (tree expr, tree *var, mpz
}
}
+/* From condition C0 CMP C1 derives information regarding the value range
+ of VAR, which is of TYPE. Results are stored in to BELOW and UP. */
+
+static void
+refine_value_range_using_guard (tree type, tree var,
+ tree c0, enum tree_code cmp, tree c1,
+ mpz_t below, mpz_t up)
+{
+ tree varc0, varc1, ctype;
+ mpz_t offc0, offc1;
+ mpz_t mint, maxt, minc1, maxc1;
+ wide_int minv, maxv;
+ bool no_wrap = nowrap_type_p (type);
+ bool c0_ok, c1_ok;
+ signop sgn = TYPE_SIGN (type);
+
+ switch (cmp)
+ {
+ case LT_EXPR:
+ case LE_EXPR:
+ case GT_EXPR:
+ case GE_EXPR:
+ STRIP_SIGN_NOPS (c0);
+ STRIP_SIGN_NOPS (c1);
+ ctype = TREE_TYPE (c0);
+ if (!useless_type_conversion_p (ctype, type))
+ return;
+
+ break;
+
+ case EQ_EXPR:
+ /* We could derive quite precise information from EQ_EXPR, however,
+ such a guard is unlikely to appear, so we do not bother with
+ handling it. */
+ return;
+
+ case NE_EXPR:
+ /* NE_EXPR comparisons do not contain much of useful information,
+ except for cases of comparing with bounds. */
+ if (TREE_CODE (c1) != INTEGER_CST
+ || !INTEGRAL_TYPE_P (type))
+ return;
+
+ /* Ensure that the condition speaks about an expression in the same
+ type as X and Y. */
+ ctype = TREE_TYPE (c0);
+ if (TYPE_PRECISION (ctype) != TYPE_PRECISION (type))
+ return;
+ c0 = fold_convert (type, c0);
+ c1 = fold_convert (type, c1);
+
+ if (operand_equal_p (var, c0, 0))
+ {
+ mpz_t valc1;
+
+ /* Case of comparing VAR with its below/up bounds. */
+ mpz_init (valc1);
+ wi::to_mpz (c1, valc1, TYPE_SIGN (type));
+ if (mpz_cmp (valc1, below) == 0)
+ cmp = GT_EXPR;
+ if (mpz_cmp (valc1, up) == 0)
+ cmp = LT_EXPR;
+
+ mpz_clear (valc1);
+ }
+ else
+ {
+ /* Case of comparing with the bounds of the type. */
+ wide_int min = wi::min_value (type);
+ wide_int max = wi::max_value (type);
+
+ if (wi::eq_p (c1, min))
+ cmp = GT_EXPR;
+ if (wi::eq_p (c1, max))
+ cmp = LT_EXPR;
+ }
+
+ /* Quick return if no useful information. */
+ if (cmp == NE_EXPR)
+ return;
+
+ break;
+
+ default:
+ return;
+ }
+
+ mpz_init (offc0);
+ mpz_init (offc1);
+ split_to_var_and_offset (expand_simple_operations (c0), &varc0, offc0);
+ split_to_var_and_offset (expand_simple_operations (c1), &varc1, offc1);
+
+ /* We are only interested in comparisons of expressions based on VAR. */
+ if (operand_equal_p (var, varc1, 0))
+ {
+ std::swap (varc0, varc1);
+ mpz_swap (offc0, offc1);
+ cmp = swap_tree_comparison (cmp);
+ }
+ else if (!operand_equal_p (var, varc0, 0))
+ {
+ mpz_clear (offc0);
+ mpz_clear (offc1);
+ return;
+ }
+
+ mpz_init (mint);
+ mpz_init (maxt);
+ get_type_static_bounds (type, mint, maxt);
+ mpz_init (minc1);
+ mpz_init (maxc1);
+ /* Setup range information for varc1. */
+ if (integer_zerop (varc1))
+ {
+ wi::to_mpz (integer_zero_node, minc1, TYPE_SIGN (type));
+ wi::to_mpz (integer_zero_node, maxc1, TYPE_SIGN (type));
+ }
+ else if (TREE_CODE (varc1) == SSA_NAME
+ && INTEGRAL_TYPE_P (type)
+ && get_range_info (varc1, &minv, &maxv) == VR_RANGE)
+ {
+ gcc_assert (wi::le_p (minv, maxv, sgn));
+ wi::to_mpz (minv, minc1, sgn);
+ wi::to_mpz (maxv, maxc1, sgn);
+ }
+ else
+ {
+ mpz_set (minc1, mint);
+ mpz_set (maxc1, maxt);
+ }
+
+ /* Compute valid range information for varc1 + offc1. Note nothing
+ useful can be derived if it overflows or underflows. Overflow or
+ underflow could happen when:
+
+ offc1 > 0 && varc1 + offc1 > MAX_VAL (type)
+ offc1 < 0 && varc1 + offc1 < MIN_VAL (type). */
+ mpz_add (minc1, minc1, offc1);
+ mpz_add (maxc1, maxc1, offc1);
+ c1_ok = (no_wrap
+ || mpz_sgn (offc1) == 0
+ || (mpz_sgn (offc1) < 0 && mpz_cmp (minc1, mint) >= 0)
+ || (mpz_sgn (offc1) > 0 && mpz_cmp (maxc1, maxt) <= 0));
+ if (!c1_ok)
+ goto end;
+
+ if (mpz_cmp (minc1, mint) < 0)
+ mpz_set (minc1, mint);
+ if (mpz_cmp (maxc1, maxt) > 0)
+ mpz_set (maxc1, maxt);
+
+ if (cmp == LT_EXPR)
+ {
+ cmp = LE_EXPR;
+ mpz_sub_ui (maxc1, maxc1, 1);
+ }
+ if (cmp == GT_EXPR)
+ {
+ cmp = GE_EXPR;
+ mpz_add_ui (minc1, minc1, 1);
+ }
+
+ /* Compute range information for varc0. If there is no overflow,
+ the condition implied that
+
+ (varc0) cmp (varc1 + offc1 - offc0)
+
+ We can possibly improve the upper bound of varc0 if cmp is LE_EXPR,
+ or the below bound if cmp is GE_EXPR.
+
+ To prove there is no overflow/underflow, we need to check below
+ four cases:
+ 1) cmp == LE_EXPR && offc0 > 0
+
+ (varc0 + offc0) doesn't overflow
+ && (varc1 + offc1 - offc0) doesn't underflow
+
+ 2) cmp == LE_EXPR && offc0 < 0
+
+ (varc0 + offc0) doesn't underflow
+ && (varc1 + offc1 - offc0) doesn't overfloe
+
+ In this case, (varc0 + offc0) will never underflow if we can
+ prove (varc1 + offc1 - offc0) doesn't overflow.
+
+ 3) cmp == GE_EXPR && offc0 < 0
+
+ (varc0 + offc0) doesn't underflow
+ && (varc1 + offc1 - offc0) doesn't overflow
+
+ 4) cmp == GE_EXPR && offc0 > 0
+
+ (varc0 + offc0) doesn't overflow
+ && (varc1 + offc1 - offc0) doesn't underflow
+
+ In this case, (varc0 + offc0) will never overflow if we can
+ prove (varc1 + offc1 - offc0) doesn't underflow.
+
+ Note we only handle case 2 and 4 in below code. */
+
+ mpz_sub (minc1, minc1, offc0);
+ mpz_sub (maxc1, maxc1, offc0);
+ c0_ok = (no_wrap
+ || mpz_sgn (offc0) == 0
+ || (cmp == LE_EXPR
+ && mpz_sgn (offc0) < 0 && mpz_cmp (maxc1, maxt) <= 0)
+ || (cmp == GE_EXPR
+ && mpz_sgn (offc0) > 0 && mpz_cmp (minc1, mint) >= 0));
+ if (!c0_ok)
+ goto end;
+
+ if (cmp == LE_EXPR)
+ {
+ if (mpz_cmp (up, maxc1) > 0)
+ mpz_set (up, maxc1);
+ }
+ else
+ {
+ if (mpz_cmp (below, minc1) < 0)
+ mpz_set (below, minc1);
+ }
+
+end:
+ mpz_clear (mint);
+ mpz_clear (maxt);
+ mpz_clear (minc1);
+ mpz_clear (maxc1);
+ mpz_clear (offc0);
+ mpz_clear (offc1);
+}
+
/* Stores estimate on the minimum/maximum value of the expression VAR + OFF
in TYPE to MIN and MAX. */
@@ -129,6 +360,9 @@ static void
determine_value_range (struct loop *loop, tree type, tree var, mpz_t off,
mpz_t min, mpz_t max)
{
+ int cnt = 0;
+ mpz_t minm, maxm;
+ basic_block bb;
wide_int minv, maxv;
enum value_range_type rtype = VR_VARYING;
@@ -183,35 +417,69 @@ determine_value_range (struct loop *loop, tree typ
}
}
}
- if (rtype == VR_RANGE)
+ mpz_init (minm);
+ mpz_init (maxm);
+ if (rtype != VR_RANGE)
{
- mpz_t minm, maxm;
+ mpz_set (minm, min);
+ mpz_set (maxm, max);
+ }
+ else
+ {
gcc_assert (wi::le_p (minv, maxv, sgn));
- mpz_init (minm);
- mpz_init (maxm);
wi::to_mpz (minv, minm, sgn);
wi::to_mpz (maxv, maxm, sgn);
- mpz_add (minm, minm, off);
- mpz_add (maxm, maxm, off);
- /* If the computation may not wrap or off is zero, then this
- is always fine. If off is negative and minv + off isn't
- smaller than type's minimum, or off is positive and
- maxv + off isn't bigger than type's maximum, use the more
- precise range too. */
- if (nowrap_type_p (type)
- || mpz_sgn (off) == 0
- || (mpz_sgn (off) < 0 && mpz_cmp (minm, min) >= 0)
- || (mpz_sgn (off) > 0 && mpz_cmp (maxm, max) <= 0))
- {
- mpz_set (min, minm);
- mpz_set (max, maxm);
- mpz_clear (minm);
- mpz_clear (maxm);
- return;
- }
+ }
+ /* Now walk the dominators of the loop header and use the entry
+ guards to refine the estimates. */
+ for (bb = loop->header;
+ bb != ENTRY_BLOCK_PTR_FOR_FN (cfun) && cnt < MAX_DOMINATORS_TO_WALK;
+ bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+ {
+ edge e;
+ tree c0, c1;
+ gimple cond;
+ enum tree_code cmp;
+
+ if (!single_pred_p (bb))
+ continue;
+ e = single_pred_edge (bb);
+
+ if (!(e->flags & (EDGE_TRUE_VALUE | EDGE_FALSE_VALUE)))
+ continue;
+
+ cond = last_stmt (e->src);
+ c0 = gimple_cond_lhs (cond);
+ cmp = gimple_cond_code (cond);
+ c1 = gimple_cond_rhs (cond);
+
+ if (e->flags & EDGE_FALSE_VALUE)
+ cmp = invert_tree_comparison (cmp, false);
+
+ refine_value_range_using_guard (type, var, c0, cmp, c1, minm, maxm);
+ ++cnt;
+ }
+
+ mpz_add (minm, minm, off);
+ mpz_add (maxm, maxm, off);
+ /* If the computation may not wrap or off is zero, then this
+ is always fine. If off is negative and minv + off isn't
+ smaller than type's minimum, or off is positive and
+ maxv + off isn't bigger than type's maximum, use the more
+ precise range too. */
+ if (nowrap_type_p (type)
+ || mpz_sgn (off) == 0
+ || (mpz_sgn (off) < 0 && mpz_cmp (minm, min) >= 0)
+ || (mpz_sgn (off) > 0 && mpz_cmp (maxm, max) <= 0))
+ {
+ mpz_set (min, minm);
+ mpz_set (max, maxm);
mpz_clear (minm);
mpz_clear (maxm);
+ return;
}
+ mpz_clear (minm);
+ mpz_clear (maxm);
}
/* If the computation may wrap, we know nothing about the value, except for
next prev parent reply other threads:[~2015-08-17 10:01 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-28 10:11 Bin Cheng
2015-08-13 8:27 ` Bin.Cheng
2015-08-13 22:10 ` Jeff Law
2015-08-14 3:13 ` Bin.Cheng
2015-08-14 18:32 ` Jeff Law
2015-08-14 8:28 ` Richard Biener
2015-08-14 18:47 ` Jeff Law
2015-08-17 10:06 ` Bin.Cheng [this message]
2015-08-17 11:16 ` Ajit Kumar Agarwal
2015-08-17 11:38 ` Ajit Kumar Agarwal
2015-08-18 7:53 ` Bin.Cheng
2015-08-18 8:16 ` Ajit Kumar Agarwal
2015-08-18 8:51 ` Richard Biener
2015-08-18 9:09 ` Bin.Cheng
2015-08-18 18:38 ` Jeff Law
2015-08-19 2:45 ` Bin.Cheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHFci29KkT64t-aAkhCPdFippjn1Rgg9E+Xp_ETFU0C8ihB90A@mail.gmail.com \
--to=amker.cheng@gmail.com \
--cc=bin.cheng@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).