[PATCH GCC]Improve bound information in loop niter analysis

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH GCC]Improve bound information in loop niter analysis
@ 2015-07-28 10:11 Bin Cheng
  2015-08-13  8:27 ` Bin.Cheng
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Bin Cheng @ 2015-07-28 10:11 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1145 bytes --]

Hi,
Loop niter computes inaccurate bound information for different loops.  This
patch is to improve it by using loop initial condition in
determine_value_range.  Generally, loop niter is computed by subtracting
start var from end var in loop exit condition.  Moreover, loop bound is
computed using value range information of both start and end variables.
Basic idea of this patch is to check if loop initial condition implies more
range information for both start/end variables.  If yes, we refine range
information and use that to compute loop bound.
With this improvement, more accurate loop bound information is computed for
test cases added by this patch.

Is it OK?

Thanks,
bin

2015-07-28  Bin Cheng  <bin.cheng@arm.com>

	* tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
	(determine_value_range): Call refine_value_range_using_guard for
	each loop initial condition to improve value range.

gcc/testsuite/ChangeLog
2015-07-28  Bin Cheng  <bin.cheng@arm.com>

	* gcc.dg/tree-ssa/loop-bound-1.c: New test.
	* gcc.dg/tree-ssa/loop-bound-3.c: New test.
	* gcc.dg/tree-ssa/loop-bound-5.c: New test.

[-- Attachment #2: improve-loop-bound-analysis-20150728.txt --]
[-- Type: text/plain, Size: 12041 bytes --]

Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+int *a;
+
+int
+foo (unsigned char s, unsigned char l)
+{
+  unsigned char i;
+  int sum = 0;
+
+  for (i = s; i > l; i -= 1)
+    {
+      sum += a[i];
+    }
+
+  return sum;
+}
+
+/* Check loop niter bound information.  */
+/* { dg-final { scan-tree-dump "bounded by 254" "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "bounded by 255" "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-5.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-5.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-5.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+int *a;
+
+int
+foo (unsigned char s)
+{
+  unsigned char i;
+  int sum = 0;
+
+  for (i = s; i > 0; i -= 1)
+    {
+      sum += a[i];
+    }
+
+  return sum;
+}
+
+/* Check loop niter bound information.  */
+/* { dg-final { scan-tree-dump "bounded by 254" "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "bounded by 255" "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-1.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+int *a;
+
+int
+foo (unsigned char s, unsigned char l)
+{
+  unsigned char i;
+  int sum = 0;
+
+  for (i = s; i < l; i += 1)
+    {
+      sum += a[i];
+    }
+
+  return sum;
+}
+
+/* Check loop niter bound information.  */
+/* { dg-final { scan-tree-dump "bounded by 254" "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "bounded by 255" "ivopts" } } */
Index: gcc/tree-ssa-loop-niter.c
===================================================================
--- gcc/tree-ssa-loop-niter.c	(revision 225859)
+++ gcc/tree-ssa-loop-niter.c	(working copy)
@@ -122,6 +122,233 @@ split_to_var_and_offset (tree expr, tree *var, mpz
     }
 }
 
+/* From condition C0 CMP C1 derives information regarding the value range
+   of VAR, which is of TYPE.  Results are stored in to BELOW and UP.  */
+
+static void
+refine_value_range_using_guard (tree type, tree var,
+				tree c0, enum tree_code cmp, tree c1,
+				mpz_t below, mpz_t up)
+{
+  tree varc0, varc1, ctype;
+  mpz_t offc0, offc1;
+  mpz_t mint, maxt, minc1, maxc1;
+  wide_int minv, maxv;
+  bool no_wrap = nowrap_type_p (type);
+  bool c0_ok, c1_ok;
+  signop sgn = TYPE_SIGN (type);
+
+  switch (cmp)
+    {
+    case LT_EXPR:
+    case LE_EXPR:
+    case GT_EXPR:
+    case GE_EXPR:
+      STRIP_SIGN_NOPS (c0);
+      STRIP_SIGN_NOPS (c1);
+      ctype = TREE_TYPE (c0);
+      if (!useless_type_conversion_p (ctype, type))
+	return;
+
+      break;
+
+    case EQ_EXPR:
+      /* We could derive quite precise information from EQ_EXPR, however,
+	 such a guard is unlikely to appear, so we do not bother with
+	 handling it.  */
+      return;
+
+    case NE_EXPR:
+      /* NE_EXPR comparisons do not contain much of useful information,
+	 except for cases of comparing with bounds.  */
+      if (TREE_CODE (c1) != INTEGER_CST
+	  || !INTEGRAL_TYPE_P (type))
+	return;
+
+      /* Ensure that the condition speaks about an expression in the same
+	 type as X and Y.  */
+      ctype = TREE_TYPE (c0);
+      if (TYPE_PRECISION (ctype) != TYPE_PRECISION (type))
+	return;
+      c0 = fold_convert (type, c0);
+      c1 = fold_convert (type, c1);
+
+      if (operand_equal_p (var, c0, 0))
+	{
+	  mpz_t valc1;
+
+	  /* Case of comparing VAR with its below/up bounds.  */
+	  mpz_init (valc1);
+	  wi::to_mpz (c1, valc1, TYPE_SIGN (type));
+	  if (mpz_cmp (valc1, below) == 0)
+	    cmp = GT_EXPR;
+	  if (mpz_cmp (valc1, up) == 0)
+	    cmp = LT_EXPR;
+
+	  mpz_clear (valc1);
+	}
+      else
+	{
+	  /* Case of comparing with the bounds of the type.  */
+	  if (TYPE_MIN_VALUE (type)
+	      && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
+	    cmp = GT_EXPR;
+	  if (TYPE_MAX_VALUE (type)
+	      && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
+	    cmp = LT_EXPR;
+	}
+
+      /* Quick return if no useful information.  */
+      if (cmp == NE_EXPR)
+	return;
+
+      break;
+
+    default:
+      return;
+    }
+
+  mpz_init (offc0);
+  mpz_init (offc1);
+  split_to_var_and_offset (expand_simple_operations (c0), &varc0, offc0);
+  split_to_var_and_offset (expand_simple_operations (c1), &varc1, offc1);
+
+  /* We are only interested in comparisons of expressions based on VAR.  */
+  if (operand_equal_p (var, varc1, 0))
+    {
+      std::swap (varc0, varc1);
+      mpz_swap (offc0, offc1);
+      cmp = swap_tree_comparison (cmp);
+    }
+  else if (!operand_equal_p (var, varc0, 0))
+    goto end_2;
+
+  mpz_init (mint);
+  mpz_init (maxt);
+  get_type_static_bounds (type, mint, maxt);
+  mpz_init (minc1);
+  mpz_init (maxc1);
+  /* Setup range information for varc1.  */
+  if (integer_zerop (varc1))
+    {
+      wi::to_mpz (integer_zero_node, minc1, TYPE_SIGN (type));
+      wi::to_mpz (integer_zero_node, maxc1, TYPE_SIGN (type));
+    }
+  else if (TREE_CODE (varc1) == SSA_NAME
+	   && INTEGRAL_TYPE_P (type)
+	   && get_range_info (varc1, &minv, &maxv) == VR_RANGE)
+    {
+      gcc_assert (wi::le_p (minv, maxv, sgn));
+      wi::to_mpz (minv, minc1, sgn);
+      wi::to_mpz (maxv, maxc1, sgn);
+    }
+  else
+    {
+      mpz_set (minc1, mint);
+      mpz_set (maxc1, maxt);
+    }
+
+  /* Compute valid range information for varc1 + offc1.  Note nothing
+     useful can be derived if it overflows or underflows.  Overflow or
+     underflow could happen when:
+
+       offc1 > 0 && varc1 + offc1 > MAX_VAL (type)
+       offc1 < 0 && varc1 + offc1 < MIN_VAL (type).  */
+  mpz_add (minc1, minc1, offc1);
+  mpz_add (maxc1, maxc1, offc1);
+  c1_ok = (no_wrap
+	   || mpz_sgn (offc1) == 0
+	   || (mpz_sgn (offc1) < 0 && mpz_cmp (minc1, mint) >= 0)
+	   || (mpz_sgn (offc1) > 0 && mpz_cmp (maxc1, maxt) <= 0));
+  if (!c1_ok)
+    goto end_1;
+
+  if (mpz_cmp (minc1, mint) < 0)
+    mpz_set (minc1, mint);
+  if (mpz_cmp (maxc1, maxt) > 0)
+    mpz_set (maxc1, maxt);
+
+  if (cmp == LT_EXPR)
+    {
+      cmp = LE_EXPR;
+      mpz_sub_ui (maxc1, maxc1, 1);
+    }
+  if (cmp == GT_EXPR)
+    {
+      cmp = GE_EXPR;
+      mpz_add_ui (minc1, minc1, 1);
+    }
+
+  /* Compute range information for varc0.  If there is no overflow,
+     the condition implied that
+
+       (varc0) cmp (varc1 + offc1 - offc0)
+
+     We can possibly improve the upper bound of varc0 if cmp is LE_EXPR,
+     or the below bound if cmp is GE_EXPR.
+
+     To prove there is no overflow/underflow, we need to check below
+     four cases:
+       1) cmp == LE_EXPR && offc0 > 0
+
+	    (varc0 + offc0) doesn't overflow
+	    && (varc1 + offc1 - offc0) doesn't underflow
+
+       2) cmp == LE_EXPR && offc0 < 0
+
+	    (varc0 + offc0) doesn't underflow
+	    && (varc1 + offc1 - offc0) doesn't overfloe
+
+	  In this case, (varc0 + offc0) will never underflow if we can
+	  prove (varc1 + offc1 - offc0) doesn't overflow.
+
+       3) cmp == GE_EXPR && offc0 < 0
+
+	    (varc0 + offc0) doesn't underflow
+	    && (varc1 + offc1 - offc0) doesn't overflow
+
+       4) cmp == GE_EXPR && offc0 > 0
+
+	    (varc0 + offc0) doesn't overflow
+	    && (varc1 + offc1 - offc0) doesn't underflow
+
+	  In this case, (varc0 + offc0) will never overflow if we can
+	  prove (varc1 + offc1 - offc0) doesn't underflow.
+
+     Note we only handle case 2 and 4 in below code.  */
+
+  mpz_sub (minc1, minc1, offc0);
+  mpz_sub (maxc1, maxc1, offc0);
+  c0_ok = (no_wrap
+	   || mpz_sgn (offc0) == 0
+	   || (cmp == LE_EXPR
+	       && mpz_sgn (offc0) < 0 && mpz_cmp (maxc1, maxt) <= 0)
+	   || (cmp == GE_EXPR
+	       && mpz_sgn (offc0) > 0 && mpz_cmp (minc1, mint) >= 0));
+  if (!c0_ok)
+    goto end_1;
+
+  if (cmp == LE_EXPR)
+    {
+      if (mpz_cmp (up, maxc1) > 0)
+	mpz_set (up, maxc1);
+    }
+  else
+    {
+      if (mpz_cmp (below, minc1) < 0)
+	mpz_set (below, minc1);
+    }
+
+end_1:
+  mpz_clear (mint);
+  mpz_clear (maxt);
+  mpz_clear (minc1);
+  mpz_clear (maxc1);
+end_2:
+  mpz_clear (offc0);
+  mpz_clear (offc1);
+}
+
 /* Stores estimate on the minimum/maximum value of the expression VAR + OFF
    in TYPE to MIN and MAX.  */
 
@@ -129,6 +356,9 @@ static void
 determine_value_range (struct loop *loop, tree type, tree var, mpz_t off,
 		       mpz_t min, mpz_t max)
 {
+  int cnt = 0;
+  mpz_t minm, maxm;
+  basic_block bb;
   wide_int minv, maxv;
   enum value_range_type rtype = VR_VARYING;
 
@@ -183,35 +413,69 @@ determine_value_range (struct loop *loop, tree typ
 		}
 	    }
 	}
-      if (rtype == VR_RANGE)
+      mpz_init (minm);
+      mpz_init (maxm);
+      if (rtype != VR_RANGE)
 	{
-	  mpz_t minm, maxm;
+	  mpz_set (minm, min);
+	  mpz_set (maxm, max);
+	}
+      else
+	{
 	  gcc_assert (wi::le_p (minv, maxv, sgn));
-	  mpz_init (minm);
-	  mpz_init (maxm);
 	  wi::to_mpz (minv, minm, sgn);
 	  wi::to_mpz (maxv, maxm, sgn);
-	  mpz_add (minm, minm, off);
-	  mpz_add (maxm, maxm, off);
-	  /* If the computation may not wrap or off is zero, then this
-	     is always fine.  If off is negative and minv + off isn't
-	     smaller than type's minimum, or off is positive and
-	     maxv + off isn't bigger than type's maximum, use the more
-	     precise range too.  */
-	  if (nowrap_type_p (type)
-	      || mpz_sgn (off) == 0
-	      || (mpz_sgn (off) < 0 && mpz_cmp (minm, min) >= 0)
-	      || (mpz_sgn (off) > 0 && mpz_cmp (maxm, max) <= 0))
-	    {
-	      mpz_set (min, minm);
-	      mpz_set (max, maxm);
-	      mpz_clear (minm);
-	      mpz_clear (maxm);
-	      return;
-	    }
+	}
+      /* Now walk the dominators of the loop header and use the entry
+	 guards to refine the estimates.  */
+      for (bb = loop->header;
+	   bb != ENTRY_BLOCK_PTR_FOR_FN (cfun) && cnt < MAX_DOMINATORS_TO_WALK;
+	   bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+	{
+	  edge e;
+	  tree c0, c1;
+	  gimple cond;
+	  enum tree_code cmp;
+
+	  if (!single_pred_p (bb))
+	    continue;
+	  e = single_pred_edge (bb);
+
+	  if (!(e->flags & (EDGE_TRUE_VALUE | EDGE_FALSE_VALUE)))
+	    continue;
+
+	  cond = last_stmt (e->src);
+	  c0 = gimple_cond_lhs (cond);
+	  cmp = gimple_cond_code (cond);
+	  c1 = gimple_cond_rhs (cond);
+
+	  if (e->flags & EDGE_FALSE_VALUE)
+	    cmp = invert_tree_comparison (cmp, false);
+
+	  refine_value_range_using_guard (type, var, c0, cmp, c1, minm, maxm);
+	  ++cnt;
+	}
+
+      mpz_add (minm, minm, off);
+      mpz_add (maxm, maxm, off);
+      /* If the computation may not wrap or off is zero, then this
+	 is always fine.  If off is negative and minv + off isn't
+	 smaller than type's minimum, or off is positive and
+	 maxv + off isn't bigger than type's maximum, use the more
+	 precise range too.  */
+      if (nowrap_type_p (type)
+	  || mpz_sgn (off) == 0
+	  || (mpz_sgn (off) < 0 && mpz_cmp (minm, min) >= 0)
+	  || (mpz_sgn (off) > 0 && mpz_cmp (maxm, max) <= 0))
+	{
+	  mpz_set (min, minm);
+	  mpz_set (max, maxm);
 	  mpz_clear (minm);
 	  mpz_clear (maxm);
+	  return;
 	}
+      mpz_clear (minm);
+      mpz_clear (maxm);
     }
 
   /* If the computation may wrap, we know nothing about the value, except for

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-07-28 10:11 [PATCH GCC]Improve bound information in loop niter analysis Bin Cheng
@ 2015-08-13  8:27 ` Bin.Cheng
  2015-08-13 22:10 ` Jeff Law
  2015-08-14  8:28 ` Richard Biener
  2 siblings, 0 replies; 16+ messages in thread
From: Bin.Cheng @ 2015-08-13  8:27 UTC (permalink / raw)
  To: Bin Cheng; +Cc: gcc-patches List

Ping.

Thanks,
bin

On Tue, Jul 28, 2015 at 5:36 PM, Bin Cheng <bin.cheng@arm.com> wrote:
> Hi,
> Loop niter computes inaccurate bound information for different loops.  This
> patch is to improve it by using loop initial condition in
> determine_value_range.  Generally, loop niter is computed by subtracting
> start var from end var in loop exit condition.  Moreover, loop bound is
> computed using value range information of both start and end variables.
> Basic idea of this patch is to check if loop initial condition implies more
> range information for both start/end variables.  If yes, we refine range
> information and use that to compute loop bound.
> With this improvement, more accurate loop bound information is computed for
> test cases added by this patch.
>
> Is it OK?
>
> Thanks,
> bin
>
> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>
>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>         (determine_value_range): Call refine_value_range_using_guard for
>         each loop initial condition to improve value range.
>
> gcc/testsuite/ChangeLog
> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>
>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-07-28 10:11 [PATCH GCC]Improve bound information in loop niter analysis Bin Cheng
  2015-08-13  8:27 ` Bin.Cheng
@ 2015-08-13 22:10 ` Jeff Law
  2015-08-14  3:13   ` Bin.Cheng
  2015-08-14  8:28 ` Richard Biener
  2 siblings, 1 reply; 16+ messages in thread
From: Jeff Law @ 2015-08-13 22:10 UTC (permalink / raw)
  To: Bin Cheng, gcc-patches

On 07/28/2015 03:36 AM, Bin Cheng wrote:
> Hi,
> Loop niter computes inaccurate bound information for different loops.  This
> patch is to improve it by using loop initial condition in
> determine_value_range.  Generally, loop niter is computed by subtracting
> start var from end var in loop exit condition.  Moreover, loop bound is
> computed using value range information of both start and end variables.
> Basic idea of this patch is to check if loop initial condition implies more
> range information for both start/end variables.  If yes, we refine range
> information and use that to compute loop bound.
> With this improvement, more accurate loop bound information is computed for
> test cases added by this patch.
>
> Is it OK?
>
> Thanks,
> bin
>
> 2015-07-28  Bin Cheng<bin.cheng@arm.com>
>
> 	* tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
> 	(determine_value_range): Call refine_value_range_using_guard for
> 	each loop initial condition to improve value range.
>
> gcc/testsuite/ChangeLog
> 2015-07-28  Bin Cheng<bin.cheng@arm.com>
>
> 	* gcc.dg/tree-ssa/loop-bound-1.c: New test.
> 	* gcc.dg/tree-ssa/loop-bound-3.c: New test.
> 	* gcc.dg/tree-ssa/loop-bound-5.c: New test.
>
>
> improve-loop-bound-analysis-20150728.txt
>
>
> Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c	(revision 0)
> +++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c	(revision 0)
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
> +
> +int *a;
> +
> +int
> +foo (unsigned char s, unsigned char l)
> +{
> +  unsigned char i;
> +  int sum = 0;
> +
> +  for (i = s; i > l; i -= 1)
So is this really bounded by 254 iterations?  ISTM it's bounded by 255 
iterations when called with s = 255, l = 0.   What am I missing here? 
Am I mis-interpreting the dump output in some way?

Similarly for the other tests.

Jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-13 22:10 ` Jeff Law
@ 2015-08-14  3:13   ` Bin.Cheng
  2015-08-14 18:32     ` Jeff Law
  0 siblings, 1 reply; 16+ messages in thread
From: Bin.Cheng @ 2015-08-14  3:13 UTC (permalink / raw)
  To: Jeff Law; +Cc: Bin Cheng, gcc-patches List

On Fri, Aug 14, 2015 at 6:08 AM, Jeff Law <law@redhat.com> wrote:
> On 07/28/2015 03:36 AM, Bin Cheng wrote:
>>
>> Hi,
>> Loop niter computes inaccurate bound information for different loops.
>> This
>> patch is to improve it by using loop initial condition in
>> determine_value_range.  Generally, loop niter is computed by subtracting
>> start var from end var in loop exit condition.  Moreover, loop bound is
>> computed using value range information of both start and end variables.
>> Basic idea of this patch is to check if loop initial condition implies
>> more
>> range information for both start/end variables.  If yes, we refine range
>> information and use that to compute loop bound.
>> With this improvement, more accurate loop bound information is computed
>> for
>> test cases added by this patch.
>>
>> Is it OK?
>>
>> Thanks,
>> bin
>>
>> 2015-07-28  Bin Cheng<bin.cheng@arm.com>
>>
>>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>>         (determine_value_range): Call refine_value_range_using_guard for
>>         each loop initial condition to improve value range.
>>
>> gcc/testsuite/ChangeLog
>> 2015-07-28  Bin Cheng<bin.cheng@arm.com>
>>
>>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.
>>
>>
>> improve-loop-bound-analysis-20150728.txt
>>
>>
>> Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c
>> ===================================================================
>> --- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c        (revision 0)
>> +++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c        (revision 0)
>> @@ -0,0 +1,22 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
>> +
>> +int *a;
>> +
>> +int
>> +foo (unsigned char s, unsigned char l)
>> +{
>> +  unsigned char i;
>> +  int sum = 0;
>> +
>> +  for (i = s; i > l; i -= 1)
>
> So is this really bounded by 254 iterations?  ISTM it's bounded by 255
> iterations when called with s = 255, l = 0.   What am I missing here? Am I
> mis-interpreting the dump output in some way?

Thanks for the comment.
IIUC, the niter information in struct tree_niter_desc means the number
of executions of the latch of the loop, so it's 254, rather than 255.
And same the bound information.  Of course, statements in the loop
body could execute bound+1 times depending on its position in loop.
For example, struct nb_iter_bound is to describe number of iterations
of statements(exit conditions).
Moreover, if we modify the test as in below:

>> +int *a;
>> +
>> +int
>> +foo (void)
>> +{
>> +  unsigned char i;
>> +  int sum = 0;
>> +
>> +  for (i = 255; i > 0; i -= 1)

GCC now successfully analyzes the loop's bound as 254.

I might be wrong about the code, so please correct me.

Thanks,
bin
>
> Similarly for the other tests.
>
> Jeff
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-07-28 10:11 [PATCH GCC]Improve bound information in loop niter analysis Bin Cheng
  2015-08-13  8:27 ` Bin.Cheng
  2015-08-13 22:10 ` Jeff Law
@ 2015-08-14  8:28 ` Richard Biener
  2015-08-14 18:47   ` Jeff Law
  2015-08-17 10:06   ` Bin.Cheng
  2 siblings, 2 replies; 16+ messages in thread
From: Richard Biener @ 2015-08-14  8:28 UTC (permalink / raw)
  To: Bin Cheng; +Cc: GCC Patches

On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
> Hi,
> Loop niter computes inaccurate bound information for different loops.  This
> patch is to improve it by using loop initial condition in
> determine_value_range.  Generally, loop niter is computed by subtracting
> start var from end var in loop exit condition.  Moreover, loop bound is
> computed using value range information of both start and end variables.
> Basic idea of this patch is to check if loop initial condition implies more
> range information for both start/end variables.  If yes, we refine range
> information and use that to compute loop bound.
> With this improvement, more accurate loop bound information is computed for
> test cases added by this patch.

+      c0 = fold_convert (type, c0);
+      c1 = fold_convert (type, c1);
+
+      if (operand_equal_p (var, c0, 0))

I believe if c0 is not already of type type operand-equal_p will never succeed.

(side-note: we should get rid of the GMP use, that's expensive and now we
have wide-int available which should do the trick as well)

+         /* Case of comparing with the bounds of the type.  */
+         if (TYPE_MIN_VALUE (type)
+             && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
+           cmp = GT_EXPR;
+         if (TYPE_MAX_VALUE (type)
+             && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
+           cmp = LT_EXPR;

don't use TYPE_MIN/MAX_VALUE.  Instead use the types precision
and all wide_int operations (see match.pd wi::max_value use).

+  else if (!operand_equal_p (var, varc0, 0))
+    goto end_2;

ick - goto.  We need sth like a auto_mpz class with a destructor.

struct auto_mpz
{
  auto_mpz () { mpz_init (m_val); }
  ~auto_mpz () { mpz_clear (m_val); }
  mpz& operator() { return m_val; }
  mpz m_val;
};

> Is it OK?

I see the code follows existing practice in niter analysis even though
my overall plan was to transition its copying of value-range related
optimizations to use VRP infrastructure.

I'm still ok with improving the existing code on the basis that I won't
get to that for GCC 6.

So - ok with the TYPE_MIN/MAX_VALUE change suggested above.

Refactoring with auto_mpz welcome.

Thanks,
RIchard.

> Thanks,
> bin
>
> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>
>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>         (determine_value_range): Call refine_value_range_using_guard for
>         each loop initial condition to improve value range.
>
> gcc/testsuite/ChangeLog
> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>
>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-14  3:13   ` Bin.Cheng
@ 2015-08-14 18:32     ` Jeff Law
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Law @ 2015-08-14 18:32 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Bin Cheng, gcc-patches List

On 08/13/2015 09:06 PM, Bin.Cheng wrote:
>
> Thanks for the comment.
> IIUC, the niter information in struct tree_niter_desc means the number
> of executions of the latch of the loop, so it's 254, rather than 255.
> And same the bound information.  Of course, statements in the loop
> body could execute bound+1 times depending on its position in loop.
> For example, struct nb_iter_bound is to describe number of iterations
> of statements(exit conditions).
Ah, if we go to tree_niter_desc, niter is defined as the # executions of 
the loop latch.  That's what I was missing.  Now back to the real review 
work :-)

jeff


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-14  8:28 ` Richard Biener
@ 2015-08-14 18:47   ` Jeff Law
  2015-08-17 10:06   ` Bin.Cheng
  1 sibling, 0 replies; 16+ messages in thread
From: Jeff Law @ 2015-08-14 18:47 UTC (permalink / raw)
  To: Richard Biener, Bin Cheng; +Cc: GCC Patches

On 08/14/2015 02:17 AM, Richard Biener wrote:
> On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>> Hi,
>> Loop niter computes inaccurate bound information for different loops.  This
>> patch is to improve it by using loop initial condition in
>> determine_value_range.  Generally, loop niter is computed by subtracting
>> start var from end var in loop exit condition.  Moreover, loop bound is
>> computed using value range information of both start and end variables.
>> Basic idea of this patch is to check if loop initial condition implies more
>> range information for both start/end variables.  If yes, we refine range
>> information and use that to compute loop bound.
>> With this improvement, more accurate loop bound information is computed for
>> test cases added by this patch.
>
> +      c0 = fold_convert (type, c0);
> +      c1 = fold_convert (type, c1);
> +
> +      if (operand_equal_p (var, c0, 0))
>
> I believe if c0 is not already of type type operand-equal_p will never succeed.
It's a bit looser than that.  Given two user defined types with the same 
signedness & precision operand_equal_p will consider the types OK.  One 
could argue that it ought to be using the useless type conversion check.

jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-14  8:28 ` Richard Biener
  2015-08-14 18:47   ` Jeff Law
@ 2015-08-17 10:06   ` Bin.Cheng
  2015-08-17 11:16     ` Ajit Kumar Agarwal
  2015-08-18 18:38     ` Jeff Law
  1 sibling, 2 replies; 16+ messages in thread
From: Bin.Cheng @ 2015-08-17 10:06 UTC (permalink / raw)
  To: Richard Biener; +Cc: Bin Cheng, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 3686 bytes --]

Thanks for all your reviews.

On Fri, Aug 14, 2015 at 4:17 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>> Hi,
>> Loop niter computes inaccurate bound information for different loops.  This
>> patch is to improve it by using loop initial condition in
>> determine_value_range.  Generally, loop niter is computed by subtracting
>> start var from end var in loop exit condition.  Moreover, loop bound is
>> computed using value range information of both start and end variables.
>> Basic idea of this patch is to check if loop initial condition implies more
>> range information for both start/end variables.  If yes, we refine range
>> information and use that to compute loop bound.
>> With this improvement, more accurate loop bound information is computed for
>> test cases added by this patch.
>
> +      c0 = fold_convert (type, c0);
> +      c1 = fold_convert (type, c1);
> +
> +      if (operand_equal_p (var, c0, 0))
>
> I believe if c0 is not already of type type operand-equal_p will never succeed.
It's quite specific case targeting comparison between var and it's
range bounds.  Given c0 is in form of "var + offc0", then the
comparison "var + offc0 != range bounds" doesn't have any useful
information.  Maybe useless type conversion can be handled here
though, it might be even corner case.

>
> (side-note: we should get rid of the GMP use, that's expensive and now we
> have wide-int available which should do the trick as well)
>
> +         /* Case of comparing with the bounds of the type.  */
> +         if (TYPE_MIN_VALUE (type)
> +             && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
> +           cmp = GT_EXPR;
> +         if (TYPE_MAX_VALUE (type)
> +             && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
> +           cmp = LT_EXPR;
>
> don't use TYPE_MIN/MAX_VALUE.  Instead use the types precision
> and all wide_int operations (see match.pd wi::max_value use).
Done.

>
> +  else if (!operand_equal_p (var, varc0, 0))
> +    goto end_2;
>
> ick - goto.  We need sth like a auto_mpz class with a destructor.
Label end_2 removed.

>
> struct auto_mpz
> {
>   auto_mpz () { mpz_init (m_val); }
>   ~auto_mpz () { mpz_clear (m_val); }
>   mpz& operator() { return m_val; }
>   mpz m_val;
> };
>
>> Is it OK?
>
> I see the code follows existing practice in niter analysis even though
> my overall plan was to transition its copying of value-range related
> optimizations to use VRP infrastructure.
Yes, I think it's easy to push it to VRP infrastructure.  Actually
from the name of the function, it's more vrp related.  For now, the
function is called only by bound_difference, not so many as vrp
queries.  We need cache facility in vrp otherwise it would be
expensive.

>
> I'm still ok with improving the existing code on the basis that I won't
> get to that for GCC 6.
>
> So - ok with the TYPE_MIN/MAX_VALUE change suggested above.
>
> Refactoring with auto_mpz welcome.
That will be an independent patch, so I skipped it in this one.

New version attached.  Bootstrap and test on x86_64.

Thanks,
bin
>
> Thanks,
> RIchard.
>
>> Thanks,
>> bin
>>
>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>>         (determine_value_range): Call refine_value_range_using_guard for
>>         each loop initial condition to improve value range.
>>
>> gcc/testsuite/ChangeLog
>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.

[-- Attachment #2: improve-loop-bound-analysis-20150817.txt --]
[-- Type: text/plain, Size: 11632 bytes --]

Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-3.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+int *a;
+
+int
+foo (unsigned char s, unsigned char l)
+{
+  unsigned char i;
+  int sum = 0;
+
+  for (i = s; i > l; i -= 1)
+    {
+      sum += a[i];
+    }
+
+  return sum;
+}
+
+/* Check loop niter bound information.  */
+/* { dg-final { scan-tree-dump "bounded by 254" "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "bounded by 255" "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-5.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-5.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-5.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+int *a;
+
+int
+foo (unsigned char s)
+{
+  unsigned char i;
+  int sum = 0;
+
+  for (i = s; i > 0; i -= 1)
+    {
+      sum += a[i];
+    }
+
+  return sum;
+}
+
+/* Check loop niter bound information.  */
+/* { dg-final { scan-tree-dump "bounded by 254" "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "bounded by 255" "ivopts" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/loop-bound-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/loop-bound-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/loop-bound-1.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+int *a;
+
+int
+foo (unsigned char s, unsigned char l)
+{
+  unsigned char i;
+  int sum = 0;
+
+  for (i = s; i < l; i += 1)
+    {
+      sum += a[i];
+    }
+
+  return sum;
+}
+
+/* Check loop niter bound information.  */
+/* { dg-final { scan-tree-dump "bounded by 254" "ivopts" } } */
+/* { dg-final { scan-tree-dump-not "bounded by 255" "ivopts" } } */
Index: gcc/tree-ssa-loop-niter.c
===================================================================
--- gcc/tree-ssa-loop-niter.c	(revision 225916)
+++ gcc/tree-ssa-loop-niter.c	(working copy)
@@ -122,6 +122,237 @@ split_to_var_and_offset (tree expr, tree *var, mpz
     }
 }
 
+/* From condition C0 CMP C1 derives information regarding the value range
+   of VAR, which is of TYPE.  Results are stored in to BELOW and UP.  */
+
+static void
+refine_value_range_using_guard (tree type, tree var,
+				tree c0, enum tree_code cmp, tree c1,
+				mpz_t below, mpz_t up)
+{
+  tree varc0, varc1, ctype;
+  mpz_t offc0, offc1;
+  mpz_t mint, maxt, minc1, maxc1;
+  wide_int minv, maxv;
+  bool no_wrap = nowrap_type_p (type);
+  bool c0_ok, c1_ok;
+  signop sgn = TYPE_SIGN (type);
+
+  switch (cmp)
+    {
+    case LT_EXPR:
+    case LE_EXPR:
+    case GT_EXPR:
+    case GE_EXPR:
+      STRIP_SIGN_NOPS (c0);
+      STRIP_SIGN_NOPS (c1);
+      ctype = TREE_TYPE (c0);
+      if (!useless_type_conversion_p (ctype, type))
+	return;
+
+      break;
+
+    case EQ_EXPR:
+      /* We could derive quite precise information from EQ_EXPR, however,
+	 such a guard is unlikely to appear, so we do not bother with
+	 handling it.  */
+      return;
+
+    case NE_EXPR:
+      /* NE_EXPR comparisons do not contain much of useful information,
+	 except for cases of comparing with bounds.  */
+      if (TREE_CODE (c1) != INTEGER_CST
+	  || !INTEGRAL_TYPE_P (type))
+	return;
+
+      /* Ensure that the condition speaks about an expression in the same
+	 type as X and Y.  */
+      ctype = TREE_TYPE (c0);
+      if (TYPE_PRECISION (ctype) != TYPE_PRECISION (type))
+	return;
+      c0 = fold_convert (type, c0);
+      c1 = fold_convert (type, c1);
+
+      if (operand_equal_p (var, c0, 0))
+	{
+	  mpz_t valc1;
+
+	  /* Case of comparing VAR with its below/up bounds.  */
+	  mpz_init (valc1);
+	  wi::to_mpz (c1, valc1, TYPE_SIGN (type));
+	  if (mpz_cmp (valc1, below) == 0)
+	    cmp = GT_EXPR;
+	  if (mpz_cmp (valc1, up) == 0)
+	    cmp = LT_EXPR;
+
+	  mpz_clear (valc1);
+	}
+      else
+	{
+	  /* Case of comparing with the bounds of the type.  */
+	  wide_int min = wi::min_value (type);
+	  wide_int max = wi::max_value (type);
+
+	  if (wi::eq_p (c1, min))
+	    cmp = GT_EXPR;
+	  if (wi::eq_p (c1, max))
+	    cmp = LT_EXPR;
+	}
+
+      /* Quick return if no useful information.  */
+      if (cmp == NE_EXPR)
+	return;
+
+      break;
+
+    default:
+      return;
+    }
+
+  mpz_init (offc0);
+  mpz_init (offc1);
+  split_to_var_and_offset (expand_simple_operations (c0), &varc0, offc0);
+  split_to_var_and_offset (expand_simple_operations (c1), &varc1, offc1);
+
+  /* We are only interested in comparisons of expressions based on VAR.  */
+  if (operand_equal_p (var, varc1, 0))
+    {
+      std::swap (varc0, varc1);
+      mpz_swap (offc0, offc1);
+      cmp = swap_tree_comparison (cmp);
+    }
+  else if (!operand_equal_p (var, varc0, 0))
+    {
+      mpz_clear (offc0);
+      mpz_clear (offc1);
+      return;
+    }
+
+  mpz_init (mint);
+  mpz_init (maxt);
+  get_type_static_bounds (type, mint, maxt);
+  mpz_init (minc1);
+  mpz_init (maxc1);
+  /* Setup range information for varc1.  */
+  if (integer_zerop (varc1))
+    {
+      wi::to_mpz (integer_zero_node, minc1, TYPE_SIGN (type));
+      wi::to_mpz (integer_zero_node, maxc1, TYPE_SIGN (type));
+    }
+  else if (TREE_CODE (varc1) == SSA_NAME
+	   && INTEGRAL_TYPE_P (type)
+	   && get_range_info (varc1, &minv, &maxv) == VR_RANGE)
+    {
+      gcc_assert (wi::le_p (minv, maxv, sgn));
+      wi::to_mpz (minv, minc1, sgn);
+      wi::to_mpz (maxv, maxc1, sgn);
+    }
+  else
+    {
+      mpz_set (minc1, mint);
+      mpz_set (maxc1, maxt);
+    }
+
+  /* Compute valid range information for varc1 + offc1.  Note nothing
+     useful can be derived if it overflows or underflows.  Overflow or
+     underflow could happen when:
+
+       offc1 > 0 && varc1 + offc1 > MAX_VAL (type)
+       offc1 < 0 && varc1 + offc1 < MIN_VAL (type).  */
+  mpz_add (minc1, minc1, offc1);
+  mpz_add (maxc1, maxc1, offc1);
+  c1_ok = (no_wrap
+	   || mpz_sgn (offc1) == 0
+	   || (mpz_sgn (offc1) < 0 && mpz_cmp (minc1, mint) >= 0)
+	   || (mpz_sgn (offc1) > 0 && mpz_cmp (maxc1, maxt) <= 0));
+  if (!c1_ok)
+    goto end;
+
+  if (mpz_cmp (minc1, mint) < 0)
+    mpz_set (minc1, mint);
+  if (mpz_cmp (maxc1, maxt) > 0)
+    mpz_set (maxc1, maxt);
+
+  if (cmp == LT_EXPR)
+    {
+      cmp = LE_EXPR;
+      mpz_sub_ui (maxc1, maxc1, 1);
+    }
+  if (cmp == GT_EXPR)
+    {
+      cmp = GE_EXPR;
+      mpz_add_ui (minc1, minc1, 1);
+    }
+
+  /* Compute range information for varc0.  If there is no overflow,
+     the condition implied that
+
+       (varc0) cmp (varc1 + offc1 - offc0)
+
+     We can possibly improve the upper bound of varc0 if cmp is LE_EXPR,
+     or the below bound if cmp is GE_EXPR.
+
+     To prove there is no overflow/underflow, we need to check below
+     four cases:
+       1) cmp == LE_EXPR && offc0 > 0
+
+	    (varc0 + offc0) doesn't overflow
+	    && (varc1 + offc1 - offc0) doesn't underflow
+
+       2) cmp == LE_EXPR && offc0 < 0
+
+	    (varc0 + offc0) doesn't underflow
+	    && (varc1 + offc1 - offc0) doesn't overfloe
+
+	  In this case, (varc0 + offc0) will never underflow if we can
+	  prove (varc1 + offc1 - offc0) doesn't overflow.
+
+       3) cmp == GE_EXPR && offc0 < 0
+
+	    (varc0 + offc0) doesn't underflow
+	    && (varc1 + offc1 - offc0) doesn't overflow
+
+       4) cmp == GE_EXPR && offc0 > 0
+
+	    (varc0 + offc0) doesn't overflow
+	    && (varc1 + offc1 - offc0) doesn't underflow
+
+	  In this case, (varc0 + offc0) will never overflow if we can
+	  prove (varc1 + offc1 - offc0) doesn't underflow.
+
+     Note we only handle case 2 and 4 in below code.  */
+
+  mpz_sub (minc1, minc1, offc0);
+  mpz_sub (maxc1, maxc1, offc0);
+  c0_ok = (no_wrap
+	   || mpz_sgn (offc0) == 0
+	   || (cmp == LE_EXPR
+	       && mpz_sgn (offc0) < 0 && mpz_cmp (maxc1, maxt) <= 0)
+	   || (cmp == GE_EXPR
+	       && mpz_sgn (offc0) > 0 && mpz_cmp (minc1, mint) >= 0));
+  if (!c0_ok)
+    goto end;
+
+  if (cmp == LE_EXPR)
+    {
+      if (mpz_cmp (up, maxc1) > 0)
+	mpz_set (up, maxc1);
+    }
+  else
+    {
+      if (mpz_cmp (below, minc1) < 0)
+	mpz_set (below, minc1);
+    }
+
+end:
+  mpz_clear (mint);
+  mpz_clear (maxt);
+  mpz_clear (minc1);
+  mpz_clear (maxc1);
+  mpz_clear (offc0);
+  mpz_clear (offc1);
+}
+
 /* Stores estimate on the minimum/maximum value of the expression VAR + OFF
    in TYPE to MIN and MAX.  */
 
@@ -129,6 +360,9 @@ static void
 determine_value_range (struct loop *loop, tree type, tree var, mpz_t off,
 		       mpz_t min, mpz_t max)
 {
+  int cnt = 0;
+  mpz_t minm, maxm;
+  basic_block bb;
   wide_int minv, maxv;
   enum value_range_type rtype = VR_VARYING;
 
@@ -183,35 +417,69 @@ determine_value_range (struct loop *loop, tree typ
 		}
 	    }
 	}
-      if (rtype == VR_RANGE)
+      mpz_init (minm);
+      mpz_init (maxm);
+      if (rtype != VR_RANGE)
 	{
-	  mpz_t minm, maxm;
+	  mpz_set (minm, min);
+	  mpz_set (maxm, max);
+	}
+      else
+	{
 	  gcc_assert (wi::le_p (minv, maxv, sgn));
-	  mpz_init (minm);
-	  mpz_init (maxm);
 	  wi::to_mpz (minv, minm, sgn);
 	  wi::to_mpz (maxv, maxm, sgn);
-	  mpz_add (minm, minm, off);
-	  mpz_add (maxm, maxm, off);
-	  /* If the computation may not wrap or off is zero, then this
-	     is always fine.  If off is negative and minv + off isn't
-	     smaller than type's minimum, or off is positive and
-	     maxv + off isn't bigger than type's maximum, use the more
-	     precise range too.  */
-	  if (nowrap_type_p (type)
-	      || mpz_sgn (off) == 0
-	      || (mpz_sgn (off) < 0 && mpz_cmp (minm, min) >= 0)
-	      || (mpz_sgn (off) > 0 && mpz_cmp (maxm, max) <= 0))
-	    {
-	      mpz_set (min, minm);
-	      mpz_set (max, maxm);
-	      mpz_clear (minm);
-	      mpz_clear (maxm);
-	      return;
-	    }
+	}
+      /* Now walk the dominators of the loop header and use the entry
+	 guards to refine the estimates.  */
+      for (bb = loop->header;
+	   bb != ENTRY_BLOCK_PTR_FOR_FN (cfun) && cnt < MAX_DOMINATORS_TO_WALK;
+	   bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+	{
+	  edge e;
+	  tree c0, c1;
+	  gimple cond;
+	  enum tree_code cmp;
+
+	  if (!single_pred_p (bb))
+	    continue;
+	  e = single_pred_edge (bb);
+
+	  if (!(e->flags & (EDGE_TRUE_VALUE | EDGE_FALSE_VALUE)))
+	    continue;
+
+	  cond = last_stmt (e->src);
+	  c0 = gimple_cond_lhs (cond);
+	  cmp = gimple_cond_code (cond);
+	  c1 = gimple_cond_rhs (cond);
+
+	  if (e->flags & EDGE_FALSE_VALUE)
+	    cmp = invert_tree_comparison (cmp, false);
+
+	  refine_value_range_using_guard (type, var, c0, cmp, c1, minm, maxm);
+	  ++cnt;
+	}
+
+      mpz_add (minm, minm, off);
+      mpz_add (maxm, maxm, off);
+      /* If the computation may not wrap or off is zero, then this
+	 is always fine.  If off is negative and minv + off isn't
+	 smaller than type's minimum, or off is positive and
+	 maxv + off isn't bigger than type's maximum, use the more
+	 precise range too.  */
+      if (nowrap_type_p (type)
+	  || mpz_sgn (off) == 0
+	  || (mpz_sgn (off) < 0 && mpz_cmp (minm, min) >= 0)
+	  || (mpz_sgn (off) > 0 && mpz_cmp (maxm, max) <= 0))
+	{
+	  mpz_set (min, minm);
+	  mpz_set (max, maxm);
 	  mpz_clear (minm);
 	  mpz_clear (maxm);
+	  return;
 	}
+      mpz_clear (minm);
+      mpz_clear (maxm);
     }
 
   /* If the computation may wrap, we know nothing about the value, except for

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-17 10:06   ` Bin.Cheng
@ 2015-08-17 11:16     ` Ajit Kumar Agarwal
  2015-08-17 11:38       ` Ajit Kumar Agarwal
  2015-08-18  7:53       ` Bin.Cheng
  2015-08-18 18:38     ` Jeff Law
  1 sibling, 2 replies; 16+ messages in thread
From: Ajit Kumar Agarwal @ 2015-08-17 11:16 UTC (permalink / raw)
  To: Bin.Cheng, Richard Biener
  Cc: Bin Cheng, GCC Patches, Vinod Kathail, Shail Aditya Gupta,
	Vidhumouli Hunsigida, Nagaraju Mekala

All:

Does the Logic to calculate the Loop bound information through Value Range Analyis uses the post dominator and
Dominator info. The iteration branches instead of Loop exit condition can be calculated through post dominator info.
If the node in the Loop has two successors and post dominates the two successors then the iteration branch can be
The same node. 

For All the nodes L in the Loop B
If (L1, L2  belongs to successors of (L) && L1,L2 belongs to PosDom(Header of Loop))
{
  I = I union L1
}

Thus "I" will have all set of iteration branches. This will handle more cases of Loop bound information that 
Will be accurate through the exact iteration count that are known cases along with Value Range Information
Where the condition is instead not the Loop exits but other nodes in the Loop.

Thanks & Regards
Ajit
 

-----Original Message-----
From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-owner@gcc.gnu.org] On Behalf Of Bin.Cheng
Sent: Monday, August 17, 2015 3:32 PM
To: Richard Biener
Cc: Bin Cheng; GCC Patches
Subject: Re: [PATCH GCC]Improve bound information in loop niter analysis

Thanks for all your reviews.

On Fri, Aug 14, 2015 at 4:17 PM, Richard Biener <richard.guenther@gmail.com> wrote:
> On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>> Hi,
>> Loop niter computes inaccurate bound information for different loops.  
>> This patch is to improve it by using loop initial condition in 
>> determine_value_range.  Generally, loop niter is computed by 
>> subtracting start var from end var in loop exit condition.  Moreover, 
>> loop bound is computed using value range information of both start and end variables.
>> Basic idea of this patch is to check if loop initial condition 
>> implies more range information for both start/end variables.  If yes, 
>> we refine range information and use that to compute loop bound.
>> With this improvement, more accurate loop bound information is 
>> computed for test cases added by this patch.
>
> +      c0 = fold_convert (type, c0);
> +      c1 = fold_convert (type, c1);
> +
> +      if (operand_equal_p (var, c0, 0))
>
> I believe if c0 is not already of type type operand-equal_p will never succeed.
It's quite specific case targeting comparison between var and it's range bounds.  Given c0 is in form of "var + offc0", then the comparison "var + offc0 != range bounds" doesn't have any useful information.  Maybe useless type conversion can be handled here though, it might be even corner case.

>
> (side-note: we should get rid of the GMP use, that's expensive and now 
> we have wide-int available which should do the trick as well)
>
> +         /* Case of comparing with the bounds of the type.  */
> +         if (TYPE_MIN_VALUE (type)
> +             && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
> +           cmp = GT_EXPR;
> +         if (TYPE_MAX_VALUE (type)
> +             && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
> +           cmp = LT_EXPR;
>
> don't use TYPE_MIN/MAX_VALUE.  Instead use the types precision and all 
> wide_int operations (see match.pd wi::max_value use).
Done.

>
> +  else if (!operand_equal_p (var, varc0, 0))
> +    goto end_2;
>
> ick - goto.  We need sth like a auto_mpz class with a destructor.
Label end_2 removed.

>
> struct auto_mpz
> {
>   auto_mpz () { mpz_init (m_val); }
>   ~auto_mpz () { mpz_clear (m_val); }
>   mpz& operator() { return m_val; }
>   mpz m_val;
> };
>
>> Is it OK?
>
> I see the code follows existing practice in niter analysis even though 
> my overall plan was to transition its copying of value-range related 
> optimizations to use VRP infrastructure.
Yes, I think it's easy to push it to VRP infrastructure.  Actually from the name of the function, it's more vrp related.  For now, the function is called only by bound_difference, not so many as vrp queries.  We need cache facility in vrp otherwise it would be expensive.

>
> I'm still ok with improving the existing code on the basis that I 
> won't get to that for GCC 6.
>
> So - ok with the TYPE_MIN/MAX_VALUE change suggested above.
>
> Refactoring with auto_mpz welcome.
That will be an independent patch, so I skipped it in this one.

New version attached.  Bootstrap and test on x86_64.

Thanks,
bin
>
> Thanks,
> RIchard.
>
>> Thanks,
>> bin
>>
>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>>         (determine_value_range): Call refine_value_range_using_guard for
>>         each loop initial condition to improve value range.
>>
>> gcc/testsuite/ChangeLog
>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-17 11:16     ` Ajit Kumar Agarwal
@ 2015-08-17 11:38       ` Ajit Kumar Agarwal
  2015-08-18  7:53       ` Bin.Cheng
  1 sibling, 0 replies; 16+ messages in thread
From: Ajit Kumar Agarwal @ 2015-08-17 11:38 UTC (permalink / raw)
  To: Ajit Kumar Agarwal, Bin.Cheng, Richard Biener
  Cc: Bin Cheng, GCC Patches, Vinod Kathail, Shail Aditya Gupta,
	Vidhumouli Hunsigida, Nagaraju Mekala

Oops, there is a typo error instead of L it was typed as L1.
Here is the corrected one.

For All the nodes L in the Loop B
If (L1, L2  belongs to successors of (L) && L1,L2 belongs to PosDom(Header of Loop)) {
  I = I union L;
}

Thanks & Regards
Ajit
-----Original Message-----
From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-owner@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal
Sent: Monday, August 17, 2015 4:19 PM
To: Bin.Cheng; Richard Biener
Cc: Bin Cheng; GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: RE: [PATCH GCC]Improve bound information in loop niter analysis

All:

Does the Logic to calculate the Loop bound information through Value Range Analyis uses the post dominator and Dominator info. The iteration branches instead of Loop exit condition can be calculated through post dominator info.
If the node in the Loop has two successors and post dominates the two successors then the iteration branch can be The same node. 

For All the nodes L in the Loop B
If (L1, L2  belongs to successors of (L) && L1,L2 belongs to PosDom(Header of Loop)) {
  I = I union L1
}

Thus "I" will have all set of iteration branches. This will handle more cases of Loop bound information that Will be accurate through the exact iteration count that are known cases along with Value Range Information Where the condition is instead not the Loop exits but other nodes in the Loop.

Thanks & Regards
Ajit
 

-----Original Message-----
From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-owner@gcc.gnu.org] On Behalf Of Bin.Cheng
Sent: Monday, August 17, 2015 3:32 PM
To: Richard Biener
Cc: Bin Cheng; GCC Patches
Subject: Re: [PATCH GCC]Improve bound information in loop niter analysis

Thanks for all your reviews.

On Fri, Aug 14, 2015 at 4:17 PM, Richard Biener <richard.guenther@gmail.com> wrote:
> On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>> Hi,
>> Loop niter computes inaccurate bound information for different loops.  
>> This patch is to improve it by using loop initial condition in 
>> determine_value_range.  Generally, loop niter is computed by 
>> subtracting start var from end var in loop exit condition.  Moreover, 
>> loop bound is computed using value range information of both start and end variables.
>> Basic idea of this patch is to check if loop initial condition 
>> implies more range information for both start/end variables.  If yes, 
>> we refine range information and use that to compute loop bound.
>> With this improvement, more accurate loop bound information is 
>> computed for test cases added by this patch.
>
> +      c0 = fold_convert (type, c0);
> +      c1 = fold_convert (type, c1);
> +
> +      if (operand_equal_p (var, c0, 0))
>
> I believe if c0 is not already of type type operand-equal_p will never succeed.
It's quite specific case targeting comparison between var and it's range bounds.  Given c0 is in form of "var + offc0", then the comparison "var + offc0 != range bounds" doesn't have any useful information.  Maybe useless type conversion can be handled here though, it might be even corner case.

>
> (side-note: we should get rid of the GMP use, that's expensive and now 
> we have wide-int available which should do the trick as well)
>
> +         /* Case of comparing with the bounds of the type.  */
> +         if (TYPE_MIN_VALUE (type)
> +             && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
> +           cmp = GT_EXPR;
> +         if (TYPE_MAX_VALUE (type)
> +             && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
> +           cmp = LT_EXPR;
>
> don't use TYPE_MIN/MAX_VALUE.  Instead use the types precision and all 
> wide_int operations (see match.pd wi::max_value use).
Done.

>
> +  else if (!operand_equal_p (var, varc0, 0))
> +    goto end_2;
>
> ick - goto.  We need sth like a auto_mpz class with a destructor.
Label end_2 removed.

>
> struct auto_mpz
> {
>   auto_mpz () { mpz_init (m_val); }
>   ~auto_mpz () { mpz_clear (m_val); }
>   mpz& operator() { return m_val; }
>   mpz m_val;
> };
>
>> Is it OK?
>
> I see the code follows existing practice in niter analysis even though 
> my overall plan was to transition its copying of value-range related 
> optimizations to use VRP infrastructure.
Yes, I think it's easy to push it to VRP infrastructure.  Actually from the name of the function, it's more vrp related.  For now, the function is called only by bound_difference, not so many as vrp queries.  We need cache facility in vrp otherwise it would be expensive.

>
> I'm still ok with improving the existing code on the basis that I 
> won't get to that for GCC 6.
>
> So - ok with the TYPE_MIN/MAX_VALUE change suggested above.
>
> Refactoring with auto_mpz welcome.
That will be an independent patch, so I skipped it in this one.

New version attached.  Bootstrap and test on x86_64.

Thanks,
bin
>
> Thanks,
> RIchard.
>
>> Thanks,
>> bin
>>
>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>>         (determine_value_range): Call refine_value_range_using_guard for
>>         each loop initial condition to improve value range.
>>
>> gcc/testsuite/ChangeLog
>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-17 11:16     ` Ajit Kumar Agarwal
  2015-08-17 11:38       ` Ajit Kumar Agarwal
@ 2015-08-18  7:53       ` Bin.Cheng
  2015-08-18  8:16         ` Ajit Kumar Agarwal
  1 sibling, 1 reply; 16+ messages in thread
From: Bin.Cheng @ 2015-08-18  7:53 UTC (permalink / raw)
  To: Ajit Kumar Agarwal
  Cc: Richard Biener, Bin Cheng, GCC Patches, Vinod Kathail,
	Shail Aditya Gupta, Vidhumouli Hunsigida, Nagaraju Mekala

On Mon, Aug 17, 2015 at 6:49 PM, Ajit Kumar Agarwal
<ajit.kumar.agarwal@xilinx.com> wrote:
> All:
>
> Does the Logic to calculate the Loop bound information through Value Range Analyis uses the post dominator and
> Dominator info. The iteration branches instead of Loop exit condition can be calculated through post dominator info.
> If the node in the Loop has two successors and post dominates the two successors then the iteration branch can be
> The same node.
>
> For All the nodes L in the Loop B
> If (L1, L2  belongs to successors of (L) && L1,L2 belongs to PosDom(Header of Loop))
> {
>   I = I union L1
> }
>
> Thus "I" will have all set of iteration branches. This will handle more cases of Loop bound information that
> Will be accurate through the exact iteration count that are known cases along with Value Range Information
> Where the condition is instead not the Loop exits but other nodes in the Loop.

I don't quite follow your words here.  Could you please give a simple
example about it?  Especially I don't know how post-dom helps the loop
bound analysis.  Seems your pseudo code is collecting some comparison
basic block of loop?

Thanks,
bin
>
> Thanks & Regards
> Ajit
>
>
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-owner@gcc.gnu.org] On Behalf Of Bin.Cheng
> Sent: Monday, August 17, 2015 3:32 PM
> To: Richard Biener
> Cc: Bin Cheng; GCC Patches
> Subject: Re: [PATCH GCC]Improve bound information in loop niter analysis
>
> Thanks for all your reviews.
>
> On Fri, Aug 14, 2015 at 4:17 PM, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>>> Hi,
>>> Loop niter computes inaccurate bound information for different loops.
>>> This patch is to improve it by using loop initial condition in
>>> determine_value_range.  Generally, loop niter is computed by
>>> subtracting start var from end var in loop exit condition.  Moreover,
>>> loop bound is computed using value range information of both start and end variables.
>>> Basic idea of this patch is to check if loop initial condition
>>> implies more range information for both start/end variables.  If yes,
>>> we refine range information and use that to compute loop bound.
>>> With this improvement, more accurate loop bound information is
>>> computed for test cases added by this patch.
>>
>> +      c0 = fold_convert (type, c0);
>> +      c1 = fold_convert (type, c1);
>> +
>> +      if (operand_equal_p (var, c0, 0))
>>
>> I believe if c0 is not already of type type operand-equal_p will never succeed.
> It's quite specific case targeting comparison between var and it's range bounds.  Given c0 is in form of "var + offc0", then the comparison "var + offc0 != range bounds" doesn't have any useful information.  Maybe useless type conversion can be handled here though, it might be even corner case.
>
>>
>> (side-note: we should get rid of the GMP use, that's expensive and now
>> we have wide-int available which should do the trick as well)
>>
>> +         /* Case of comparing with the bounds of the type.  */
>> +         if (TYPE_MIN_VALUE (type)
>> +             && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
>> +           cmp = GT_EXPR;
>> +         if (TYPE_MAX_VALUE (type)
>> +             && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
>> +           cmp = LT_EXPR;
>>
>> don't use TYPE_MIN/MAX_VALUE.  Instead use the types precision and all
>> wide_int operations (see match.pd wi::max_value use).
> Done.
>
>>
>> +  else if (!operand_equal_p (var, varc0, 0))
>> +    goto end_2;
>>
>> ick - goto.  We need sth like a auto_mpz class with a destructor.
> Label end_2 removed.
>
>>
>> struct auto_mpz
>> {
>>   auto_mpz () { mpz_init (m_val); }
>>   ~auto_mpz () { mpz_clear (m_val); }
>>   mpz& operator() { return m_val; }
>>   mpz m_val;
>> };
>>
>>> Is it OK?
>>
>> I see the code follows existing practice in niter analysis even though
>> my overall plan was to transition its copying of value-range related
>> optimizations to use VRP infrastructure.
> Yes, I think it's easy to push it to VRP infrastructure.  Actually from the name of the function, it's more vrp related.  For now, the function is called only by bound_difference, not so many as vrp queries.  We need cache facility in vrp otherwise it would be expensive.
>
>>
>> I'm still ok with improving the existing code on the basis that I
>> won't get to that for GCC 6.
>>
>> So - ok with the TYPE_MIN/MAX_VALUE change suggested above.
>>
>> Refactoring with auto_mpz welcome.
> That will be an independent patch, so I skipped it in this one.
>
> New version attached.  Bootstrap and test on x86_64.
>
> Thanks,
> bin
>>
>> Thanks,
>> RIchard.
>>
>>> Thanks,
>>> bin
>>>
>>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>>>         (determine_value_range): Call refine_value_range_using_guard for
>>>         each loop initial condition to improve value range.
>>>
>>> gcc/testsuite/ChangeLog
>>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>>>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>>>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-18  7:53       ` Bin.Cheng
@ 2015-08-18  8:16         ` Ajit Kumar Agarwal
  2015-08-18  8:51           ` Richard Biener
  2015-08-18  9:09           ` Bin.Cheng
  0 siblings, 2 replies; 16+ messages in thread
From: Ajit Kumar Agarwal @ 2015-08-18  8:16 UTC (permalink / raw)
  To: Bin.Cheng
  Cc: Richard Biener, Bin Cheng, GCC Patches, Vinod Kathail,
	Shail Aditya Gupta, Vidhumouli Hunsigida, Nagaraju Mekala



-----Original Message-----
From: Bin.Cheng [mailto:amker.cheng@gmail.com] 
Sent: Tuesday, August 18, 2015 1:08 PM
To: Ajit Kumar Agarwal
Cc: Richard Biener; Bin Cheng; GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: [PATCH GCC]Improve bound information in loop niter analysis

On Mon, Aug 17, 2015 at 6:49 PM, Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com> wrote:
> All:
>
> Does the Logic to calculate the Loop bound information through Value 
> Range Analyis uses the post dominator and Dominator info. The iteration branches instead of Loop exit condition can be calculated through post dominator info.
> If the node in the Loop has two successors and post dominates the two 
> successors then the iteration branch can be The same node.
>
> For All the nodes L in the Loop B
> If (L1, L2  belongs to successors of (L) && L1,L2 belongs to 
> PosDom(Header of Loop)) {
>   I = I union L1
> }
>
> Thus "I" will have all set of iteration branches. This will handle 
> more cases of Loop bound information that Will be accurate through the 
> exact iteration count that are known cases along with Value Range Information Where the condition is instead not the Loop exits but other nodes in the Loop.

>>I don't quite follow your words here.  Could you please give a simple example about it?  Especially I don't know how post-dom helps the loop bound analysis.  >>Seems your pseudo code is collecting some comparison basic block of loop?

The Algorithm I have given above is based on Post Dominator Info. This helps to calculate the iteration branches. The iteration branches are the
Branches that determine the loop exit condition. Based on the condition it either branches to the header of the Loop, Or it may branch to the 
Block dominated by the header or exit from the loop. The above Algorithm finds out such iteration branches and thus decides on the Loop bound
Or iteration count. If such iteration branches are not at the back edge node and it may be a node inside the loop based on some conditions.
Finding out such iteration branches can be done through the post dominator info using the above algorithm.  Based on the iteration branches the 
conditions can be analyzed and that helps in finding out the iteration bound for Known cases. Know cases are the cases where the loop bound can be determined at compile time.

 One Example would be Multi-Exits Loops where the Loop exit condition can be at the back edge or it may be a block inside the Loop based on the 
IF conditions and breaks out based on the conditions. Thus having multiple exits. Such iteration branches can be found using The above Algorithm.

Thanks & Regards
Ajit 

Thanks,
bin
>
> Thanks & Regards
> Ajit
>
>
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org 
> [mailto:gcc-patches-owner@gcc.gnu.org] On Behalf Of Bin.Cheng
> Sent: Monday, August 17, 2015 3:32 PM
> To: Richard Biener
> Cc: Bin Cheng; GCC Patches
> Subject: Re: [PATCH GCC]Improve bound information in loop niter 
> analysis
>
> Thanks for all your reviews.
>
> On Fri, Aug 14, 2015 at 4:17 PM, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>>> Hi,
>>> Loop niter computes inaccurate bound information for different loops.
>>> This patch is to improve it by using loop initial condition in 
>>> determine_value_range.  Generally, loop niter is computed by 
>>> subtracting start var from end var in loop exit condition.  
>>> Moreover, loop bound is computed using value range information of both start and end variables.
>>> Basic idea of this patch is to check if loop initial condition 
>>> implies more range information for both start/end variables.  If 
>>> yes, we refine range information and use that to compute loop bound.
>>> With this improvement, more accurate loop bound information is 
>>> computed for test cases added by this patch.
>>
>> +      c0 = fold_convert (type, c0);
>> +      c1 = fold_convert (type, c1);
>> +
>> +      if (operand_equal_p (var, c0, 0))
>>
>> I believe if c0 is not already of type type operand-equal_p will never succeed.
> It's quite specific case targeting comparison between var and it's range bounds.  Given c0 is in form of "var + offc0", then the comparison "var + offc0 != range bounds" doesn't have any useful information.  Maybe useless type conversion can be handled here though, it might be even corner case.
>
>>
>> (side-note: we should get rid of the GMP use, that's expensive and 
>> now we have wide-int available which should do the trick as well)
>>
>> +         /* Case of comparing with the bounds of the type.  */
>> +         if (TYPE_MIN_VALUE (type)
>> +             && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
>> +           cmp = GT_EXPR;
>> +         if (TYPE_MAX_VALUE (type)
>> +             && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
>> +           cmp = LT_EXPR;
>>
>> don't use TYPE_MIN/MAX_VALUE.  Instead use the types precision and 
>> all wide_int operations (see match.pd wi::max_value use).
> Done.
>
>>
>> +  else if (!operand_equal_p (var, varc0, 0))
>> +    goto end_2;
>>
>> ick - goto.  We need sth like a auto_mpz class with a destructor.
> Label end_2 removed.
>
>>
>> struct auto_mpz
>> {
>>   auto_mpz () { mpz_init (m_val); }
>>   ~auto_mpz () { mpz_clear (m_val); }
>>   mpz& operator() { return m_val; }
>>   mpz m_val;
>> };
>>
>>> Is it OK?
>>
>> I see the code follows existing practice in niter analysis even 
>> though my overall plan was to transition its copying of value-range 
>> related optimizations to use VRP infrastructure.
> Yes, I think it's easy to push it to VRP infrastructure.  Actually from the name of the function, it's more vrp related.  For now, the function is called only by bound_difference, not so many as vrp queries.  We need cache facility in vrp otherwise it would be expensive.
>
>>
>> I'm still ok with improving the existing code on the basis that I 
>> won't get to that for GCC 6.
>>
>> So - ok with the TYPE_MIN/MAX_VALUE change suggested above.
>>
>> Refactoring with auto_mpz welcome.
> That will be an independent patch, so I skipped it in this one.
>
> New version attached.  Bootstrap and test on x86_64.
>
> Thanks,
> bin
>>
>> Thanks,
>> RIchard.
>>
>>> Thanks,
>>> bin
>>>
>>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>>>         (determine_value_range): Call refine_value_range_using_guard for
>>>         each loop initial condition to improve value range.
>>>
>>> gcc/testsuite/ChangeLog
>>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>>>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>>>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-18  8:16         ` Ajit Kumar Agarwal
@ 2015-08-18  8:51           ` Richard Biener
  2015-08-18  9:09           ` Bin.Cheng
  1 sibling, 0 replies; 16+ messages in thread
From: Richard Biener @ 2015-08-18  8:51 UTC (permalink / raw)
  To: Ajit Kumar Agarwal
  Cc: Bin.Cheng, Bin Cheng, GCC Patches, Vinod Kathail,
	Shail Aditya Gupta, Vidhumouli Hunsigida, Nagaraju Mekala

On Tue, Aug 18, 2015 at 10:02 AM, Ajit Kumar Agarwal
<ajit.kumar.agarwal@xilinx.com> wrote:
>
>
> -----Original Message-----
> From: Bin.Cheng [mailto:amker.cheng@gmail.com]
> Sent: Tuesday, August 18, 2015 1:08 PM
> To: Ajit Kumar Agarwal
> Cc: Richard Biener; Bin Cheng; GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: Re: [PATCH GCC]Improve bound information in loop niter analysis
>
> On Mon, Aug 17, 2015 at 6:49 PM, Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com> wrote:
>> All:
>>
>> Does the Logic to calculate the Loop bound information through Value
>> Range Analyis uses the post dominator and Dominator info. The iteration branches instead of Loop exit condition can be calculated through post dominator info.
>> If the node in the Loop has two successors and post dominates the two
>> successors then the iteration branch can be The same node.
>>
>> For All the nodes L in the Loop B
>> If (L1, L2  belongs to successors of (L) && L1,L2 belongs to
>> PosDom(Header of Loop)) {
>>   I = I union L1
>> }
>>
>> Thus "I" will have all set of iteration branches. This will handle
>> more cases of Loop bound information that Will be accurate through the
>> exact iteration count that are known cases along with Value Range Information Where the condition is instead not the Loop exits but other nodes in the Loop.
>
>>>I don't quite follow your words here.  Could you please give a simple example about it?  Especially I don't know how post-dom helps the loop bound analysis.  >>Seems your pseudo code is collecting some comparison basic block of loop?
>
> The Algorithm I have given above is based on Post Dominator Info. This helps to calculate the iteration branches. The iteration branches are the
> Branches that determine the loop exit condition. Based on the condition it either branches to the header of the Loop, Or it may branch to the
> Block dominated by the header or exit from the loop. The above Algorithm finds out such iteration branches and thus decides on the Loop bound
> Or iteration count. If such iteration branches are not at the back edge node and it may be a node inside the loop based on some conditions.
> Finding out such iteration branches can be done through the post dominator info using the above algorithm.  Based on the iteration branches the
> conditions can be analyzed and that helps in finding out the iteration bound for Known cases. Know cases are the cases where the loop bound can be determined at compile time.
>
>  One Example would be Multi-Exits Loops where the Loop exit condition can be at the back edge or it may be a block inside the Loop based on the
> IF conditions and breaks out based on the conditions. Thus having multiple exits. Such iteration branches can be found using The above Algorithm.

We already have code to compute the iteration count for each loop exit
and combine that into an overall iteration count.

Richard.

> Thanks & Regards
> Ajit
>
> Thanks,
> bin
>>
>> Thanks & Regards
>> Ajit
>>
>>
>> -----Original Message-----
>> From: gcc-patches-owner@gcc.gnu.org
>> [mailto:gcc-patches-owner@gcc.gnu.org] On Behalf Of Bin.Cheng
>> Sent: Monday, August 17, 2015 3:32 PM
>> To: Richard Biener
>> Cc: Bin Cheng; GCC Patches
>> Subject: Re: [PATCH GCC]Improve bound information in loop niter
>> analysis
>>
>> Thanks for all your reviews.
>>
>> On Fri, Aug 14, 2015 at 4:17 PM, Richard Biener <richard.guenther@gmail.com> wrote:
>>> On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>>>> Hi,
>>>> Loop niter computes inaccurate bound information for different loops.
>>>> This patch is to improve it by using loop initial condition in
>>>> determine_value_range.  Generally, loop niter is computed by
>>>> subtracting start var from end var in loop exit condition.
>>>> Moreover, loop bound is computed using value range information of both start and end variables.
>>>> Basic idea of this patch is to check if loop initial condition
>>>> implies more range information for both start/end variables.  If
>>>> yes, we refine range information and use that to compute loop bound.
>>>> With this improvement, more accurate loop bound information is
>>>> computed for test cases added by this patch.
>>>
>>> +      c0 = fold_convert (type, c0);
>>> +      c1 = fold_convert (type, c1);
>>> +
>>> +      if (operand_equal_p (var, c0, 0))
>>>
>>> I believe if c0 is not already of type type operand-equal_p will never succeed.
>> It's quite specific case targeting comparison between var and it's range bounds.  Given c0 is in form of "var + offc0", then the comparison "var + offc0 != range bounds" doesn't have any useful information.  Maybe useless type conversion can be handled here though, it might be even corner case.
>>
>>>
>>> (side-note: we should get rid of the GMP use, that's expensive and
>>> now we have wide-int available which should do the trick as well)
>>>
>>> +         /* Case of comparing with the bounds of the type.  */
>>> +         if (TYPE_MIN_VALUE (type)
>>> +             && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
>>> +           cmp = GT_EXPR;
>>> +         if (TYPE_MAX_VALUE (type)
>>> +             && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
>>> +           cmp = LT_EXPR;
>>>
>>> don't use TYPE_MIN/MAX_VALUE.  Instead use the types precision and
>>> all wide_int operations (see match.pd wi::max_value use).
>> Done.
>>
>>>
>>> +  else if (!operand_equal_p (var, varc0, 0))
>>> +    goto end_2;
>>>
>>> ick - goto.  We need sth like a auto_mpz class with a destructor.
>> Label end_2 removed.
>>
>>>
>>> struct auto_mpz
>>> {
>>>   auto_mpz () { mpz_init (m_val); }
>>>   ~auto_mpz () { mpz_clear (m_val); }
>>>   mpz& operator() { return m_val; }
>>>   mpz m_val;
>>> };
>>>
>>>> Is it OK?
>>>
>>> I see the code follows existing practice in niter analysis even
>>> though my overall plan was to transition its copying of value-range
>>> related optimizations to use VRP infrastructure.
>> Yes, I think it's easy to push it to VRP infrastructure.  Actually from the name of the function, it's more vrp related.  For now, the function is called only by bound_difference, not so many as vrp queries.  We need cache facility in vrp otherwise it would be expensive.
>>
>>>
>>> I'm still ok with improving the existing code on the basis that I
>>> won't get to that for GCC 6.
>>>
>>> So - ok with the TYPE_MIN/MAX_VALUE change suggested above.
>>>
>>> Refactoring with auto_mpz welcome.
>> That will be an independent patch, so I skipped it in this one.
>>
>> New version attached.  Bootstrap and test on x86_64.
>>
>> Thanks,
>> bin
>>>
>>> Thanks,
>>> RIchard.
>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>>>>         (determine_value_range): Call refine_value_range_using_guard for
>>>>         each loop initial condition to improve value range.
>>>>
>>>> gcc/testsuite/ChangeLog
>>>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>>>>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>>>>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-18  8:16         ` Ajit Kumar Agarwal
  2015-08-18  8:51           ` Richard Biener
@ 2015-08-18  9:09           ` Bin.Cheng
  1 sibling, 0 replies; 16+ messages in thread
From: Bin.Cheng @ 2015-08-18  9:09 UTC (permalink / raw)
  To: Ajit Kumar Agarwal
  Cc: Richard Biener, Bin Cheng, GCC Patches, Vinod Kathail,
	Shail Aditya Gupta, Vidhumouli Hunsigida, Nagaraju Mekala

On Tue, Aug 18, 2015 at 4:02 PM, Ajit Kumar Agarwal
<ajit.kumar.agarwal@xilinx.com> wrote:
>
>
> -----Original Message-----
> From: Bin.Cheng [mailto:amker.cheng@gmail.com]
> Sent: Tuesday, August 18, 2015 1:08 PM
> To: Ajit Kumar Agarwal
> Cc: Richard Biener; Bin Cheng; GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
> Subject: Re: [PATCH GCC]Improve bound information in loop niter analysis
>
> On Mon, Aug 17, 2015 at 6:49 PM, Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com> wrote:
>> All:
>>
>> Does the Logic to calculate the Loop bound information through Value
>> Range Analyis uses the post dominator and Dominator info. The iteration branches instead of Loop exit condition can be calculated through post dominator info.
>> If the node in the Loop has two successors and post dominates the two
>> successors then the iteration branch can be The same node.
>>
>> For All the nodes L in the Loop B
>> If (L1, L2  belongs to successors of (L) && L1,L2 belongs to
>> PosDom(Header of Loop)) {
>>   I = I union L1
>> }
>>
>> Thus "I" will have all set of iteration branches. This will handle
>> more cases of Loop bound information that Will be accurate through the
>> exact iteration count that are known cases along with Value Range Information Where the condition is instead not the Loop exits but other nodes in the Loop.
>
>>>I don't quite follow your words here.  Could you please give a simple example about it?  Especially I don't know how post-dom helps the loop bound analysis.  >>Seems your pseudo code is collecting some comparison basic block of loop?
>
> The Algorithm I have given above is based on Post Dominator Info. This helps to calculate the iteration branches. The iteration branches are the
> Branches that determine the loop exit condition. Based on the condition it either branches to the header of the Loop, Or it may branch to the
> Block dominated by the header or exit from the loop. The above Algorithm finds out such iteration branches and thus decides on the Loop bound
> Or iteration count. If such iteration branches are not at the back edge node and it may be a node inside the loop based on some conditions.
> Finding out such iteration branches can be done through the post dominator info using the above algorithm.  Based on the iteration branches the
> conditions can be analyzed and that helps in finding out the iteration bound for Known cases. Know cases are the cases where the loop bound can be determined at compile time.
>
>  One Example would be Multi-Exits Loops where the Loop exit condition can be at the back edge or it may be a block inside the Loop based on the
> IF conditions and breaks out based on the conditions. Thus having multiple exits. Such iteration branches can be found using The above Algorithm.
As far as I understand, GCC already do loop niter analysis one the
basis of exit edge (and the corresponding block and condition).  And
yes, it uses scev/vrp information for the analysis.  The original
patch is to improve the analysis with flow sensitive range
information.  It's not about GCC can't work on exit edge.

Thanks,
bin
>
> Thanks & Regards
> Ajit
>
> Thanks,
> bin
>>
>> Thanks & Regards
>> Ajit
>>
>>
>> -----Original Message-----
>> From: gcc-patches-owner@gcc.gnu.org
>> [mailto:gcc-patches-owner@gcc.gnu.org] On Behalf Of Bin.Cheng
>> Sent: Monday, August 17, 2015 3:32 PM
>> To: Richard Biener
>> Cc: Bin Cheng; GCC Patches
>> Subject: Re: [PATCH GCC]Improve bound information in loop niter
>> analysis
>>
>> Thanks for all your reviews.
>>
>> On Fri, Aug 14, 2015 at 4:17 PM, Richard Biener <richard.guenther@gmail.com> wrote:
>>> On Tue, Jul 28, 2015 at 11:36 AM, Bin Cheng <bin.cheng@arm.com> wrote:
>>>> Hi,
>>>> Loop niter computes inaccurate bound information for different loops.
>>>> This patch is to improve it by using loop initial condition in
>>>> determine_value_range.  Generally, loop niter is computed by
>>>> subtracting start var from end var in loop exit condition.
>>>> Moreover, loop bound is computed using value range information of both start and end variables.
>>>> Basic idea of this patch is to check if loop initial condition
>>>> implies more range information for both start/end variables.  If
>>>> yes, we refine range information and use that to compute loop bound.
>>>> With this improvement, more accurate loop bound information is
>>>> computed for test cases added by this patch.
>>>
>>> +      c0 = fold_convert (type, c0);
>>> +      c1 = fold_convert (type, c1);
>>> +
>>> +      if (operand_equal_p (var, c0, 0))
>>>
>>> I believe if c0 is not already of type type operand-equal_p will never succeed.
>> It's quite specific case targeting comparison between var and it's range bounds.  Given c0 is in form of "var + offc0", then the comparison "var + offc0 != range bounds" doesn't have any useful information.  Maybe useless type conversion can be handled here though, it might be even corner case.
>>
>>>
>>> (side-note: we should get rid of the GMP use, that's expensive and
>>> now we have wide-int available which should do the trick as well)
>>>
>>> +         /* Case of comparing with the bounds of the type.  */
>>> +         if (TYPE_MIN_VALUE (type)
>>> +             && operand_equal_p (c1, TYPE_MIN_VALUE (type), 0))
>>> +           cmp = GT_EXPR;
>>> +         if (TYPE_MAX_VALUE (type)
>>> +             && operand_equal_p (c1, TYPE_MAX_VALUE (type), 0))
>>> +           cmp = LT_EXPR;
>>>
>>> don't use TYPE_MIN/MAX_VALUE.  Instead use the types precision and
>>> all wide_int operations (see match.pd wi::max_value use).
>> Done.
>>
>>>
>>> +  else if (!operand_equal_p (var, varc0, 0))
>>> +    goto end_2;
>>>
>>> ick - goto.  We need sth like a auto_mpz class with a destructor.
>> Label end_2 removed.
>>
>>>
>>> struct auto_mpz
>>> {
>>>   auto_mpz () { mpz_init (m_val); }
>>>   ~auto_mpz () { mpz_clear (m_val); }
>>>   mpz& operator() { return m_val; }
>>>   mpz m_val;
>>> };
>>>
>>>> Is it OK?
>>>
>>> I see the code follows existing practice in niter analysis even
>>> though my overall plan was to transition its copying of value-range
>>> related optimizations to use VRP infrastructure.
>> Yes, I think it's easy to push it to VRP infrastructure.  Actually from the name of the function, it's more vrp related.  For now, the function is called only by bound_difference, not so many as vrp queries.  We need cache facility in vrp otherwise it would be expensive.
>>
>>>
>>> I'm still ok with improving the existing code on the basis that I
>>> won't get to that for GCC 6.
>>>
>>> So - ok with the TYPE_MIN/MAX_VALUE change suggested above.
>>>
>>> Refactoring with auto_mpz welcome.
>> That will be an independent patch, so I skipped it in this one.
>>
>> New version attached.  Bootstrap and test on x86_64.
>>
>> Thanks,
>> bin
>>>
>>> Thanks,
>>> RIchard.
>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         * tree-ssa-loop-niter.c (refine_value_range_using_guard): New.
>>>>         (determine_value_range): Call refine_value_range_using_guard for
>>>>         each loop initial condition to improve value range.
>>>>
>>>> gcc/testsuite/ChangeLog
>>>> 2015-07-28  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         * gcc.dg/tree-ssa/loop-bound-1.c: New test.
>>>>         * gcc.dg/tree-ssa/loop-bound-3.c: New test.
>>>>         * gcc.dg/tree-ssa/loop-bound-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-17 10:06   ` Bin.Cheng
  2015-08-17 11:16     ` Ajit Kumar Agarwal
@ 2015-08-18 18:38     ` Jeff Law
  2015-08-19  2:45       ` Bin.Cheng
  1 sibling, 1 reply; 16+ messages in thread
From: Jeff Law @ 2015-08-18 18:38 UTC (permalink / raw)
  To: Bin.Cheng, Richard Biener; +Cc: Bin Cheng, GCC Patches

On 08/17/2015 04:01 AM, Bin.Cheng wrote:
>>
>> +      c0 = fold_convert (type, c0);
>> +      c1 = fold_convert (type, c1);
>> +
>> +      if (operand_equal_p (var, c0, 0))
>>
>> I believe if c0 is not already of type type operand-equal_p will never succeed.
> It's quite specific case targeting comparison between var and it's
> range bounds.  Given c0 is in form of "var + offc0", then the
> comparison "var + offc0 != range bounds" doesn't have any useful
> information.  Maybe useless type conversion can be handled here
> though, it might be even corner case.
My comment about useless type conversions was more about a deficiency in 
operand_equal_p's implementation.  It wasn't something I felt needed to 
be addressed in your patch.

I think using operand_equal_p is fine here.


Jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC]Improve bound information in loop niter analysis
  2015-08-18 18:38     ` Jeff Law
@ 2015-08-19  2:45       ` Bin.Cheng
  0 siblings, 0 replies; 16+ messages in thread
From: Bin.Cheng @ 2015-08-19  2:45 UTC (permalink / raw)
  To: Jeff Law; +Cc: Richard Biener, Bin Cheng, GCC Patches

On Wed, Aug 19, 2015 at 2:14 AM, Jeff Law <law@redhat.com> wrote:
> On 08/17/2015 04:01 AM, Bin.Cheng wrote:
>>>
>>>
>>> +      c0 = fold_convert (type, c0);
>>> +      c1 = fold_convert (type, c1);
>>> +
>>> +      if (operand_equal_p (var, c0, 0))
>>>
>>> I believe if c0 is not already of type type operand-equal_p will never
>>> succeed.
>>
>> It's quite specific case targeting comparison between var and it's
>> range bounds.  Given c0 is in form of "var + offc0", then the
>> comparison "var + offc0 != range bounds" doesn't have any useful
>> information.  Maybe useless type conversion can be handled here
>> though, it might be even corner case.
>
> My comment about useless type conversions was more about a deficiency in
> operand_equal_p's implementation.  It wasn't something I felt needed to be
> addressed in your patch.
>
> I think using operand_equal_p is fine here.
Hi Jeff,
I misunderstood the point.  Thanks for explanation.  Given the
approval, new version patch is applied.

Thanks,
bin
>
>
> Jeff

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-08-19  1:54 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-28 10:11 [PATCH GCC]Improve bound information in loop niter analysis Bin Cheng
2015-08-13  8:27 ` Bin.Cheng
2015-08-13 22:10 ` Jeff Law
2015-08-14  3:13   ` Bin.Cheng
2015-08-14 18:32     ` Jeff Law
2015-08-14  8:28 ` Richard Biener
2015-08-14 18:47   ` Jeff Law
2015-08-17 10:06   ` Bin.Cheng
2015-08-17 11:16     ` Ajit Kumar Agarwal
2015-08-17 11:38       ` Ajit Kumar Agarwal
2015-08-18  7:53       ` Bin.Cheng
2015-08-18  8:16         ` Ajit Kumar Agarwal
2015-08-18  8:51           ` Richard Biener
2015-08-18  9:09           ` Bin.Cheng
2015-08-18 18:38     ` Jeff Law
2015-08-19  2:45       ` Bin.Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).