[PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
@ 2017-01-26  1:03 Jeff Law
  2017-01-26  7:38 ` Marc Glisse
  2017-01-26 10:00 ` Richard Biener
  0 siblings, 2 replies; 17+ messages in thread
From: Jeff Law @ 2017-01-26  1:03 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1255 bytes --]

As has been discussed extensively, we're not doing a good job at 
simplifying overflow tests, particularly those which collapse down to an 
EQ/NE test.

x + -1 > x  -> x == 0
x + -1 < x  -> x != 0
x + 1 < x   -> x == -1U
x + 1 > x   -> x != -1U

The simplifications allow us to propagate a constant for X into one ARM 
of the associated IF/ELSE construct.  For C++ std::vector operations 
those propagations can eliminate lots of unnecessary code.

Those propagations also eliminate (by way of removing unnecessary code) 
false positive warnings for memset calls that come from std::vector 
operations.

This patch does two things.

1. It adds special case patterns to the A+CST CMP A pattern for cases 
where CST is 1 or -1 where the result turns into A EQ/NE 0 or A EQ/NE 
-1U.  These special patterns are applied regardless of the single_use 
status of the expression.

2. It adds a call to fold_stmt in simplify_cond_using_ranges.  This 
allows VRP to transform the code early and the first DOM pass to often 
see the simpified conditional and thus optimize better, rather than 
waiting for forwprop3 to simplify the conditional and the last DOM pass 
to optimize the code.

Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?


[-- Attachment #2: P --]
[-- Type: text/plain, Size: 4747 bytes --]

	PR tree-optimization/79095
	* match.pd (A + CST CMP A): Add special cases for CST of 1 or -1.
	* tree-vrp.c (simplify_cond_using_ranges): Accept GSI rather than statement.
	Callers changed.  Just fold the conditional if no other simplifications
	were possible.

	PR tree-optimization/79095
	* g++.dg/pr79095: New test.
	* gcc.c-torture/execute/arith-1.c: Test additional cases.

diff --git a/gcc/match.pd b/gcc/match.pd
index 7b96800..8178e9c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3019,15 +3019,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    ADD_OVERFLOW detection in tree-ssa-math-opts.c.
    A + CST CMP A  ->  A CMP' CST' */
 (for cmp (lt le ge gt)
+     out_eqneq_zero (ne ne eq eq)
+     out_eqneq_m1 (eq eq ne ne)
      out (gt gt le le)
  (simplify
   (cmp:c (plus@2 @0 INTEGER_CST@1) @0)
   (if (TYPE_UNSIGNED (TREE_TYPE (@0))
        && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
+       && wi::eq_p (@1, 1))
+   (out_eqneq_m1 @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
+	       (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED)); })
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+       && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
+       && wi::eq_p (@1, -1))
+   (out_eqneq_zero @0 { fold_convert (TREE_TYPE (@0), integer_zero_node) ; })
+  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
+       && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
        && wi::ne_p (@1, 0)
        && single_use (@2))
    (out @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
-	       (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }))))
+	       (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }))))))
 
 /* To detect overflow in unsigned A - B, A < B is simpler than A - B > A.
    However, the detection logic for SUB_OVERFLOW in tree-ssa-math-opts.c
diff --git a/gcc/testsuite/g++.dg/pr79095.C b/gcc/testsuite/g++.dg/pr79095.C
new file mode 100644
index 0000000..edf3739
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr79095.C
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-Wuninitialized -O2" } */
+
+typedef __SIZE_TYPE__ size_t;
+
+struct S {
+  int *p0, *p1, *p2;
+
+  size_t size () const { return p1 - p0; }
+
+  void f (size_t n) {
+    if (n > size ())       // can't happen because
+      foo (n - size ());   //   n is in [1, MIN(size() - 1, 3)]
+    else if (n < size ())
+      bar (p0 + n);
+  }
+
+  void foo (size_t n)
+  {
+    size_t left = (size_t)(p2 - p1);
+    if (left >= n)
+      __builtin_memset (p2, 0, n * sizeof *p2); /*  { dg-bogus "maximum object" "false warning" } */
+  }
+
+  void bar (int*);
+};
+
+void f (S &s)
+{
+  size_t n = s.size ();
+  if (n > 1 && n < 5)
+    s.f (n - 1);
+}
+
+
diff --git a/gcc/testsuite/gcc.c-torture/execute/arith-1.c b/gcc/testsuite/gcc.c-torture/execute/arith-1.c
index 58df322..6168d77 100644
--- a/gcc/testsuite/gcc.c-torture/execute/arith-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/arith-1.c
@@ -7,9 +7,41 @@ sat_add (unsigned i)
   return ret;
 }
 
+unsigned
+sat_add2 (unsigned i)
+{
+  unsigned ret = i + 1;
+  if (ret > i)
+    return ret;
+  return i;
+}
+
+unsigned
+sat_add3 (unsigned i)
+{
+  unsigned ret = i - 1;
+  if (ret > i)
+    ret = i;
+  return ret;
+}
+
+unsigned
+sat_add4 (unsigned i)
+{
+  unsigned ret = i - 1;
+  if (ret < i)
+    return ret;
+  return i;
+}
 main ()
 {
   if (sat_add (~0U) != ~0U)
     abort ();
+  if (sat_add2 (~0U) != ~0U)
+    abort ();
+  if (sat_add3 (0U) != 0U)
+    abort ();
+  if (sat_add4 (0U) != 0U)
+    abort ();
   exit (0);
 }
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index d7d7a0d..b4b6d8a 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -9550,8 +9550,9 @@ range_fits_type_p (value_range *vr, unsigned dest_precision, signop dest_sgn)
    the original conditional.  */
 
 static bool
-simplify_cond_using_ranges (gcond *stmt)
+simplify_cond_using_ranges (gimple_stmt_iterator *gsi)
 {
+  gcond *stmt = as_a <gcond *> (gsi_stmt (*gsi));
   tree op0 = gimple_cond_lhs (stmt);
   tree op1 = gimple_cond_rhs (stmt);
   enum tree_code cond_code = gimple_cond_code (stmt);
@@ -9720,6 +9721,14 @@ simplify_cond_using_ranges (gcond *stmt)
 	}
     }
 
+  /* Finally, just try match.pd simplification.  This can convert some
+     overflow checks into simple equality checks (for example).  */
+  if (fold_stmt (gsi, NULL))
+    {
+      update_stmt (gsi_stmt (*gsi));
+      return true;
+    }
+
   return false;
 }
 
@@ -10318,7 +10328,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi)
 	}
     }
   else if (gimple_code (stmt) == GIMPLE_COND)
-    return simplify_cond_using_ranges (as_a <gcond *> (stmt));
+    return simplify_cond_using_ranges (gsi);
   else if (gimple_code (stmt) == GIMPLE_SWITCH)
     return simplify_switch_using_ranges (as_a <gswitch *> (stmt));
   else if (is_gimple_call (stmt)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-26  1:03 [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings Jeff Law
@ 2017-01-26  7:38 ` Marc Glisse
  2017-01-26 15:57   ` Jeff Law
  2017-01-26 18:32   ` Jeff Law
  2017-01-26 10:00 ` Richard Biener
  1 sibling, 2 replies; 17+ messages in thread
From: Marc Glisse @ 2017-01-26  7:38 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Wed, 25 Jan 2017, Jeff Law wrote:

> As has been discussed extensively, we're not doing a good job at simplifying 
> overflow tests, particularly those which collapse down to an EQ/NE test.
>
> x + -1 > x  -> x == 0
> x + -1 < x  -> x != 0
> x + 1 < x   -> x == -1U
> x + 1 > x   -> x != -1U
>
> The simplifications allow us to propagate a constant for X into one ARM of 
> the associated IF/ELSE construct.  For C++ std::vector operations those 
> propagations can eliminate lots of unnecessary code.
>
> Those propagations also eliminate (by way of removing unnecessary code) false 
> positive warnings for memset calls that come from std::vector operations.
>
> This patch does two things.
>
> 1. It adds special case patterns to the A+CST CMP A pattern for cases where 
> CST is 1 or -1 where the result turns into A EQ/NE 0 or A EQ/NE -1U.  These 
> special patterns are applied regardless of the single_use status of the 
> expression.
>
> 2. It adds a call to fold_stmt in simplify_cond_using_ranges.  This allows 
> VRP to transform the code early and the first DOM pass to often see the 
> simpified conditional and thus optimize better, rather than waiting for 
> forwprop3 to simplify the conditional and the last DOM pass to optimize the 
> code.
>
> Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?

I assume this causes a regression for code like

unsigned f(unsigned a){
   unsigned b=a+1;
   if(b<a)return 42;
   return b;
}

? On the other hand, the optimization is already very fragile, if I write 
b<=a (which is equivalent since 1 != 0), it doesn't apply.

We currently get
 	addl	$1, %edi
 	movl	$42, %eax
 	cmovnc	%edi, %eax
or almost as good with b==0
 	movl	%edi, %eax
 	movl	$42, %edx
 	addl	$1, %eax
 	cmove	%edx, %eax
while with a==-1 we have the redundant comparison
 	leal	1(%rdi), %eax
 	cmpl	$-1, %edi
 	movl	$42, %edx
 	cmove	%edx, %eax

Simplifying x + 1 < x to x + 1 == 0 might not be enough to simplify your 
examples though I guess?

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-26  1:03 [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings Jeff Law
  2017-01-26  7:38 ` Marc Glisse
@ 2017-01-26 10:00 ` Richard Biener
  2017-01-26 16:09   ` Jeff Law
  2017-01-27  9:03   ` Marc Glisse
  1 sibling, 2 replies; 17+ messages in thread
From: Richard Biener @ 2017-01-26 10:00 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Thu, Jan 26, 2017 at 1:06 AM, Jeff Law <law@redhat.com> wrote:
> As has been discussed extensively, we're not doing a good job at simplifying
> overflow tests, particularly those which collapse down to an EQ/NE test.
>
> x + -1 > x  -> x == 0
> x + -1 < x  -> x != 0
> x + 1 < x   -> x == -1U
> x + 1 > x   -> x != -1U
>
> The simplifications allow us to propagate a constant for X into one ARM of
> the associated IF/ELSE construct.  For C++ std::vector operations those
> propagations can eliminate lots of unnecessary code.
>
> Those propagations also eliminate (by way of removing unnecessary code)
> false positive warnings for memset calls that come from std::vector
> operations.
>
> This patch does two things.
>
> 1. It adds special case patterns to the A+CST CMP A pattern for cases where
> CST is 1 or -1 where the result turns into A EQ/NE 0 or A EQ/NE -1U.  These
> special patterns are applied regardless of the single_use status of the
> expression.
>
> 2. It adds a call to fold_stmt in simplify_cond_using_ranges.  This allows
> VRP to transform the code early and the first DOM pass to often see the
> simpified conditional and thus optimize better, rather than waiting for
> forwprop3 to simplify the conditional and the last DOM pass to optimize the
> code.
>
> Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?

Marc mentions some reasons we have a lot of single_use predicates on condition
simplifications.  You may want to evaluate some of the examples (usually
they boil down to RTL if-conversion missed optimizations).  Other cases happen
when we can use CC flags of a previous computation that is live over the
conditional like

int foo (int b)
{
   int a = b - 1;
   if (a == 0)
     return b;
   return a;
}

if we transform that to if (b == 1) we lose because we don't see to undo that
on RTL (and we do seem to do that transform somewhere since basically
forever). gcc asm:

        movl    %edi, %eax
        movl    $1, %edx
        subl    $1, %eax
        cmove   %edx, %eax
        ret

clang asm:

        movl    %edi, %eax
        decl    %eax
        cmovel  %edi, %eax
        retq

in this case probably a missed uncprop opportunity on our side (rematerialize
the constant propagated 1 from b).

Comments below

>
>         PR tree-optimization/79095
>         * match.pd (A + CST CMP A): Add special cases for CST of 1 or -1.
>         * tree-vrp.c (simplify_cond_using_ranges): Accept GSI rather than
> statement.
>         Callers changed.  Just fold the conditional if no other
> simplifications
>         were possible.
>
>         PR tree-optimization/79095
>         * g++.dg/pr79095: New test.
>         * gcc.c-torture/execute/arith-1.c: Test additional cases.
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 7b96800..8178e9c 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3019,15 +3019,26 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>     ADD_OVERFLOW detection in tree-ssa-math-opts.c.
>     A + CST CMP A  ->  A CMP' CST' */
>  (for cmp (lt le ge gt)
> +     out_eqneq_zero (ne ne eq eq)
> +     out_eqneq_m1 (eq eq ne ne)
>       out (gt gt le le)
>   (simplify
>    (cmp:c (plus@2 @0 INTEGER_CST@1) @0)
>    (if (TYPE_UNSIGNED (TREE_TYPE (@0))
>         && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
> +       && wi::eq_p (@1, 1))
> +   (out_eqneq_m1 @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
> +              (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED)); })

build_minus_one_cst (TREE_TYPE (@0))

> +  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
> +       && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
> +       && wi::eq_p (@1, -1))
> +   (out_eqneq_zero @0 { fold_convert (TREE_TYPE (@0), integer_zero_node) ;
> })

build_zero_cst (TREE_TYPE (@0))

> +  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
> +       && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
>         && wi::ne_p (@1, 0)
>         && single_use (@2))
>     (out @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
> -              (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }))))
> +              (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }))))))

For a little CSE I'd use

    (if (TYPE_UNSIGNED (TREE_TYPE (@0))
        && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
     (switch
      (if (wi::eq_p (@1, 1))
        ...)
      (if (wi::eq_p (@1, -1))
       ...)
      (if (wi::ne_p (@1, 0) && single_use (@2))
          ...)))

which is also nicer for indentation.

I also notice we dont' handle 1 - A CMP A anywhere.

Richard.

>  /* To detect overflow in unsigned A - B, A < B is simpler than A - B > A.
>     However, the detection logic for SUB_OVERFLOW in tree-ssa-math-opts.c
> diff --git a/gcc/testsuite/g++.dg/pr79095.C b/gcc/testsuite/g++.dg/pr79095.C
> new file mode 100644
> index 0000000..edf3739
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr79095.C
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Wuninitialized -O2" } */
> +
> +typedef __SIZE_TYPE__ size_t;
> +
> +struct S {
> +  int *p0, *p1, *p2;
> +
> +  size_t size () const { return p1 - p0; }
> +
> +  void f (size_t n) {
> +    if (n > size ())       // can't happen because
> +      foo (n - size ());   //   n is in [1, MIN(size() - 1, 3)]
> +    else if (n < size ())
> +      bar (p0 + n);
> +  }
> +
> +  void foo (size_t n)
> +  {
> +    size_t left = (size_t)(p2 - p1);
> +    if (left >= n)
> +      __builtin_memset (p2, 0, n * sizeof *p2); /*  { dg-bogus "maximum
> object" "false warning" } */
> +  }
> +
> +  void bar (int*);
> +};
> +
> +void f (S &s)
> +{
> +  size_t n = s.size ();
> +  if (n > 1 && n < 5)
> +    s.f (n - 1);
> +}
> +
> +
> diff --git a/gcc/testsuite/gcc.c-torture/execute/arith-1.c
> b/gcc/testsuite/gcc.c-torture/execute/arith-1.c
> index 58df322..6168d77 100644
> --- a/gcc/testsuite/gcc.c-torture/execute/arith-1.c
> +++ b/gcc/testsuite/gcc.c-torture/execute/arith-1.c
> @@ -7,9 +7,41 @@ sat_add (unsigned i)
>    return ret;
>  }
>
> +unsigned
> +sat_add2 (unsigned i)
> +{
> +  unsigned ret = i + 1;
> +  if (ret > i)
> +    return ret;
> +  return i;
> +}
> +
> +unsigned
> +sat_add3 (unsigned i)
> +{
> +  unsigned ret = i - 1;
> +  if (ret > i)
> +    ret = i;
> +  return ret;
> +}
> +
> +unsigned
> +sat_add4 (unsigned i)
> +{
> +  unsigned ret = i - 1;
> +  if (ret < i)
> +    return ret;
> +  return i;
> +}
>  main ()
>  {
>    if (sat_add (~0U) != ~0U)
>      abort ();
> +  if (sat_add2 (~0U) != ~0U)
> +    abort ();
> +  if (sat_add3 (0U) != 0U)
> +    abort ();
> +  if (sat_add4 (0U) != 0U)
> +    abort ();
>    exit (0);
>  }
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index d7d7a0d..b4b6d8a 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -9550,8 +9550,9 @@ range_fits_type_p (value_range *vr, unsigned
> dest_precision, signop dest_sgn)
>     the original conditional.  */
>
>  static bool
> -simplify_cond_using_ranges (gcond *stmt)
> +simplify_cond_using_ranges (gimple_stmt_iterator *gsi)
>  {
> +  gcond *stmt = as_a <gcond *> (gsi_stmt (*gsi));
>    tree op0 = gimple_cond_lhs (stmt);
>    tree op1 = gimple_cond_rhs (stmt);
>    enum tree_code cond_code = gimple_cond_code (stmt);
> @@ -9720,6 +9721,14 @@ simplify_cond_using_ranges (gcond *stmt)
>         }
>      }
>
> +  /* Finally, just try match.pd simplification.  This can convert some
> +     overflow checks into simple equality checks (for example).  */
> +  if (fold_stmt (gsi, NULL))
> +    {
> +      update_stmt (gsi_stmt (*gsi));
> +      return true;
> +    }
> +
>    return false;
>  }
>
> @@ -10318,7 +10328,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator
> *gsi)
>         }
>      }
>    else if (gimple_code (stmt) == GIMPLE_COND)
> -    return simplify_cond_using_ranges (as_a <gcond *> (stmt));
> +    return simplify_cond_using_ranges (gsi);
>    else if (gimple_code (stmt) == GIMPLE_SWITCH)
>      return simplify_switch_using_ranges (as_a <gswitch *> (stmt));
>    else if (is_gimple_call (stmt)
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-26  7:38 ` Marc Glisse
@ 2017-01-26 15:57   ` Jeff Law
  2017-01-27  9:13     ` Marc Glisse
  2017-01-26 18:32   ` Jeff Law
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff Law @ 2017-01-26 15:57 UTC (permalink / raw)
  To: gcc-patches

On 01/25/2017 11:09 PM, Marc Glisse wrote:
> On Wed, 25 Jan 2017, Jeff Law wrote:
>
>> As has been discussed extensively, we're not doing a good job at
>> simplifying overflow tests, particularly those which collapse down to
>> an EQ/NE test.
>>
>> x + -1 > x  -> x == 0
>> x + -1 < x  -> x != 0
>> x + 1 < x   -> x == -1U
>> x + 1 > x   -> x != -1U
>>
>> The simplifications allow us to propagate a constant for X into one
>> ARM of the associated IF/ELSE construct.  For C++ std::vector
>> operations those propagations can eliminate lots of unnecessary code.
>>
>> Those propagations also eliminate (by way of removing unnecessary
>> code) false positive warnings for memset calls that come from
>> std::vector operations.
>>
>> This patch does two things.
>>
>> 1. It adds special case patterns to the A+CST CMP A pattern for cases
>> where CST is 1 or -1 where the result turns into A EQ/NE 0 or A EQ/NE
>> -1U.  These special patterns are applied regardless of the single_use
>> status of the expression.
>>
>> 2. It adds a call to fold_stmt in simplify_cond_using_ranges.  This
>> allows VRP to transform the code early and the first DOM pass to often
>> see the simpified conditional and thus optimize better, rather than
>> waiting for forwprop3 to simplify the conditional and the last DOM
>> pass to optimize the code.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the
>> trunk?
>
> I assume this causes a regression for code like
>
> unsigned f(unsigned a){
>   unsigned b=a+1;
>   if(b<a)return 42;
>   return b;
> }
Yes.  The transformation ruins the conversion into ADD_OVERFLOW for the 
+- 1 case.  However, ISTM that we could potentially recover the 
ADD_OVERFLOW in phi-opt.  It's a very simple pattern that would be 
presented to phi-opt, so it might not be terrible to recover -- which 
has the advantage that if a user wrote an optimized overflow test we'd 
be able to recover ADD_OVERFLOW for it.


>
> ? On the other hand, the optimization is already very fragile, if I
> write b<=a (which is equivalent since 1 != 0), it doesn't apply.
True, the existing ADD_OVERFLOW is missing this case.

Note that if we use my pattern and tried to recover ADD_OVERFLOW in 
phi-opt, both the b<a and b<=a look the same, so we'd catch both.



>
> We currently get
>     addl    $1, %edi
>     movl    $42, %eax
>     cmovnc    %edi, %eax
> or almost as good with b==0
>     movl    %edi, %eax
>     movl    $42, %edx
>     addl    $1, %eax
>     cmove    %edx, %eax
> while with a==-1 we have the redundant comparison
>     leal    1(%rdi), %eax
>     cmpl    $-1, %edi
>     movl    $42, %edx
>     cmove    %edx, %eax
>
> Simplifying x + 1 < x to x + 1 == 0 might not be enough to simplify your
> examples though I guess?
It's an interesting idea.  But x+1 == 0 will get canonicalized back into 
x == -1.

But it does raise an interesting idea that I only briefly pondered.  We 
could perhaps do back-propagation to arrive at a known value for x in 
one arm of the conditional.  That may be enough.  And DOM already has 
some ability to do that kind of back propagation.

So given:

   _1 = _13 + 18446744073709551615;
   if (_1 > _13)
     goto <bb 4>; [29.56%]
   else
     goto <bb 27>; [70.44%]


DOM could see the conditional 1 > 13.  Walk the use-def chain to the 
assignment of _1 to see what values for _1 would make the condition 
true.  The only value that satisfies is _13 == 0.  Then it would enter 
_13 = 0 into its const/copies table for the duration of the THEN 
conditional.

That leaves the conditional alone, so presumably it would still be seen 
as ADD_OVERFLOW later, but gets the const propagation into the THEN arm 
that we need.

It seems worthwhile enough to play with.

jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-26 10:00 ` Richard Biener
@ 2017-01-26 16:09   ` Jeff Law
  2017-01-27  9:03   ` Marc Glisse
  1 sibling, 0 replies; 17+ messages in thread
From: Jeff Law @ 2017-01-26 16:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On 01/26/2017 02:38 AM, Richard Biener wrote:
>
> Marc mentions some reasons we have a lot of single_use predicates on condition
> simplifications.  You may want to evaluate some of the examples (usually
> they boil down to RTL if-conversion missed optimizations).  Other cases happen
> when we can use CC flags of a previous computation that is live over the
> conditional like
>
> int foo (int b)
> {
>    int a = b - 1;
>    if (a == 0)
>      return b;
>    return a;
> }
>
> if we transform that to if (b == 1) we lose because we don't see to undo that
> on RTL (and we do seem to do that transform somewhere since basically
I'm pretty sure my pattern wouldn't make that transformation.  However, 
I would expect other existing patterns + forwprop to do that 
transformation.  It's easy enough to check.  ANd yes, we're terrible at 
re-using CC flags.


>> +       && wi::eq_p (@1, 1))
>> +   (out_eqneq_m1 @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
>> +              (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED)); })
>
> build_minus_one_cst (TREE_TYPE (@0))
>
>> +  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
>> +       && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
>> +       && wi::eq_p (@1, -1))
>> +   (out_eqneq_zero @0 { fold_convert (TREE_TYPE (@0), integer_zero_node) ;
>> })
>
> build_zero_cst (TREE_TYPE (@0))
Yup.  Should have used the existing build_XXX routines.  Thanks.

>
>> +  (if (TYPE_UNSIGNED (TREE_TYPE (@0))
>> +       && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
>>         && wi::ne_p (@1, 0)
>>         && single_use (@2))
>>     (out @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
>> -              (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }))))
>> +              (TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }))))))
>
> For a little CSE I'd use
>
>     (if (TYPE_UNSIGNED (TREE_TYPE (@0))
>         && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
>      (switch
>       (if (wi::eq_p (@1, 1))
>         ...)
>       (if (wi::eq_p (@1, -1))
>        ...)
>       (if (wi::ne_p (@1, 0) && single_use (@2))
>           ...)))
>
> which is also nicer for indentation.
Will fix if we go with the match.pd approach.

>
> I also notice we dont' handle 1 - A CMP A anywhere.
Yea, I thought about that case, but decided against inclusion due to 
stage4 and the lack of a testcase.  It's enough enough to handle though 
once we decide on the final form.

I've got a couple things to play with from Marc's message.  Will post an 
update on those findings ASAP.

jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-26  7:38 ` Marc Glisse
  2017-01-26 15:57   ` Jeff Law
@ 2017-01-26 18:32   ` Jeff Law
  1 sibling, 0 replies; 17+ messages in thread
From: Jeff Law @ 2017-01-26 18:32 UTC (permalink / raw)
  To: gcc-patches

On 01/25/2017 11:09 PM, Marc Glisse wrote:
>
> I assume this causes a regression for code like
>
> unsigned f(unsigned a){
>   unsigned b=a+1;
>   if(b<a)return 42;
>   return b;
> }
So, it's not terrible to catch this in a manner similar to how we find 
other ADD_OVERFLOW cases.  And the positive is that we obviously pick it 
up for cases created by my match.pd pattern or if the user wrote it 
explicitly.

With my match.pd pattern + some more code in tree-ssa-mathopts.c we get:

         addl    $1, %edi
         movl    $42, %eax
         cmovnc  %edi, %eax

Which is precisely what we want.  An enterprising individual could 
probably extend it further and catch overflows from increments other 
than +-1.

The widening_mul pass (where this happens) is quite late in the 
optimizer pipeline, so we're not losing the original benefits we were 
looking for.

>
> ? On the other hand, the optimization is already very fragile, if I
> write b<=a (which is equivalent since 1 != 0), it doesn't apply.
Looks like it's just an oversight.

WRT back-propagating in DOM.  It's possible, but we have to open code 
the detection, which is what I didn't like about my original tweak to 
VRP -- detecting in match.pd is *so* much simpler.

I'm going to push the improvement to tree-ssa-mathopts.c further to make 
sure I didn't goof anything badly there.  That seems the most promising 
path to me.

jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-26 10:00 ` Richard Biener
  2017-01-26 16:09   ` Jeff Law
@ 2017-01-27  9:03   ` Marc Glisse
  1 sibling, 0 replies; 17+ messages in thread
From: Marc Glisse @ 2017-01-27  9:03 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, gcc-patches

On Thu, 26 Jan 2017, Richard Biener wrote:

> I also notice we dont' handle 1 - A CMP A anywhere.

Hmm, what are we supposed to transform that into? For ==, that would be 
false (parity), but for < it is false for 0, true for 1, false for 2, then 
true again after MAX/2... Nothing obvious.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-26 15:57   ` Jeff Law
@ 2017-01-27  9:13     ` Marc Glisse
  2017-01-27 12:10       ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Marc Glisse @ 2017-01-27  9:13 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Thu, 26 Jan 2017, Jeff Law wrote:

>> I assume this causes a regression for code like
>> 
>> unsigned f(unsigned a){
>>   unsigned b=a+1;
>>   if(b<a)return 42;
>>   return b;
>> }
> Yes.  The transformation ruins the conversion into ADD_OVERFLOW for the +- 1 
> case.  However, ISTM that we could potentially recover the ADD_OVERFLOW in 
> phi-opt.  It's a very simple pattern that would be presented to phi-opt, so 
> it might not be terrible to recover -- which has the advantage that if a user 
> wrote an optimized overflow test we'd be able to recover ADD_OVERFLOW for it.

phi-opt is a bit surprising at first glance because there can be overflow 
checking without condition/PHI, but if it is convenient to catch many 
cases...

>> We currently get
>>     addl    $1, %edi
>>     movl    $42, %eax
>>     cmovnc    %edi, %eax
>> or almost as good with b==0
>>     movl    %edi, %eax
>>     movl    $42, %edx
>>     addl    $1, %eax
>>     cmove    %edx, %eax
>> while with a==-1 we have the redundant comparison
>>     leal    1(%rdi), %eax
>>     cmpl    $-1, %edi
>>     movl    $42, %edx
>>     cmove    %edx, %eax
>> 
>> Simplifying x + 1 < x to x + 1 == 0 might not be enough to simplify your
>> examples though I guess?
> It's an interesting idea.  But x+1 == 0 will get canonicalized back into x == 
> -1.

Well, it doesn't always get canonicalized since I got different code above 
for b==0 and for a==-1. There may be another single_use check in there. 
But that was probably not a good idea anyway.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-27  9:13     ` Marc Glisse
@ 2017-01-27 12:10       ` Richard Biener
  2017-01-27 19:20         ` Jeff Law
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Biener @ 2017-01-27 12:10 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jeff Law

On Fri, Jan 27, 2017 at 10:02 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Thu, 26 Jan 2017, Jeff Law wrote:
>
>>> I assume this causes a regression for code like
>>>
>>> unsigned f(unsigned a){
>>>   unsigned b=a+1;
>>>   if(b<a)return 42;
>>>   return b;
>>> }
>>
>> Yes.  The transformation ruins the conversion into ADD_OVERFLOW for the +-
>> 1 case.  However, ISTM that we could potentially recover the ADD_OVERFLOW in
>> phi-opt.  It's a very simple pattern that would be presented to phi-opt, so
>> it might not be terrible to recover -- which has the advantage that if a
>> user wrote an optimized overflow test we'd be able to recover ADD_OVERFLOW
>> for it.
>
>
> phi-opt is a bit surprising at first glance because there can be overflow
> checking without condition/PHI, but if it is convenient to catch many
> cases...

Yeah, and it's still on my TODO to add some helpers exercising
match.pd COND_EXPR
patterns from PHI nodes and their controlling condition.

>>> We currently get
>>>     addl    $1, %edi
>>>     movl    $42, %eax
>>>     cmovnc    %edi, %eax
>>> or almost as good with b==0
>>>     movl    %edi, %eax
>>>     movl    $42, %edx
>>>     addl    $1, %eax
>>>     cmove    %edx, %eax
>>> while with a==-1 we have the redundant comparison
>>>     leal    1(%rdi), %eax
>>>     cmpl    $-1, %edi
>>>     movl    $42, %edx
>>>     cmove    %edx, %eax
>>>
>>> Simplifying x + 1 < x to x + 1 == 0 might not be enough to simplify your
>>> examples though I guess?
>>
>> It's an interesting idea.  But x+1 == 0 will get canonicalized back into x
>> == -1.
>
>
> Well, it doesn't always get canonicalized since I got different code above
> for b==0 and for a==-1. There may be another single_use check in there. But
> that was probably not a good idea anyway.
>
> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-27 12:10       ` Richard Biener
@ 2017-01-27 19:20         ` Jeff Law
  2017-01-27 22:02           ` Richard Biener
  2017-01-27 22:36           ` Andrew Pinski
  0 siblings, 2 replies; 17+ messages in thread
From: Jeff Law @ 2017-01-27 19:20 UTC (permalink / raw)
  To: Richard Biener, GCC Patches

On 01/27/2017 05:08 AM, Richard Biener wrote:
> On Fri, Jan 27, 2017 at 10:02 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Thu, 26 Jan 2017, Jeff Law wrote:
>>
>>>> I assume this causes a regression for code like
>>>>
>>>> unsigned f(unsigned a){
>>>>   unsigned b=a+1;
>>>>   if(b<a)return 42;
>>>>   return b;
>>>> }
>>>
>>> Yes.  The transformation ruins the conversion into ADD_OVERFLOW for the +-
>>> 1 case.  However, ISTM that we could potentially recover the ADD_OVERFLOW in
>>> phi-opt.  It's a very simple pattern that would be presented to phi-opt, so
>>> it might not be terrible to recover -- which has the advantage that if a
>>> user wrote an optimized overflow test we'd be able to recover ADD_OVERFLOW
>>> for it.
>>
>>
>> phi-opt is a bit surprising at first glance because there can be overflow
>> checking without condition/PHI, but if it is convenient to catch many
>> cases...
>
> Yeah, and it's still on my TODO to add some helpers exercising
> match.pd COND_EXPR
> patterns from PHI nodes and their controlling condition.
It turns out to be better to fix the existing machinery to detect 
ADD_OVERFLOW in the transformed case than to add new detection to phi-opt.

The problem with improving the detection of ADD_OVERFLOW is that the 
transformed test may allow the ADD/SUB to be sunk.  So by the time we 
run the pass to detect ADD_OVERFLOW, the test and arithmetic may be in 
different blocks -- ugh.

The more I keep thinking about this the more I wonder if transforming 
the conditional is just more of a headache than its worth -- the main 
need here is to drive propagation of known constants into the THEN/ELSE 
clauses.  Transforming the conditional makes that easy for VRP & DOM to 
discover those constant and the transform is easy to write in match.pd. 
But we could just go back to discovering the case in VRP or DOM via 
open-coding detection, then propagating the known constants without 
transforming the conditional.

Jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-27 19:20         ` Jeff Law
@ 2017-01-27 22:02           ` Richard Biener
  2017-01-27 22:10             ` Jeff Law
  2017-01-27 22:29             ` Jeff Law
  2017-01-27 22:36           ` Andrew Pinski
  1 sibling, 2 replies; 17+ messages in thread
From: Richard Biener @ 2017-01-27 22:02 UTC (permalink / raw)
  To: Jeff Law, GCC Patches

On January 27, 2017 7:30:07 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>On 01/27/2017 05:08 AM, Richard Biener wrote:
>> On Fri, Jan 27, 2017 at 10:02 AM, Marc Glisse <marc.glisse@inria.fr>
>wrote:
>>> On Thu, 26 Jan 2017, Jeff Law wrote:
>>>
>>>>> I assume this causes a regression for code like
>>>>>
>>>>> unsigned f(unsigned a){
>>>>>   unsigned b=a+1;
>>>>>   if(b<a)return 42;
>>>>>   return b;
>>>>> }
>>>>
>>>> Yes.  The transformation ruins the conversion into ADD_OVERFLOW for
>the +-
>>>> 1 case.  However, ISTM that we could potentially recover the
>ADD_OVERFLOW in
>>>> phi-opt.  It's a very simple pattern that would be presented to
>phi-opt, so
>>>> it might not be terrible to recover -- which has the advantage that
>if a
>>>> user wrote an optimized overflow test we'd be able to recover
>ADD_OVERFLOW
>>>> for it.
>>>
>>>
>>> phi-opt is a bit surprising at first glance because there can be
>overflow
>>> checking without condition/PHI, but if it is convenient to catch
>many
>>> cases...
>>
>> Yeah, and it's still on my TODO to add some helpers exercising
>> match.pd COND_EXPR
>> patterns from PHI nodes and their controlling condition.
>It turns out to be better to fix the existing machinery to detect 
>ADD_OVERFLOW in the transformed case than to add new detection to
>phi-opt.
>
>The problem with improving the detection of ADD_OVERFLOW is that the 
>transformed test may allow the ADD/SUB to be sunk.  So by the time we 
>run the pass to detect ADD_OVERFLOW, the test and arithmetic may be in 
>different blocks -- ugh.
>
>The more I keep thinking about this the more I wonder if transforming 
>the conditional is just more of a headache than its worth -- the main 
>need here is to drive propagation of known constants into the THEN/ELSE
>
>clauses.  Transforming the conditional makes that easy for VRP & DOM to
>
>discover those constant and the transform is easy to write in match.pd.
>
>But we could just go back to discovering the case in VRP or DOM via 
>open-coding detection, then propagating the known constants without 
>transforming the conditional.

Indeed we can do that.  And in fact with named patterns in match.pd you could even avoid the open-coding.

Richard.

>Jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-27 22:02           ` Richard Biener
@ 2017-01-27 22:10             ` Jeff Law
  2017-01-27 22:29             ` Jeff Law
  1 sibling, 0 replies; 17+ messages in thread
From: Jeff Law @ 2017-01-27 22:10 UTC (permalink / raw)
  To: Richard Biener, GCC Patches

On 01/27/2017 02:35 PM, Richard Biener wrote:
> On January 27, 2017 7:30:07 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>> On 01/27/2017 05:08 AM, Richard Biener wrote:
>>> On Fri, Jan 27, 2017 at 10:02 AM, Marc Glisse <marc.glisse@inria.fr>
>> wrote:
>>>> On Thu, 26 Jan 2017, Jeff Law wrote:
>>>>
>>>>>> I assume this causes a regression for code like
>>>>>>
>>>>>> unsigned f(unsigned a){
>>>>>>   unsigned b=a+1;
>>>>>>   if(b<a)return 42;
>>>>>>   return b;
>>>>>> }
>>>>>
>>>>> Yes.  The transformation ruins the conversion into ADD_OVERFLOW for
>> the +-
>>>>> 1 case.  However, ISTM that we could potentially recover the
>> ADD_OVERFLOW in
>>>>> phi-opt.  It's a very simple pattern that would be presented to
>> phi-opt, so
>>>>> it might not be terrible to recover -- which has the advantage that
>> if a
>>>>> user wrote an optimized overflow test we'd be able to recover
>> ADD_OVERFLOW
>>>>> for it.
>>>>
>>>>
>>>> phi-opt is a bit surprising at first glance because there can be
>> overflow
>>>> checking without condition/PHI, but if it is convenient to catch
>> many
>>>> cases...
>>>
>>> Yeah, and it's still on my TODO to add some helpers exercising
>>> match.pd COND_EXPR
>>> patterns from PHI nodes and their controlling condition.
>> It turns out to be better to fix the existing machinery to detect
>> ADD_OVERFLOW in the transformed case than to add new detection to
>> phi-opt.
>>
>> The problem with improving the detection of ADD_OVERFLOW is that the
>> transformed test may allow the ADD/SUB to be sunk.  So by the time we
>> run the pass to detect ADD_OVERFLOW, the test and arithmetic may be in
>> different blocks -- ugh.
>>
>> The more I keep thinking about this the more I wonder if transforming
>> the conditional is just more of a headache than its worth -- the main
>> need here is to drive propagation of known constants into the THEN/ELSE
>>
>> clauses.  Transforming the conditional makes that easy for VRP & DOM to
>>
>> discover those constant and the transform is easy to write in match.pd.
>>
>> But we could just go back to discovering the case in VRP or DOM via
>> open-coding detection, then propagating the known constants without
>> transforming the conditional.
>
> Indeed we can do that.  And in fact with named patterns in match.pd you could even avoid the open-coding.
?!? Is there an example of this somewhere?

jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-27 22:02           ` Richard Biener
  2017-01-27 22:10             ` Jeff Law
@ 2017-01-27 22:29             ` Jeff Law
  2017-01-30  9:55               ` Richard Biener
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff Law @ 2017-01-27 22:29 UTC (permalink / raw)
  To: Richard Biener, GCC Patches

On 01/27/2017 02:35 PM, Richard Biener wrote:
> On January 27, 2017 7:30:07 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>> On 01/27/2017 05:08 AM, Richard Biener wrote:
>>> On Fri, Jan 27, 2017 at 10:02 AM, Marc Glisse <marc.glisse@inria.fr>
>> wrote:
>>>> On Thu, 26 Jan 2017, Jeff Law wrote:
>>>>
>>>>>> I assume this causes a regression for code like
>>>>>>
>>>>>> unsigned f(unsigned a){
>>>>>>   unsigned b=a+1;
>>>>>>   if(b<a)return 42;
>>>>>>   return b;
>>>>>> }
>>>>>
>>>>> Yes.  The transformation ruins the conversion into ADD_OVERFLOW for
>> the +-
>>>>> 1 case.  However, ISTM that we could potentially recover the
>> ADD_OVERFLOW in
>>>>> phi-opt.  It's a very simple pattern that would be presented to
>> phi-opt, so
>>>>> it might not be terrible to recover -- which has the advantage that
>> if a
>>>>> user wrote an optimized overflow test we'd be able to recover
>> ADD_OVERFLOW
>>>>> for it.
>>>>
>>>>
>>>> phi-opt is a bit surprising at first glance because there can be
>> overflow
>>>> checking without condition/PHI, but if it is convenient to catch
>> many
>>>> cases...
>>>
>>> Yeah, and it's still on my TODO to add some helpers exercising
>>> match.pd COND_EXPR
>>> patterns from PHI nodes and their controlling condition.
>> It turns out to be better to fix the existing machinery to detect
>> ADD_OVERFLOW in the transformed case than to add new detection to
>> phi-opt.
>>
>> The problem with improving the detection of ADD_OVERFLOW is that the
>> transformed test may allow the ADD/SUB to be sunk.  So by the time we
>> run the pass to detect ADD_OVERFLOW, the test and arithmetic may be in
>> different blocks -- ugh.
>>
>> The more I keep thinking about this the more I wonder if transforming
>> the conditional is just more of a headache than its worth -- the main
>> need here is to drive propagation of known constants into the THEN/ELSE
>>
>> clauses.  Transforming the conditional makes that easy for VRP & DOM to
>>
>> discover those constant and the transform is easy to write in match.pd.
>>
>> But we could just go back to discovering the case in VRP or DOM via
>> open-coding detection, then propagating the known constants without
>> transforming the conditional.
>
> Indeed we can do that.  And in fact with named patterns in match.pd you could even avoid the open-coding.
negate_expr_p being the example?  That does look mighty interesting... 
After recognition we'd still have to extract the operands, but it does 
look like it might handle the detection part.

jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-27 19:20         ` Jeff Law
  2017-01-27 22:02           ` Richard Biener
@ 2017-01-27 22:36           ` Andrew Pinski
  1 sibling, 0 replies; 17+ messages in thread
From: Andrew Pinski @ 2017-01-27 22:36 UTC (permalink / raw)
  To: Jeff Law; +Cc: Richard Biener, GCC Patches

On Fri, Jan 27, 2017 at 10:30 AM, Jeff Law <law@redhat.com> wrote:
> On 01/27/2017 05:08 AM, Richard Biener wrote:
>>
>> On Fri, Jan 27, 2017 at 10:02 AM, Marc Glisse <marc.glisse@inria.fr>
>> wrote:
>>>
>>> On Thu, 26 Jan 2017, Jeff Law wrote:
>>>
>>>>> I assume this causes a regression for code like
>>>>>
>>>>> unsigned f(unsigned a){
>>>>>   unsigned b=a+1;
>>>>>   if(b<a)return 42;
>>>>>   return b;
>>>>> }
>>>>
>>>>
>>>> Yes.  The transformation ruins the conversion into ADD_OVERFLOW for the
>>>> +-
>>>> 1 case.  However, ISTM that we could potentially recover the
>>>> ADD_OVERFLOW in
>>>> phi-opt.  It's a very simple pattern that would be presented to phi-opt,
>>>> so
>>>> it might not be terrible to recover -- which has the advantage that if a
>>>> user wrote an optimized overflow test we'd be able to recover
>>>> ADD_OVERFLOW
>>>> for it.
>>>
>>>
>>>
>>> phi-opt is a bit surprising at first glance because there can be overflow
>>> checking without condition/PHI, but if it is convenient to catch many
>>> cases...
>>
>>
>> Yeah, and it's still on my TODO to add some helpers exercising
>> match.pd COND_EXPR
>> patterns from PHI nodes and their controlling condition.
>
> It turns out to be better to fix the existing machinery to detect
> ADD_OVERFLOW in the transformed case than to add new detection to phi-opt.

I had a start on improving PHI-OPT to use match.pd directly.  I
started a prototype patch to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25290 .  It needs many
cleanups which I hope to do early March.  As I said in the bug report,
the patch started out on as doing conditional move; I disabled that
part but did not remove it.  This is part of the cleanup which I hope
to do.  There is some interesting interactions with doing the match.pd
part on generic which I did not have time to debug yet either.  Though
it was able to do simple simplifications using match.pd now.  I hope
to clean up the patch and finish it up even removing the more complex
parts of phi-opt too.

Thanks,
Andrew

>
> The problem with improving the detection of ADD_OVERFLOW is that the
> transformed test may allow the ADD/SUB to be sunk.  So by the time we run
> the pass to detect ADD_OVERFLOW, the test and arithmetic may be in different
> blocks -- ugh.
>
> The more I keep thinking about this the more I wonder if transforming the
> conditional is just more of a headache than its worth -- the main need here
> is to drive propagation of known constants into the THEN/ELSE clauses.
> Transforming the conditional makes that easy for VRP & DOM to discover those
> constant and the transform is easy to write in match.pd. But we could just
> go back to discovering the case in VRP or DOM via open-coding detection,
> then propagating the known constants without transforming the conditional.
>
> Jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-27 22:29             ` Jeff Law
@ 2017-01-30  9:55               ` Richard Biener
  2017-01-30 18:08                 ` Jeff Law
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Biener @ 2017-01-30  9:55 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches

On Fri, Jan 27, 2017 at 11:21 PM, Jeff Law <law@redhat.com> wrote:
> On 01/27/2017 02:35 PM, Richard Biener wrote:
>>
>> On January 27, 2017 7:30:07 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>>>
>>> On 01/27/2017 05:08 AM, Richard Biener wrote:
>>>>
>>>> On Fri, Jan 27, 2017 at 10:02 AM, Marc Glisse <marc.glisse@inria.fr>
>>>
>>> wrote:
>>>>>
>>>>> On Thu, 26 Jan 2017, Jeff Law wrote:
>>>>>
>>>>>>> I assume this causes a regression for code like
>>>>>>>
>>>>>>> unsigned f(unsigned a){
>>>>>>>   unsigned b=a+1;
>>>>>>>   if(b<a)return 42;
>>>>>>>   return b;
>>>>>>> }
>>>>>>
>>>>>>
>>>>>> Yes.  The transformation ruins the conversion into ADD_OVERFLOW for
>>>
>>> the +-
>>>>>>
>>>>>> 1 case.  However, ISTM that we could potentially recover the
>>>
>>> ADD_OVERFLOW in
>>>>>>
>>>>>> phi-opt.  It's a very simple pattern that would be presented to
>>>
>>> phi-opt, so
>>>>>>
>>>>>> it might not be terrible to recover -- which has the advantage that
>>>
>>> if a
>>>>>>
>>>>>> user wrote an optimized overflow test we'd be able to recover
>>>
>>> ADD_OVERFLOW
>>>>>>
>>>>>> for it.
>>>>>
>>>>>
>>>>>
>>>>> phi-opt is a bit surprising at first glance because there can be
>>>
>>> overflow
>>>>>
>>>>> checking without condition/PHI, but if it is convenient to catch
>>>
>>> many
>>>>>
>>>>> cases...
>>>>
>>>>
>>>> Yeah, and it's still on my TODO to add some helpers exercising
>>>> match.pd COND_EXPR
>>>> patterns from PHI nodes and their controlling condition.
>>>
>>> It turns out to be better to fix the existing machinery to detect
>>> ADD_OVERFLOW in the transformed case than to add new detection to
>>> phi-opt.
>>>
>>> The problem with improving the detection of ADD_OVERFLOW is that the
>>> transformed test may allow the ADD/SUB to be sunk.  So by the time we
>>> run the pass to detect ADD_OVERFLOW, the test and arithmetic may be in
>>> different blocks -- ugh.
>>>
>>> The more I keep thinking about this the more I wonder if transforming
>>> the conditional is just more of a headache than its worth -- the main
>>> need here is to drive propagation of known constants into the THEN/ELSE
>>>
>>> clauses.  Transforming the conditional makes that easy for VRP & DOM to
>>>
>>> discover those constant and the transform is easy to write in match.pd.
>>>
>>> But we could just go back to discovering the case in VRP or DOM via
>>> open-coding detection, then propagating the known constants without
>>> transforming the conditional.
>>
>>
>> Indeed we can do that.  And in fact with named patterns in match.pd you
>> could even avoid the open-coding.
>
> negate_expr_p being the example?  That does look mighty interesting... After
> recognition we'd still have to extract the operands, but it does look like
> it might handle the detection part.

Yes, the (match ..) stuff is actually exported in gimple-match.c (just
no declarations
are emitted to headers yet).  logical_inverted_value might be a better example
given it has an "output":

(match (logical_inverted_value @0)
 (truth_not @0))

bool
gimple_logical_inverted_value (tree t, tree *res_ops, tree
(*valueize)(tree) ATTRIBUTE_UNUSED)
{

you get @0 in res_ops[0] if that returns true.  I've at some point
written some of tree-vect-patterns.c
as match.pd (match...) but never really completed it.  In the end it
would be nice to write the patterns
in-inline at use points, say,

  /* (match (foo @0) (....)) */
  if (gimple_foo (...))
   {
   }

and have some gen-program extract those patterns from source files
(plus inserting the necessary
prototypes locally?).

Richard.



> jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-30  9:55               ` Richard Biener
@ 2017-01-30 18:08                 ` Jeff Law
  2017-01-30 19:53                   ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Law @ 2017-01-30 18:08 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On 01/30/2017 02:51 AM, Richard Biener wrote:
> On Fri, Jan 27, 2017 at 11:21 PM, Jeff Law <law@redhat.com> wrote:
>> On 01/27/2017 02:35 PM, Richard Biener wrote:
>>>
>>> On January 27, 2017 7:30:07 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>>>>
>>>> On 01/27/2017 05:08 AM, Richard Biener wrote:
>>>>>
>>>>> On Fri, Jan 27, 2017 at 10:02 AM, Marc Glisse <marc.glisse@inria.fr>
>>>>
>>>> wrote:
>>>>>>
>>>>>> On Thu, 26 Jan 2017, Jeff Law wrote:
>>>>>>
>>>>>>>> I assume this causes a regression for code like
>>>>>>>>
>>>>>>>> unsigned f(unsigned a){
>>>>>>>>   unsigned b=a+1;
>>>>>>>>   if(b<a)return 42;
>>>>>>>>   return b;
>>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> Yes.  The transformation ruins the conversion into ADD_OVERFLOW for
>>>>
>>>> the +-
>>>>>>>
>>>>>>> 1 case.  However, ISTM that we could potentially recover the
>>>>
>>>> ADD_OVERFLOW in
>>>>>>>
>>>>>>> phi-opt.  It's a very simple pattern that would be presented to
>>>>
>>>> phi-opt, so
>>>>>>>
>>>>>>> it might not be terrible to recover -- which has the advantage that
>>>>
>>>> if a
>>>>>>>
>>>>>>> user wrote an optimized overflow test we'd be able to recover
>>>>
>>>> ADD_OVERFLOW
>>>>>>>
>>>>>>> for it.
>>>>>>
>>>>>>
>>>>>>
>>>>>> phi-opt is a bit surprising at first glance because there can be
>>>>
>>>> overflow
>>>>>>
>>>>>> checking without condition/PHI, but if it is convenient to catch
>>>>
>>>> many
>>>>>>
>>>>>> cases...
>>>>>
>>>>>
>>>>> Yeah, and it's still on my TODO to add some helpers exercising
>>>>> match.pd COND_EXPR
>>>>> patterns from PHI nodes and their controlling condition.
>>>>
>>>> It turns out to be better to fix the existing machinery to detect
>>>> ADD_OVERFLOW in the transformed case than to add new detection to
>>>> phi-opt.
>>>>
>>>> The problem with improving the detection of ADD_OVERFLOW is that the
>>>> transformed test may allow the ADD/SUB to be sunk.  So by the time we
>>>> run the pass to detect ADD_OVERFLOW, the test and arithmetic may be in
>>>> different blocks -- ugh.
>>>>
>>>> The more I keep thinking about this the more I wonder if transforming
>>>> the conditional is just more of a headache than its worth -- the main
>>>> need here is to drive propagation of known constants into the THEN/ELSE
>>>>
>>>> clauses.  Transforming the conditional makes that easy for VRP & DOM to
>>>>
>>>> discover those constant and the transform is easy to write in match.pd.
>>>>
>>>> But we could just go back to discovering the case in VRP or DOM via
>>>> open-coding detection, then propagating the known constants without
>>>> transforming the conditional.
>>>
>>>
>>> Indeed we can do that.  And in fact with named patterns in match.pd you
>>> could even avoid the open-coding.
>>
>> negate_expr_p being the example?  That does look mighty interesting... After
>> recognition we'd still have to extract the operands, but it does look like
>> it might handle the detection part.
>
> Yes, the (match ..) stuff is actually exported in gimple-match.c (just
> no declarations
> are emitted to headers yet).  logical_inverted_value might be a better example
> given it has an "output":
>
> (match (logical_inverted_value @0)
>  (truth_not @0))
>
> bool
> gimple_logical_inverted_value (tree t, tree *res_ops, tree
> (*valueize)(tree) ATTRIBUTE_UNUSED)
> {
Thanks.  If I look at the generated code for this match:

(for cmp (lt le ge gt)
  (match (overflow_test @0 @1 @2)
   (cmp:c (plus@2 @0 INTEGER_CST@1) @0)
   (if (TYPE_UNSIGNED (TREE_TYPE (@0))
        && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
        && wi::ne_p (@1, 0)))))


The generated code starts with...

  switch (TREE_CODE (t))
     {
     case SSA_NAME:
       if (do_valueize (valueize, t) != NULL_TREE)
         {
           gimple *def_stmt = SSA_NAME_DEF_STMT (t);
           if (gassign *def = dyn_cast <gassign *> (def_stmt))
             switch (gimple_assign_rhs_code (def))
               {
               case LT_EXPR:
                 {

So, it appears to want to match an SSA_NAME which is assigned from the 
conditional.  So it's not all that helpful because the expression we 
want to match shows up inside a GIMPLE_COND.  It also appears we'd have 
to build a tree to pass down to matching code.

Jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings
  2017-01-30 18:08                 ` Jeff Law
@ 2017-01-30 19:53                   ` Richard Biener
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Biener @ 2017-01-30 19:53 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches

On January 30, 2017 7:02:38 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>On 01/30/2017 02:51 AM, Richard Biener wrote:
>> On Fri, Jan 27, 2017 at 11:21 PM, Jeff Law <law@redhat.com> wrote:
>>> On 01/27/2017 02:35 PM, Richard Biener wrote:
>>>>
>>>> On January 27, 2017 7:30:07 PM GMT+01:00, Jeff Law <law@redhat.com>
>wrote:
>>>>>
>>>>> On 01/27/2017 05:08 AM, Richard Biener wrote:
>>>>>>
>>>>>> On Fri, Jan 27, 2017 at 10:02 AM, Marc Glisse
><marc.glisse@inria.fr>
>>>>>
>>>>> wrote:
>>>>>>>
>>>>>>> On Thu, 26 Jan 2017, Jeff Law wrote:
>>>>>>>
>>>>>>>>> I assume this causes a regression for code like
>>>>>>>>>
>>>>>>>>> unsigned f(unsigned a){
>>>>>>>>>   unsigned b=a+1;
>>>>>>>>>   if(b<a)return 42;
>>>>>>>>>   return b;
>>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes.  The transformation ruins the conversion into ADD_OVERFLOW
>for
>>>>>
>>>>> the +-
>>>>>>>>
>>>>>>>> 1 case.  However, ISTM that we could potentially recover the
>>>>>
>>>>> ADD_OVERFLOW in
>>>>>>>>
>>>>>>>> phi-opt.  It's a very simple pattern that would be presented to
>>>>>
>>>>> phi-opt, so
>>>>>>>>
>>>>>>>> it might not be terrible to recover -- which has the advantage
>that
>>>>>
>>>>> if a
>>>>>>>>
>>>>>>>> user wrote an optimized overflow test we'd be able to recover
>>>>>
>>>>> ADD_OVERFLOW
>>>>>>>>
>>>>>>>> for it.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> phi-opt is a bit surprising at first glance because there can be
>>>>>
>>>>> overflow
>>>>>>>
>>>>>>> checking without condition/PHI, but if it is convenient to catch
>>>>>
>>>>> many
>>>>>>>
>>>>>>> cases...
>>>>>>
>>>>>>
>>>>>> Yeah, and it's still on my TODO to add some helpers exercising
>>>>>> match.pd COND_EXPR
>>>>>> patterns from PHI nodes and their controlling condition.
>>>>>
>>>>> It turns out to be better to fix the existing machinery to detect
>>>>> ADD_OVERFLOW in the transformed case than to add new detection to
>>>>> phi-opt.
>>>>>
>>>>> The problem with improving the detection of ADD_OVERFLOW is that
>the
>>>>> transformed test may allow the ADD/SUB to be sunk.  So by the time
>we
>>>>> run the pass to detect ADD_OVERFLOW, the test and arithmetic may
>be in
>>>>> different blocks -- ugh.
>>>>>
>>>>> The more I keep thinking about this the more I wonder if
>transforming
>>>>> the conditional is just more of a headache than its worth -- the
>main
>>>>> need here is to drive propagation of known constants into the
>THEN/ELSE
>>>>>
>>>>> clauses.  Transforming the conditional makes that easy for VRP &
>DOM to
>>>>>
>>>>> discover those constant and the transform is easy to write in
>match.pd.
>>>>>
>>>>> But we could just go back to discovering the case in VRP or DOM
>via
>>>>> open-coding detection, then propagating the known constants
>without
>>>>> transforming the conditional.
>>>>
>>>>
>>>> Indeed we can do that.  And in fact with named patterns in match.pd
>you
>>>> could even avoid the open-coding.
>>>
>>> negate_expr_p being the example?  That does look mighty
>interesting... After
>>> recognition we'd still have to extract the operands, but it does
>look like
>>> it might handle the detection part.
>>
>> Yes, the (match ..) stuff is actually exported in gimple-match.c
>(just
>> no declarations
>> are emitted to headers yet).  logical_inverted_value might be a
>better example
>> given it has an "output":
>>
>> (match (logical_inverted_value @0)
>>  (truth_not @0))
>>
>> bool
>> gimple_logical_inverted_value (tree t, tree *res_ops, tree
>> (*valueize)(tree) ATTRIBUTE_UNUSED)
>> {
>Thanks.  If I look at the generated code for this match:
>
>(for cmp (lt le ge gt)
>  (match (overflow_test @0 @1 @2)
>   (cmp:c (plus@2 @0 INTEGER_CST@1) @0)
>   (if (TYPE_UNSIGNED (TREE_TYPE (@0))
>        && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
>        && wi::ne_p (@1, 0)))))
>
>
>The generated code starts with...
>
>  switch (TREE_CODE (t))
>     {
>     case SSA_NAME:
>       if (do_valueize (valueize, t) != NULL_TREE)
>         {
>           gimple *def_stmt = SSA_NAME_DEF_STMT (t);
>           if (gassign *def = dyn_cast <gassign *> (def_stmt))
>             switch (gimple_assign_rhs_code (def))
>               {
>               case LT_EXPR:
>                 {
>
>So, it appears to want to match an SSA_NAME which is assigned from the 
>conditional.  So it's not all that helpful because the expression we 
>want to match shows up inside a GIMPLE_COND.  It also appears we'd have

Yeah...

>to build a tree to pass down to matching code.

A tree?  Not sure what you mean.

As this isn't used outside of match.PD yet it surely needs some work in genmatch like for the case where the expression to match has no SSA value.

Richard.

>
>Jeff

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-01-30 19:15 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-26  1:03 [PATCH][RFA][PR tree-optimization/79095] Improve overflow test optimization and avoid invalid warnings Jeff Law
2017-01-26  7:38 ` Marc Glisse
2017-01-26 15:57   ` Jeff Law
2017-01-27  9:13     ` Marc Glisse
2017-01-27 12:10       ` Richard Biener
2017-01-27 19:20         ` Jeff Law
2017-01-27 22:02           ` Richard Biener
2017-01-27 22:10             ` Jeff Law
2017-01-27 22:29             ` Jeff Law
2017-01-30  9:55               ` Richard Biener
2017-01-30 18:08                 ` Jeff Law
2017-01-30 19:53                   ` Richard Biener
2017-01-27 22:36           ` Andrew Pinski
2017-01-26 18:32   ` Jeff Law
2017-01-26 10:00 ` Richard Biener
2017-01-26 16:09   ` Jeff Law
2017-01-27  9:03   ` Marc Glisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).