public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]
@ 2023-10-25  3:50 Andrew Pinski
  2023-10-26  9:29 ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Pinski @ 2023-10-25  3:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: Andrew Pinski

I noticed we were missing optimizing `a / (1 << b)` when
we know that a is nonnegative but only due to ranger information.
This adds the use of the global ranger to tree_single_nonnegative_warnv_p
for SSA_NAME.
I didn't extend tree_single_nonnegative_warnv_p to use the ranger for floating
point nor to use the local ranger since I am not 100% sure it is safe where
all of the uses tree_expr_nonnegative_p would be safe.

Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
global ranges from __builtin_unreachable. It just happened to be optimized
before due to global ranges not being used as much.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR tree-optimization/111959

gcc/ChangeLog:

	* fold-const.cc (tree_single_nonnegative_warnv_p): Use
	the global range to see if the SSA_NAME was nonnegative.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/forwprop-42.c: New test.
	* gcc.dg/pr80776-1.c: xfail and update comment.
---
 gcc/fold-const.cc                           | 36 +++++++++++++++------
 gcc/testsuite/gcc.dg/pr80776-1.c            |  8 ++---
 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +++++++++
 3 files changed, 46 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 40767736389..2a2a90230f5 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool *strict_overflow_p, int depth)
       return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2));
 
     case SSA_NAME:
-      /* Limit the depth of recursion to avoid quadratic behavior.
-	 This is expected to catch almost all occurrences in practice.
-	 If this code misses important cases that unbounded recursion
-	 would not, passes that need this information could be revised
-	 to provide it through dataflow propagation.  */
-      return (!name_registered_for_update_p (t)
-	      && depth < param_max_ssa_name_query_depth
-	      && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
-						  strict_overflow_p, depth));
+      {
+	/* For integral types, querry the global range if possible. */
+	if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
+	  {
+	    value_range vr;
+	    if (get_global_range_query ()->range_of_expr (vr, t)
+		&& !vr.varying_p () && !vr.undefined_p ())
+	      {
+		/* If the range is nonnegative, return true. */
+		if (vr.nonnegative_p ())
+		  return true;
+
+		/* If the range is non-positive, then return false. */
+		if (vr.nonpositive_p ())
+		  return false;
+	      }
+	  }
+	/* Limit the depth of recursion to avoid quadratic behavior.
+	   This is expected to catch almost all occurrences in practice.
+	   If this code misses important cases that unbounded recursion
+	   would not, passes that need this information could be revised
+	   to provide it through dataflow propagation.  */
+	return (!name_registered_for_update_p (t)
+		&& depth < param_max_ssa_name_query_depth
+		&& gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
+						    strict_overflow_p, depth));
+      }
 
     default:
       return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE (t));
diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c b/gcc/testsuite/gcc.dg/pr80776-1.c
index b9bce62d982..f3d47aeda36 100644
--- a/gcc/testsuite/gcc.dg/pr80776-1.c
+++ b/gcc/testsuite/gcc.dg/pr80776-1.c
@@ -18,14 +18,14 @@ Foo (void)
   if (! (0 <= i && i <= 999999))
     __builtin_unreachable ();
 
-  /* Legacy evrp sets the range of i to [0, MAX] *before* the first conditional,
+  /* vrp1 sets the range of i to [0, MAX] *before* the first conditional,
      and to [0,999999] *before* the second conditional.  This is because both
-     evrp and VRP use trickery to set global ranges when this particular use of
+     vrp use trickery to set global ranges when this particular use of
      a __builtin_unreachable is in play (see uses of
      assert_unreachable_fallthru_edge_p).
 
-     Setting these ranges at the definition site, causes VRP to remove the
+     Setting these ranges at the definition site, causes other passes to remove the
      unreachable code altogether, leaving the following sprintf unguarded.  This
      causes the bogus warning below.  */
-  sprintf (number, "%d", i); /* { dg-bogus "writing" "" } */
+  sprintf (number, "%d", i); /* { dg-bogus "writing" "" { xfail *-*-* } } */
 }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
new file mode 100644
index 00000000000..4e5421ed4d4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* PR tree-optimization/111959 */
+
+int divbypow2(int a, int b)
+{
+  if (a & ~0xff) __builtin_unreachable();
+  return a / (1<<b);
+}
+
+/* divbypow2 should be able to optimize to just a/b as a is known to be always positive. */
+/* { dg-final { scan-tree-dump-not " / " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " << " "optimized" } } */
+/* { dg-final { scan-tree-dump-times " >> " 1 "optimized" } } */
+
-- 
2.39.3


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]
  2023-10-25  3:50 [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959] Andrew Pinski
@ 2023-10-26  9:29 ` Richard Biener
  2023-10-26 16:41   ` Mikael Morin
  2023-10-26 18:30   ` Andrew Pinski
  0 siblings, 2 replies; 5+ messages in thread
From: Richard Biener @ 2023-10-26  9:29 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches

On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski <pinskia@gmail.com> wrote:
>
> I noticed we were missing optimizing `a / (1 << b)` when
> we know that a is nonnegative but only due to ranger information.
> This adds the use of the global ranger to tree_single_nonnegative_warnv_p
> for SSA_NAME.
> I didn't extend tree_single_nonnegative_warnv_p to use the ranger for floating
> point nor to use the local ranger since I am not 100% sure it is safe where
> all of the uses tree_expr_nonnegative_p would be safe.
>
> Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
> global ranges from __builtin_unreachable. It just happened to be optimized
> before due to global ranges not being used as much.
>
> Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
>         PR tree-optimization/111959
>
> gcc/ChangeLog:
>
>         * fold-const.cc (tree_single_nonnegative_warnv_p): Use
>         the global range to see if the SSA_NAME was nonnegative.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/forwprop-42.c: New test.
>         * gcc.dg/pr80776-1.c: xfail and update comment.
> ---
>  gcc/fold-const.cc                           | 36 +++++++++++++++------
>  gcc/testsuite/gcc.dg/pr80776-1.c            |  8 ++---
>  gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +++++++++
>  3 files changed, 46 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 40767736389..2a2a90230f5 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool *strict_overflow_p, int depth)
>        return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2));
>
>      case SSA_NAME:
> -      /* Limit the depth of recursion to avoid quadratic behavior.
> -        This is expected to catch almost all occurrences in practice.
> -        If this code misses important cases that unbounded recursion
> -        would not, passes that need this information could be revised
> -        to provide it through dataflow propagation.  */
> -      return (!name_registered_for_update_p (t)
> -             && depth < param_max_ssa_name_query_depth
> -             && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> -                                                 strict_overflow_p, depth));
> +      {
> +       /* For integral types, querry the global range if possible. */

query

> +       if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
> +         {
> +           value_range vr;
> +           if (get_global_range_query ()->range_of_expr (vr, t)
> +               && !vr.varying_p () && !vr.undefined_p ())
> +             {
> +               /* If the range is nonnegative, return true. */
> +               if (vr.nonnegative_p ())
> +                 return true;
> +
> +               /* If the range is non-positive, then return false. */
> +               if (vr.nonpositive_p ())
> +                 return false;

That's testing for <= 0, nonnegative for >= 0.  This means when
vr.nonpositive_p () the value could still be zero (and nonnegative),
possibly be figured out by the recursion below.

Since we don't have negative_p () do we want to test
nonpositive_p () && nonzero_p () instead?

OK with that change.

Richard.

> +             }
> +         }
> +       /* Limit the depth of recursion to avoid quadratic behavior.
> +          This is expected to catch almost all occurrences in practice.
> +          If this code misses important cases that unbounded recursion
> +          would not, passes that need this information could be revised
> +          to provide it through dataflow propagation.  */
> +       return (!name_registered_for_update_p (t)
> +               && depth < param_max_ssa_name_query_depth
> +               && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> +                                                   strict_overflow_p, depth));
> +      }
>
>      default:
>        return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE (t));
> diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c b/gcc/testsuite/gcc.dg/pr80776-1.c
> index b9bce62d982..f3d47aeda36 100644
> --- a/gcc/testsuite/gcc.dg/pr80776-1.c
> +++ b/gcc/testsuite/gcc.dg/pr80776-1.c
> @@ -18,14 +18,14 @@ Foo (void)
>    if (! (0 <= i && i <= 999999))
>      __builtin_unreachable ();
>
> -  /* Legacy evrp sets the range of i to [0, MAX] *before* the first conditional,
> +  /* vrp1 sets the range of i to [0, MAX] *before* the first conditional,
>       and to [0,999999] *before* the second conditional.  This is because both
> -     evrp and VRP use trickery to set global ranges when this particular use of
> +     vrp use trickery to set global ranges when this particular use of
>       a __builtin_unreachable is in play (see uses of
>       assert_unreachable_fallthru_edge_p).
>
> -     Setting these ranges at the definition site, causes VRP to remove the
> +     Setting these ranges at the definition site, causes other passes to remove the
>       unreachable code altogether, leaving the following sprintf unguarded.  This
>       causes the bogus warning below.  */
> -  sprintf (number, "%d", i); /* { dg-bogus "writing" "" } */
> +  sprintf (number, "%d", i); /* { dg-bogus "writing" "" { xfail *-*-* } } */
>  }
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> new file mode 100644
> index 00000000000..4e5421ed4d4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* PR tree-optimization/111959 */
> +
> +int divbypow2(int a, int b)
> +{
> +  if (a & ~0xff) __builtin_unreachable();
> +  return a / (1<<b);
> +}
> +
> +/* divbypow2 should be able to optimize to just a/b as a is known to be always positive. */
> +/* { dg-final { scan-tree-dump-not " / " "optimized" } } */
> +/* { dg-final { scan-tree-dump-not " << " "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " >> " 1 "optimized" } } */
> +
> --
> 2.39.3
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]
  2023-10-26  9:29 ` Richard Biener
@ 2023-10-26 16:41   ` Mikael Morin
  2023-10-26 18:30   ` Andrew Pinski
  1 sibling, 0 replies; 5+ messages in thread
From: Mikael Morin @ 2023-10-26 16:41 UTC (permalink / raw)
  To: Richard Biener, Andrew Pinski; +Cc: gcc-patches

Le 26/10/2023 à 11:29, Richard Biener a écrit :
> On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski <pinskia@gmail.com> wrote:
>> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
>> index 40767736389..2a2a90230f5 100644
>> --- a/gcc/fold-const.cc
>> +++ b/gcc/fold-const.cc
>> @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool *strict_overflow_p, int depth)
>>         return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2));
>>
>>       case SSA_NAME:
>> -      /* Limit the depth of recursion to avoid quadratic behavior.
>> -        This is expected to catch almost all occurrences in practice.
>> -        If this code misses important cases that unbounded recursion
>> -        would not, passes that need this information could be revised
>> -        to provide it through dataflow propagation.  */
>> -      return (!name_registered_for_update_p (t)
>> -             && depth < param_max_ssa_name_query_depth
>> -             && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
>> -                                                 strict_overflow_p, depth));
>> +      {
>> +       /* For integral types, querry the global range if possible. */
> 
> query
> 
>> +       if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
>> +         {
>> +           value_range vr;
>> +           if (get_global_range_query ()->range_of_expr (vr, t)
>> +               && !vr.varying_p () && !vr.undefined_p ())
>> +             {
>> +               /* If the range is nonnegative, return true. */
>> +               if (vr.nonnegative_p ())
>> +                 return true;
>> +
>> +               /* If the range is non-positive, then return false. */
>> +               if (vr.nonpositive_p ())
>> +                 return false;
> 
> That's testing for <= 0, nonnegative for >= 0.  This means when
> vr.nonpositive_p () the value could still be zero (and nonnegative),
> possibly be figured out by the recursion below.
> 
> Since we don't have negative_p () do we want to test
> nonpositive_p () && nonzero_p () instead?
> 

Maybe !contains_zero_p () instead of nonzero_p () ?

nonzero_p seems to check that the range is exactly the "all but zero" 
range as visible in the implementation:

   inline bool
   irange::nonzero_p () const
   {
     if (undefined_p ())
       return false;

     wide_int zero = wi::zero (TYPE_PRECISION (type ()));
     return *this == int_range<2> (type (), zero, zero, VR_ANTI_RANGE);
   }


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]
  2023-10-26  9:29 ` Richard Biener
  2023-10-26 16:41   ` Mikael Morin
@ 2023-10-26 18:30   ` Andrew Pinski
  2023-10-27 13:15     ` Richard Biener
  1 sibling, 1 reply; 5+ messages in thread
From: Andrew Pinski @ 2023-10-26 18:30 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Thu, Oct 26, 2023 at 2:29 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski <pinskia@gmail.com> wrote:
> >
> > I noticed we were missing optimizing `a / (1 << b)` when
> > we know that a is nonnegative but only due to ranger information.
> > This adds the use of the global ranger to tree_single_nonnegative_warnv_p
> > for SSA_NAME.
> > I didn't extend tree_single_nonnegative_warnv_p to use the ranger for floating
> > point nor to use the local ranger since I am not 100% sure it is safe where
> > all of the uses tree_expr_nonnegative_p would be safe.
> >
> > Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
> > global ranges from __builtin_unreachable. It just happened to be optimized
> > before due to global ranges not being used as much.
> >
> > Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> >         PR tree-optimization/111959
> >
> > gcc/ChangeLog:
> >
> >         * fold-const.cc (tree_single_nonnegative_warnv_p): Use
> >         the global range to see if the SSA_NAME was nonnegative.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/tree-ssa/forwprop-42.c: New test.
> >         * gcc.dg/pr80776-1.c: xfail and update comment.
> > ---
> >  gcc/fold-const.cc                           | 36 +++++++++++++++------
> >  gcc/testsuite/gcc.dg/pr80776-1.c            |  8 ++---
> >  gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +++++++++
> >  3 files changed, 46 insertions(+), 13 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> >
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index 40767736389..2a2a90230f5 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool *strict_overflow_p, int depth)
> >        return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2));
> >
> >      case SSA_NAME:
> > -      /* Limit the depth of recursion to avoid quadratic behavior.
> > -        This is expected to catch almost all occurrences in practice.
> > -        If this code misses important cases that unbounded recursion
> > -        would not, passes that need this information could be revised
> > -        to provide it through dataflow propagation.  */
> > -      return (!name_registered_for_update_p (t)
> > -             && depth < param_max_ssa_name_query_depth
> > -             && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > -                                                 strict_overflow_p, depth));
> > +      {
> > +       /* For integral types, querry the global range if possible. */
>
> query
>
> > +       if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
> > +         {
> > +           value_range vr;
> > +           if (get_global_range_query ()->range_of_expr (vr, t)
> > +               && !vr.varying_p () && !vr.undefined_p ())
> > +             {
> > +               /* If the range is nonnegative, return true. */
> > +               if (vr.nonnegative_p ())
> > +                 return true;
> > +
> > +               /* If the range is non-positive, then return false. */
> > +               if (vr.nonpositive_p ())
> > +                 return false;
>
> That's testing for <= 0, nonnegative for >= 0.  This means when
> vr.nonpositive_p () the value could still be zero (and nonnegative),
> possibly be figured out by the recursion below.
>
> Since we don't have negative_p () do we want to test
> nonpositive_p () && nonzero_p () instead?

I was thinking about that when I was writing the patch.
If the ranger figured out the value was zero, nonnegative_p would have
returned true.
So while yes nonpositive_p() would return true but we already checked
nonnegative_p beforehand and the nonzero_p would not matter.
Now the question is if after nonnegative_p we check if the range could
contain 0 still is that worth the recursion. The whole idea of
returning false was to remove the need from recursion as much.

Thanks,
Andrew


>
> OK with that change.
>
> Richard.
>
> > +             }
> > +         }
> > +       /* Limit the depth of recursion to avoid quadratic behavior.
> > +          This is expected to catch almost all occurrences in practice.
> > +          If this code misses important cases that unbounded recursion
> > +          would not, passes that need this information could be revised
> > +          to provide it through dataflow propagation.  */
> > +       return (!name_registered_for_update_p (t)
> > +               && depth < param_max_ssa_name_query_depth
> > +               && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > +                                                   strict_overflow_p, depth));
> > +      }
> >
> >      default:
> >        return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE (t));
> > diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c b/gcc/testsuite/gcc.dg/pr80776-1.c
> > index b9bce62d982..f3d47aeda36 100644
> > --- a/gcc/testsuite/gcc.dg/pr80776-1.c
> > +++ b/gcc/testsuite/gcc.dg/pr80776-1.c
> > @@ -18,14 +18,14 @@ Foo (void)
> >    if (! (0 <= i && i <= 999999))
> >      __builtin_unreachable ();
> >
> > -  /* Legacy evrp sets the range of i to [0, MAX] *before* the first conditional,
> > +  /* vrp1 sets the range of i to [0, MAX] *before* the first conditional,
> >       and to [0,999999] *before* the second conditional.  This is because both
> > -     evrp and VRP use trickery to set global ranges when this particular use of
> > +     vrp use trickery to set global ranges when this particular use of
> >       a __builtin_unreachable is in play (see uses of
> >       assert_unreachable_fallthru_edge_p).
> >
> > -     Setting these ranges at the definition site, causes VRP to remove the
> > +     Setting these ranges at the definition site, causes other passes to remove the
> >       unreachable code altogether, leaving the following sprintf unguarded.  This
> >       causes the bogus warning below.  */
> > -  sprintf (number, "%d", i); /* { dg-bogus "writing" "" } */
> > +  sprintf (number, "%d", i); /* { dg-bogus "writing" "" { xfail *-*-* } } */
> >  }
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > new file mode 100644
> > index 00000000000..4e5421ed4d4
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > @@ -0,0 +1,15 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-optimized" } */
> > +/* PR tree-optimization/111959 */
> > +
> > +int divbypow2(int a, int b)
> > +{
> > +  if (a & ~0xff) __builtin_unreachable();
> > +  return a / (1<<b);
> > +}
> > +
> > +/* divbypow2 should be able to optimize to just a/b as a is known to be always positive. */
> > +/* { dg-final { scan-tree-dump-not " / " "optimized" } } */
> > +/* { dg-final { scan-tree-dump-not " << " "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times " >> " 1 "optimized" } } */
> > +
> > --
> > 2.39.3
> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959]
  2023-10-26 18:30   ` Andrew Pinski
@ 2023-10-27 13:15     ` Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2023-10-27 13:15 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: gcc-patches

On Thu, Oct 26, 2023 at 8:30 PM Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Thu, Oct 26, 2023 at 2:29 AM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Wed, Oct 25, 2023 at 5:51 AM Andrew Pinski <pinskia@gmail.com> wrote:
> > >
> > > I noticed we were missing optimizing `a / (1 << b)` when
> > > we know that a is nonnegative but only due to ranger information.
> > > This adds the use of the global ranger to tree_single_nonnegative_warnv_p
> > > for SSA_NAME.
> > > I didn't extend tree_single_nonnegative_warnv_p to use the ranger for floating
> > > point nor to use the local ranger since I am not 100% sure it is safe where
> > > all of the uses tree_expr_nonnegative_p would be safe.
> > >
> > > Note pr80776-1.c testcase fails again due to vrp's bad handling of setting
> > > global ranges from __builtin_unreachable. It just happened to be optimized
> > > before due to global ranges not being used as much.
> > >
> > > Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> > >
> > >         PR tree-optimization/111959
> > >
> > > gcc/ChangeLog:
> > >
> > >         * fold-const.cc (tree_single_nonnegative_warnv_p): Use
> > >         the global range to see if the SSA_NAME was nonnegative.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         * gcc.dg/tree-ssa/forwprop-42.c: New test.
> > >         * gcc.dg/pr80776-1.c: xfail and update comment.
> > > ---
> > >  gcc/fold-const.cc                           | 36 +++++++++++++++------
> > >  gcc/testsuite/gcc.dg/pr80776-1.c            |  8 ++---
> > >  gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c | 15 +++++++++
> > >  3 files changed, 46 insertions(+), 13 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > >
> > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > > index 40767736389..2a2a90230f5 100644
> > > --- a/gcc/fold-const.cc
> > > +++ b/gcc/fold-const.cc
> > > @@ -15047,15 +15047,33 @@ tree_single_nonnegative_warnv_p (tree t, bool *strict_overflow_p, int depth)
> > >        return RECURSE (TREE_OPERAND (t, 1)) && RECURSE (TREE_OPERAND (t, 2));
> > >
> > >      case SSA_NAME:
> > > -      /* Limit the depth of recursion to avoid quadratic behavior.
> > > -        This is expected to catch almost all occurrences in practice.
> > > -        If this code misses important cases that unbounded recursion
> > > -        would not, passes that need this information could be revised
> > > -        to provide it through dataflow propagation.  */
> > > -      return (!name_registered_for_update_p (t)
> > > -             && depth < param_max_ssa_name_query_depth
> > > -             && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > > -                                                 strict_overflow_p, depth));
> > > +      {
> > > +       /* For integral types, querry the global range if possible. */
> >
> > query
> >
> > > +       if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
> > > +         {
> > > +           value_range vr;
> > > +           if (get_global_range_query ()->range_of_expr (vr, t)
> > > +               && !vr.varying_p () && !vr.undefined_p ())
> > > +             {
> > > +               /* If the range is nonnegative, return true. */
> > > +               if (vr.nonnegative_p ())
> > > +                 return true;
> > > +
> > > +               /* If the range is non-positive, then return false. */
> > > +               if (vr.nonpositive_p ())
> > > +                 return false;
> >
> > That's testing for <= 0, nonnegative for >= 0.  This means when
> > vr.nonpositive_p () the value could still be zero (and nonnegative),
> > possibly be figured out by the recursion below.
> >
> > Since we don't have negative_p () do we want to test
> > nonpositive_p () && nonzero_p () instead?
>
> I was thinking about that when I was writing the patch.
> If the ranger figured out the value was zero, nonnegative_p would have
> returned true.
> So while yes nonpositive_p() would return true but we already checked
> nonnegative_p beforehand and the nonzero_p would not matter.
> Now the question is if after nonnegative_p we check if the range could
> contain 0 still is that worth the recursion.

Yes, that was the point.

> The whole idea of
> returning false was to remove the need from recursion as much.

Well, specifically the exact zero case is something the code is reasonably
good at.  If we want to remove recursion as much as possible we'd
remove it completely.

Maybe you can do some statistics how many hits the recursion yields
after the vr.nonnegative_p () out (but without the nonpositive_p one)?

Richard.

> Thanks,
> Andrew
>
>
> >
> > OK with that change.
> >
> > Richard.
> >
> > > +             }
> > > +         }
> > > +       /* Limit the depth of recursion to avoid quadratic behavior.
> > > +          This is expected to catch almost all occurrences in practice.
> > > +          If this code misses important cases that unbounded recursion
> > > +          would not, passes that need this information could be revised
> > > +          to provide it through dataflow propagation.  */
> > > +       return (!name_registered_for_update_p (t)
> > > +               && depth < param_max_ssa_name_query_depth
> > > +               && gimple_stmt_nonnegative_warnv_p (SSA_NAME_DEF_STMT (t),
> > > +                                                   strict_overflow_p, depth));
> > > +      }
> > >
> > >      default:
> > >        return tree_simple_nonnegative_warnv_p (TREE_CODE (t), TREE_TYPE (t));
> > > diff --git a/gcc/testsuite/gcc.dg/pr80776-1.c b/gcc/testsuite/gcc.dg/pr80776-1.c
> > > index b9bce62d982..f3d47aeda36 100644
> > > --- a/gcc/testsuite/gcc.dg/pr80776-1.c
> > > +++ b/gcc/testsuite/gcc.dg/pr80776-1.c
> > > @@ -18,14 +18,14 @@ Foo (void)
> > >    if (! (0 <= i && i <= 999999))
> > >      __builtin_unreachable ();
> > >
> > > -  /* Legacy evrp sets the range of i to [0, MAX] *before* the first conditional,
> > > +  /* vrp1 sets the range of i to [0, MAX] *before* the first conditional,
> > >       and to [0,999999] *before* the second conditional.  This is because both
> > > -     evrp and VRP use trickery to set global ranges when this particular use of
> > > +     vrp use trickery to set global ranges when this particular use of
> > >       a __builtin_unreachable is in play (see uses of
> > >       assert_unreachable_fallthru_edge_p).
> > >
> > > -     Setting these ranges at the definition site, causes VRP to remove the
> > > +     Setting these ranges at the definition site, causes other passes to remove the
> > >       unreachable code altogether, leaving the following sprintf unguarded.  This
> > >       causes the bogus warning below.  */
> > > -  sprintf (number, "%d", i); /* { dg-bogus "writing" "" } */
> > > +  sprintf (number, "%d", i); /* { dg-bogus "writing" "" { xfail *-*-* } } */
> > >  }
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > > new file mode 100644
> > > index 00000000000..4e5421ed4d4
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-42.c
> > > @@ -0,0 +1,15 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -fdump-tree-optimized" } */
> > > +/* PR tree-optimization/111959 */
> > > +
> > > +int divbypow2(int a, int b)
> > > +{
> > > +  if (a & ~0xff) __builtin_unreachable();
> > > +  return a / (1<<b);
> > > +}
> > > +
> > > +/* divbypow2 should be able to optimize to just a/b as a is known to be always positive. */
> > > +/* { dg-final { scan-tree-dump-not " / " "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-not " << " "optimized" } } */
> > > +/* { dg-final { scan-tree-dump-times " >> " 1 "optimized" } } */
> > > +
> > > --
> > > 2.39.3
> > >

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-10-27 13:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-25  3:50 [PATCH] Improve tree_expr_nonnegative_p by using the ranger [PR111959] Andrew Pinski
2023-10-26  9:29 ` Richard Biener
2023-10-26 16:41   ` Mikael Morin
2023-10-26 18:30   ` Andrew Pinski
2023-10-27 13:15     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).