public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] PR tree-optimization/108359 - Utilize op1 == op2 when invoking range-ops folding.
@ 2023-01-13 21:23 Andrew MacLeod
  2023-01-13 21:54 ` Jakub Jelinek
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew MacLeod @ 2023-01-13 21:23 UTC (permalink / raw)
  To: gcc-patches; +Cc: hernandez, aldy

[-- Attachment #1: Type: text/plain, Size: 1231 bytes --]

fold_range() already invokes wi_fold_in_parts to try to get more refined 
information. If the subranges are quite small, it will do each 
individual calculation and combine the results.

x * y with x = [1,3] and y = [1,3]  is broken down and we calculate each 
possibility and we end up with [1,4][6,6][9,9] instead of [1,9]

We limit this as the time is between quadratic to exponential depending 
on the number of elements in x and y.

If we also check the relation and determine that x == y, we don't need 
to worry about that growth as this process is linear.  The above case 
will be broken down to just  1*1, 2*2 and 3*3, resulting in a range of 
[1, 1][4,4][9,9].

  In the testcase, it happens to be the right_shift operation, but this 
solution is generic and applies to all range-op operations. I added a 
testcase which checks >>, *, + and %.

I also arbitrarily chose 8 elements as the limit for breaking down 
individual operations.  The overall compile time change for this is 
negligible.

Although this is a regression fix, it will affect all operations where x 
== y, which is where my initial hesitancy arose.

Regardless, bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK 
for trunk?

Andrew




[-- Attachment #2: 0002-Utilize-op1-op2-when-invoking-range-ops-folding.patch --]
[-- Type: text/x-patch, Size: 5107 bytes --]

From fd50dabc626cea1886ebb517ca24c8b8f199c3aa Mon Sep 17 00:00:00 2001
From: Andrew MacLeod <amacleod@redhat.com>
Date: Wed, 11 Jan 2023 18:12:51 -0500
Subject: [PATCH 2/3] Utilize op1 == op2 when invoking range-ops folding.

If there exists an equivalence relationship between op1 and op2,
any binary operation can be broken into individual operations and
unioned if there are sufficently few elements in the set.

	PR tree-optimization/108359
	gcc/
	* range-op.cc (range_operator::wi_fold_in_parts_equiv): New.
	(range_operator::fold_range): If op1 is equivalent to op2 then
	invoke new fold_in_parts_equiv to operate on sub-components.
	* range-op.h (wi_fold_in_parts_equiv): New prototype.

	gcc/testsuite/
	* gcc.dg/pr108359.c: New.
---
 gcc/range-op.cc                 | 50 +++++++++++++++++++++++++++++++++
 gcc/range-op.h                  |  5 ++++
 gcc/testsuite/gcc.dg/pr108359.c | 50 +++++++++++++++++++++++++++++++++
 3 files changed, 105 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr108359.c

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index ec75e07bc8a..2cb2c1344f1 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -160,6 +160,36 @@ range_operator::wi_fold (irange &r, tree type,
   r.set_varying (type);
 }
 
+// Call wi_fold when both op1 and op2 are equivalent. Further split small
+// subranges into constants.  This can provide better precision.
+// For x + y,  when x == y with a range of [0,4] instead of [0, 8] produce
+// [0,0][2, 2][4,4][6, 6][8, 8]
+
+void
+range_operator::wi_fold_in_parts_equiv (irange &r, tree type,
+					const wide_int &lh_lb,
+					const wide_int &lh_ub) const
+{
+  int_range_max tmp;
+  widest_int lh_range = wi::sub (widest_int::from (lh_ub, TYPE_SIGN (type)),
+				 widest_int::from (lh_lb, TYPE_SIGN (type)));
+  // if there are 1 to 8 values in the LH range, split them up.
+  r.set_undefined ();
+  if (lh_range >= 0 && lh_range <= 7)
+    {
+      unsigned x;
+      for (x = 0; x <= lh_range; x++)
+	{
+	  wide_int val = lh_lb + x;
+	  wi_fold (tmp, type, val, val, val, val);
+	  r.union_ (tmp);
+	}
+    }
+  // Otherwise just call wi_fold.
+  else
+    wi_fold (r, type, lh_lb, lh_ub, lh_lb, lh_ub);
+}
+
 // Call wi_fold, except further split small subranges into constants.
 // This can provide better precision. For something   8 >> [0,1]
 // Instead of [8, 16], we will produce [8,8][16,16]
@@ -234,6 +264,26 @@ range_operator::fold_range (irange &r, tree type,
   unsigned num_lh = lh.num_pairs ();
   unsigned num_rh = rh.num_pairs ();
 
+  // If op1 and op2 are equivalences, then we don't need a complete cross
+  // product, just pairs of matching elements.
+  if (relation_equiv_p (rel) && (lh == rh))
+    {
+      int_range_max tmp;
+      r.set_undefined ();
+      for (unsigned x = 0; x < num_lh; ++x)
+	{
+	  wide_int lh_lb = lh.lower_bound (x);
+	  wide_int lh_ub = lh.upper_bound (x);
+	  wi_fold_in_parts_equiv (tmp, type, lh_lb, lh_ub);
+	  r.union_ (tmp);
+	  if (r.varying_p ())
+	    break;
+	}
+      op1_op2_relation_effect (r, type, lh, rh, rel);
+      update_known_bitmask (r, m_code, lh, rh);
+      return true;
+    }
+
   // If both ranges are single pairs, fold directly into the result range.
   // If the number of subranges grows too high, produce a summary result as the
   // loop becomes exponential with little benefit.  See PR 103821.
diff --git a/gcc/range-op.h b/gcc/range-op.h
index b7b8a3b9473..998aeedb0d9 100644
--- a/gcc/range-op.h
+++ b/gcc/range-op.h
@@ -109,6 +109,11 @@ protected:
 			 const wide_int &rh_lb,
 			 const wide_int &rh_ub) const;
 
+  // Called by fold range to split small subranges into parts when op1 == op2
+  void wi_fold_in_parts_equiv (irange &r, tree type,
+			       const wide_int &lb,
+			       const wide_int &ub) const;
+
   // Tree code of the range operator or ERROR_MARK if unknown.
   tree_code m_code;
 };
diff --git a/gcc/testsuite/gcc.dg/pr108359.c b/gcc/testsuite/gcc.dg/pr108359.c
new file mode 100644
index 00000000000..00fd2de6dc7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr108359.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* PR test case.  */
+int b = 10;
+int c;
+char e;
+void foo();
+static char(a)(char f, char g) { return f && g == 1 ? 0 : f % g; }
+short(d)(short f, short g) { return f * g; }
+int main() {
+  short h;
+  int i;
+  unsigned j;
+  h = d(b && c, 5);
+  j = h;
+  i = a(h, 237);
+  unsigned k = i;
+  e = i < 0 || k >= 32 ? 0 : i >> k;
+  if (e) {
+    c = 0;
+    foo();
+  }
+}
+
+
+/* Also Check that small ranges are broken down and optimized properly
+   in the egneric case. This function should always return 0.  */
+
+int otherfunc (int x, int z) {
+  if (x < 0 || x > 6 )
+    return 0;
+
+  if (x == z)
+    {
+    if (x >> z > 0)
+      return 1;
+    if (x * z > 26 && x * z < 35)
+      return 1;
+    if (x + z == 5)
+      return 1;
+    if ((x + z) % 2 == 1)
+      return 1;
+    }
+  return 0;
+}
+
+
+/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "return 1" "optimized" } } */
-- 
2.38.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-01-16  8:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-13 21:23 [PATCH] PR tree-optimization/108359 - Utilize op1 == op2 when invoking range-ops folding Andrew MacLeod
2023-01-13 21:54 ` Jakub Jelinek
2023-01-13 22:07   ` Andrew MacLeod
2023-01-16  7:19     ` Richard Biener
2023-01-16  7:32       ` Aldy Hernandez
2023-01-16  8:32       ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).