public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
@ 2021-11-09  4:11 Navid Rahimi
  2021-11-09  4:24 ` Navid Rahimi
  2021-11-09 10:36 ` Richard Biener
  0 siblings, 2 replies; 7+ messages in thread
From: Navid Rahimi @ 2021-11-09  4:11 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 582 bytes --]

Hi GCC community,

This patch will add the missed pattern described in bug 102232 [1] to the match.pd. The testcase will test whether the multiplication and division has been removed from the code or not. The correctness proof for this pattern is here [2] in case anyone is curious.

PR tree-optimization/102232
      * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization.
      * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization.


1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102232
2) https://alive2.llvm.org/ce/z/2VScjD

Best wishes,
Navid.

[-- Attachment #2: 0001-PR-tree-optimization-102232.patch --]
[-- Type: application/octet-stream, Size: 2454 bytes --]

From 20dcbad29f4d41cce09ac799e9b5093167f3aaa0 Mon Sep 17 00:00:00 2001
From: Navid Rahimi <navidrahimi@microsoft.com>
Date: Mon, 8 Nov 2021 13:57:19 -0800
Subject: [PATCH] PR tree-optimization/102232

	* match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization.
	* gcc.dg/tree-ssa/pr102232.c: testcase for this optimization.
---
 gcc/match.pd                             |  7 ++++
 gcc/testsuite/gcc.dg/tree-ssa/pr102232.c | 52 ++++++++++++++++++++++++
 2 files changed, 59 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102232.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 71cf6f9df0a..37c01e79d97 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1295,6 +1295,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (bit_xor (bit_ior:c (bit_not @0) @1) (bit_ior:c @0 (bit_not @1)))
  (bit_xor @0 @1))
 
+/* x * (1 + y / x) - y -> x - y % x */
+(simplify
+ (minus (mult:cs @0 (plus:cs integer_onep (trunc_div:s @1 @0))) @1)
+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+      && types_match (@0, @1))
+  (minus @0 (trunc_mod @1 @0))))
+
 /* ((x & y) - (x | y)) - 1 -> ~(x ^ y) */
 (simplify
  (plus (nop_convert1? (minus@2 (nop_convert2? (bit_and:c @0 @1))
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c b/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c
new file mode 100644
index 00000000000..63fe37c0936
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c
@@ -0,0 +1,52 @@
+/* PR tree-optimization/102232 */
+/* { dg-do run } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+int __attribute__ ((noipa)) foo (int a, int b)
+{
+  return b * (1 + a / b) - a;
+}
+
+int
+main (void)
+{
+  // few randomly generated test cases
+  if (foo (71856034, 238) != 212)
+    {
+      return 1;
+    }
+  if (foo (71856034, 10909) != 1549)
+    {
+      return 1;
+    }
+  if (foo (20350, 1744) != 578)
+    {
+      return 1;
+    }
+  if (foo (444813, 88563) != 86565)
+    {
+      return 1;
+    }
+  if (foo (112237, 63004) != 13771)
+    {
+      return 1;
+    }
+  if (foo (68268386, 787116) != 210706)
+    {
+      return 1;
+    }
+  if (foo (-444813, 88563) != 90561)
+    {
+      return 1;
+    }
+  if (foo (-68268386, 787116) != 1363526)
+    {
+      return 1;
+    }
+
+  return 0;
+}
+
+/* Verify that multiplication and division has been removed.  */
+/* { dg-final { scan-tree-dump-not " \\* " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " / " "optimized" } } */
\ No newline at end of file
-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
  2021-11-09  4:11 [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd Navid Rahimi
@ 2021-11-09  4:24 ` Navid Rahimi
  2021-11-09 10:36 ` Richard Biener
  1 sibling, 0 replies; 7+ messages in thread
From: Navid Rahimi @ 2021-11-09  4:24 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1941 bytes --]

Sorry, a last minute update via my editor changed the MIME type to application/octet-stream. You can find the patch with correct MIME (text/x-diff) attached to this email.

(Same file and content, fixed MIME).

Best wishes,
Navid.

________________________________________
From: Gcc-patches <gcc-patches-bounces+navidrahimi=microsoft.com@gcc.gnu.org> on behalf of Navid Rahimi via Gcc-patches <gcc-patches@gcc.gnu.org>
Sent: Monday, November 8, 2021 20:11
To: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd

Hi GCC community,

This patch will add the missed pattern described in bug 102232 [1] to the match.pd. The testcase will test whether the multiplication and division has been removed from the code or not. The correctness proof for this pattern is here [2] in case anyone is curious.

PR tree-optimization/102232
      * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization.
      * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization.


1) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D102232&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cded0a3258b594204129608d9a3371763%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720279311390932%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=AaXdpyn%2BNQ2BizC6FILJfL3B8WV84v5FuLuU1d8MJxU%3D&amp;reserved=0
2) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2F2VScjD&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cded0a3258b594204129608d9a3371763%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720279311400927%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=3zM4awjl2LxhsfMd1VAOjH4JQ9621Q7Mg61gEB2Su%2Bk%3D&amp;reserved=0

Best wishes,
Navid.

[-- Attachment #2: 0001-PR-tree-optimization-102232.patch --]
[-- Type: application/octet-stream, Size: 2454 bytes --]

From 7c2abb0eab05766ab879066b000c13de827e3b3d Mon Sep 17 00:00:00 2001
From: Navid Rahimi <navidrahimi@microsoft.com>
Date: Mon, 8 Nov 2021 13:57:19 -0800
Subject: [PATCH] PR tree-optimization/102232

	* match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization.
	* gcc.dg/tree-ssa/pr102232.c: testcase for this optimization.
---
 gcc/match.pd                             |  7 ++++
 gcc/testsuite/gcc.dg/tree-ssa/pr102232.c | 52 ++++++++++++++++++++++++
 2 files changed, 59 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102232.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 71cf6f9df0a..37c01e79d97 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1295,6 +1295,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (bit_xor (bit_ior:c (bit_not @0) @1) (bit_ior:c @0 (bit_not @1)))
  (bit_xor @0 @1))
 
+/* x * (1 + y / x) - y -> x - y % x */
+(simplify
+ (minus (mult:cs @0 (plus:cs integer_onep (trunc_div:s @1 @0))) @1)
+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+      && types_match (@0, @1))
+  (minus @0 (trunc_mod @1 @0))))
+
 /* ((x & y) - (x | y)) - 1 -> ~(x ^ y) */
 (simplify
  (plus (nop_convert1? (minus@2 (nop_convert2? (bit_and:c @0 @1))
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c b/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c
new file mode 100644
index 00000000000..e7485cf24e9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c
@@ -0,0 +1,52 @@
+/* PR tree-optimization/102232 */
+/* { dg-do run } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+int __attribute__ ((noipa)) foo (int a, int b)
+{
+  return b * (1 + a / b) - a;
+}
+
+int
+main (void)
+{
+  // few randomly generated test cases
+  if (foo (71856034, 238) != 212)
+    {
+      return 1;
+    }
+  if (foo (71856034, 10909) != 1549)
+    {
+      return 1;
+    }
+  if (foo (20350, 1744) != 578)
+    {
+      return 1;
+    }
+  if (foo (444813, 88563) != 86565)
+    {
+      return 1;
+    }
+  if (foo (112237, 63004) != 13771)
+    {
+      return 1;
+    }
+  if (foo (68268386, 787116) != 210706)
+    {
+      return 1;
+    }
+  if (foo (-444813, 88563) != 90561)
+    {
+      return 1;
+    }
+  if (foo (-68268386, 787116) != 1363526)
+    {
+      return 1;
+    }
+
+  return 0;
+}
+
+/* Verify that multiplication and division has been removed.  */
+/* { dg-final { scan-tree-dump-not " \\* " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " / " "optimized" } } */
\ No newline at end of file
-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
  2021-11-09  4:11 [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd Navid Rahimi
  2021-11-09  4:24 ` Navid Rahimi
@ 2021-11-09 10:36 ` Richard Biener
  2021-11-09 16:25   ` [EXTERNAL] " Navid Rahimi
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Biener @ 2021-11-09 10:36 UTC (permalink / raw)
  To: Navid Rahimi; +Cc: gcc-patches

On Tue, Nov 9, 2021 at 5:12 AM Navid Rahimi via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi GCC community,
>
> This patch will add the missed pattern described in bug 102232 [1] to the match.pd. The testcase will test whether the multiplication and division has been removed from the code or not. The correctness proof for this pattern is here [2] in case anyone is curious.
>
> PR tree-optimization/102232
>       * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization.

+/* x * (1 + y / x) - y -> x - y % x */
+(simplify
+ (minus (mult:cs @0 (plus:cs integer_onep (trunc_div:s @1 @0))) @1)

the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
constants come last - you can then remove the 'c'

+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))

you can use INTEGRAL_TYPE_P (type).

+      && types_match (@0, @1))

this test is not necessary

+  (minus @0 (trunc_mod @1 @0))))

But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
it looks like a conflict with the x * (1 + b) <-> x + x * b transform
(fold_plusminus_mult)?  That said, the special case of one
definitely makes the result cheaper (one less operation).

Please move the pattern next to the most relatest which I think is

/* X - (X / Y) * Y is the same as X % Y.  */
(simplify
 (minus (convert1? @0) (convert2? (mult:c (trunc_div @@0 @@1) @1)))
 (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
  (convert (trunc_mod @0 @1))))

+int
+main (void)
+{
+  // few randomly generated test cases
+  if (foo (71856034, 238) != 212)
+    {
+      return 1;

the return value of 1 is an unreliable way to fail, please instead call
__builtin_abort ();

+/* { dg-options "-O3 -fdump-tree-optimized" } */

do we really need -O3 for this?  I think that -O should be enough here?

Thanks,
Richard.

>       * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization.
>
>
> 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102232
> 2) https://alive2.llvm.org/ce/z/2VScjD
>
> Best wishes,
> Navid.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
  2021-11-09 10:36 ` Richard Biener
@ 2021-11-09 16:25   ` Navid Rahimi
  2021-11-10  8:35     ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Navid Rahimi @ 2021-11-09 16:25 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5318 bytes --]

Hi Richard,

Thank you so much for your detailed feedback. I am attaching another version of the patch which does include all the changes you mentioned.

Bellow you can see my response to your feedbacks:

> the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
> constants come last - you can then remove the 'c'
Fixed. I was not aware of the canonical order.

> you can use INTEGRAL_TYPE_P (type).
Fixed. Didn't know about "type" either.

> this test is not necessary
Fixed.

> But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
> it looks like a conflict with the x * (1 + b) <-> x + x * b transform
> (fold_plusminus_mult)?  That said, the special case of one
> definitely makes the result cheaper (one less operation).
For this special case, it does remove operation indeed. But I was not able to reproduce it for any other constant [1]. If it was possible to do it with other constants I would've changed the pattern to have be more general like "x * (C + y/x) - y -> C*x - y % x". Basically anything other than 1 wasn't a win. Regarding the "x * (1 + b) <-> x + x * b" transformation, it appears to me when there is a "- y" at the end "x * (1 + b)", there is opportunity to optimize. Without that "- y" I was not able to make any operation more performant. Either direction, looks like same amount of computation.

1) https://compiler-explorer.com/z/dWsq7Tzf4

> Please move the pattern next to the most relatest which I think is
Fixed.

> the return value of 1 is an unreliable way to fail, please instead call
> __builtin_abort ();
Fixed.

> do we really need -O3 for this?  I think that -O should be enough here?
We don't specifically need that. But I realized that the optimization can happen in two different level at the compiler. It seems if you spread the computation over multiple statement like:
  int c = a/b;
  int y = b * (1 + c);
  return y - a;

instead of :
  return b * (1 + a / b) - a;

Then you have to have at least -O1 to have it optimized. Granted, I am not doing that in the testcase. In the new patch I am changing it to -O. Let me know if you have any suggestions.

Best wishes,
Navid.

________________________________________
From: Richard Biener <richard.guenther@gmail.com>
Sent: Tuesday, November 9, 2021 02:36
To: Navid Rahimi
Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd

On Tue, Nov 9, 2021 at 5:12 AM Navid Rahimi via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi GCC community,
>
> This patch will add the missed pattern described in bug 102232 [1] to the match.pd. The testcase will test whether the multiplication and division has been removed from the code or not. The correctness proof for this pattern is here [2] in case anyone is curious.
>
> PR tree-optimization/102232
>       * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization.

+/* x * (1 + y / x) - y -> x - y % x */
+(simplify
+ (minus (mult:cs @0 (plus:cs integer_onep (trunc_div:s @1 @0))) @1)

the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
constants come last - you can then remove the 'c'

+ (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))

you can use INTEGRAL_TYPE_P (type).

+      && types_match (@0, @1))

this test is not necessary

+  (minus @0 (trunc_mod @1 @0))))

But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
it looks like a conflict with the x * (1 + b) <-> x + x * b transform
(fold_plusminus_mult)?  That said, the special case of one
definitely makes the result cheaper (one less operation).

Please move the pattern next to the most relatest which I think is

/* X - (X / Y) * Y is the same as X % Y.  */
(simplify
 (minus (convert1? @0) (convert2? (mult:c (trunc_div @@0 @@1) @1)))
 (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
  (convert (trunc_mod @0 @1))))

+int
+main (void)
+{
+  // few randomly generated test cases
+  if (foo (71856034, 238) != 212)
+    {
+      return 1;

the return value of 1 is an unreliable way to fail, please instead call
__builtin_abort ();

+/* { dg-options "-O3 -fdump-tree-optimized" } */

do we really need -O3 for this?  I think that -O should be enough here?

Thanks,
Richard.

>       * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization.
>
>
> 1) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D102232&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cbc89643c6a14411d11ac08d9a36ce282%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720510334697749%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Br7B1ShCT4ynLwypyM29LqaNF9GBJF3%2BIR6sZrDRTKM%3D&amp;reserved=0
> 2) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2F2VScjD&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cbc89643c6a14411d11ac08d9a36ce282%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720510334707741%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=FH0%2BwEHmaGjyclNFD2HrKMXtXDscPAlc9uUs3p6UMkw%3D&amp;reserved=0
>
> Best wishes,
> Navid.

[-- Attachment #2: 0001-PR-tree-optimization-102232.patch --]
[-- Type: application/octet-stream, Size: 2563 bytes --]

From 3d7355450f018898e50c4a5c9c1951c93b8224ae Mon Sep 17 00:00:00 2001
From: Navid Rahimi <navidrahimi@microsoft.com>
Date: Mon, 8 Nov 2021 13:57:19 -0800
Subject: [PATCH] PR tree-optimization/102232

	* match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization.
	* gcc.dg/tree-ssa/pr102232.c: testcase for this optimization.
---
 gcc/match.pd                             |  6 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr102232.c | 52 ++++++++++++++++++++++++
 2 files changed, 58 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102232.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 71cf6f9df0a..c91fff46936 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -608,6 +608,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
   (convert (trunc_mod @0 @1))))
 
+/* x * (1 + y / x) - y -> x - y % x */
+(simplify
+ (minus (mult:cs @0 (plus:s (trunc_div:s @1 @0) integer_onep)) @1)
+ (if (INTEGRAL_TYPE_P (type))
+  (minus @0 (trunc_mod @1 @0))))
+
 /* Optimize TRUNC_MOD_EXPR by a power of two into a BIT_AND_EXPR,
    i.e. "X % C" into "X & (C - 1)", if X and C are positive.
    Also optimize A % (C << N)  where C is a power of 2,
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c b/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c
new file mode 100644
index 00000000000..62bca6922ab
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102232.c
@@ -0,0 +1,52 @@
+/* PR tree-optimization/102232 */
+/* { dg-do run } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int __attribute__ ((noipa)) foo (int a, int b)
+{
+  return b * (1 + a / b) - a;
+}
+
+int
+main (void)
+{
+  // few randomly generated test cases
+  if (foo (71856034, 238) != 212)
+    {
+      __builtin_abort ();
+    }
+  if (foo (71856034, 10909) != 1549)
+    {
+      __builtin_abort ();
+    }
+  if (foo (20350, 1744) != 578)
+    {
+      __builtin_abort ();
+    }
+  if (foo (444813, 88563) != 86565)
+    {
+      __builtin_abort ();
+    }
+  if (foo (112237, 63004) != 13771)
+    {
+      __builtin_abort ();
+    }
+  if (foo (68268386, 787116) != 210706)
+    {
+      __builtin_abort ();
+    }
+  if (foo (-444813, 88563) != 90561)
+    {
+      __builtin_abort ();
+    }
+  if (foo (-68268386, 787116) != 1363526)
+    {
+      __builtin_abort ();
+    }
+
+  return 0;
+}
+
+/* Verify that multiplication and division has been removed.  */
+/* { dg-final { scan-tree-dump-not " \\* " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " / " "optimized" } } */
\ No newline at end of file
-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
  2021-11-09 16:25   ` [EXTERNAL] " Navid Rahimi
@ 2021-11-10  8:35     ` Richard Biener
  2021-11-23  3:09       ` Jeff Law
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2021-11-10  8:35 UTC (permalink / raw)
  To: Navid Rahimi; +Cc: gcc-patches

On Tue, Nov 9, 2021 at 5:25 PM Navid Rahimi <navidrahimi@microsoft.com> wrote:
>
> Hi Richard,
>
> Thank you so much for your detailed feedback. I am attaching another version of the patch which does include all the changes you mentioned.
>
> Bellow you can see my response to your feedbacks:
>
> > the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
> > constants come last - you can then remove the 'c'
> Fixed. I was not aware of the canonical order.
>
> > you can use INTEGRAL_TYPE_P (type).
> Fixed. Didn't know about "type" either.
>
> > this test is not necessary
> Fixed.
>
> > But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
> > it looks like a conflict with the x * (1 + b) <-> x + x * b transform
> > (fold_plusminus_mult)?  That said, the special case of one
> > definitely makes the result cheaper (one less operation).
> For this special case, it does remove operation indeed. But I was not able to reproduce it for any other constant [1]. If it was possible to do it with other constants I would've changed the pattern to have be more general like "x * (C + y/x) - y -> C*x - y % x". Basically anything other than 1 wasn't a win. Regarding the "x * (1 + b) <-> x + x * b" transformation, it appears to me when there is a "- y" at the end "x * (1 + b)", there is opportunity to optimize. Without that "- y" I was not able to make any operation more performant. Either direction, looks like same amount of computation.
>
> 1) https://compiler-explorer.com/z/dWsq7Tzf4
>
> > Please move the pattern next to the most relatest which I think is
> Fixed.
>
> > the return value of 1 is an unreliable way to fail, please instead call
> > __builtin_abort ();
> Fixed.
>
> > do we really need -O3 for this?  I think that -O should be enough here?
> We don't specifically need that. But I realized that the optimization can happen in two different level at the compiler. It seems if you spread the computation over multiple statement like:
>   int c = a/b;
>   int y = b * (1 + c);
>   return y - a;
>
> instead of :
>   return b * (1 + a / b) - a;
>
> Then you have to have at least -O1 to have it optimized. Granted, I am not doing that in the testcase. In the new patch I am changing it to -O. Let me know if you have any suggestions.

-O is fine, generally at -O0 we shouldn't expect such transforms to
happen (but they still do, of course).

The patch looks OK now.

Thanks,
Richard.

>
> Best wishes,
> Navid.
>
> ________________________________________
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Tuesday, November 9, 2021 02:36
> To: Navid Rahimi
> Cc: gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
>
> On Tue, Nov 9, 2021 at 5:12 AM Navid Rahimi via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi GCC community,
> >
> > This patch will add the missed pattern described in bug 102232 [1] to the match.pd. The testcase will test whether the multiplication and division has been removed from the code or not. The correctness proof for this pattern is here [2] in case anyone is curious.
> >
> > PR tree-optimization/102232
> >       * match.pd (x * (1 + y / x) - y) -> (x - y % x): New optimization.
>
> +/* x * (1 + y / x) - y -> x - y % x */
> +(simplify
> + (minus (mult:cs @0 (plus:cs integer_onep (trunc_div:s @1 @0))) @1)
>
> the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
> constants come last - you can then remove the 'c'
>
> + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>
> you can use INTEGRAL_TYPE_P (type).
>
> +      && types_match (@0, @1))
>
> this test is not necessary
>
> +  (minus @0 (trunc_mod @1 @0))))
>
> But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
> it looks like a conflict with the x * (1 + b) <-> x + x * b transform
> (fold_plusminus_mult)?  That said, the special case of one
> definitely makes the result cheaper (one less operation).
>
> Please move the pattern next to the most relatest which I think is
>
> /* X - (X / Y) * Y is the same as X % Y.  */
> (simplify
>  (minus (convert1? @0) (convert2? (mult:c (trunc_div @@0 @@1) @1)))
>  (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
>   (convert (trunc_mod @0 @1))))
>
> +int
> +main (void)
> +{
> +  // few randomly generated test cases
> +  if (foo (71856034, 238) != 212)
> +    {
> +      return 1;
>
> the return value of 1 is an unreliable way to fail, please instead call
> __builtin_abort ();
>
> +/* { dg-options "-O3 -fdump-tree-optimized" } */
>
> do we really need -O3 for this?  I think that -O should be enough here?
>
> Thanks,
> Richard.
>
> >       * gcc.dg/tree-ssa/pr102232.c: testcase for this optimization.
> >
> >
> > 1) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D102232&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cbc89643c6a14411d11ac08d9a36ce282%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720510334697749%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Br7B1ShCT4ynLwypyM29LqaNF9GBJF3%2BIR6sZrDRTKM%3D&amp;reserved=0
> > 2) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2F2VScjD&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cbc89643c6a14411d11ac08d9a36ce282%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637720510334707741%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=FH0%2BwEHmaGjyclNFD2HrKMXtXDscPAlc9uUs3p6UMkw%3D&amp;reserved=0
> >
> > Best wishes,
> > Navid.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
  2021-11-10  8:35     ` Richard Biener
@ 2021-11-23  3:09       ` Jeff Law
  2021-11-23  3:10         ` Navid Rahimi
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Law @ 2021-11-23  3:09 UTC (permalink / raw)
  To: Richard Biener, Navid Rahimi; +Cc: gcc-patches



On 11/10/2021 1:35 AM, Richard Biener via Gcc-patches wrote:
> On Tue, Nov 9, 2021 at 5:25 PM Navid Rahimi <navidrahimi@microsoft.com> wrote:
>> Hi Richard,
>>
>> Thank you so much for your detailed feedback. I am attaching another version of the patch which does include all the changes you mentioned.
>>
>> Bellow you can see my response to your feedbacks:
>>
>>> the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
>>> constants come last - you can then remove the 'c'
>> Fixed. I was not aware of the canonical order.
>>
>>> you can use INTEGRAL_TYPE_P (type).
>> Fixed. Didn't know about "type" either.
>>
>>> this test is not necessary
>> Fixed.
>>
>>> But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
>>> it looks like a conflict with the x * (1 + b) <-> x + x * b transform
>>> (fold_plusminus_mult)?  That said, the special case of one
>>> definitely makes the result cheaper (one less operation).
>> For this special case, it does remove operation indeed. But I was not able to reproduce it for any other constant [1]. If it was possible to do it with other constants I would've changed the pattern to have be more general like "x * (C + y/x) - y -> C*x - y % x". Basically anything other than 1 wasn't a win. Regarding the "x * (1 + b) <-> x + x * b" transformation, it appears to me when there is a "- y" at the end "x * (1 + b)", there is opportunity to optimize. Without that "- y" I was not able to make any operation more performant. Either direction, looks like same amount of computation.
>>
>> 1) https://compiler-explorer.com/z/dWsq7Tzf4
>>
>>> Please move the pattern next to the most relatest which I think is
>> Fixed.
>>
>>> the return value of 1 is an unreliable way to fail, please instead call
>>> __builtin_abort ();
>> Fixed.
>>
>>> do we really need -O3 for this?  I think that -O should be enough here?
>> We don't specifically need that. But I realized that the optimization can happen in two different level at the compiler. It seems if you spread the computation over multiple statement like:
>>    int c = a/b;
>>    int y = b * (1 + c);
>>    return y - a;
>>
>> instead of :
>>    return b * (1 + a / b) - a;
>>
>> Then you have to have at least -O1 to have it optimized. Granted, I am not doing that in the testcase. In the new patch I am changing it to -O. Let me know if you have any suggestions.
> -O is fine, generally at -O0 we shouldn't expect such transforms to
> happen (but they still do, of course).
>
> The patch looks OK now.
I've pushed this to the trunk for Navid.

jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd
  2021-11-23  3:09       ` Jeff Law
@ 2021-11-23  3:10         ` Navid Rahimi
  0 siblings, 0 replies; 7+ messages in thread
From: Navid Rahimi @ 2021-11-23  3:10 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: gcc-patches

Thanks Jeff for this too.

Best wishes,
Navid.

________________________________________
From: Jeff Law <jeffreyalaw@gmail.com>
Sent: Monday, November 22, 2021 19:09
To: Richard Biener; Navid Rahimi
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] Re: [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd

[You don't often get email from jeffreyalaw@gmail.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]

On 11/10/2021 1:35 AM, Richard Biener via Gcc-patches wrote:
> On Tue, Nov 9, 2021 at 5:25 PM Navid Rahimi <navidrahimi@microsoft.com> wrote:
>> Hi Richard,
>>
>> Thank you so much for your detailed feedback. I am attaching another version of the patch which does include all the changes you mentioned.
>>
>> Bellow you can see my response to your feedbacks:
>>
>>> the canonical order of the plus is (plus:s (trunc_div ...) integer_onep) as
>>> constants come last - you can then remove the 'c'
>> Fixed. I was not aware of the canonical order.
>>
>>> you can use INTEGRAL_TYPE_P (type).
>> Fixed. Didn't know about "type" either.
>>
>>> this test is not necessary
>> Fixed.
>>
>>> But should we also optimize x * (2 + y/x) - y -> 2*x - y % x?  So
>>> it looks like a conflict with the x * (1 + b) <-> x + x * b transform
>>> (fold_plusminus_mult)?  That said, the special case of one
>>> definitely makes the result cheaper (one less operation).
>> For this special case, it does remove operation indeed. But I was not able to reproduce it for any other constant [1]. If it was possible to do it with other constants I would've changed the pattern to have be more general like "x * (C + y/x) - y -> C*x - y % x". Basically anything other than 1 wasn't a win. Regarding the "x * (1 + b) <-> x + x * b" transformation, it appears to me when there is a "- y" at the end "x * (1 + b)", there is opportunity to optimize. Without that "- y" I was not able to make any operation more performant. Either direction, looks like same amount of computation.
>>
>> 1) https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2FdWsq7Tzf4&amp;data=04%7C01%7Cnavidrahimi%40microsoft.com%7C1ddd355d86ee4a5d645808d9ae2e9e73%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637732337521412439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=K4ecZcigYkh8%2BSiqDegRxm72gkL%2FnsbuDDp5nsM3ZqA%3D&amp;reserved=0
>>
>>> Please move the pattern next to the most relatest which I think is
>> Fixed.
>>
>>> the return value of 1 is an unreliable way to fail, please instead call
>>> __builtin_abort ();
>> Fixed.
>>
>>> do we really need -O3 for this?  I think that -O should be enough here?
>> We don't specifically need that. But I realized that the optimization can happen in two different level at the compiler. It seems if you spread the computation over multiple statement like:
>>    int c = a/b;
>>    int y = b * (1 + c);
>>    return y - a;
>>
>> instead of :
>>    return b * (1 + a / b) - a;
>>
>> Then you have to have at least -O1 to have it optimized. Granted, I am not doing that in the testcase. In the new patch I am changing it to -O. Let me know if you have any suggestions.
> -O is fine, generally at -O0 we shouldn't expect such transforms to
> happen (but they still do, of course).
>
> The patch looks OK now.
I've pushed this to the trunk for Navid.

jeff


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-11-23  3:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-09  4:11 [PATCH] PR tree-optimization/102232 Adding a missing pattern to match.pd Navid Rahimi
2021-11-09  4:24 ` Navid Rahimi
2021-11-09 10:36 ` Richard Biener
2021-11-09 16:25   ` [EXTERNAL] " Navid Rahimi
2021-11-10  8:35     ` Richard Biener
2021-11-23  3:09       ` Jeff Law
2021-11-23  3:10         ` Navid Rahimi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).