[PATCH v3] match.pd: Simplify 1 / X for integer X [PR95424]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH v3] match.pd: Simplify 1 / X for integer X [PR95424]
@ 2022-01-19 18:42 Zhao Wei Liew
  2022-01-28 18:38 ` Jeff Law
  0 siblings, 1 reply; 27+ messages in thread
From: Zhao Wei Liew @ 2022-01-19 18:42 UTC (permalink / raw)
  To: GCC Patches; +Cc: Jakub Jelinek, Richard Biener

This patch implements an optimization for the following C++ code:

int f(int x) {
    return 1 / x;
}

int f(unsigned int x) {
    return 1 / x;
}

Before this patch, x86-64 gcc -std=c++20 -O3 produces the following assembly:

f(int):
    xor edx, edx
    mov eax, 1
    idiv edi
    ret
f(unsigned int):
    xor edx, edx
    mov eax, 1
    div edi
    ret

In comparison, clang++ -std=c++20 -O3 produces the following assembly:

f(int):
    lea ecx, [rdi + 1]
    xor eax, eax
    cmp ecx, 3
    cmovb eax, edi
    ret
f(unsigned int):
    xor eax, eax
    cmp edi, 1
    sete al
    ret

Clang's output is more efficient as it avoids expensive div operations.

With this patch, GCC now produces the following assembly:

f(int):
    lea eax, [rdi + 1]
    cmp eax, 2
    mov eax, 0
    cmovbe eax, edi
    ret
f(unsigned int):
    xor eax, eax
    cmp edi, 1
    sete al
    ret

which is virtually identical to Clang's assembly output. Any slight differences
in the output for f(int) is possibly related to a different missed optimization.

v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587751.html
Changes from v2:
1. Refactor from using a switch statement to using the built-in
if-else statement.

v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587634.html
Changes from v1:
1. Refactor common if conditions.
2. Use build_[minus_]one_cst (type) to get -1/1 of the correct type.
3. Match only for TRUNC_DIV_EXPR and TYPE_PRECISION (type) > 1.

gcc/ChangeLog:

	* match.pd: Simplify 1 / X where X is an integer.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/divide-6.c: New test.
	* gcc.dg/tree-ssa/divide-7.c: New test.
---
 gcc/match.pd                             | 13 +++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/divide-6.c |  9 +++++++++
 gcc/testsuite/gcc.dg/tree-ssa/divide-7.c |  9 +++++++++
 3 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/divide-6.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/divide-7.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 84c9b918041ee..4cd692c863d0c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -432,6 +432,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
       && TYPE_UNSIGNED (type))
   (trunc_div @0 @1)))

+ /* 1 / X -> X == 1 for unsigned integer X.
+    1 / X -> X >= -1 && X <= 1 ? X : 0 for signed integer X.
+    But not for 1 / 0 so that we can get proper warnings and errors,
+    and not for 1-bit integers as they are edge cases better handled
elsewhere. */
+(simplify
+  (trunc_div integer_onep@0 @1)
+  (if (INTEGRAL_TYPE_P (type) && !integer_zerop (@1) &&
TYPE_PRECISION (type) > 1)
+    (if (TYPE_UNSIGNED (type))
+      (eq @1 { build_one_cst (type); })
+      (with { tree utype = unsigned_type_for (type); }
+        (cond (le (plus (convert:utype @1) { build_one_cst (utype);
}) { build_int_cst (utype, 2); })
+          @1 { build_zero_cst (type); })))))
+
 /* Combine two successive divisions.  Note that combining ceil_div
    and floor_div is trickier and combining round_div even more so.  */
 (for div (trunc_div exact_div)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/divide-6.c
b/gcc/testsuite/gcc.dg/tree-ssa/divide-6.c
new file mode 100644
index 0000000000000..a9fc4c04058c6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/divide-6.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+unsigned int f(unsigned int x) {
+  return 1 / x;
+}
+
+/* { dg-final { scan-tree-dump-not "1 / x_..D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump "x_..D. == 1;" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/divide-7.c
b/gcc/testsuite/gcc.dg/tree-ssa/divide-7.c
new file mode 100644
index 0000000000000..285279af7c210
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/divide-7.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int f(int x) {
+  return 1 / x;
+}
+
+/* { dg-final { scan-tree-dump-not "1 / x_..D.;" "optimized" } } */
+/* { dg-final { scan-tree-dump ".. <= 2 ? x_..D. : 0;" "optimized" } } */

-- 
2.17.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3] match.pd: Simplify 1 / X for integer X [PR95424]
  2022-01-19 18:42 [PATCH v3] match.pd: Simplify 1 / X for integer X [PR95424] Zhao Wei Liew
@ 2022-01-28 18:38 ` Jeff Law
  2022-01-29 16:23   ` [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280] Jakub Jelinek
  0 siblings, 1 reply; 27+ messages in thread
From: Jeff Law @ 2022-01-28 18:38 UTC (permalink / raw)
  To: Zhao Wei Liew, GCC Patches; +Cc: Jakub Jelinek



On 1/19/2022 11:42 AM, Zhao Wei Liew via Gcc-patches wrote:
> This patch implements an optimization for the following C++ code:
>
> int f(int x) {
>      return 1 / x;
> }
>
> int f(unsigned int x) {
>      return 1 / x;
> }
>
> Before this patch, x86-64 gcc -std=c++20 -O3 produces the following assembly:
>
> f(int):
>      xor edx, edx
>      mov eax, 1
>      idiv edi
>      ret
> f(unsigned int):
>      xor edx, edx
>      mov eax, 1
>      div edi
>      ret
>
> In comparison, clang++ -std=c++20 -O3 produces the following assembly:
>
> f(int):
>      lea ecx, [rdi + 1]
>      xor eax, eax
>      cmp ecx, 3
>      cmovb eax, edi
>      ret
> f(unsigned int):
>      xor eax, eax
>      cmp edi, 1
>      sete al
>      ret
>
> Clang's output is more efficient as it avoids expensive div operations.
>
> With this patch, GCC now produces the following assembly:
>
> f(int):
>      lea eax, [rdi + 1]
>      cmp eax, 2
>      mov eax, 0
>      cmovbe eax, edi
>      ret
> f(unsigned int):
>      xor eax, eax
>      cmp edi, 1
>      sete al
>      ret
>
> which is virtually identical to Clang's assembly output. Any slight differences
> in the output for f(int) is possibly related to a different missed optimization.
>
> v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587751.html
> Changes from v2:
> 1. Refactor from using a switch statement to using the built-in
> if-else statement.
>
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587634.html
> Changes from v1:
> 1. Refactor common if conditions.
> 2. Use build_[minus_]one_cst (type) to get -1/1 of the correct type.
> 3. Match only for TRUNC_DIV_EXPR and TYPE_PRECISION (type) > 1.
>
> gcc/ChangeLog:
>
> 	* match.pd: Simplify 1 / X where X is an integer.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.dg/tree-ssa/divide-6.c: New test.
> 	* gcc.dg/tree-ssa/divide-7.c: New test.
Thanks.  Given the original submission and most of the review work was 
done prior to stage3 closing, I went ahead and installed this on the trunk.
jeff


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-01-28 18:38 ` Jeff Law
@ 2022-01-29 16:23   ` Jakub Jelinek
  2022-01-29 16:48     ` Jeff Law
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Jakub Jelinek @ 2022-01-29 16:23 UTC (permalink / raw)
  To: Richard Biener, Jeff Law; +Cc: Zhao Wei Liew, GCC Patches

Hi!

On Fri, Jan 28, 2022 at 11:38:23AM -0700, Jeff Law wrote:
> Thanks.  Given the original submission and most of the review work was done
> prior to stage3 closing, I went ahead and installed this on the trunk.

Unfortunately this breaks quite a lot of things.
The main problem is that GIMPLE allows EQ_EXPR etc. only with BOOLEAN_TYPE
or with TYPE_PRECISION == 1 integral type (or vector boolean).
Violating this causes verification failures in tree-cfg.cc in some cases,
in other cases wrong-code issues because before it is verified we e.g.
transform
1U / x
into
x == 1U
and later into
x (because we assume that == type must be one of the above cases and
when it is the same type as the type of the first operand, for boolean-ish
cases it should be equivalent).

Fixed by changing that
(eq @1 { build_one_cst (type); })
into
(convert (eq:boolean_type_node @1 { build_one_cst (type); }))
Note, I'm not 100% sure if :boolean_type_node is required in that case,
I see some spots in match.pd that look exactly like this, while there is
e.g. (convert (le ...)) that supposedly does the right thing too.
The signed integer 1/X case doesn't need changes changes, for
(cond (le ...) ...)
le gets correctly boolean_type_node and cond should use type.
I've also reformatted it, some lines were too long, match.pd uses
indentation by 1 column instead of 2 etc.

Bootstrapped/regtested on powerpc64le-linux and tested on the testcases
on x86_64-linux, ok for trunk?

2022-01-29  Jakub Jelinek  <jakub@redhat.com>
	    Andrew Pinski  <apinski@marvell.com>

	PR tree-optimization/104279
	PR tree-optimization/104280
	PR tree-optimization/104281
	* match.pd (1 / X -> X == 1 for unsigned X): Build eq with
	boolean_type_node and convert to type.  Formatting fixes.

	* gcc.dg/torture/pr104279.c: New test.
	* gcc.dg/torture/pr104280.c: New test.
	* gcc.dg/torture/pr104281.c: New test.

--- gcc/match.pd.jj	2022-01-29 11:11:39.316628007 +0100
+++ gcc/match.pd	2022-01-29 12:19:46.662096678 +0100
@@ -435,18 +435,22 @@ (define_operator_list SYNC_FETCH_AND_AND
        && TYPE_UNSIGNED (type))
    (trunc_divmod @0 @1))))
 
- /* 1 / X -> X == 1 for unsigned integer X.
-    1 / X -> X >= -1 && X <= 1 ? X : 0 for signed integer X.
-    But not for 1 / 0 so that we can get proper warnings and errors,
-    and not for 1-bit integers as they are edge cases better handled elsewhere. */
+/* 1 / X -> X == 1 for unsigned integer X.
+   1 / X -> X >= -1 && X <= 1 ? X : 0 for signed integer X.
+   But not for 1 / 0 so that we can get proper warnings and errors,
+   and not for 1-bit integers as they are edge cases better handled
+   elsewhere.  */
 (simplify
-  (trunc_div integer_onep@0 @1)
-  (if (INTEGRAL_TYPE_P (type) && !integer_zerop (@1) && TYPE_PRECISION (type) > 1)
-    (if (TYPE_UNSIGNED (type))
-      (eq @1 { build_one_cst (type); })
-      (with { tree utype = unsigned_type_for (type); }
-        (cond (le (plus (convert:utype @1) { build_one_cst (utype); }) { build_int_cst (utype, 2); })
-          @1 { build_zero_cst (type); })))))
+ (trunc_div integer_onep@0 @1)
+ (if (INTEGRAL_TYPE_P (type)
+      && !integer_zerop (@1)
+      && TYPE_PRECISION (type) > 1)
+  (if (TYPE_UNSIGNED (type))
+   (convert (eq:boolean_type_node @1 { build_one_cst (type); }))
+   (with { tree utype = unsigned_type_for (type); }
+    (cond (le (plus (convert:utype @1) { build_one_cst (utype); })
+	      { build_int_cst (utype, 2); })
+     @1 { build_zero_cst (type); })))))
 
 /* Combine two successive divisions.  Note that combining ceil_div
    and floor_div is trickier and combining round_div even more so.  */
--- gcc/testsuite/gcc.dg/torture/pr104279.c.jj	2022-01-29 12:25:36.388174312 +0100
+++ gcc/testsuite/gcc.dg/torture/pr104279.c	2022-01-29 12:25:53.206937588 +0100
@@ -0,0 +1,12 @@
+/* PR tree-optimization/104279 */
+/* { dg-do compile } */
+
+unsigned a, b;
+
+int
+main ()
+{
+  b = ~(0 || ~0);
+  a = ~b / ~a;
+  return 0;
+}
--- gcc/testsuite/gcc.dg/torture/pr104280.c.jj	2022-01-29 12:24:36.190021595 +0100
+++ gcc/testsuite/gcc.dg/torture/pr104280.c	2022-01-29 12:25:28.681282783 +0100
@@ -0,0 +1,16 @@
+/* PR tree-optimization/104280 */
+/* { dg-do run } */
+
+int
+foo (unsigned b, int c)
+{
+  return b / c;
+}
+
+int
+main ()
+{
+  if (foo (1, 2) != 0)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.dg/torture/pr104281.c.jj	2022-01-29 12:27:29.840577473 +0100
+++ gcc/testsuite/gcc.dg/torture/pr104281.c	2022-01-29 12:27:24.526652267 +0100
@@ -0,0 +1,22 @@
+/* PR tree-optimization/104281 */
+/* { dg-do run } */
+
+unsigned a = 1;
+int b, c = 2;
+long d;
+
+int
+main ()
+{
+  while (1)
+    {
+      int m = a;
+    L:
+      a = ~(-(m || b & d));
+      b = ((1 ^ a) / c);
+      if (b)
+	goto L;
+      break;
+    }
+  return 0;
+}


	Jakub


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-01-29 16:23   ` [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280] Jakub Jelinek
@ 2022-01-29 16:48     ` Jeff Law
  2022-01-30  2:28       ` Zhao Wei Liew
  2022-01-31  8:16     ` Eric Botcazou
  2022-01-31  8:28     ` Richard Biener
  2 siblings, 1 reply; 27+ messages in thread
From: Jeff Law @ 2022-01-29 16:48 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener; +Cc: Zhao Wei Liew, GCC Patches



On 1/29/2022 9:23 AM, Jakub Jelinek wrote:
> Hi!
>
> On Fri, Jan 28, 2022 at 11:38:23AM -0700, Jeff Law wrote:
>> Thanks.  Given the original submission and most of the review work was done
>> prior to stage3 closing, I went ahead and installed this on the trunk.
> Unfortunately this breaks quite a lot of things.
> The main problem is that GIMPLE allows EQ_EXPR etc. only with BOOLEAN_TYPE
> or with TYPE_PRECISION == 1 integral type (or vector boolean).
> Violating this causes verification failures in tree-cfg.cc in some cases,
> in other cases wrong-code issues because before it is verified we e.g.
> transform
> 1U / x
> into
> x == 1U
> and later into
> x (because we assume that == type must be one of the above cases and
> when it is the same type as the type of the first operand, for boolean-ish
> cases it should be equivalent).
>
> Fixed by changing that
> (eq @1 { build_one_cst (type); })
> into
> (convert (eq:boolean_type_node @1 { build_one_cst (type); }))
> Note, I'm not 100% sure if :boolean_type_node is required in that case,
> I see some spots in match.pd that look exactly like this, while there is
> e.g. (convert (le ...)) that supposedly does the right thing too.
> The signed integer 1/X case doesn't need changes changes, for
> (cond (le ...) ...)
> le gets correctly boolean_type_node and cond should use type.
> I've also reformatted it, some lines were too long, match.pd uses
> indentation by 1 column instead of 2 etc.
>
> Bootstrapped/regtested on powerpc64le-linux and tested on the testcases
> on x86_64-linux, ok for trunk?
>
> 2022-01-29  Jakub Jelinek  <jakub@redhat.com>
> 	    Andrew Pinski  <apinski@marvell.com>
>
> 	PR tree-optimization/104279
> 	PR tree-optimization/104280
> 	PR tree-optimization/104281
> 	* match.pd (1 / X -> X == 1 for unsigned X): Build eq with
> 	boolean_type_node and convert to type.  Formatting fixes.
>
> 	* gcc.dg/torture/pr104279.c: New test.
> 	* gcc.dg/torture/pr104280.c: New test.
> 	* gcc.dg/torture/pr104281.c: New test.
Ugh.  Sorry for the breakage.  THat makes me wonder how the original 
patch was tested at all.

I think both forms will DTRT (eq:boolean_type_node ...) vs (convert (le 
...)) .

OK for the trunk as well as the testcase fix.

Jeff



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-01-29 16:48     ` Jeff Law
@ 2022-01-30  2:28       ` Zhao Wei Liew
  2022-01-30  2:38         ` Andrew Pinski
  0 siblings, 1 reply; 27+ messages in thread
From: Zhao Wei Liew @ 2022-01-30  2:28 UTC (permalink / raw)
  To: Jeff Law, Jakub Jelinek; +Cc: Richard Biener, GCC Patches

Sincere apologies for the issues. I wasn't aware of the need for a cast but
after reading the PRs, I understand that now. On the other hand, the
incorrect test case was simply a major oversight on my part.

I'll be sure to be more careful next time.

Thanks for the fixes!

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-01-30  2:28       ` Zhao Wei Liew
@ 2022-01-30  2:38         ` Andrew Pinski
  2022-01-31  8:33           ` Richard Biener
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Pinski @ 2022-01-30  2:38 UTC (permalink / raw)
  To: Zhao Wei Liew; +Cc: Jeff Law, Jakub Jelinek, GCC Patches, Richard Biener

On Sat, Jan 29, 2022 at 6:30 PM Zhao Wei Liew via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Sincere apologies for the issues. I wasn't aware of the need for a cast but
> after reading the PRs, I understand that now. On the other hand, the
> incorrect test case was simply a major oversight on my part.
>
> I'll be sure to be more careful next time.

Well I think gimple-match should have caught the cast issue really. I
filed https://gcc.gnu.org/PR104286 for that. I hope to submit a patch
for stage 1 to catch that so it will be harder to happen in the first
place.

Thanks,
Andrew

>
> Thanks for the fixes!

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-01-29 16:23   ` [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280] Jakub Jelinek
  2022-01-29 16:48     ` Jeff Law
@ 2022-01-31  8:16     ` Eric Botcazou
  2022-02-02 22:58       ` Andrew Pinski
  2022-01-31  8:28     ` Richard Biener
  2 siblings, 1 reply; 27+ messages in thread
From: Eric Botcazou @ 2022-01-31  8:16 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jeff Law, Richard Biener, gcc-patches, Zhao Wei Liew, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 422 bytes --]

> Unfortunately this breaks quite a lot of things.

Right, for example in Ada where we now happily turn a division by zero, which 
should raise an exception with -gnatp, into nonsense.  Do we really need this 
rather useless optimization in GCC?  Blindly mimicing LLVM is not a reason...

I have installed the attached testcase, which now fails because of the change.

	* gnat.dg/div_zero.adb: New test.

-- 
Eric Botcazou

[-- Attachment #2: div_zero.adb --]
[-- Type: text/x-adasrc, Size: 471 bytes --]

-- { dg-do run }

-- This test requires architecture- and OS-specific support code for unwinding
-- through signal frames (typically located in *-unwind.h) to pass.  Feel free
-- to disable it if this code hasn't been implemented yet.

procedure Div_Zero is

  pragma Suppress (All_Checks);

  function Zero return Integer is
  begin
    return 0;
  end;

  D : Integer := Zero;

begin
  D := 1 / D;
  raise Program_Error;
exception
  when Constraint_Error => null;
end;

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-01-29 16:23   ` [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280] Jakub Jelinek
  2022-01-29 16:48     ` Jeff Law
  2022-01-31  8:16     ` Eric Botcazou
@ 2022-01-31  8:28     ` Richard Biener
  2 siblings, 0 replies; 27+ messages in thread
From: Richard Biener @ 2022-01-31  8:28 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jeff Law, Zhao Wei Liew, GCC Patches

On Sat, 29 Jan 2022, Jakub Jelinek wrote:

> Hi!
> 
> On Fri, Jan 28, 2022 at 11:38:23AM -0700, Jeff Law wrote:
> > Thanks.  Given the original submission and most of the review work was done
> > prior to stage3 closing, I went ahead and installed this on the trunk.
> 
> Unfortunately this breaks quite a lot of things.
> The main problem is that GIMPLE allows EQ_EXPR etc. only with BOOLEAN_TYPE
> or with TYPE_PRECISION == 1 integral type (or vector boolean).
> Violating this causes verification failures in tree-cfg.cc in some cases,
> in other cases wrong-code issues because before it is verified we e.g.
> transform
> 1U / x
> into
> x == 1U
> and later into
> x (because we assume that == type must be one of the above cases and
> when it is the same type as the type of the first operand, for boolean-ish
> cases it should be equivalent).
> 
> Fixed by changing that
> (eq @1 { build_one_cst (type); })
> into
> (convert (eq:boolean_type_node @1 { build_one_cst (type); }))
> Note, I'm not 100% sure if :boolean_type_node is required in that case,

It shouldn't be necessary.  Unfortunately it's hard for genmatch to
statically detect this case as bogus ...

Thanks for spotting and fixing this.

Richard.

> I see some spots in match.pd that look exactly like this, while there is
> e.g. (convert (le ...)) that supposedly does the right thing too.
> The signed integer 1/X case doesn't need changes changes, for
> (cond (le ...) ...)
> le gets correctly boolean_type_node and cond should use type.
> I've also reformatted it, some lines were too long, match.pd uses
> indentation by 1 column instead of 2 etc.
> 
> Bootstrapped/regtested on powerpc64le-linux and tested on the testcases
> on x86_64-linux, ok for trunk?
> 
> 2022-01-29  Jakub Jelinek  <jakub@redhat.com>
> 	    Andrew Pinski  <apinski@marvell.com>
> 
> 	PR tree-optimization/104279
> 	PR tree-optimization/104280
> 	PR tree-optimization/104281
> 	* match.pd (1 / X -> X == 1 for unsigned X): Build eq with
> 	boolean_type_node and convert to type.  Formatting fixes.
> 
> 	* gcc.dg/torture/pr104279.c: New test.
> 	* gcc.dg/torture/pr104280.c: New test.
> 	* gcc.dg/torture/pr104281.c: New test.
> 
> --- gcc/match.pd.jj	2022-01-29 11:11:39.316628007 +0100
> +++ gcc/match.pd	2022-01-29 12:19:46.662096678 +0100
> @@ -435,18 +435,22 @@ (define_operator_list SYNC_FETCH_AND_AND
>         && TYPE_UNSIGNED (type))
>     (trunc_divmod @0 @1))))
>  
> - /* 1 / X -> X == 1 for unsigned integer X.
> -    1 / X -> X >= -1 && X <= 1 ? X : 0 for signed integer X.
> -    But not for 1 / 0 so that we can get proper warnings and errors,
> -    and not for 1-bit integers as they are edge cases better handled elsewhere. */
> +/* 1 / X -> X == 1 for unsigned integer X.
> +   1 / X -> X >= -1 && X <= 1 ? X : 0 for signed integer X.
> +   But not for 1 / 0 so that we can get proper warnings and errors,
> +   and not for 1-bit integers as they are edge cases better handled
> +   elsewhere.  */
>  (simplify
> -  (trunc_div integer_onep@0 @1)
> -  (if (INTEGRAL_TYPE_P (type) && !integer_zerop (@1) && TYPE_PRECISION (type) > 1)
> -    (if (TYPE_UNSIGNED (type))
> -      (eq @1 { build_one_cst (type); })
> -      (with { tree utype = unsigned_type_for (type); }
> -        (cond (le (plus (convert:utype @1) { build_one_cst (utype); }) { build_int_cst (utype, 2); })
> -          @1 { build_zero_cst (type); })))))
> + (trunc_div integer_onep@0 @1)
> + (if (INTEGRAL_TYPE_P (type)
> +      && !integer_zerop (@1)
> +      && TYPE_PRECISION (type) > 1)
> +  (if (TYPE_UNSIGNED (type))
> +   (convert (eq:boolean_type_node @1 { build_one_cst (type); }))
> +   (with { tree utype = unsigned_type_for (type); }
> +    (cond (le (plus (convert:utype @1) { build_one_cst (utype); })
> +	      { build_int_cst (utype, 2); })
> +     @1 { build_zero_cst (type); })))))
>  
>  /* Combine two successive divisions.  Note that combining ceil_div
>     and floor_div is trickier and combining round_div even more so.  */
> --- gcc/testsuite/gcc.dg/torture/pr104279.c.jj	2022-01-29 12:25:36.388174312 +0100
> +++ gcc/testsuite/gcc.dg/torture/pr104279.c	2022-01-29 12:25:53.206937588 +0100
> @@ -0,0 +1,12 @@
> +/* PR tree-optimization/104279 */
> +/* { dg-do compile } */
> +
> +unsigned a, b;
> +
> +int
> +main ()
> +{
> +  b = ~(0 || ~0);
> +  a = ~b / ~a;
> +  return 0;
> +}
> --- gcc/testsuite/gcc.dg/torture/pr104280.c.jj	2022-01-29 12:24:36.190021595 +0100
> +++ gcc/testsuite/gcc.dg/torture/pr104280.c	2022-01-29 12:25:28.681282783 +0100
> @@ -0,0 +1,16 @@
> +/* PR tree-optimization/104280 */
> +/* { dg-do run } */
> +
> +int
> +foo (unsigned b, int c)
> +{
> +  return b / c;
> +}
> +
> +int
> +main ()
> +{
> +  if (foo (1, 2) != 0)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.dg/torture/pr104281.c.jj	2022-01-29 12:27:29.840577473 +0100
> +++ gcc/testsuite/gcc.dg/torture/pr104281.c	2022-01-29 12:27:24.526652267 +0100
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/104281 */
> +/* { dg-do run } */
> +
> +unsigned a = 1;
> +int b, c = 2;
> +long d;
> +
> +int
> +main ()
> +{
> +  while (1)
> +    {
> +      int m = a;
> +    L:
> +      a = ~(-(m || b & d));
> +      b = ((1 ^ a) / c);
> +      if (b)
> +	goto L;
> +      break;
> +    }
> +  return 0;
> +}
> 
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-01-30  2:38         ` Andrew Pinski
@ 2022-01-31  8:33           ` Richard Biener
  0 siblings, 0 replies; 27+ messages in thread
From: Richard Biener @ 2022-01-31  8:33 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Zhao Wei Liew, Jeff Law, Jakub Jelinek, GCC Patches

On Sat, 29 Jan 2022, Andrew Pinski wrote:

> On Sat, Jan 29, 2022 at 6:30 PM Zhao Wei Liew via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Sincere apologies for the issues. I wasn't aware of the need for a cast but
> > after reading the PRs, I understand that now. On the other hand, the
> > incorrect test case was simply a major oversight on my part.
> >
> > I'll be sure to be more careful next time.
> 
> Well I think gimple-match should have caught the cast issue really. I
> filed https://gcc.gnu.org/PR104286 for that. I hope to submit a patch
> for stage 1 to catch that so it will be harder to happen in the first
> place.

I suppose that we could change verify_gimple_* to receive gimple_match_op
(or simply exploded args) and we could then, with -fchecking, verify
each such built match-op (some passes also build "custom" ones they feed
into match-and-simplify).

Richard.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-01-31  8:16     ` Eric Botcazou
@ 2022-02-02 22:58       ` Andrew Pinski
  2022-02-02 23:01         ` Eric Botcazou
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Pinski @ 2022-02-02 22:58 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Jakub Jelinek, GCC Patches, Richard Biener, Zhao Wei Liew

On Mon, Jan 31, 2022 at 12:17 AM Eric Botcazou via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> > Unfortunately this breaks quite a lot of things.
>
> Right, for example in Ada where we now happily turn a division by zero, which
> should raise an exception with -gnatp, into nonsense.  Do we really need this
> rather useless optimization in GCC?  Blindly mimicing LLVM is not a reason...

Since I didn't see anyone responding to this problem, I filed PR
104356 to record the regression.
And yes this should be handled correctly.

Thanks,
Andrew Pinski

>
> I have installed the attached testcase, which now fails because of the change.
>
>         * gnat.dg/div_zero.adb: New test.
>
> --
> Eric Botcazou

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-02 22:58       ` Andrew Pinski
@ 2022-02-02 23:01         ` Eric Botcazou
  2022-02-03  7:16           ` Richard Biener
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Botcazou @ 2022-02-02 23:01 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: Jakub Jelinek, GCC Patches, Richard Biener, Zhao Wei Liew

> Since I didn't see anyone responding to this problem, I filed PR
> 104356 to record the regression.
> And yes this should be handled correctly.

Thanks.  Note that we have an example of this in libgcc/libgcc2.c too.

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-02 23:01         ` Eric Botcazou
@ 2022-02-03  7:16           ` Richard Biener
  2022-02-03  9:06             ` Eric Botcazou
  2022-02-04  9:53             ` Eric Botcazou
  0 siblings, 2 replies; 27+ messages in thread
From: Richard Biener @ 2022-02-03  7:16 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Andrew Pinski, Jakub Jelinek, GCC Patches, Zhao Wei Liew

On Thu, 3 Feb 2022, Eric Botcazou wrote:

> > Since I didn't see anyone responding to this problem, I filed PR
> > 104356 to record the regression.
> > And yes this should be handled correctly.
> 
> Thanks.  Note that we have an example of this in libgcc/libgcc2.c too.

I assumed this was handled by the followups to the patch ...

Well, yes, we have to fix it.  I will share my thoughts when coming
along the bugreport.

Richard.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03  7:16           ` Richard Biener
@ 2022-02-03  9:06             ` Eric Botcazou
  2022-02-03  9:33               ` Jakub Jelinek
  2022-02-04  9:53             ` Eric Botcazou
  1 sibling, 1 reply; 27+ messages in thread
From: Eric Botcazou @ 2022-02-03  9:06 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jakub Jelinek, Zhao Wei Liew, GCC Patches

> Well, yes, we have to fix it.  I will share my thoughts when coming
> along the bugreport.

IMO we should simply scrap it, as it does not serve any useful purpose, breaks 
a very ancient and useful idiom and also introduces an artificial discrepancy 
between 1/X and 2/X for example.

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03  9:06             ` Eric Botcazou
@ 2022-02-03  9:33               ` Jakub Jelinek
  2022-02-03  9:40                 ` Richard Biener
  2022-02-03  9:49                 ` Eric Botcazou
  0 siblings, 2 replies; 27+ messages in thread
From: Jakub Jelinek @ 2022-02-03  9:33 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Richard Biener, gcc-patches, Zhao Wei Liew

On Thu, Feb 03, 2022 at 10:06:42AM +0100, Eric Botcazou wrote:
> > Well, yes, we have to fix it.  I will share my thoughts when coming
> > along the bugreport.
> 
> IMO we should simply scrap it, as it does not serve any useful purpose, breaks 
> a very ancient and useful idiom and also introduces an artificial discrepancy 
> between 1/X and 2/X for example.

That doesn't make sense.
The optimization can be useful, it doesn't have to be for user written
y = 1 / x; but can appear through inlining or some other optimizations, e.g.
jump threading and suddenly we have constant 1 in some bb, if it a never
executed at runtime block, it will be likely shorter because of the
optimization, if it is not, then it will be cheaper.
And, this is definitely not the first optimization that assumes undefined
behavior in integer division should not happen, lots of other optimizations
do the same thing.
E.g. consider
unsigned
foo (unsigned x, unsigned y)
{
  if (x >= 2)
    return 0;
  if (x == 1)
    y += 4;
  return y / x;
}
Here the ranger optimizes away the division, in GCC11 in vrp1 pass, in
GCC12 in evrp pass already.  Even in match.pd I vaguely remember one or two
optimizations where it also relied on division by zero being undefined.

	Jakub


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03  9:33               ` Jakub Jelinek
@ 2022-02-03  9:40                 ` Richard Biener
  2022-02-03  9:55                   ` Jakub Jelinek
  2022-02-03  9:49                 ` Eric Botcazou
  1 sibling, 1 reply; 27+ messages in thread
From: Richard Biener @ 2022-02-03  9:40 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Eric Botcazou, gcc-patches, Zhao Wei Liew

On Thu, 3 Feb 2022, Jakub Jelinek wrote:

> On Thu, Feb 03, 2022 at 10:06:42AM +0100, Eric Botcazou wrote:
> > > Well, yes, we have to fix it.  I will share my thoughts when coming
> > > along the bugreport.
> > 
> > IMO we should simply scrap it, as it does not serve any useful purpose, breaks 
> > a very ancient and useful idiom and also introduces an artificial discrepancy 
> > between 1/X and 2/X for example.
> 
> That doesn't make sense.
> The optimization can be useful, it doesn't have to be for user written
> y = 1 / x; but can appear through inlining or some other optimizations, e.g.
> jump threading and suddenly we have constant 1 in some bb, if it a never
> executed at runtime block, it will be likely shorter because of the
> optimization, if it is not, then it will be cheaper.
> And, this is definitely not the first optimization that assumes undefined
> behavior in integer division should not happen, lots of other optimizations
> do the same thing.
> E.g. consider
> unsigned
> foo (unsigned x, unsigned y)
> {
>   if (x >= 2)
>     return 0;
>   if (x == 1)
>     y += 4;
>   return y / x;
> }
> Here the ranger optimizes away the division, in GCC11 in vrp1 pass, in
> GCC12 in evrp pass already.  Even in match.pd I vaguely remember one or two
> optimizations where it also relied on division by zero being undefined.

Yes, we definitely have multiple of those cases.  I do think
preserving "an idiom", for example literal 0/0 or all x/0 might be
worth considering.  But I also think we have to sort out different
language standards requirements vs. the middle-end and whos
responsible for making sure we adhere here.

That said, for GCC 12 we can take shortcuts and one option is of
course to revert (but we don't have to rush).

oh, and even GCC 4.3 optimizes

int main()
{
  volatile int tem = 0/0;
}

to just

        movl    $0, -4(%rsp)

even with -fnon-call-exceptions -fexceptions.

Richard.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03  9:33               ` Jakub Jelinek
  2022-02-03  9:40                 ` Richard Biener
@ 2022-02-03  9:49                 ` Eric Botcazou
  1 sibling, 0 replies; 27+ messages in thread
From: Eric Botcazou @ 2022-02-03  9:49 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Zhao Wei Liew, gcc-patches, Richard Biener

> The optimization can be useful, it doesn't have to be for user written
> y = 1 / x; but can appear through inlining or some other optimizations, e.g.
> jump threading and suddenly we have constant 1 in some bb, if it a never
> executed at runtime block, it will be likely shorter because of the
> optimization, if it is not, then it will be cheaper.

The hundreds of people having worked on GCC for the past 30 years must have 
been stupid then, how come have they missed such a great optimization? ;-)

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03  9:40                 ` Richard Biener
@ 2022-02-03  9:55                   ` Jakub Jelinek
  2022-02-03 11:03                     ` Richard Biener
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2022-02-03  9:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Eric Botcazou, gcc-patches, Zhao Wei Liew

On Thu, Feb 03, 2022 at 10:40:10AM +0100, Richard Biener wrote:
> Yes, we definitely have multiple of those cases.  I do think
> preserving "an idiom", for example literal 0/0 or all x/0 might be
> worth considering.  But I also think we have to sort out different
> language standards requirements vs. the middle-end and whos
> responsible for making sure we adhere here.

I think we try to preserve literal 0/0 and x/0, including this
optimization which punts if the divisor is literal.  But, for literal
0/0 and x/0 we alsy emit -Wdiv-by-zero warning, maybe that was the
reason why libgcc2.c did it differently.

	Jakub


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03  9:55                   ` Jakub Jelinek
@ 2022-02-03 11:03                     ` Richard Biener
  2022-02-03 11:18                       ` Jakub Jelinek
  0 siblings, 1 reply; 27+ messages in thread
From: Richard Biener @ 2022-02-03 11:03 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Eric Botcazou, gcc-patches, Zhao Wei Liew

On Thu, 3 Feb 2022, Jakub Jelinek wrote:

> On Thu, Feb 03, 2022 at 10:40:10AM +0100, Richard Biener wrote:
> > Yes, we definitely have multiple of those cases.  I do think
> > preserving "an idiom", for example literal 0/0 or all x/0 might be
> > worth considering.  But I also think we have to sort out different
> > language standards requirements vs. the middle-end and whos
> > responsible for making sure we adhere here.
> 
> I think we try to preserve literal 0/0 and x/0, including this
> optimization which punts if the divisor is literal.  But, for literal
> 0/0 and x/0 we alsy emit -Wdiv-by-zero warning, maybe that was the
> reason why libgcc2.c did it differently.

Sure, I bet the code is quite old since we very likely propagate
the equality and optimize the division away since a long time.
Now that #pragma GCC diagnostic is a thing we can probably write
literal 0/0 if we choose to preserve that until the end.

But as said, for the libgcc2.c case I'd simply remove all of it.

Richard.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03 11:03                     ` Richard Biener
@ 2022-02-03 11:18                       ` Jakub Jelinek
  2022-02-03 12:13                         ` Richard Biener
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2022-02-03 11:18 UTC (permalink / raw)
  To: Richard Biener; +Cc: Eric Botcazou, gcc-patches, Zhao Wei Liew

On Thu, Feb 03, 2022 at 12:03:15PM +0100, Richard Biener wrote:
> But as said, for the libgcc2.c case I'd simply remove all of it.

I can't read RMS' mind, it is indeed UB, so we can do anything, but I bet
it was just a QoI attempt, when (most of the time) normal single-word
or smaller division for / 0 behaves certain way (SIGFPE with FPE_INTDIV,
or being ignored, or whatever else) that it would be nice if the multi-word
division behaved likewise.
On the platforms where it is SIGFPE with FPE_INTDIV, raising that would
help people figure out what's going on.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03 11:18                       ` Jakub Jelinek
@ 2022-02-03 12:13                         ` Richard Biener
  2022-02-03 12:22                           ` Jakub Jelinek
  2022-02-03 12:24                           ` Eric Botcazou
  0 siblings, 2 replies; 27+ messages in thread
From: Richard Biener @ 2022-02-03 12:13 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Eric Botcazou, gcc-patches, Zhao Wei Liew

On Thu, 3 Feb 2022, Jakub Jelinek wrote:

> On Thu, Feb 03, 2022 at 12:03:15PM +0100, Richard Biener wrote:
> > But as said, for the libgcc2.c case I'd simply remove all of it.
> 
> I can't read RMS' mind, it is indeed UB, so we can do anything, but I bet
> it was just a QoI attempt, when (most of the time) normal single-word
> or smaller division for / 0 behaves certain way (SIGFPE with FPE_INTDIV,
> or being ignored, or whatever else) that it would be nice if the multi-word
> division behaved likewise.
> On the platforms where it is SIGFPE with FPE_INTDIV, raising that would
> help people figure out what's going on.

Yes, I think the intent is clear.  The question is whether we should
re-instantiate the clear intent of preserving a literal / 0 as well
(for C, without -fnon-call-exceptions).

That said, I'm fine with the asms but they are ugly ;)

Richard.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03 12:13                         ` Richard Biener
@ 2022-02-03 12:22                           ` Jakub Jelinek
  2022-02-03 12:24                           ` Eric Botcazou
  1 sibling, 0 replies; 27+ messages in thread
From: Jakub Jelinek @ 2022-02-03 12:22 UTC (permalink / raw)
  To: Richard Biener, Jeff Law; +Cc: Eric Botcazou, gcc-patches, Zhao Wei Liew

On Thu, Feb 03, 2022 at 01:13:29PM +0100, Richard Biener wrote:
> On Thu, 3 Feb 2022, Jakub Jelinek wrote:
> 
> > On Thu, Feb 03, 2022 at 12:03:15PM +0100, Richard Biener wrote:
> > > But as said, for the libgcc2.c case I'd simply remove all of it.
> > 
> > I can't read RMS' mind, it is indeed UB, so we can do anything, but I bet
> > it was just a QoI attempt, when (most of the time) normal single-word
> > or smaller division for / 0 behaves certain way (SIGFPE with FPE_INTDIV,
> > or being ignored, or whatever else) that it would be nice if the multi-word
> > division behaved likewise.
> > On the platforms where it is SIGFPE with FPE_INTDIV, raising that would
> > help people figure out what's going on.
> 
> Yes, I think the intent is clear.  The question is whether we should
> re-instantiate the clear intent of preserving a literal / 0 as well
> (for C, without -fnon-call-exceptions).

I think currently it is path isolation that breaks it.
Bet in a good intent that everything after it is UB and allowing the
stmts after it to be optimized away.
Unfortunately we don't (easily) know whether the division by zero
was literal already in the source or propagated through inlining etc.
or even jump threading.  Especially in the last case it is nice to
be able to emit as short code as possible, which __builtin_trap ()
serves nicely on many targets.
Emitting both a 1 / 0 insn followed by __builtin_trap () might work
with perhaps some RTL optimization to optimize away the trap if
the target is known to trap on division by zero.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03 12:13                         ` Richard Biener
  2022-02-03 12:22                           ` Jakub Jelinek
@ 2022-02-03 12:24                           ` Eric Botcazou
  1 sibling, 0 replies; 27+ messages in thread
From: Eric Botcazou @ 2022-02-03 12:24 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jakub Jelinek, gcc-patches, Zhao Wei Liew, gcc-patches

> Yes, I think the intent is clear.  The question is whether we should
> re-instantiate the clear intent of preserving a literal / 0 as well
> (for C, without -fnon-call-exceptions).

Note that the code is precisely compiled with -fnon-call-exceptions:

ifeq ($(LIB2_DIVMOD_EXCEPTION_FLAGS),)
# Provide default flags for compiling divmod functions, if they haven't been
# set already by a target-specific Makefile fragment.
LIB2_DIVMOD_EXCEPTION_FLAGS := -fexceptions -fnon-call-exceptions
endif

# Build LIB2_DIVMOD_FUNCS.
lib2-divmod-o = $(patsubst %,%$(objext),$(LIB2_DIVMOD_FUNCS))
$(lib2-divmod-o): %$(objext): $(srcdir)/libgcc2.c
	$(gcc_compile) -DL$* -c $< \
	  $(LIB2_DIVMOD_EXCEPTION_FLAGS) $(vis_hide)
libgcc-objects += $(lib2-divmod-o)

ifeq ($(enable_shared),yes)
lib2-divmod-s-o = $(patsubst %,%_s$(objext),$(LIB2_DIVMOD_FUNCS))
$(lib2-divmod-s-o): %_s$(objext): $(srcdir)/libgcc2.c
	$(gcc_s_compile) -DL$* -c $< \
	  $(LIB2_DIVMOD_EXCEPTION_FLAGS)
libgcc-s-objects += $(lib2-divmod-s-o)
endif

so it seems like we were about to reinvent the wheel here. ;-)

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-03  7:16           ` Richard Biener
  2022-02-03  9:06             ` Eric Botcazou
@ 2022-02-04  9:53             ` Eric Botcazou
  2022-02-04 10:07               ` Jakub Jelinek
  1 sibling, 1 reply; 27+ messages in thread
From: Eric Botcazou @ 2022-02-04  9:53 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jakub Jelinek, Zhao Wei Liew, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 329 bytes --]

> Well, yes, we have to fix it.

Here's the fix we agreed upon in the audit trail, OK for the mainline?

	PR tree-optimization/104356
	* match.pd (X / bool_range_Y is X): Add guard.
	(X / X is one): Likewise.
	(X / abs (X) is X < 0 ? -1 : 1): Likewise.
	(X / -X is -1): Likewise.
	(1 / X -> X == 1): Likewise.

-- 
Eric Botcazou

[-- Attachment #2: pr104356.diff --]
[-- Type: text/x-patch, Size: 1988 bytes --]

diff --git a/gcc/match.pd b/gcc/match.pd
index b942cb2930a..4b695db7a25 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -401,27 +401,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  /* X / bool_range_Y is X.  */ 
  (simplify
   (div @0 SSA_NAME@1)
-  (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1))
+  (if (INTEGRAL_TYPE_P (type)
+       && ssa_name_has_boolean_range (@1)
+       && !flag_non_call_exceptions)
    @0))
  /* X / X is one.  */
  (simplify
   (div @0 @0)
   /* But not for 0 / 0 so that we can get the proper warnings and errors.
      And not for _Fract types where we can't build 1.  */
-  (if (!integer_zerop (@0) && !ALL_FRACT_MODE_P (TYPE_MODE (type)))
+  (if (!integer_zerop (@0)
+       && (!flag_non_call_exceptions || tree_expr_nonzero_p (@0))
+       && !ALL_FRACT_MODE_P (TYPE_MODE (type)))
    { build_one_cst (type); }))
  /* X / abs (X) is X < 0 ? -1 : 1.  */
  (simplify
    (div:C @0 (abs @0))
    (if (INTEGRAL_TYPE_P (type)
-	&& TYPE_OVERFLOW_UNDEFINED (type))
+	&& TYPE_OVERFLOW_UNDEFINED (type)
+	&& !integer_zerop (@0)
+	&& (!flag_non_call_exceptions || tree_expr_nonzero_p (@0)))
     (cond (lt @0 { build_zero_cst (type); })
           { build_minus_one_cst (type); } { build_one_cst (type); })))
  /* X / -X is -1.  */
  (simplify
    (div:C @0 (negate @0))
    (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
-	&& TYPE_OVERFLOW_UNDEFINED (type))
+	&& TYPE_OVERFLOW_UNDEFINED (type)
+	&& !integer_zerop (@0)
+	&& (!flag_non_call_exceptions || tree_expr_nonzero_p (@0)))
     { build_minus_one_cst (type); })))
 
 /* For unsigned integral types, FLOOR_DIV_EXPR is the same as
@@ -444,6 +452,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (trunc_div integer_onep@0 @1)
  (if (INTEGRAL_TYPE_P (type)
       && !integer_zerop (@1)
+      && (!flag_non_call_exceptions || tree_expr_nonzero_p (@1))
       && TYPE_PRECISION (type) > 1)
   (if (TYPE_UNSIGNED (type))
    (convert (eq:boolean_type_node @1 { build_one_cst (type); }))

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-04  9:53             ` Eric Botcazou
@ 2022-02-04 10:07               ` Jakub Jelinek
  2022-02-04 10:14                 ` Eric Botcazou
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2022-02-04 10:07 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Richard Biener, gcc-patches, Zhao Wei Liew

On Fri, Feb 04, 2022 at 10:53:22AM +0100, Eric Botcazou wrote:
> > Well, yes, we have to fix it.
> 
> Here's the fix we agreed upon in the audit trail, OK for the mainline?
> 
> 	PR tree-optimization/104356
> 	* match.pd (X / bool_range_Y is X): Add guard.
> 	(X / X is one): Likewise.
> 	(X / abs (X) is X < 0 ? -1 : 1): Likewise.
> 	(X / -X is -1): Likewise.
> 	(1 / X -> X == 1): Likewise.

Looks mostly good, just a few nits.

> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -401,27 +401,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   /* X / bool_range_Y is X.  */ 
>   (simplify
>    (div @0 SSA_NAME@1)
> -  (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1))
> +  (if (INTEGRAL_TYPE_P (type)
> +       && ssa_name_has_boolean_range (@1)
> +       && !flag_non_call_exceptions)

ssa_name_has_boolean_range call is certainly more expensive than
!flag_non_call_exceptions check, can you swap those two?

> @@ -444,6 +452,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (trunc_div integer_onep@0 @1)
>   (if (INTEGRAL_TYPE_P (type)
>        && !integer_zerop (@1)
> +      && (!flag_non_call_exceptions || tree_expr_nonzero_p (@1))
>        && TYPE_PRECISION (type) > 1)

And similarly, TYPE_PRECISION (type) > 1 check is very cheap, can
it be done before the && !integer_zerop (@1) line?
I admit this one is already preexisting, but tree_expr_nonzero_p
can be quite expensive.

Otherwise LGTM.

	Jakub


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-04 10:07               ` Jakub Jelinek
@ 2022-02-04 10:14                 ` Eric Botcazou
  2022-02-04 10:26                   ` Jakub Jelinek
  0 siblings, 1 reply; 27+ messages in thread
From: Eric Botcazou @ 2022-02-04 10:14 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches, Zhao Wei Liew

> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -401,27 +401,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > 
> >   /* X / bool_range_Y is X.  */
> >   (simplify
> >   
> >    (div @0 SSA_NAME@1)
> > 
> > -  (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1))
> > +  (if (INTEGRAL_TYPE_P (type)
> > +       && ssa_name_has_boolean_range (@1)
> > +       && !flag_non_call_exceptions)
> 
> ssa_name_has_boolean_range call is certainly more expensive than
> !flag_non_call_exceptions check, can you swap those two?

But !flag_non_call_exceptions is (almost) always true for the C family of 
languages, so you're going to penalize them by doing this.

> And similarly, TYPE_PRECISION (type) > 1 check is very cheap, can
> it be done before the && !integer_zerop (@1) line?

Yes, it clearly belongs there.

-- 
Eric Botcazou




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-04 10:14                 ` Eric Botcazou
@ 2022-02-04 10:26                   ` Jakub Jelinek
  2022-02-04 10:29                     ` Jakub Jelinek
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2022-02-04 10:26 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Richard Biener, gcc-patches, Zhao Wei Liew

On Fri, Feb 04, 2022 at 11:14:05AM +0100, Eric Botcazou wrote:
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -401,27 +401,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > 
> > >   /* X / bool_range_Y is X.  */
> > >   (simplify
> > >   
> > >    (div @0 SSA_NAME@1)
> > > 
> > > -  (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1))
> > > +  (if (INTEGRAL_TYPE_P (type)
> > > +       && ssa_name_has_boolean_range (@1)
> > > +       && !flag_non_call_exceptions)
> > 
> > ssa_name_has_boolean_range call is certainly more expensive than
> > !flag_non_call_exceptions check, can you swap those two?
> 
> But !flag_non_call_exceptions is (almost) always true for the C family of 
> languages, so you're going to penalize them by doing this.

True, but much less so than the other order penalizing Ada/Go.

> > And similarly, TYPE_PRECISION (type) > 1 check is very cheap, can
> > it be done before the && !integer_zerop (@1) line?
> 
> Yes, it clearly belongs there.

	Jakub


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
  2022-02-04 10:26                   ` Jakub Jelinek
@ 2022-02-04 10:29                     ` Jakub Jelinek
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Jelinek @ 2022-02-04 10:29 UTC (permalink / raw)
  To: Eric Botcazou, Zhao Wei Liew, gcc-patches, Richard Biener

On Fri, Feb 04, 2022 at 11:26:30AM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Fri, Feb 04, 2022 at 11:14:05AM +0100, Eric Botcazou wrote:
> > > > --- a/gcc/match.pd
> > > > +++ b/gcc/match.pd
> > > > @@ -401,27 +401,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > > > 
> > > >   /* X / bool_range_Y is X.  */
> > > >   (simplify
> > > >   
> > > >    (div @0 SSA_NAME@1)
> > > > 
> > > > -  (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1))
> > > > +  (if (INTEGRAL_TYPE_P (type)
> > > > +       && ssa_name_has_boolean_range (@1)
> > > > +       && !flag_non_call_exceptions)
> > > 
> > > ssa_name_has_boolean_range call is certainly more expensive than
> > > !flag_non_call_exceptions check, can you swap those two?
> > 
> > But !flag_non_call_exceptions is (almost) always true for the C family of 
> > languages, so you're going to penalize them by doing this.
> 
> True, but much less so than the other order penalizing Ada/Go.
> 
> > > And similarly, TYPE_PRECISION (type) > 1 check is very cheap, can
> > > it be done before the && !integer_zerop (@1) line?
> > 
> > Yes, it clearly belongs there.

Anyway, not a big deal.

	Jakub


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-02-04 10:29 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-19 18:42 [PATCH v3] match.pd: Simplify 1 / X for integer X [PR95424] Zhao Wei Liew
2022-01-28 18:38 ` Jeff Law
2022-01-29 16:23   ` [PATCH] match.pd: Fix up 1 / X for unsigned X optimization [PR104280] Jakub Jelinek
2022-01-29 16:48     ` Jeff Law
2022-01-30  2:28       ` Zhao Wei Liew
2022-01-30  2:38         ` Andrew Pinski
2022-01-31  8:33           ` Richard Biener
2022-01-31  8:16     ` Eric Botcazou
2022-02-02 22:58       ` Andrew Pinski
2022-02-02 23:01         ` Eric Botcazou
2022-02-03  7:16           ` Richard Biener
2022-02-03  9:06             ` Eric Botcazou
2022-02-03  9:33               ` Jakub Jelinek
2022-02-03  9:40                 ` Richard Biener
2022-02-03  9:55                   ` Jakub Jelinek
2022-02-03 11:03                     ` Richard Biener
2022-02-03 11:18                       ` Jakub Jelinek
2022-02-03 12:13                         ` Richard Biener
2022-02-03 12:22                           ` Jakub Jelinek
2022-02-03 12:24                           ` Eric Botcazou
2022-02-03  9:49                 ` Eric Botcazou
2022-02-04  9:53             ` Eric Botcazou
2022-02-04 10:07               ` Jakub Jelinek
2022-02-04 10:14                 ` Eric Botcazou
2022-02-04 10:26                   ` Jakub Jelinek
2022-02-04 10:29                     ` Jakub Jelinek
2022-01-31  8:28     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).