public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Thomas Schwinge <thomas@codesourcery.com>
To: Aldy Hernandez <aldyh@redhat.com>, <gcc-patches@gcc.gnu.org>
Cc: Andrew MacLeod <amacleod@redhat.com>
Subject: Re: Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.))
Date: Fri, 21 Oct 2022 10:38:14 +0200	[thread overview]
Message-ID: <875ygdefm1.fsf@dem-tschwing-1.ger.mentorg.com> (raw)
In-Reply-To: <CAGm3qMV5_7hEED6_NKNAFaiE5dFXapsrRGEd_MAqNiSsF15nmw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4274 bytes --]

Hi!

On 2022-10-21T00:44:30+0200, Aldy Hernandez <aldyh@redhat.com> wrote:
> On Thu, Oct 20, 2022 at 9:22 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
>> "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached?
>
> I see 7 different tests in this patch.  Did the 6 that pass, fail
> before my patch for PR107195 and are now working?   Cause unless
> that's the case, they shouldn't be in a test named pr107195-3.c, but
> somewhere else.

That's correct; I should've mentioned that I had verified this.  With the
code changes of commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0" reverted, we get:

    PASS: gcc.dg/tree-ssa/pr107195-3.c (test for excess errors)
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo1," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo2," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo3," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo4," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo5," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo6," 1

..., and in 'pr107195-3.c.196t.dom3' instead see two calls of each
'foo[...]' function.

That's with this...

> I see there's one XFAILed test in your patch

... XFAILed test case removed, see the attached
"Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]";
OK now to push that version?


> and this certainly
> doesn't look like something that has anything to do with the patch I
> submitted.  Perhaps you could open a PR with an enhancement request
> for this one?
>
> That being said...
>
> /* { dg-additional-options -O1 } */
> extern int
> __attribute__((const))
> foo4b (int);
>
> int f4b (unsigned int r)
> {
>   if (foo4b (r))
>     r *= 8U;
>
>   if ((r / 2U) & 2U)
>     r += foo4b (r);
>
>   return r;
> }
> /* { dg-final { scan-tree-dump-times {gimple_call <foo4b,} 1 dom3 {
> xfail *-*-* } } } */
>
> At -O2, this is something PRE is doing,  so GCC already handles this.
> However, you are suggesting this isn't handled at -O1 and should be??

My thinking was that this optimization does work for 'r >> 1', but it
doesn't work for 'r / 2'.

> None of the VRPs run at -O1 so ranger-vrp won't even get a chance.
> However, DOM runs at -O1 and it uses ranger to do simple copy
> propagation and some jump threading...so technically we could do
> something...
>
> DOM should be able to thread from the r *= 8U to the return because
> the nonzero mask (known zeros) after the multiplication is 0xfffffff8,
> which it could use to solve the second conditional as false.  This
> would leave us with:
>
> if (foo4b (r))
>   {
>     r *= 8U;
>    return r;
>   }
> else
>   {
>      if ((r / 2U) & 2U)
>        r += foo4b (r);
>   }
>
> ...which exposes the fact that the second call to foo4b() has the same
> "r" as the first one, so it could be folded.  I don't know whose job
> it is to notice that two const calls have the same arguments, but ISTM
> that if we thread the above correctly, someone should be able to clean
> this up.  No clue whether this happens at -O1.
>
> However... we're not threading this.  It looks like we're not keeping
> track of nonzero bits (known zeros) through the division.  The
> multiplication gives us 0xfffffff8 and we should be able to divide
> that by 2 and get 0x7ffffffc which solves the second conditional to 0.
>
> So...maybe DOM+ranger could set things up for another pass to clean this up?
>
> Either way, you could open an enhancement request, if anything to keep
> the nonzero mask up to date through the division.

I've thus filed <https://gcc.gnu.org/PR107342>
"Optimization opportunity where integer '/' corresponds to '>>'" for
continuing that investigation.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-gcc.dg-tree-ssa-pr107195-3.c-PR107195.patch --]
[-- Type: text/x-diff, Size: 3059 bytes --]

From e55e8569201c482507550eb56ff16aa3bbb48676 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Mon, 17 Oct 2022 09:10:03 +0200
Subject: [PATCH] Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]

... to display optimization performed as of recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0".

	PR tree-optimization/107195
	gcc/testsuite/
	* gcc.dg/tree-ssa/pr107195-3.c: New.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c | 112 +++++++++++++++++++++
 1 file changed, 112 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
new file mode 100644
index 00000000000..eba4218b3c9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
@@ -0,0 +1,112 @@
+/* Inspired by 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'.  */
+
+/* { dg-additional-options -O1 } */
+/* { dg-additional-options -fdump-tree-dom3-raw } */
+
+
+extern int
+__attribute__((const))
+foo1 (int);
+
+int f1 (int r)
+{
+  if (foo1 (r)) /* If this first 'if' holds...  */
+    r *= 2; /* ..., 'r' now has a zero-value lower-most bit...  */
+
+  if (r & 1) /* ..., so this second 'if' can never hold...  */
+    { /* ..., so this is unreachable.  */
+      /* In constrast, if the first 'if' does not hold ('foo1 (r) == 0'), the
+	 second 'if' may hold, but we know ('foo1' being 'const') that
+	 'foo1 (r) == 0', so don't have to re-evaluate it here: */
+      r += foo1 (r);
+    }
+
+  return r;
+}
+/* Thus, if optimizing, we only ever expect one call of 'foo1'.
+   { dg-final { scan-tree-dump-times {gimple_call <foo1,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo2 (int);
+
+int f2 (int r)
+{
+  if (foo2 (r))
+    r *= 8;
+
+  if (r & 7)
+    r += foo2 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo2,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo3 (int);
+
+int f3 (int r)
+{
+  if (foo3 (r))
+    r <<= 4;
+
+  if ((r & 64) && ((r & 8) || (r & 4) || (r & 2) || (r & 1)))
+    r += foo3 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo3,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo4 (int);
+
+int f4 (int r)
+{
+  if (foo4 (r))
+    r *= 8;
+
+  if ((r >> 1) & 2)
+    r += foo4 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo4,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo5 (int);
+
+int f5 (int r) /* Works for both 'signed' and 'unsigned'.  */
+{
+  if (foo5 (r))
+    r *= 2;
+
+  if ((r % 2) != 0)
+    r += foo5 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo5,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo6 (int);
+
+int f6 (unsigned int r) /* 'unsigned' is important here.  */
+{
+  if (foo6 (r))
+    r *= 2;
+
+  if ((r % 2) == 1)
+    r += foo6 (r);
+
+  return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo6,} 1 dom3 } } */
-- 
2.25.1


  reply	other threads:[~2022-10-21  8:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-11  8:31 [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0 Aldy Hernandez
2022-10-17  7:43 ` Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.) Thomas Schwinge
2022-10-17 13:58   ` Aldy Hernandez
2022-10-17 14:46     ` Thomas Schwinge
2022-10-18  5:41       ` Aldy Hernandez
2022-10-20 11:38         ` Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.)) Thomas Schwinge
2022-10-20 12:23           ` Aldy Hernandez
2022-10-20 19:22             ` Thomas Schwinge
2022-10-20 22:44               ` Aldy Hernandez
2022-10-21  8:38                 ` Thomas Schwinge [this message]
2022-10-21  8:51                   ` Aldy Hernandez
2022-10-21  9:36   ` Restore 'libgomp.oacc-c-c++-common/nvptx-sese-1.c' SESE regions checking [PR107195, PR107344] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.) Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875ygdefm1.fsf@dem-tschwing-1.ger.mentorg.com \
    --to=thomas@codesourcery.com \
    --cc=aldyh@redhat.com \
    --cc=amacleod@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).