From: Thomas Schwinge <thomas@codesourcery.com>
To: Aldy Hernandez <aldyh@redhat.com>, <gcc-patches@gcc.gnu.org>
Cc: Andrew MacLeod <amacleod@redhat.com>
Subject: Re: Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.))
Date: Fri, 21 Oct 2022 10:38:14 +0200 [thread overview]
Message-ID: <875ygdefm1.fsf@dem-tschwing-1.ger.mentorg.com> (raw)
In-Reply-To: <CAGm3qMV5_7hEED6_NKNAFaiE5dFXapsrRGEd_MAqNiSsF15nmw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4274 bytes --]
Hi!
On 2022-10-21T00:44:30+0200, Aldy Hernandez <aldyh@redhat.com> wrote:
> On Thu, Oct 20, 2022 at 9:22 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
>> "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached?
>
> I see 7 different tests in this patch. Did the 6 that pass, fail
> before my patch for PR107195 and are now working? Cause unless
> that's the case, they shouldn't be in a test named pr107195-3.c, but
> somewhere else.
That's correct; I should've mentioned that I had verified this. With the
code changes of commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0" reverted, we get:
PASS: gcc.dg/tree-ssa/pr107195-3.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo1," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo2," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo3," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo4," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo5," 1
FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo6," 1
..., and in 'pr107195-3.c.196t.dom3' instead see two calls of each
'foo[...]' function.
That's with this...
> I see there's one XFAILed test in your patch
... XFAILed test case removed, see the attached
"Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]";
OK now to push that version?
> and this certainly
> doesn't look like something that has anything to do with the patch I
> submitted. Perhaps you could open a PR with an enhancement request
> for this one?
>
> That being said...
>
> /* { dg-additional-options -O1 } */
> extern int
> __attribute__((const))
> foo4b (int);
>
> int f4b (unsigned int r)
> {
> if (foo4b (r))
> r *= 8U;
>
> if ((r / 2U) & 2U)
> r += foo4b (r);
>
> return r;
> }
> /* { dg-final { scan-tree-dump-times {gimple_call <foo4b,} 1 dom3 {
> xfail *-*-* } } } */
>
> At -O2, this is something PRE is doing, so GCC already handles this.
> However, you are suggesting this isn't handled at -O1 and should be??
My thinking was that this optimization does work for 'r >> 1', but it
doesn't work for 'r / 2'.
> None of the VRPs run at -O1 so ranger-vrp won't even get a chance.
> However, DOM runs at -O1 and it uses ranger to do simple copy
> propagation and some jump threading...so technically we could do
> something...
>
> DOM should be able to thread from the r *= 8U to the return because
> the nonzero mask (known zeros) after the multiplication is 0xfffffff8,
> which it could use to solve the second conditional as false. This
> would leave us with:
>
> if (foo4b (r))
> {
> r *= 8U;
> return r;
> }
> else
> {
> if ((r / 2U) & 2U)
> r += foo4b (r);
> }
>
> ...which exposes the fact that the second call to foo4b() has the same
> "r" as the first one, so it could be folded. I don't know whose job
> it is to notice that two const calls have the same arguments, but ISTM
> that if we thread the above correctly, someone should be able to clean
> this up. No clue whether this happens at -O1.
>
> However... we're not threading this. It looks like we're not keeping
> track of nonzero bits (known zeros) through the division. The
> multiplication gives us 0xfffffff8 and we should be able to divide
> that by 2 and get 0x7ffffffc which solves the second conditional to 0.
>
> So...maybe DOM+ranger could set things up for another pass to clean this up?
>
> Either way, you could open an enhancement request, if anything to keep
> the nonzero mask up to date through the division.
I've thus filed <https://gcc.gnu.org/PR107342>
"Optimization opportunity where integer '/' corresponds to '>>'" for
continuing that investigation.
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Add-gcc.dg-tree-ssa-pr107195-3.c-PR107195.patch --]
[-- Type: text/x-diff, Size: 3059 bytes --]
From e55e8569201c482507550eb56ff16aa3bbb48676 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Mon, 17 Oct 2022 09:10:03 +0200
Subject: [PATCH] Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]
... to display optimization performed as of recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0".
PR tree-optimization/107195
gcc/testsuite/
* gcc.dg/tree-ssa/pr107195-3.c: New.
---
gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c | 112 +++++++++++++++++++++
1 file changed, 112 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
new file mode 100644
index 00000000000..eba4218b3c9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
@@ -0,0 +1,112 @@
+/* Inspired by 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'. */
+
+/* { dg-additional-options -O1 } */
+/* { dg-additional-options -fdump-tree-dom3-raw } */
+
+
+extern int
+__attribute__((const))
+foo1 (int);
+
+int f1 (int r)
+{
+ if (foo1 (r)) /* If this first 'if' holds... */
+ r *= 2; /* ..., 'r' now has a zero-value lower-most bit... */
+
+ if (r & 1) /* ..., so this second 'if' can never hold... */
+ { /* ..., so this is unreachable. */
+ /* In constrast, if the first 'if' does not hold ('foo1 (r) == 0'), the
+ second 'if' may hold, but we know ('foo1' being 'const') that
+ 'foo1 (r) == 0', so don't have to re-evaluate it here: */
+ r += foo1 (r);
+ }
+
+ return r;
+}
+/* Thus, if optimizing, we only ever expect one call of 'foo1'.
+ { dg-final { scan-tree-dump-times {gimple_call <foo1,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo2 (int);
+
+int f2 (int r)
+{
+ if (foo2 (r))
+ r *= 8;
+
+ if (r & 7)
+ r += foo2 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo2,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo3 (int);
+
+int f3 (int r)
+{
+ if (foo3 (r))
+ r <<= 4;
+
+ if ((r & 64) && ((r & 8) || (r & 4) || (r & 2) || (r & 1)))
+ r += foo3 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo3,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo4 (int);
+
+int f4 (int r)
+{
+ if (foo4 (r))
+ r *= 8;
+
+ if ((r >> 1) & 2)
+ r += foo4 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo4,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo5 (int);
+
+int f5 (int r) /* Works for both 'signed' and 'unsigned'. */
+{
+ if (foo5 (r))
+ r *= 2;
+
+ if ((r % 2) != 0)
+ r += foo5 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo5,} 1 dom3 } } */
+
+
+extern int
+__attribute__((const))
+foo6 (int);
+
+int f6 (unsigned int r) /* 'unsigned' is important here. */
+{
+ if (foo6 (r))
+ r *= 2;
+
+ if ((r % 2) == 1)
+ r += foo6 (r);
+
+ return r;
+}
+/* { dg-final { scan-tree-dump-times {gimple_call <foo6,} 1 dom3 } } */
--
2.25.1
next prev parent reply other threads:[~2022-10-21 8:38 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-11 8:31 [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0 Aldy Hernandez
2022-10-17 7:43 ` Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.) Thomas Schwinge
2022-10-17 13:58 ` Aldy Hernandez
2022-10-17 14:46 ` Thomas Schwinge
2022-10-18 5:41 ` Aldy Hernandez
2022-10-20 11:38 ` Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.)) Thomas Schwinge
2022-10-20 12:23 ` Aldy Hernandez
2022-10-20 19:22 ` Thomas Schwinge
2022-10-20 22:44 ` Aldy Hernandez
2022-10-21 8:38 ` Thomas Schwinge [this message]
2022-10-21 8:51 ` Aldy Hernandez
2022-10-21 9:36 ` Restore 'libgomp.oacc-c-c++-common/nvptx-sese-1.c' SESE regions checking [PR107195, PR107344] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.) Thomas Schwinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875ygdefm1.fsf@dem-tschwing-1.ger.mentorg.com \
--to=thomas@codesourcery.com \
--cc=aldyh@redhat.com \
--cc=amacleod@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).