From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id DF56E3856DF6 for ; Fri, 21 Oct 2022 08:38:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DF56E3856DF6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.95,200,1661846400"; d="scan'208,223";a="87984099" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 21 Oct 2022 00:38:20 -0800 IronPort-SDR: HMrDSjoxe8LdXryeyYHZTQxQXc2Vgxu/C9eLkKz6xd7cgGTkqYwMpQtJHGYgeu0zioGOmkQ+ka 6Yymy0xtNve4aESTKODqJ324eJmWnSkzQ5zIFNkHy9OJ/fxUJHGTMbtyV1F/jNQ0cxv0RvBC3V neu7Lt9ftEEueyoW6flxsCyiYFG1NNMfE2UV+jm7IgctebzIxwmZhgiREhVQUQuS1gXypVFnqI QlDf1zpq6AehD6L6+UkwEatSA/gw9U/zGyi5QdPo/addmXzX35yU0woZDi8373sF7XTfh7OrAY zsg= From: Thomas Schwinge To: Aldy Hernandez , CC: Andrew MacLeod Subject: Re: Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.)) In-Reply-To: References: <20221011083137.336470-1-aldyh@redhat.com> <878rlej3o6.fsf@euler.schwinge.homeip.net> <87o7uafqyf.fsf@dem-tschwing-1.ger.mentorg.com> <87y1taencs.fsf@dem-tschwing-1.ger.mentorg.com> <87mt9qe1wf.fsf@dem-tschwing-1.ger.mentorg.com> User-Agent: Notmuch/0.29.1+93~g67ed7df (https://notmuchmail.org) Emacs/26.3 (x86_64-pc-linux-gnu) Date: Fri, 21 Oct 2022 10:38:14 +0200 Message-ID: <875ygdefm1.fsf@dem-tschwing-1.ger.mentorg.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-15.mgc.mentorg.com (139.181.222.15) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,KAM_SHORT,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --=-=-= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi! On 2022-10-21T00:44:30+0200, Aldy Hernandez wrote: > On Thu, Oct 20, 2022 at 9:22 PM Thomas Schwinge = wrote: >> "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached? > > I see 7 different tests in this patch. Did the 6 that pass, fail > before my patch for PR107195 and are now working? Cause unless > that's the case, they shouldn't be in a test named pr107195-3.c, but > somewhere else. That's correct; I should've mentioned that I had verified this. With the code changes of commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04 "[PR107195] Set range to zero when nonzero mask is 0" reverted, we get: PASS: gcc.dg/tree-ssa/pr107195-3.c (test for excess errors) FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_ca= ll I see there's one XFAILed test in your patch ... XFAILed test case removed, see the attached "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]"; OK now to push that version? > and this certainly > doesn't look like something that has anything to do with the patch I > submitted. Perhaps you could open a PR with an enhancement request > for this one? > > That being said... > > /* { dg-additional-options -O1 } */ > extern int > __attribute__((const)) > foo4b (int); > > int f4b (unsigned int r) > { > if (foo4b (r)) > r *=3D 8U; > > if ((r / 2U) & 2U) > r +=3D foo4b (r); > > return r; > } > /* { dg-final { scan-tree-dump-times {gimple_call xfail *-*-* } } } */ > > At -O2, this is something PRE is doing, so GCC already handles this. > However, you are suggesting this isn't handled at -O1 and should be?? My thinking was that this optimization does work for 'r >> 1', but it doesn't work for 'r / 2'. > None of the VRPs run at -O1 so ranger-vrp won't even get a chance. > However, DOM runs at -O1 and it uses ranger to do simple copy > propagation and some jump threading...so technically we could do > something... > > DOM should be able to thread from the r *=3D 8U to the return because > the nonzero mask (known zeros) after the multiplication is 0xfffffff8, > which it could use to solve the second conditional as false. This > would leave us with: > > if (foo4b (r)) > { > r *=3D 8U; > return r; > } > else > { > if ((r / 2U) & 2U) > r +=3D foo4b (r); > } > > ...which exposes the fact that the second call to foo4b() has the same > "r" as the first one, so it could be folded. I don't know whose job > it is to notice that two const calls have the same arguments, but ISTM > that if we thread the above correctly, someone should be able to clean > this up. No clue whether this happens at -O1. > > However... we're not threading this. It looks like we're not keeping > track of nonzero bits (known zeros) through the division. The > multiplication gives us 0xfffffff8 and we should be able to divide > that by 2 and get 0x7ffffffc which solves the second conditional to 0. > > So...maybe DOM+ranger could set things up for another pass to clean this = up? > > Either way, you could open an enhancement request, if anything to keep > the nonzero mask up to date through the division. I've thus filed "Optimization opportunity where integer '/' corresponds to '>>'" for continuing that investigation. Gr=C3=BC=C3=9Fe Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955 --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename="0001-Add-gcc.dg-tree-ssa-pr107195-3.c-PR107195.patch" >From e55e8569201c482507550eb56ff16aa3bbb48676 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Mon, 17 Oct 2022 09:10:03 +0200 Subject: [PATCH] Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] ... to display optimization performed as of recent commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04 "[PR107195] Set range to zero when nonzero mask is 0". PR tree-optimization/107195 gcc/testsuite/ * gcc.dg/tree-ssa/pr107195-3.c: New. --- gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c | 112 +++++++++++++++++++++ 1 file changed, 112 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c new file mode 100644 index 00000000000..eba4218b3c9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c @@ -0,0 +1,112 @@ +/* Inspired by 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'. */ + +/* { dg-additional-options -O1 } */ +/* { dg-additional-options -fdump-tree-dom3-raw } */ + + +extern int +__attribute__((const)) +foo1 (int); + +int f1 (int r) +{ + if (foo1 (r)) /* If this first 'if' holds... */ + r *= 2; /* ..., 'r' now has a zero-value lower-most bit... */ + + if (r & 1) /* ..., so this second 'if' can never hold... */ + { /* ..., so this is unreachable. */ + /* In constrast, if the first 'if' does not hold ('foo1 (r) == 0'), the + second 'if' may hold, but we know ('foo1' being 'const') that + 'foo1 (r) == 0', so don't have to re-evaluate it here: */ + r += foo1 (r); + } + + return r; +} +/* Thus, if optimizing, we only ever expect one call of 'foo1'. + { dg-final { scan-tree-dump-times {gimple_call > 1) & 2) + r += foo4 (r); + + return r; +} +/* { dg-final { scan-tree-dump-times {gimple_call