public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Aldy Hernandez <aldyh@redhat.com>
To: Thomas Schwinge <thomas@codesourcery.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.)
Date: Mon, 17 Oct 2022 15:58:47 +0200	[thread overview]
Message-ID: <CAGm3qMWQLevXbN+6W81vZ+BMd-upW6LBouM+ENn8Mpe9X7+ZNg@mail.gmail.com> (raw)
In-Reply-To: <878rlej3o6.fsf@euler.schwinge.homeip.net>

On Mon, Oct 17, 2022 at 9:44 AM Thomas Schwinge <thomas@codesourcery.com> wrote:
>
> Hi!
>
> On 2022-10-11T10:31:37+0200, Aldy Hernandez via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > When solving 0 = _15 & 1, we calculate _15 as:
> >
> >       [irange] int [-INF, -2][0, +INF] NONZERO 0xfffffffe
> >
> > The known value of _15 is [0, 1] NONZERO 0x1 which is intersected with
> > the above, yielding:
> >
> >       [0, 1] NONZERO 0x0
> >
> > This eventually gets copied to a _Bool [0, 1] NONZERO 0x0.
> >
> > This is problematic because here we have a bool which is zero, but
> > returns false for irange::zero_p, since the latter does not look at
> > nonzero bits.  This causes logical_combine to assume the range is
> > not-zero, and all hell breaks loose.
> >
> > I think we should just normalize a nonzero mask of 0 to [0, 0] at
> > creation, thus avoiding all this.
>
> 1. This commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
> "[PR107195] Set range to zero when nonzero mask is 0" broke a GCC/nvptx
> offloading test case:
>
>     UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
>     PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
>     PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
>     [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2   scan-nvptx-none-offload-rtl-dump mach "SESE regions:.* [0-9]+{[0-9]+->[0-9]+(\\.[0-9]+)+}"
>
> Same for C++.
>
> I'll later send a patch (for the test case!) to fix that up.
>
> 2. Looking into this, I found that this
> commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
> "[PR107195] Set range to zero when nonzero mask is 0" actually enables a
> code transformation/optimization that GCC apparently has not been doing
> before!  I've tried to capture that in the attached
> "Add 'c-c++-common/torture/pr107195-1.c' [PR107195]".

Nice.

>
> Will you please verify that one?  In its current '#if 1' configuration,
> it's all-PASS after commit
> r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
> "[PR107195] Set range to zero when nonzero mask is 0", whereas before, we
> get two calls to 'foo', because GCC apparently didnn't understand the
> relation (optimization opportunity) between 'r *= 2;' and the subsequent
> 'if (r & 1)'.

Yeah, that looks correct.  We keep better track of nonzero masks.

>
> I've left in the other '#if' variants in case you'd like to experiment
> with these, but would otherwise clean that up before pushing.
>
> Where does one put such a test case?
>
> Should the file be named 'pr107195' or something else?

The aforementioned patch already has:

            * gcc.dg/tree-ssa/pr107195-1.c: New test.
            * gcc.dg/tree-ssa/pr107195-2.c: New test.

So I would just add a pr107195-3.c test.

>
> Do we scan 'optimized', or an earlier dump?
>
> At '-O1', the actual code transformation is visible already in the 'dom2'
> dump:
>
>        <bb 3> [local count: 536870913]:
>        gimple_assign <mult_expr, r_7, r_6(D), 2, NULL>
>     +  gimple_assign <bit_and_expr, _11, r_7, 1, NULL>
>     +  goto <bb 6>; [100.00%]
>
>     -  <bb 4> [local count: 1073741824]:
>     -  # gimple_phi <r_4, r_6(D)(2), r_7(3)>
>     +  <bb 4> [local count: 536870912]:
>     +  # gimple_phi <r_4, r_6(D)(2)>
>        gimple_assign <bit_and_expr, _2, r_4, 1, NULL>
>        gimple_cond <ne_expr, _2, 0, NULL, NULL>
>     -    goto <bb 5>; [50.00%]
>     +    goto <bb 5>; [100.00%]
>        else
>     -    goto <bb 6>; [50.00%]
>     +    goto <bb 6>; [0.00%]
>
>        <bb 5> [local count: 536870913]:
>        gimple_call <foo, _3, r_4>
>        gimple_assign <plus_expr, r_8, _3, r_4, NULL>
>
>        <bb 6> [local count: 1073741824]:
>     -  # gimple_phi <r_5, r_4(4), r_8(5)>
>     +  # gimple_phi <r_5, r_4(4), r_8(5), r_7(3)>
>        gimple_return <r_5>
>
> And, the actual "avoid second call 'foo'" optimization is visiable
> starting 'dom3':
>
>        <bb 3> [local count: 536870913]:
>        gimple_assign <mult_expr, r_7, r_6(D), 2, NULL>
>     +  goto <bb 6>; [100.00%]
>
>     -  <bb 4> [local count: 1073741824]:
>     -  # gimple_phi <r_4, r_6(D)(2), r_7(3)>
>     -  gimple_assign <bit_and_expr, _2, r_4, 1, NULL>
>     +  <bb 4> [local count: 536870912]:
>     +  gimple_assign <bit_and_expr, _2, r_6(D), 1, NULL>
>        gimple_cond <ne_expr, _2, 0, NULL, NULL>
>     -    goto <bb 5>; [50.00%]
>     +    goto <bb 5>; [100.00%]
>        else
>     -    goto <bb 6>; [50.00%]
>     +    goto <bb 6>; [0.00%]
>
>        <bb 5> [local count: 536870913]:
>     -  gimple_call <foo, _3, r_4>
>     -  gimple_assign <plus_expr, r_8, _3, r_4, NULL>
>     +  gimple_assign <integer_cst, _3, 0, NULL, NULL>
>     +  gimple_assign <ssa_name, r_8, r_6(D), NULL, NULL>
>
>        <bb 6> [local count: 1073741824]:
>     -  # gimple_phi <r_5, r_4(4), r_8(5)>
>     +  # gimple_phi <r_5, r_6(D)(4), r_6(D)(5), r_7(3)>
>        gimple_return <r_5>
>
> ..., but I don't know if either of those would be stable/appropriate to
> scan instead of 'optimized'?

IMO, either dom3 or optimized is fine.

Thanks.
Aldy


  reply	other threads:[~2022-10-17 13:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-11  8:31 [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0 Aldy Hernandez
2022-10-17  7:43 ` Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.) Thomas Schwinge
2022-10-17 13:58   ` Aldy Hernandez [this message]
2022-10-17 14:46     ` Thomas Schwinge
2022-10-18  5:41       ` Aldy Hernandez
2022-10-20 11:38         ` Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.)) Thomas Schwinge
2022-10-20 12:23           ` Aldy Hernandez
2022-10-20 19:22             ` Thomas Schwinge
2022-10-20 22:44               ` Aldy Hernandez
2022-10-21  8:38                 ` Thomas Schwinge
2022-10-21  8:51                   ` Aldy Hernandez
2022-10-21  9:36   ` Restore 'libgomp.oacc-c-c++-common/nvptx-sese-1.c' SESE regions checking [PR107195, PR107344] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.) Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGm3qMWQLevXbN+6W81vZ+BMd-upW6LBouM+ENn8Mpe9X7+ZNg@mail.gmail.com \
    --to=aldyh@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=thomas@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).