From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.guenther@gmail.com>
Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com
 [IPv6:2a00:1450:4864:20::62b])
 by sourceware.org (Postfix) with ESMTPS id 7DCCD3853C03
 for <gcc-patches@gcc.gnu.org>; Wed, 28 Jul 2021 09:59:24 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7DCCD3853C03
Received: by mail-ej1-x62b.google.com with SMTP id e19so3612769ejs.9
 for <gcc-patches@gcc.gnu.org>; Wed, 28 Jul 2021 02:59:24 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc:content-transfer-encoding;
 bh=er1q4HYTe33sVmwGz/T/h2m+QqPPeYsNWiROF3EAjJE=;
 b=C79eH1H0vmXapvlJfqRvRX7dDMb95HBgZclBEZycxMRJtxW52Ss7FIu2I5KGcYtJBd
 OeArOjZQfOHulh9xu48xKEIZsdpHlgGCWfnIEIf7eD8fYJ5dLgYqfv4x+QB8nX8CrfQv
 PQYIf0cJ/MI8EXtCE2PyP9+XJBoHlvf8MYQyeR63/p6loo3Da6IkNNkYIk4NCC954vnm
 Kt4934t1fPK0FGW5hJ/MWKuPO5IMK0R5c9ei5DNmPZXHTU/4+o38wrbQmyqMhRqZNOQx
 4tOF/E1XHEaDaBd/GrmjE+jgMOR5mGWDkGh3g5Cde0slyTBua91crFimpEZDiKuweFMb
 Uedg==
X-Gm-Message-State: AOAM532JFFWs6TaHjbvaK8Ew3qlqmArR8tlZdN4VJgj72M+X7sBLn0Bz
 nz7jKinh8qL6s9sqGhnvKlUPmkl4jd5DBlTItFQ=
X-Google-Smtp-Source: ABdhPJzHnYV4/FHD0n3Ql8s3TrHmas4niMyhp+4mUVD3yNTx2kE8XlE+l0qO66n4AetXrwDQUBeYhc1jUC0xKtrqPtY=
X-Received: by 2002:a17:906:c834:: with SMTP id
 dd20mr19437846ejb.371.1627466363450; 
 Wed, 28 Jul 2021 02:59:23 -0700 (PDT)
MIME-Version: 1.0
References: <CY4PR2101MB0801A8FFC0954BA013219E76CC7C9@CY4PR2101MB0801.namprd21.prod.outlook.com>
 <CAFiYyc1NJXgedUqSzDquv0odhZdcfP1zepSnOP27DSEg3eDy3g@mail.gmail.com>
 <MWHPR2101MB0811F770CA794056CA891FD5CC3D9@MWHPR2101MB0811.namprd21.prod.outlook.com>
 <CAFiYyc3DV6XrnQuU3LywpasS6BWoxC9-+4CDoEFXWSynmF92ww@mail.gmail.com>
 <MWHPR2101MB08118007CBC33CDFA4BA09FBCC0F9@MWHPR2101MB0811.namprd21.prod.outlook.com>
 <CAFiYyc3BVkKOMiVGPFQVOqp+YEKPZKeN68nfe72gmbj6o_M25Q@mail.gmail.com>
 <1e409dd-c5a8-a59f-d7ba-86ae805ab54f@hippo.saclay.inria.fr>
 <CAFiYyc06F5td4+JYS4MorjDKKE9JBgO0nPiQsi2n3ogi+xc8TA@mail.gmail.com>
 <MWHPR2101MB08112E560629D93249766876CC039@MWHPR2101MB0811.namprd21.prod.outlook.com>
In-Reply-To: <MWHPR2101MB08112E560629D93249766876CC039@MWHPR2101MB0811.namprd21.prod.outlook.com>
From: Richard Biener <richard.guenther@gmail.com>
Date: Wed, 28 Jul 2021 11:59:12 +0200
Message-ID: <CAFiYyc0ypRj51YQ8N1iF0dpi3gm06fLKZsTMyWVDx=d159Fsnw@mail.gmail.com>
Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division
 followed by multiply [PR95176]
To: Victor Tong <vitong@microsoft.com>
Cc: Marc Glisse <marc.glisse@inria.fr>, 
 "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jul 2021 09:59:26 -0000

On Tue, Jun 29, 2021 at 1:10 AM Victor Tong <vitong@microsoft.com> wrote:
>
> Thanks Richard and Marc.
>
> I wrote the following test case to compare the outputs of fn1() and fn1No=
Opt() below with my extra pattern being applied. I tested the two functions=
 with all of the integers from INT_MIN to INT_MAX.
>
> long
> fn1 (int x)
> {
>   return 42L - (long)(42 - x);
> }
>
> #pragma GCC push_options
> #pragma GCC optimize ("O0")
> long
> fn1NoOpt (int x)
> {
>   volatile int y =3D (42 - x);
>   return 42L - (long)y;
> }
> #pragma GCC pop_options
>
> int main ()
> {
>         for (long i=3DINT_MIN; i<=3DINT_MAX;i++)
>         {
>                 auto valNoOpt =3D fn1NoOpt(i);
>                 auto valOpt =3D fn1(i);
>                 if (valNoOpt !=3D valOpt)
>                         printf("valOpt=3D%ld, valNoOpt=3D%ld\n", valOpt, =
valNoOpt);
>         }
>         return 0;
> }
>
> I saw that the return values of fn1() and fn1NoOpt() differed when the in=
put was between INT_MIN and INT_MIN+42 inclusive. When passing values in th=
is range to fn1NoOpt(), a signed overflow is triggered which causes the val=
ue to differ (undefined behavior). This seems to go in line with what Marc =
described and I think the transformation is correct in the scenario above. =
I do think that type casts that result in truncation (i.e. from a higher pr=
ecision to a lower one) or with unsigned types will result in an incorrect =
transformation so those scenarios need to be avoided.
>
> Given that the extra pattern I'm adding is taking advantage the undefined=
 behavior of signed integer overflow, I'm considering keeping the existing =
nop_convert pattern in place and adding a new pattern to cover these new ca=
ses. I'd also like to avoid touching nop_convert given that it's used in a =
number of other patterns.
>
> This is the pattern I have currently:
>
>   (simplify
>     (minus (convert1? @0) (convert2? (minus (convert3? @2) @1)))
>     (if (operand_equal_p(@0, @2, 0)

The operand_equal_p should be reflected by using @0 in place of @2.

>         && INTEGRAL_TYPE_P (type)
>         && TYPE_OVERFLOW_UNDEFINED(type)
>         && !TYPE_OVERFLOW_SANITIZED(type)
>         && INTEGRAL_TYPE_P (TREE_TYPE(@1))
>         && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@1))
>         && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@1))
>         && !TYPE_UNSIGNED (TREE_TYPE (@1))
>         && !TYPE_UNSIGNED (type)

please group checks on common argument.  I think a single test
on INTEGRAL_TYPE_P (type) is enough, it could be ANY_INTEGRAL_TYPE_P
to include vector and complex types.

>         && TYPE_PRECISION (TREE_TYPE (@1)) <=3D TYPE_PRECISION (type)
>         && INTEGRAL_TYPE_P (TREE_TYPE(@0))
>         && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@0))

why is this testing TREE_TYPE (@0)?  The outer subtract is using 'type',
the inner subtract uses TREE_TYPE (@1) though you could place
a capture on the minus to make what you test more obvious, like

  (minus (convert1? @0) (convert2? (minus@3 (convert3? @2) @1)))

TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@3))

there's one set of checks too much I think.

>         && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0))
>         && !TYPE_UNSIGNED (TREE_TYPE (@0))

we only ever have TYPE_OVERFLOW_UNDEFINED on singed types so
the !TYPE_UNSIGNED checks are superfluous.

>         && TYPE_PRECISION (TREE_TYPE (@0)) <=3D TYPE_PRECISION (type)
>         && TREE_TYPE(@1) =3D=3D TREE_TYPE(@2))

This check means that convert3 is never present since a MINUS requires
compatible types.

Sorry for the late reply.

Note the pattern will necessarily be partly redundant with the

  (simplify
   (minus (nop_convert1? (minus (nop_convert2? @0) @1)) @0)
   (if (!ANY_INTEGRAL_TYPE_P (type)
        || TYPE_OVERFLOW_WRAPS (type))
   (negate (view_convert @1))
   (view_convert (negate @1))))

pattern.  Once we'd "inline" nop_convert genmatch would complain
about this.

>     (convert @1)))
>
> Is there a more concise/better way of writing the pattern? I was looking =
for similar checks in match.pd and I couldn't find anything that I could le=
verage.
>
> I also kept my pattern to the specific scenario I'm seeing with the regre=
ssion to lower the risk of something breaking. I've limited @1 and @2 to ha=
ve the same type.
>
> I'm also in favor of adding/running computer verification to make sure th=
e transformation is legal. I've written some tests to verify that the patte=
rn is being applied in the right scenarios and not being applied in others,=
 but I think there are too many possibilities to manually write them all. I=
s there anything in GCC that can be used to verify that match.pd transforma=
tions are correct? I'm thinking of something like Alive https://github.com/=
AliveToolkit/alive2.
>
> Thanks,
> Victor
>
>
>
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Monday, June 21, 2021 12:08 AM
> To: Marc Glisse <marc.glisse@inria.fr>
> Cc: Victor Tong <vitong@microsoft.com>; gcc-patches@gcc.gnu.org <gcc-patc=
hes@gcc.gnu.org>
> Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division =
followed by multiply [PR95176]
>
> On Sat, Jun 19, 2021 at 7:05 PM Marc Glisse <marc.glisse@inria.fr> wrote:
> >
> > On Fri, 18 Jun 2021, Richard Biener wrote:
> >
> > >> Option 2: Add a new pattern to support scenarios that the existing n=
op_convert pattern bails out on.
> > >>
> > >> Existing pattern:
> > >>
> > >> (simplify
> > >>    (minus (nop_convert1? @0) (nop_convert2? (minus (nop_convert3? @@=
0) @1)))
> > >>    (view_convert @1))
> >
> > I tried to check with a program when
> >
> > T3 x;
> > T1 y;
> > (T2)x-(T2)((T1)x-y)
> >
> > can be safely replaced with
> >
> > (T2)y
> >
> > From the output, it looks like this is safe when T1 is at least as larg=
e
> > as T2. It is wrong when T1 is unsigned and smaller than T2. And when T1=
 is
> > signed and smaller than T2, it is ok if T3 is the same type as T1 (sign=
ed
> > then) or has strictly less precision (any sign), and not in other cases=
.
> >
> > Note that this is when signed implies undefined overflow and unsigned
> > implies wrapping, and I wouldn't put too much faith in this recently
> > dusted program. And it doesn't say how to write the match.pd pattern wi=
th
> > '?', "@@", disabling it if TYPE_OVERFLOW_SANITIZED, etc.
> >
> > Mostly, I wanted to say that if we are going to go handle more than
> > nop_convert for more than just 1 or 2 easy transformations, I think som=
e
> > kind of computer verification would be useful, it would save a lot of t=
ime
> > and headaches.
>
> True.  I wonder if auto-generating such tests from match.pd rules would
> be a good project to work on (GSoC?).  When there's define_match
> involved things get a little tricky, but then one item on the TODO list
> is "inlining" those at the use site (exploding the decision tree even mor=
e).
>
> Richard.
>
> > (I just check by brute force all possible precisions (from 1 to 6) and
> > signedness for T1, T2 and T3, all possible values for x and y, compute
> > the before and after formulas, accepting if there is UB before, rejecti=
ng
> > if there is UB after (and not before), and manually try to see a patter=
n
> > in the list of types that work)
> >
> > --
> > Marc Glisse