From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by sourceware.org (Postfix) with ESMTPS id 7DCCD3853C03 for ; Wed, 28 Jul 2021 09:59:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7DCCD3853C03 Received: by mail-ej1-x62b.google.com with SMTP id e19so3612769ejs.9 for ; Wed, 28 Jul 2021 02:59:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=er1q4HYTe33sVmwGz/T/h2m+QqPPeYsNWiROF3EAjJE=; b=C79eH1H0vmXapvlJfqRvRX7dDMb95HBgZclBEZycxMRJtxW52Ss7FIu2I5KGcYtJBd OeArOjZQfOHulh9xu48xKEIZsdpHlgGCWfnIEIf7eD8fYJ5dLgYqfv4x+QB8nX8CrfQv PQYIf0cJ/MI8EXtCE2PyP9+XJBoHlvf8MYQyeR63/p6loo3Da6IkNNkYIk4NCC954vnm Kt4934t1fPK0FGW5hJ/MWKuPO5IMK0R5c9ei5DNmPZXHTU/4+o38wrbQmyqMhRqZNOQx 4tOF/E1XHEaDaBd/GrmjE+jgMOR5mGWDkGh3g5Cde0slyTBua91crFimpEZDiKuweFMb Uedg== X-Gm-Message-State: AOAM532JFFWs6TaHjbvaK8Ew3qlqmArR8tlZdN4VJgj72M+X7sBLn0Bz nz7jKinh8qL6s9sqGhnvKlUPmkl4jd5DBlTItFQ= X-Google-Smtp-Source: ABdhPJzHnYV4/FHD0n3Ql8s3TrHmas4niMyhp+4mUVD3yNTx2kE8XlE+l0qO66n4AetXrwDQUBeYhc1jUC0xKtrqPtY= X-Received: by 2002:a17:906:c834:: with SMTP id dd20mr19437846ejb.371.1627466363450; Wed, 28 Jul 2021 02:59:23 -0700 (PDT) MIME-Version: 1.0 References: <1e409dd-c5a8-a59f-d7ba-86ae805ab54f@hippo.saclay.inria.fr> In-Reply-To: From: Richard Biener Date: Wed, 28 Jul 2021 11:59:12 +0200 Message-ID: Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176] To: Victor Tong Cc: Marc Glisse , "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jul 2021 09:59:26 -0000 On Tue, Jun 29, 2021 at 1:10 AM Victor Tong wrote: > > Thanks Richard and Marc. > > I wrote the following test case to compare the outputs of fn1() and fn1No= Opt() below with my extra pattern being applied. I tested the two functions= with all of the integers from INT_MIN to INT_MAX. > > long > fn1 (int x) > { > return 42L - (long)(42 - x); > } > > #pragma GCC push_options > #pragma GCC optimize ("O0") > long > fn1NoOpt (int x) > { > volatile int y =3D (42 - x); > return 42L - (long)y; > } > #pragma GCC pop_options > > int main () > { > for (long i=3DINT_MIN; i<=3DINT_MAX;i++) > { > auto valNoOpt =3D fn1NoOpt(i); > auto valOpt =3D fn1(i); > if (valNoOpt !=3D valOpt) > printf("valOpt=3D%ld, valNoOpt=3D%ld\n", valOpt, = valNoOpt); > } > return 0; > } > > I saw that the return values of fn1() and fn1NoOpt() differed when the in= put was between INT_MIN and INT_MIN+42 inclusive. When passing values in th= is range to fn1NoOpt(), a signed overflow is triggered which causes the val= ue to differ (undefined behavior). This seems to go in line with what Marc = described and I think the transformation is correct in the scenario above. = I do think that type casts that result in truncation (i.e. from a higher pr= ecision to a lower one) or with unsigned types will result in an incorrect = transformation so those scenarios need to be avoided. > > Given that the extra pattern I'm adding is taking advantage the undefined= behavior of signed integer overflow, I'm considering keeping the existing = nop_convert pattern in place and adding a new pattern to cover these new ca= ses. I'd also like to avoid touching nop_convert given that it's used in a = number of other patterns. > > This is the pattern I have currently: > > (simplify > (minus (convert1? @0) (convert2? (minus (convert3? @2) @1))) > (if (operand_equal_p(@0, @2, 0) The operand_equal_p should be reflected by using @0 in place of @2. > && INTEGRAL_TYPE_P (type) > && TYPE_OVERFLOW_UNDEFINED(type) > && !TYPE_OVERFLOW_SANITIZED(type) > && INTEGRAL_TYPE_P (TREE_TYPE(@1)) > && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@1)) > && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@1)) > && !TYPE_UNSIGNED (TREE_TYPE (@1)) > && !TYPE_UNSIGNED (type) please group checks on common argument. I think a single test on INTEGRAL_TYPE_P (type) is enough, it could be ANY_INTEGRAL_TYPE_P to include vector and complex types. > && TYPE_PRECISION (TREE_TYPE (@1)) <=3D TYPE_PRECISION (type) > && INTEGRAL_TYPE_P (TREE_TYPE(@0)) > && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@0)) why is this testing TREE_TYPE (@0)? The outer subtract is using 'type', the inner subtract uses TREE_TYPE (@1) though you could place a capture on the minus to make what you test more obvious, like (minus (convert1? @0) (convert2? (minus@3 (convert3? @2) @1))) TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@3)) there's one set of checks too much I think. > && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0)) > && !TYPE_UNSIGNED (TREE_TYPE (@0)) we only ever have TYPE_OVERFLOW_UNDEFINED on singed types so the !TYPE_UNSIGNED checks are superfluous. > && TYPE_PRECISION (TREE_TYPE (@0)) <=3D TYPE_PRECISION (type) > && TREE_TYPE(@1) =3D=3D TREE_TYPE(@2)) This check means that convert3 is never present since a MINUS requires compatible types. Sorry for the late reply. Note the pattern will necessarily be partly redundant with the (simplify (minus (nop_convert1? (minus (nop_convert2? @0) @1)) @0) (if (!ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type)) (negate (view_convert @1)) (view_convert (negate @1)))) pattern. Once we'd "inline" nop_convert genmatch would complain about this. > (convert @1))) > > Is there a more concise/better way of writing the pattern? I was looking = for similar checks in match.pd and I couldn't find anything that I could le= verage. > > I also kept my pattern to the specific scenario I'm seeing with the regre= ssion to lower the risk of something breaking. I've limited @1 and @2 to ha= ve the same type. > > I'm also in favor of adding/running computer verification to make sure th= e transformation is legal. I've written some tests to verify that the patte= rn is being applied in the right scenarios and not being applied in others,= but I think there are too many possibilities to manually write them all. I= s there anything in GCC that can be used to verify that match.pd transforma= tions are correct? I'm thinking of something like Alive https://github.com/= AliveToolkit/alive2. > > Thanks, > Victor > > > > From: Richard Biener > Sent: Monday, June 21, 2021 12:08 AM > To: Marc Glisse > Cc: Victor Tong ; gcc-patches@gcc.gnu.org > Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division = followed by multiply [PR95176] > > On Sat, Jun 19, 2021 at 7:05 PM Marc Glisse wrote: > > > > On Fri, 18 Jun 2021, Richard Biener wrote: > > > > >> Option 2: Add a new pattern to support scenarios that the existing n= op_convert pattern bails out on. > > >> > > >> Existing pattern: > > >> > > >> (simplify > > >> (minus (nop_convert1? @0) (nop_convert2? (minus (nop_convert3? @@= 0) @1))) > > >> (view_convert @1)) > > > > I tried to check with a program when > > > > T3 x; > > T1 y; > > (T2)x-(T2)((T1)x-y) > > > > can be safely replaced with > > > > (T2)y > > > > From the output, it looks like this is safe when T1 is at least as larg= e > > as T2. It is wrong when T1 is unsigned and smaller than T2. And when T1= is > > signed and smaller than T2, it is ok if T3 is the same type as T1 (sign= ed > > then) or has strictly less precision (any sign), and not in other cases= . > > > > Note that this is when signed implies undefined overflow and unsigned > > implies wrapping, and I wouldn't put too much faith in this recently > > dusted program. And it doesn't say how to write the match.pd pattern wi= th > > '?', "@@", disabling it if TYPE_OVERFLOW_SANITIZED, etc. > > > > Mostly, I wanted to say that if we are going to go handle more than > > nop_convert for more than just 1 or 2 easy transformations, I think som= e > > kind of computer verification would be useful, it would save a lot of t= ime > > and headaches. > > True. I wonder if auto-generating such tests from match.pd rules would > be a good project to work on (GSoC?). When there's define_match > involved things get a little tricky, but then one item on the TODO list > is "inlining" those at the use site (exploding the decision tree even mor= e). > > Richard. > > > (I just check by brute force all possible precisions (from 1 to 6) and > > signedness for T1, T2 and T3, all possible values for x and y, compute > > the before and after formulas, accepting if there is UB before, rejecti= ng > > if there is UB after (and not before), and manually try to see a patter= n > > in the list of types that work) > > > > -- > > Marc Glisse