Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Victor Tong <vitong@microsoft.com>
To: Richard Biener <richard.guenther@gmail.com>,
	Marc Glisse <marc.glisse@inria.fr>,
	Victor Tong <vitong@microsoft.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]
Date: Mon, 19 Jul 2021 20:58:22 +0000	[thread overview]
Message-ID: <MWHPR2101MB081117A434FB24A771FE3DB1CCE19@MWHPR2101MB0811.namprd21.prod.outlook.com> (raw)
In-Reply-To: <MWHPR2101MB08112E560629D93249766876CC039@MWHPR2101MB0811.namprd21.prod.outlook.com>

Gentle ping.
________________________________
From: Gcc-patches <gcc-patches-bounces+vitong=microsoft.com@gcc.gnu.org> on behalf of Victor Tong via Gcc-patches <gcc-patches@gcc.gnu.org>
Sent: Monday, June 28, 2021 4:10 PM
To: Richard Biener <richard.guenther@gmail.com>; Marc Glisse <marc.glisse@inria.fr>
Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>
Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]

Thanks Richard and Marc.

I wrote the following test case to compare the outputs of fn1() and fn1NoOpt() below with my extra pattern being applied. I tested the two functions with all of the integers from INT_MIN to INT_MAX.

long
fn1 (int x)
{
  return 42L - (long)(42 - x);
}

#pragma GCC push_options
#pragma GCC optimize ("O0")
long
fn1NoOpt (int x)
{
  volatile int y = (42 - x);
  return 42L - (long)y;
}
#pragma GCC pop_options

int main ()
{
        for (long i=INT_MIN; i<=INT_MAX;i++)
        {
                auto valNoOpt = fn1NoOpt(i);
                auto valOpt = fn1(i);
                if (valNoOpt != valOpt)
                        printf("valOpt=%ld, valNoOpt=%ld\n", valOpt, valNoOpt);
        }
        return 0;
}

I saw that the return values of fn1() and fn1NoOpt() differed when the input was between INT_MIN and INT_MIN+42 inclusive. When passing values in this range to fn1NoOpt(), a signed overflow is triggered which causes the value to differ (undefined behavior). This seems to go in line with what Marc described and I think the transformation is correct in the scenario above. I do think that type casts that result in truncation (i.e. from a higher precision to a lower one) or with unsigned types will result in an incorrect transformation so those scenarios need to be avoided.

Given that the extra pattern I'm adding is taking advantage the undefined behavior of signed integer overflow, I'm considering keeping the existing nop_convert pattern in place and adding a new pattern to cover these new cases. I'd also like to avoid touching nop_convert given that it's used in a number of other patterns.

This is the pattern I have currently:

  (simplify
    (minus (convert1? @0) (convert2? (minus (convert3? @2) @1)))
    (if (operand_equal_p(@0, @2, 0)
        && INTEGRAL_TYPE_P (type)
        && TYPE_OVERFLOW_UNDEFINED(type)
        && !TYPE_OVERFLOW_SANITIZED(type)
        && INTEGRAL_TYPE_P (TREE_TYPE(@1))
        && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@1))
        && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@1))
        && !TYPE_UNSIGNED (TREE_TYPE (@1))
        && !TYPE_UNSIGNED (type)
        && TYPE_PRECISION (TREE_TYPE (@1)) <= TYPE_PRECISION (type)
        && INTEGRAL_TYPE_P (TREE_TYPE(@0))
        && TYPE_OVERFLOW_UNDEFINED(TREE_TYPE(@0))
        && !TYPE_OVERFLOW_SANITIZED(TREE_TYPE(@0))
        && !TYPE_UNSIGNED (TREE_TYPE (@0))
        && TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
        && TREE_TYPE(@1) == TREE_TYPE(@2))
    (convert @1)))

Is there a more concise/better way of writing the pattern? I was looking for similar checks in match.pd and I couldn't find anything that I could leverage.

I also kept my pattern to the specific scenario I'm seeing with the regression to lower the risk of something breaking. I've limited @1 and @2 to have the same type.

I'm also in favor of adding/running computer verification to make sure the transformation is legal. I've written some tests to verify that the pattern is being applied in the right scenarios and not being applied in others, but I think there are too many possibilities to manually write them all. Is there anything in GCC that can be used to verify that match.pd transformations are correct? I'm thinking of something like Alive https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAliveToolkit%2Falive2&amp;data=04%7C01%7Cvitong%40microsoft.com%7Cba7d8f9f9b774462148608d93a8a0471%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637605186726283785%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ugx%2FTw58OPjLzamE9yqQThV5u4EfQ8JLrurnIy00AzQ%3D&amp;reserved=0.

Thanks,
Victor

From: Richard Biener <richard.guenther@gmail.com>
Sent: Monday, June 21, 2021 12:08 AM
To: Marc Glisse <marc.glisse@inria.fr>
Cc: Victor Tong <vitong@microsoft.com>; gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>
Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization: Optimize division followed by multiply [PR95176]

On Sat, Jun 19, 2021 at 7:05 PM Marc Glisse <marc.glisse@inria.fr> wrote:
>
> On Fri, 18 Jun 2021, Richard Biener wrote:
>
> >> Option 2: Add a new pattern to support scenarios that the existing nop_convert pattern bails out on.
> >>
> >> Existing pattern:
> >>
> >> (simplify
> >>    (minus (nop_convert1? @0) (nop_convert2? (minus (nop_convert3? @@0) @1)))
> >>    (view_convert @1))
>
> I tried to check with a program when
>
> T3 x;
> T1 y;
> (T2)x-(T2)((T1)x-y)
>
> can be safely replaced with
>
> (T2)y
>
> From the output, it looks like this is safe when T1 is at least as large
> as T2. It is wrong when T1 is unsigned and smaller than T2. And when T1 is
> signed and smaller than T2, it is ok if T3 is the same type as T1 (signed
> then) or has strictly less precision (any sign), and not in other cases.
>
> Note that this is when signed implies undefined overflow and unsigned
> implies wrapping, and I wouldn't put too much faith in this recently
> dusted program. And it doesn't say how to write the match.pd pattern with
> '?', "@@", disabling it if TYPE_OVERFLOW_SANITIZED, etc.
>
> Mostly, I wanted to say that if we are going to go handle more than
> nop_convert for more than just 1 or 2 easy transformations, I think some
> kind of computer verification would be useful, it would save a lot of time
> and headaches.

True.  I wonder if auto-generating such tests from match.pd rules would
be a good project to work on (GSoC?).  When there's define_match
involved things get a little tricky, but then one item on the TODO list
is "inlining" those at the use site (exploding the decision tree even more).

Richard.

> (I just check by brute force all possible precisions (from 1 to 6) and
> signedness for T1, T2 and T3, all possible values for x and y, compute
> the before and after formulas, accepting if there is UB before, rejecting
> if there is UB after (and not before), and manually try to see a pattern
> in the list of types that work)
>
> --
> Marc Glisse

next prev parent reply	other threads:[~2021-07-19 20:58 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-31 23:02 Victor Tong
2021-04-27  8:29 ` Richard Biener
2021-06-02 20:55   ` [EXTERNAL] " Victor Tong
2021-06-07  8:25     ` Richard Biener
2021-06-16 18:49       ` Victor Tong
2021-06-18  9:43         ` Richard Biener
2021-06-19 17:05           ` Marc Glisse
2021-06-21  7:08             ` Richard Biener
2021-06-28 23:10               ` Victor Tong
2021-07-19 20:58                 ` Victor Tong [this message]
2021-07-28  9:59                 ` Richard Biener
2021-08-06 21:17                   ` Victor Tong
2021-08-06 22:49                     ` Marc Glisse
2021-08-09  9:58                       ` Richard Biener
2021-08-23  0:44                         ` Victor Tong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR2101MB081117A434FB24A771FE3DB1CCE19@MWHPR2101MB0811.namprd21.prod.outlook.com \
    --to=vitong@microsoft.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=marc.glisse@inria.fr \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).