From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x232.google.com (mail-oi1-x232.google.com [IPv6:2607:f8b0:4864:20::232]) by sourceware.org (Postfix) with ESMTPS id 4670C3858D3C for ; Sat, 9 Jul 2022 18:19:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4670C3858D3C Received: by mail-oi1-x232.google.com with SMTP id j193so2402380oih.0 for ; Sat, 09 Jul 2022 11:19:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=viyUEd9zBQh+OJdgiApldjYoG1hXfAMs/2BQP0qXfUM=; b=01hp9l8EpTvXZFVev/h1dzLt8EpHVim7UtPqLnTU9hUPwVfCgZfI8MsT2waO5SoMXE XVfmkp8IVkdonTp4emHtUixK43RWDlvFfaIN5MnzrAuVIMgP/j2c8yN6yTUUoigWfid+ yct7vn6x/JeRccJhQ+hDSb4Xg/84ckEK8hh0DCN3AKj3KMjEXjIt4F/Zvlib4IRBfcq+ CJQFR6tFQtyfx1wl6iPHCN6DcHZfUvJVJ+R07cx0AxT/ELCbQOyZEKNVp5WV2yY0dlCS O8Kw6oWVpxytPBU7T48PpE9ORehhCgeNEC1U7JADeHA4Qksvx6Mat8KTOXajsGx26XJD iNLw== X-Gm-Message-State: AJIora9wo8slFj6R0KKFNmbQ9HEO7N1xI+r8GXDW8wEYZg3Q9vUIAVnW gDV+A1V8RKjy9TVbqWNCEO7R9gHXqrs= X-Google-Smtp-Source: AGRyM1vbJbZMVChzfusxi/hr3XijO7NlBJk9umWXVAI6vzlNbPnK8UqcjFPxLBKvj08BrcPP8ESSCg== X-Received: by 2002:aca:908:0:b0:337:c93f:f213 with SMTP id 8-20020aca0908000000b00337c93ff213mr3216530oij.83.1657390782020; Sat, 09 Jul 2022 11:19:42 -0700 (PDT) Received: from [172.31.0.204] (c-73-63-24-84.hsd1.ut.comcast.net. [73.63.24.84]) by smtp.gmail.com with ESMTPSA id v25-20020a9d4e99000000b00616d713c062sm991201otk.28.2022.07.09.11.19.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 09 Jul 2022 11:19:41 -0700 (PDT) Message-ID: Date: Sat, 9 Jul 2022 12:19:40 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [PATCH][v2] tree-optimization: Fold (type)X / (type)Y [PR103855] Content-Language: en-US To: gcc-patches@gcc.gnu.org, Zhao Wei Liew References: From: Jeff Law In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Jul 2022 18:19:45 -0000 On 3/16/2022 6:49 PM, Zhao Wei Liew via Gcc-patches wrote: > Thanks for the detailed review. I have uploaded patch v2 based on the review. > > v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590604.html > Changes since v1: > 1. Add patterns for the cases CST / (T)x and (T)x / CST as well. Fix > test regressions caused by those patterns. > 2. Support signed integers except for the possible INT_MIN / -1 case. > 3. Support cases where X and Y have different precisions. > > On Tue, 22 Feb 2022 at 15:54, Richard Biener wrote: >> +/* (type)X / (type)Y -> (type)(X / Y) >> + when the resulting type is at least precise as the original types >> + and when all the types are unsigned integral types. */ >> +(simplify >> + (trunc_div (convert @0) (convert @1)) >> + (if (INTEGRAL_TYPE_P (type) >> + && INTEGRAL_TYPE_P (TREE_TYPE (@0)) >> + && INTEGRAL_TYPE_P (TREE_TYPE (@1)) >> + && TYPE_UNSIGNED (type) >> + && TYPE_UNSIGNED (TREE_TYPE (@0)) >> + && TYPE_UNSIGNED (TREE_TYPE (@1)) >> + && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1)) >> + && TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@0))) >> + (convert (trunc_div @0 @1)))) >> >> since you are requiring the types of @0 and @1 to match it's easier to write >> >> && types_match (TREE_TYPE(@0), TREE_TYPE (@1)) >> > Thanks. In the new patch, TREE_TYPE (@0) and TREE_TYPE (@1) can now > have different precisions, so I believe that I can't use types_match() > anymore. > >> that allows you to elide checks on either @0 or @1. I suppose the transform >> does not work for signed types because of the -INT_MIN / -1 overflow case? >> It might be possible to use expr_not_equal_to (@0, -INT_MIN) || >> expr_not_equal_to (@1, -1) >> (correctly written, lookup the existing examples in match.pd for the X >> % -Y transform) > That's right. I have made changes to support signed types except for > the INT_MIN / -1 case. > >> I'll note that as written the transform will not catch CST / (T)x or >> (T)x / CST since >> you'll not see conversions around constants. I'm not sure whether >> using (convert[12]? ...) >> or writing special patterns with INTEGER_CST operands is more convenient. >> There is int_fits_type_p to check whether a constant will fit in a >> type without truncation >> or sign change. > I see. I couldn't find an easy way to use (convert[12]? ...) in this > case so I added 2 other patterns for the CST / (T)x and (T)x / CST > cases. > >> When @0 and @1 do not have the same type there might still be a common type >> that can be used and is smaller than 'type', it might be as simple as using >> build_nonstandard_integer_type (MIN (@0-prec, @1-prec), 1 /*unsigned_p*/). > Thanks for the tip. I've made a similar change, but using either > TREE_TYPE (@0) or TREE_TYPE (@1) instead of > build_nonstandard_integer_type(). > >> In the past there have been attempts to more globally narrow operations using >> a new pass rather than using individual patterns. So for more complicated cases >> that might be the way to go. There's now also the ISEL pass which does >> pre-RTL expansion transforms that need some global context and for example >> can look at SSA uses. > I had wanted to do something similar to handle all integral > trunc_divs, but I'm not sure where/how to start. It seems out of my > league at this moment. I'll gladly explore it after this change > though! Just a couple general notes in case you want to poke further in this space. Earlier you mentioned you were trying to do some of this work in expand_divmod, but the operands had rtx types rather than tree types, thus you're unable to get at the range data. In expand_expr_divmod there's treeop0, treeop1.  So if they are SSA_NAMEs, then you can query for their range. Richi mentioned attempts to globally narrow during a new pass. That's a much broader chance, but has the potential to apply in a lot more places.    Narrowing early would potentially expose the narrowed statements to the vectorizer which could be a win. Narrowing late is likely a win too, but it likely only helps the expansion phase generate narrower operations.  This is obviously a larger project than just handling the division cases in match.pd. Tackling it is not (IMHO) a requirement to move forward. Finally, narrowing isn't always a win.  There have been micro-architectures were sub-word accesses are slower than word accesses.  I'm not too worried about it for this patch, but it could become more important if you were to look into a more general solution. As far as the patch is concerned.  At one point I was a bit worried that expr_not_equal_to wasn't right.  But after digging a bit deeper into its implementation, it should do the right thing here.  Note if you wanted to see how to get to underlying range data, you can find a simple example in that function. Your new testcase needs to limit itself to targets where integers are at least 32 bits wide.    Something like this at the top of the new test: /* { dg-require-effective-target int32plus } */ There are additional cases Andrew Pinski pointed out in BZ.  Does your new code handle all of them.  It's OK if it doesn't, but obviously not optimal.  Consider adding his additional testcases, potentially xfailed if they're not handled yet. You changed gcc.dg/ipa/pr91088.c and the ChangeLog indicates "fix a test regression".   Can you say a bit more about what's going on there.  It's likely find, but we want to make sure that your change hasn't compromised the test. Otherwise it looks pretty good. jeff