From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12832 invoked by alias); 29 May 2019 13:11:40 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 12824 invoked by uid 89); 29 May 2019 13:11:40 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mail-lj1-f176.google.com Received: from mail-lj1-f176.google.com (HELO mail-lj1-f176.google.com) (209.85.208.176) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 29 May 2019 13:11:38 +0000 Received: by mail-lj1-f176.google.com with SMTP id w1so2431958ljw.0 for ; Wed, 29 May 2019 06:11:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=BsqYNDDG6DZZ6vhLMetKeBfr3oFSzkjo6bv889JsbXM=; b=thrGBkXNsSg8LKcfbK0aVvHfmvfhs+99ISarwvV3qoIvdA/CTraRyEzFKt1G3ML3vr 4qI7K8S6nGkS08R+0A9VWefaAxipbjTsiThQBN3xOCtyNxnLZXKMxBKhtb6dYF0C2+TZ 6mO23+aiTRkrGGIHwwTjR7xOCSRZdFkyvszLzOGgD0VXnoa4XREI4mw5tOQm0pWq/0ur ZZ4O8n8LLduVu2fFyoq/iTehkAwQr7MFlEpPY35dPq2kP86/CZTOkkgWC3DqGHWjOnf0 zR0ar8a6OcdDnN9H9Gx6xpZHsYBsIKoAPOEM+NIqrI0oZq0ftd5PbxDdlfbwHyE99ATD xCPA== MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Wed, 29 May 2019 13:11:00 -0000 Message-ID: Subject: Re: On-Demand range technology [3/5] - The Prototype To: Jeff Law Cc: Andrew MacLeod , GCC , Aldy Hernandez Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2019-05/txt/msg00249.txt.bz2 On Tue, May 28, 2019 at 4:41 PM Jeff Law wrote: > > On 5/23/19 7:10 AM, Richard Biener wrote: > > On Thu, May 23, 2019 at 3:29 AM Andrew MacLeod wr= ote: > >> > >> There is a functioning prototype in branch =E2=80=9Cssa-range=E2=80=9D= which is a proof > >> of concept that the approach is functional as well as quick, and can be > >> used to answer questions which come up regarding what it can and can= =E2=80=99t > >> do. Our last merge was on April 13th, so it's fairly up to date. > >> > >> We have implemented a flexible range class (irange) which allows for > >> multiple subranges to represent a range, and which can be extended in > >> the future to types other than integral. We use this throughout, but it > >> could be replaced in the ranger with any similar API. Conversion > >> routines are also provided to convert from irange to value_range and > >> value_range to irange. > >> > >> A full set of tree_code range-op routines are implemented. We have > >> commoned as much code as possible with the existing VRP range extracti= on > >> code. Also, we have added additional code for calculating the other > >> operands from a known result in numerous cases. > >> > >> The code base in VRP has been modified (via a flag) to > >> - Work completely with the native value_range like it does today. > >> - Use irange and the range-ops component under the covers to > >> extract ranges. Requests in VRP are then converted from value_ranges to > >> irange, called into the range-op routines, and then converted back to > >> value_range for VRP/EVRP=E2=80=99s use. > >> - Do operations both ways and compare the results to make sure bo= th > >> agree on the range, and trap if they do not. > >> > >> The branch defaults to the compare and trap mode to ensure everything = is > >> working correctly. This has been our correctness model for statement > >> range extraction and was active during the Fedora package builds. The > >> only time we disabled it was to do performance runs vs RVRP, and were > >> looking at both branch and trunk times for EVRP and VRP. > >> > >> Of note, we drop all symbolics in ranges to VARYING on everything exce= pt > >> PLUS and MINUS, which we leave as native calculations if there are > >> symbolics present. More on symbolics later. > >> > >> A VRP like pass called RVRP has been implemented. > >> - The vr-values statement simplification code has been factored o= ut > >> to be range agnostic, meaning that these routines can operate on either > >> value_range or irange. Thus, we are using a common code base to perform > >> statement simplification as well. > >> - For complete compatibility with EVRP, the RVRP pass builds > >> dominators and instantiates the SCEV loop information so we have loop > >> range info available. RVRP does not need this info to run, but would > >> miss some of the test cases which depend on loop ranges. > >> - RVRP is set up to demonstrate it can process the IL in multiple > >> directions and bootstraps/passes all tests in all directions. > >> * Dominator order > >> * Post-dominator order > >> * BB1 thru BBn > >> * BBn thru BB1 > >> * branch-only mode where only branches at the end of each BB > >> are examined for folding opportunities > >> > >> 4 additional passes have been converted to use the ranger model: > >> - sprintf - removed the dominator building/walking > >> - warn alloca - replaced calls to get global ranges with calls th= at > >> now return context sensitive ranges. > >> - warn restrict - just replaced EVRP range calls with ranger call= s. > >> - backwards threader - enhanced to use contextual range informati= on > >> to make additional threading decisions. > >> > >> > >> Symbolic Ranges > >> > >> One big concern last year expressed was my decision to abolish symbolic > >> ranges. > >> > >> I continue to maintain that there is no need to track the range of x_2 > >> as [y_3 + 5, MAX] for x_2 =3D y_3 + 5. All you need to do is look at= the > >> definition of x_2, and the same information is exposed right there in > >> the IL. If one requires the symbolic information, the same on-demand > >> process could lookup that information as well. This in turn, makes the > >> code for ranges much simpler, easier to maintain, and less likely to > >> introduce bugs. > >> > >> We have found through our prototype that symbolics in ranges are not > >> nearly as prevalent as one would think. During the work to share a > >> common code base with VRP, we found that symbolic ranges are irrelevant > >> for everything except PLUS_EXPR and MINUS_EXPR. The shared code in our > >> branch drops symbolics to varying immediately for everything else, and > >> it has no impact on EVRP, or VRP or any tests we can find anywhere. > >> Furthermore, we never trapped while comparing ranges generated by VRP > >> versus generating them with range-ops which drops symbolics to varying. > >> > >> We tried modifying VRP such that we don=E2=80=99t even create symbolic > >> endpoints, but rather revert to VARYING always. We can find no test > >> case that fails because a range is not calculated properly due to > >> resolving these endpoints. > >> > >> There are a few that fail due to the symbolic being used to help track > >> relationals.. Ie > >> > >> x_2 =3D y_3 + 5 > >> If (x_2 > y_3) // can be folded since we know x_2 must be < y= _3 > >> > >> VRP generates a range for x of [ y_3+5, MAX ] and at various points > >> uses that to infer a relational or equivalence. Ie, it becomes easy = to > >> tell that the condition must always be true since the lower bound of t= he > >> range is y_3 + 5. > >> > >> I argue this is not a range question, but rather a different problem > >> which VRP has chosen to solve by piggybacking on the range > >> representation. This leads to complications/complexity when trying to > >> evaluate ranges because they must constantly be on the lookout for > >> symbolics. This information is then carried around for the life of the > >> pass, even if it is never used. It also forces any > >> relational/equivalency queries to be handled within the context of the > >> VRP pass. > >> > >> This aspect of symbolics would be handled by a relational/equivalence > >> processing engine that would be follow on work. Using the same basic > >> model as ranges, each tree code is taught to understand the relation > >> between its operands, and then we can answer equivalency and relational > >> accurately as well. It would be available for any pass to use > >> independent of ranges. I will expound upon that a bit in the future > >> directions section. > > > > While I agree that symbolic ranges are a complication and that most > > cases it currently handles are not "value-range" things I do not agree > > with the idea that we can simply remove handling them and deal > > with the fallout in some later point in the future. Eric may also be > > able to show you "real" symbolic value-range magic. > > > > Note that symbolic ranges are already restricted to PLUS_EXPR > > and MINUS_EXPR (and NEGATE_EXPR I think). There are > > also "symbolic" (non-integer constant) ranges like [&a, &a]. > > I've never seen [&a, &MEM[&a + 4]] but we wouldn't reject those > > I think. > > > > You may have noticed that EVRP does not use symbolic ranges. Btw, I probably misremembered this part - it is equivalences that EVRP doesn't use. equivalences are the "other half" of relations besides symbolic ranges. > > As already said I'd like to see VRP go but obstackles are > > symbolic ranges and jump-threading (with Jeff promising > > to handle the jump-threading part in the past). > Right. FWIW, one of the follow-on items to this work is Aldy's > improvements to backwards jump threading which utilizes the ranger > framework -- the primary purpose of that work is to eliminate the need > for jump threading in VRP. But without relations it won't catch most of the cases... Richard. > jeff