From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 56021 invoked by alias); 24 Jul 2019 17:00:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 55972 invoked by uid 89); 24 Jul 2019 17:00:11 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-5.0 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mail-wm1-f43.google.com Received: from mail-wm1-f43.google.com (HELO mail-wm1-f43.google.com) (209.85.128.43) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 24 Jul 2019 17:00:08 +0000 Received: by mail-wm1-f43.google.com with SMTP id l2so42256229wmg.0 for ; Wed, 24 Jul 2019 10:00:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:user-agent:in-reply-to:references:mime-version :content-transfer-encoding:subject:to:cc:from:message-id; bh=eEfwUkc8TkOfWikBfGY3C0IaMAciza0RbD45O+h3zac=; b=G1ReTo1JKtqzZMDx/imtfXF2BoKbSQUvzKterhMkLGNap1IrHWX9mFUOikFQ6o+AtR hAHGLVKYh2jv3yXHzPgsr2Dz7C9zhNz2x0cL/HXDE1Hp7JyWBkQ8rrTjHHXedViYIDWe yyBWX/HH2HB9yO3O1YginXfL9VMePrPUNrOsV+B6HSf1TJJDIUeFtjym4zrS7rzt8jaW i8EeKqwnMQJhsEbTRotK4HJg0fZIYVSKkKQlkIqh3RP2Jk/oXYggbxXyBjNCTIfeNlm8 QRu6R34rpBvu01qbW8UUQj/n7tcfTa6arq52VU3xVY9ceqjob2w5q6BhW5cV4zUCQ12R 8EXQ== Return-Path: Received: from [192.168.178.32] (x5f751d34.dyn.telefonica.de. [95.117.29.52]) by smtp.gmail.com with ESMTPSA id f12sm51034106wrg.5.2019.07.24.10.00.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Jul 2019 10:00:05 -0700 (PDT) Date: Wed, 24 Jul 2019 17:06:00 -0000 User-Agent: K-9 Mail for Android In-Reply-To: <3ebd8a38-9478-7899-5067-b4cb220d2edf@redhat.com> References: <481033aa-6ecc-3bcb-c874-becee14c5605@redhat.com> <27d3db25-ea95-33f4-57fb-e9a4a40d7341@redhat.com> <1a0a328e-74bb-15cb-40fb-09944fe2b84c@redhat.com> <3ebd8a38-9478-7899-5067-b4cb220d2edf@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING To: Jeff Law CC: Andrew MacLeod ,Aldy Hernandez ,gcc-patches From: Richard Biener Message-ID: X-IsSubscribed: yes X-SW-Source: 2019-07/txt/msg01593.txt.bz2 On July 24, 2019 6:00:04 PM GMT+02:00, Jeff Law wrote: >On 7/23/19 3:42 AM, Richard Biener wrote: >> On Tue, Jul 23, 2019 at 1:59 AM Jeff Law wrote: >>> >>> On 7/16/19 12:37 PM, Andrew MacLeod wrote: >>>> On 7/9/19 5:56 AM, Richard Biener wrote: >>>>> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez >wrote: >>>>>> >>>>>> >>>>>> On 7/4/19 6:33 AM, Richard Biener wrote: >>>>>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez >wrote: >>>>>>>> On 7/3/19 7:08 AM, Richard Biener wrote: >>>>>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez > >>>>>>>>> wrote: >>>>>>>> How about we keep VARYING and UNDEFINED typeless until right >before we >>>>>>>> call into the ranger. At which point, we have can populate >min/max >>>>>>>> because we have the tree_code and the type handy. So right >before we >>>>>>>> call into the ranger do: >>>>>>>> >>>>>>>> if (varying_p ()) >>>>>>>> foo->set_varying(TYPE); >>>>>>>> >>>>>>>> This would avoid the type cache, and keep the ranger happy. >>>>>>> you cannot do set_varying on the static const range but instead >>>>>>> you'd do >>>>>>> >>>>>>> value_range tem (*foo); >>>>>>> if (varying_p ()) >>>>>>> tem->set_full_range (TYPE); >>>>>>> >>>>>>> which I think we already do in some places. Thus my question >_where_ >>>>>>> you actually need this. >>>>>> Basically, everywhere. By having a type for varying/undefined, >we don't >>>>>> have to special case anything. Sure, we could for example, >special case >>>>>> the invert operation for undefined / varying. And we could >special case >>>>>> everything dealing with ranges to handle varying and undefined, >but why? >>>>>> We could also pass a type argument everywhere, but that's just >ugly. >>>>>> However, I do understand your objection to the type cache. >>>>>> >>>>>> How about the attached approach? Set the type for >varying/undefined >>>>>> when we know it, while avoiding touching the CONST varying. Then >right >>>>>> before calling the ranger, pass down a new varying node with >min/max for >>>>>> any varyings that were still typeless until that point. >>>>>> >>>>>> I have taken care of never adding a set_varying() that was not >already >>>>>> there. Would this keep the const happy? >>>>>> >>>>>> Technically we don't need to set varying/undef types for every >instance >>>>>> in VRP, but we need it at least for the code that will be shared >with >>>>>> range-ops (extract_range_from_multiplicative_op, union, >intersect, etc). >>>>>> I just figured if we have the information, might as well set >it for >>>>>> consistency. >>>>>> >>>>>> If you like this approach, I can rebase the other patches that >depend on >>>>>> this one. >>>>> OK, so I went ant checked what you do for class irange which has >>>>> a type but no kind member (but constructors with a kind). It also >>>>> uses wide_int members for storage. For a pure integer constant >>>>> range representation this represents somewhat odd choices; I'd >>>>> have elided the m_type member completely here, it seems fully >>>>> redundant. Only range operations need to be carried out in a >>>>> specific type (what I was suggesting above). Even the precision >>>>> encoded in the wide_int members is redundant then (I'd have >>>>> expected widest_int here and trailing-wide-ints for optimizing >>>>> storage). >>>> >>>> What irange contains internally is a bit arbitrary. It's really an >API >>>> for managing a set of subranges. We could have used trees >internally >>>> just as easily, then we wouldnt need a type field. Since we were >doing >>>> lots of operations, rather than going back and forth from trees all >the >>>> time, we just used the underlying wide_int directly. we could >have >>>> fiddle farted around with HOST_WIDE_INT or whatever, but wide_int >is >>>> there, has all the operations, and it works fine. so thats what it >>>> currently is on the branch. >>>> >>>> We are treating a range object as a unique self contained object. >>>> Therefore, the range has a type so we know how to print it, and can >>>> confirm before any operation that the ranges being operated on are >>>> appropriately matched. We found and opened bugzillas over the past >>>> couple years for places where our code caught bugs because a range >was >>>> created and then operated on in a way that was not compatible with >>>> another range. I think there is a still an open one against ada(?) >>>> where the switch and case are different precision. >>>> >>>> From my point of view, a range object is similar to a tree node. A >tree >>>> node has the bits to indicate what the value is, but also >associates a >>>> type with those bits within the same object. This is less error >prone >>>> than passing around the bits and the type separately. As ranges >are >>>> starting to be used in many places outside of VRP, we should do the >same >>>> thing with ranges. WIth value_range it would actually be free >since >>>> there is already a tree for the bounds already which contains the >type. >>>> >>>> >>>> >>>> >>>> >>>>> to fold_range/op_range? The API also seems to be oddly >>>>> constrained to binary ops. Anyway, the way you build >>>>> the operator table requires an awful lot of global C++ ctor >>>>> invocations, sth we generally try to avoid. But I'm getting >>>>> into too many details here. >>>> >>>> Its "oddly constrained" because what you are looking at is just >the >>>> standardized unary/binary ops code.. ie the equivalent of >>>> extract_range_from_binary_expr() and >extract_range_from_unary_expr(). >>>> The other ops we care about have specialized requirements, like >PHIs >>>> and the arbitrary numbers of parameters in a call, or anything less >>>> common than one or two operands. You are just not seeing those >parts. >>>> >>>>> >>>>> So - to answer your question above, I'd like you to pass down >>>>> a type to operations. Because that's what is fundamentally >>>>> required - a range doesn't have a "type" and the current >>>>> value_range_base doesn't fall into the trap of needing one. >>>>> >>>>> Richard. >>>>> >>>> >>>> Why is having a type associated with the data a "trap"? Perhaps >the >>>> old VRP lattice idea didn't need a type with the UNDEFINED and >VARYING >>>> lattice values, but we're moving past lattice values and into a >realm >>>> where we have ranges as useful things outside of VRP, and trying to >>>> shoehorn lattice values does not seem appropriate anymore. >>>> >>>> I looked at implementing range-ops without a type in the range, and >we >>>> end up passing 2 parameters everywhere each time we do anything >with a >>>> range. This doubled the number of parameters in most routines, and >when >>>> we had chains of calls, we were simply passing the type along with >the >>>> range. It seems archaic to be constantly passing information >around >>>> instead of associating it with the range itself. Its just another >place >>>> for an error to creep in.. Aldy found a place where we were >creating >>>> varying nodes for floats IIRC.. the type checking in the range >>>> operations caught it precisely because the range returned wasn't >the >>>> type it was assumed to be. >>>> >>>> That said. I believe we can do away with the need for a type with >an >>>> 'UNDEFINED' range. That is not too onerous, and there doesn't >really >>>> seem to be too many ripple effect from have a typeless undefined >range. >>>> I think we can contain that, and prevents us from having to add a >hack >>>> to value_range for that situation. >>>> >>>> VARYING is another situation completely. We adopted the term >'varying' >>>> for ease of compatibility with VRP, but varying is simply a range >going >>>> from MIN to MAX for a type. Removing the type from that would then >>>> require we always pass a type in with every range which gets back >to >>>> doubling the number of of parameters everywhere for no good reason. >>>> >>>> If we standardize value_range so that MIN and MAX are set for >varying, >>>> then everything works very smoothly, we can make value_range and >irange >>>> interchangeable and facilitate getting the range ops code into >trunk. >>>> >>>> It seems like the only reason we cant do this right now is the >CONST >>>> varying nodes that are returned from get_value_range(). >>>> Looking at that routine, it seems there are only 2 cases where that >can >>>> be returned >>>> 1) we ask for an ssa-name beyond the end of the local ssa-name >vector >>>> 2) once values_propagated is set to true and an ssa-name has no >entry. >>>> >>>> Both seem pretty straightforward to fix... >>>> 1) if we ask for an ssa-Name beyond the end, simply reallocate the >>>> vector to be big enough. I doubt this will trigger a lot, and if >we >>>> initially allocate it with room for an extra 10% names it'll >probably >>>> never trigger. Or pick whatever number seems appropriate. >>>> 2) if values_propagated is true, simply allocate a node for the >>>> ssa-name, set it to varying and return it. THIs accomplishes the >same >>>> thing. >>>> >>>> the number of additional nodes we allocate will be pretty minimal, >and >>>> it will never exceed the number of ssa-names. Then we never have to >>>> worry about having CONST set for a varying node either. I see >places >>>> where there is special processing to avoid calling set_varying=20 >because >>>> we dont know if the vbalue_range in hand is in constant memory and >would >>>> cause the compiler to trap. This seems like a really bad situation, >and >>>> it would eliminate that issue. We can also then eliminate the >places >>>> where VARYING is expanded to min/max for a given type.. instead we >can >>>> just pick up min/max directly. It seems much cleaner overall. >>>> >>>> Something like the simple attached patch would resolve that issue, >and >>>> remove any lurking concerns/bugs with the CONST code. >>>> >>>> Then we can associate a type with varying, canonicalize it >consistently, >>>> and set MIN/MAX appropriately. This will give us full >>>> interchangeability between irange and value_range, establish a >>>> solid/consistent API, and we can move on to the next thing :-) >>>> >>>> Does this not seem reasonable? >>> One might even claim that this patch in and of itself is a step >forward >>> independent of all the other work going on. >>> >>> THe first time I saw that code when I copied it into vr-values I >thought >>> it was ripe for fixing, but never got to it. As it stands, it's a >hack, >>> no more, no less and it's a hack that we can easily remove and do >>> something better. >>=20 >> It's not a hack - it's an optimization and a _sanitization_. Iff you >> want to remove it then >> make it return NULL (no range) and make all callers deal with that. >We can do better here. Callers shouldn't have to cope with the fact >that SSA_NAMEs and thus ranges can be created on the fly (gimple-fold) >and they should be able to get a valid range object that is no >different >than any other range object. NULL is no better than the constant >varying node. > >If you really think we need some kind of special case here for a >client, >can you please elaborate which client and why? Clearly it's something >you feel strongly about and I'm open to the possibility there's a case >I'm not aware of. > > > >>=20 >> That we return a pointer to read-only storage makes sure callers do >not >> change the value-range. That's better than the alternative below >which >> is the appropriate "fix" for the existing code if you want to get rid >of the >> statically allocated constant varying and the "hack" of CONST_CASTing >it. >But I'd claim that if callers are required not to change these ranges, >then the callers are fundamentally broken. I'm not sure what the >"sanitization" is really buying you here. Can you point to something >specific? > >>=20 >> But you lose the sanitizing that nobody can change it and the >> changed info leaks to other SSA vars. >>=20 >> As said, fix all callers to deal with NULL. >>=20 >> But I argue the current code is exactly optimal and safe. >ANd I'd argue that it's just plain broken and that the sanitization >you're referring to points to something broken elsewhere, higher up in >the callers. Another option is to make get_value_range return by value and the only way = to change the lattice to call an appropriate set function. I think we alrea= dy do the latter in all cases (but we use get_value_range in the setter) an= d returning by reference is just eliding the copy.=20 Richard.=20 > >jeff