From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23043 invoked by alias); 23 Jul 2019 00:00:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 22953 invoked by uid 89); 23 Jul 2019 00:00:02 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 23 Jul 2019 00:00:00 +0000 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E971C87648; Mon, 22 Jul 2019 23:59:58 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-9.rdu2.redhat.com [10.10.112.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5CB6F19C6A; Mon, 22 Jul 2019 23:59:57 +0000 (UTC) Subject: Re: [range-ops] patch 01/04: types for VR_UNDEFINED and VR_VARYING To: Andrew MacLeod , Richard Biener , Aldy Hernandez Cc: gcc-patches References: <481033aa-6ecc-3bcb-c874-becee14c5605@redhat.com> <27d3db25-ea95-33f4-57fb-e9a4a40d7341@redhat.com> From: Jeff Law Openpgp: preference=signencrypt Message-ID: <1a0a328e-74bb-15cb-40fb-09944fe2b84c@redhat.com> Date: Tue, 23 Jul 2019 00:19:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <27d3db25-ea95-33f4-57fb-e9a4a40d7341@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-IsSubscribed: yes X-SW-Source: 2019-07/txt/msg01467.txt.bz2 On 7/16/19 12:37 PM, Andrew MacLeod wrote: > On 7/9/19 5:56 AM, Richard Biener wrote: >> On Tue, Jul 9, 2019 at 9:28 AM Aldy Hernandez wrote: >>> >>> >>> On 7/4/19 6:33 AM, Richard Biener wrote: >>>> On Wed, Jul 3, 2019 at 2:17 PM Aldy Hernandez wrote: >>>>> On 7/3/19 7:08 AM, Richard Biener wrote: >>>>>> On Wed, Jul 3, 2019 at 11:19 AM Aldy Hernandez >>>>>> wrote: >>>>> How about we keep VARYING and UNDEFINED typeless until right before we >>>>> call into the ranger.  At which point, we have can populate min/max >>>>> because we have the tree_code and the type handy.  So right before we >>>>> call into the ranger do: >>>>> >>>>>           if (varying_p ()) >>>>>             foo->set_varying(TYPE); >>>>> >>>>> This would avoid the type cache, and keep the ranger happy. >>>> you cannot do set_varying on the static const range but instead >>>> you'd do >>>> >>>>     value_range tem (*foo); >>>>     if (varying_p ()) >>>>      tem->set_full_range (TYPE); >>>> >>>> which I think we already do in some places.  Thus my question _where_ >>>> you actually need this. >>> Basically, everywhere.  By having a type for varying/undefined, we don't >>> have to special case anything.  Sure, we could for example, special case >>> the invert operation for undefined / varying.  And we could special case >>> everything dealing with ranges to handle varying and undefined, but why? >>>    We could also pass a type argument everywhere, but that's just ugly. >>> However, I do understand your objection to the type cache. >>> >>> How about the attached approach?  Set the type for varying/undefined >>> when we know it, while avoiding touching the CONST varying.  Then right >>> before calling the ranger, pass down a new varying node with min/max for >>> any varyings that were still typeless until that point. >>> >>> I have taken care of never adding a set_varying() that was not already >>> there.  Would this keep the const happy? >>> >>> Technically we don't need to set varying/undef types for every instance >>> in VRP, but we need it at least for the code that will be shared with >>> range-ops (extract_range_from_multiplicative_op, union, intersect, etc). >>>    I just figured if we have the information, might as well set it for >>> consistency. >>> >>> If you like this approach, I can rebase the other patches that depend on >>> this one. >> OK, so I went ant checked what you do for class irange which has >> a type but no kind member (but constructors with a kind).  It also >> uses wide_int members for storage.  For a pure integer constant >> range representation this represents somewhat odd choices;  I'd >> have elided the m_type member completely here, it seems fully >> redundant.  Only range operations need to be carried out in a >> specific type (what I was suggesting above).  Even the precision >> encoded in the wide_int members is redundant then (I'd have >> expected widest_int here and trailing-wide-ints for optimizing >> storage). > > What irange contains internally is a bit arbitrary.  It's really an API > for managing a set of subranges.  We could have used trees internally > just as easily, then we wouldnt need a type field. Since we were doing > lots of operations, rather than going back and forth from trees all the > time, we just used the underlying wide_int directly.   we could have > fiddle farted around with HOST_WIDE_INT or whatever, but wide_int is > there, has all the operations, and it works fine. so thats what it > currently is on the branch. > > We are treating a range object as a unique self contained object. > Therefore, the range has a type so we know how to print it, and can > confirm before any operation that the ranges being operated on are > appropriately matched.  We found and opened bugzillas over the past > couple years for places where our code caught bugs because a range was > created and then operated on in a way that was not compatible with > another range.  I think there is a still an open one against ada(?) > where the switch and case  are different precision. > > From my point of view, a range object is similar to a tree node. A tree > node has the bits to indicate what the value is, but also associates a > type with those bits within the same object.  This is less error prone > than passing around the bits and the type separately.  As ranges are > starting to be used in many places outside of VRP, we should do the same > thing with ranges.  WIth value_range it would actually be free since > there is already a tree for the bounds already which contains the type. > > > > > >> to fold_range/op_range?  The API also seems to be oddly >> constrained to binary ops.  Anyway, the way you build >> the operator table requires an awful lot of global C++ ctor >> invocations, sth we generally try to avoid.  But I'm getting >> into too many details here. > > Its "oddly constrained" because what you are looking at  is just the > standardized unary/binary ops code.. ie the equivalent of > extract_range_from_binary_expr() and extract_range_from_unary_expr().   > The other ops we care about have specialized requirements, like PHIs  > and the arbitrary numbers of parameters in a call, or anything less > common than one or two operands.    You are just not seeing those parts. > >> >> So - to answer your question above, I'd like you to pass down >> a type to operations.  Because that's what is fundamentally >> required - a range doesn't have a "type" and the current >> value_range_base doesn't fall into the trap of needing one. >> >> Richard. >> > > Why is having a type associated with the data a "trap"?   Perhaps the > old VRP lattice idea didn't need a type with the UNDEFINED and VARYING > lattice values, but we're moving past lattice values and into a realm > where we have ranges as useful things outside of VRP, and trying to > shoehorn lattice values does not seem appropriate anymore. > > I looked at implementing range-ops without a type in the range, and we > end up passing 2 parameters everywhere each time we do anything with a > range.  This doubled the number of parameters in most routines, and when > we had chains of calls, we were simply passing the type along with the > range.  It seems archaic to be constantly passing information around > instead of associating it with the range itself.  Its just another place > for an error to creep in.. Aldy found a place where we were creating > varying nodes for floats IIRC.. the type checking in the range > operations caught it precisely because the range returned wasn't the > type it was assumed to be. > > That said. I believe we can do away with the need for a type with an > 'UNDEFINED' range.  That is not too onerous, and there doesn't really > seem to be too many ripple effect from have a typeless undefined range. > I think we can contain that, and prevents us from having to add a hack > to value_range for that situation. > > VARYING  is another situation completely.  We adopted the term 'varying' > for ease of compatibility with VRP, but varying is simply a range going > from MIN to MAX for a type.  Removing the type from that would then > require we always pass a type in with every range which gets back to  > doubling the number of of parameters everywhere for no good reason. > > If we standardize value_range so that MIN and MAX are set for varying, > then everything works very smoothly, we can make value_range and irange > interchangeable and facilitate getting the range ops code into trunk. > > It seems like the only reason we cant do this right now is the CONST > varying nodes that are returned from get_value_range(). > Looking at that routine, it seems there are only 2 cases where that can > be returned >  1) we ask for an ssa-name beyond the end of the local ssa-name vector >  2) once values_propagated is set  to true and an ssa-name has no entry. > > Both seem pretty straightforward to fix... > 1)  if we ask for an ssa-Name beyond the end, simply reallocate the > vector to be big enough.   I doubt this will trigger a lot, and if we > initially allocate it with room for an extra 10% names it'll probably > never trigger.  Or pick whatever number seems appropriate. > 2) if values_propagated is true, simply allocate a node for the > ssa-name,  set it to varying and return it. THIs accomplishes the same > thing. > > the number of additional nodes we allocate will be pretty minimal, and > it will never exceed the number of ssa-names. Then we never have to > worry about having CONST set for a varying node either. I see places > where there is special processing to avoid calling set_varying  because > we dont know if the vbalue_range in hand is in constant memory and would > cause the compiler to trap. This seems like a really bad situation, and > it would eliminate that issue.  We can also then eliminate the places > where VARYING is expanded to min/max for a given type.. instead we can > just pick up min/max directly.    It seems much cleaner overall. > > Something like the simple attached patch would resolve that issue, and > remove any lurking concerns/bugs with the CONST code. > > Then we can associate a type with varying, canonicalize it consistently, > and set MIN/MAX appropriately.   This will give us full > interchangeability between irange and value_range, establish a > solid/consistent API,  and we can move on to the next thing :-) > > Does this not seem reasonable? One might even claim that this patch in and of itself is a step forward independent of all the other work going on. THe first time I saw that code when I copied it into vr-values I thought it was ripe for fixing, but never got to it. As it stands, it's a hack, no more, no less and it's a hack that we can easily remove and do something better. Jeff