From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-454290-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 128622 invoked by alias); 23 May 2017 19:26:08 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 127349 invoked by uid 89); 23 May 2017 19:26:07 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-6.4 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,GIT_PATCH_1,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=ham version=3.3.2 spammy=Fire, excited, prone
X-HELO: mail-qk0-f180.google.com
Received: from mail-qk0-f180.google.com (HELO mail-qk0-f180.google.com) (209.85.220.180) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 23 May 2017 19:26:05 +0000
Received: by mail-qk0-f180.google.com with SMTP id u75so138132764qka.3        for <gcc-patches@gcc.gnu.org>; Tue, 23 May 2017 12:26:08 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=1e100.net; s=20161025;        h=x-gm-message-state:subject:to:references:cc:from:message-id:date         :user-agent:mime-version:in-reply-to:content-transfer-encoding;        bh=B7UJNaK9r0wZ/rgRMI3l96jL7disewdLZ8WMKo0OZYg=;        b=YygjJh6XoPDKqlC9tKeRif9S6nAOTaSrJto4N1GbftTLDpbX+d5jNqn4dRcu+9gwaR         q1NBB0l9hfr5Mbpd11C3+vz27PvX7QI9EFPJI/j6dMwY8Z3k45KP9ua3fe7EU/t2qDGg         9rIKytCOZDjAEOgwFMihBU1ppj5JLe8c7Qp05ibkNWB6lF7d1GHubbPvNzMLOInR0dd0         DNL4HMiyev1WsPThJwxLpEeHtGVeTpWKjoh/jUFvkZyfJiolHJkfWqon/tc/dLv2oft9         6IkfC9Z5HqpwAcDpIcEdEk+X03DRi/kSVyYGjKBvLKLrv1SrZr3zTv4ZR8y7Dg3b4rLb         9m3Q==
X-Gm-Message-State: AODbwcDNT261KotDzfROQ2JQR/eBm54tG85OWuxwxHkEuDwp+fcuFC2w	7a8uCs3S6PgswM8E
X-Received: by 10.55.103.203 with SMTP id b194mr26665995qkc.55.1495567566637;        Tue, 23 May 2017 12:26:06 -0700 (PDT)
Received: from localhost.localdomain (75-166-125-146.hlrn.qwest.net. [75.166.125.146])        by smtp.gmail.com with ESMTPSA id s7sm1063240qtb.35.2017.05.23.12.26.05        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);        Tue, 23 May 2017 12:26:06 -0700 (PDT)
Subject: Re: SSA range class and removal of VR_ANTI_RANGEs
To: Aldy Hernandez <aldyh@redhat.com>, Richard Biener <richard.guenther@gmail.com>
References: <dbad7708-d495-402c-c9d8-626a9c8202b9@redhat.com>
Cc: Andrew MacLeod <amacleod@redhat.com>, gcc-patches <gcc-patches@gcc.gnu.org>
From: Martin Sebor <msebor@gmail.com>
Message-ID: <34686d1e-c2c9-bc10-b967-4f651d8de193@gmail.com>
Date: Tue, 23 May 2017 19:37:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <dbad7708-d495-402c-c9d8-626a9c8202b9@redhat.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
X-SW-Source: 2017-05/txt/msg01800.txt.bz2

On 05/23/2017 04:48 AM, Aldy Hernandez wrote:
> Howdy!
>
> For Andrew's upcoming on-demand range work I have created a range class
> for use in his engine.  Currently, the range class is only for SSA
> integers, but there is no reason why we couldn't adapt this for RTL or
> non-integer types at a later time.
>
> The class can live outside of his work, as can be demonstrated by the
> attached patch.  With it, I was able to rewrite the post-VRP range
> information to use this class and get rid of VR_ANTI_RANGE throughout
> the compiler.  A VR_ANTI_RANGE of ~[5,10] becomes [-MIN,4][11,+MAX].
>
> The internal structure of VRP is unchanged.  I have only changed the
> outward facing structures.
>
> A good example of how much cleaner using an actual range rather than
> VR_ANTI_RANGEs is the change to get_size_range() in the attached patch.
> We remove an assortment of contortions into:
>
> +  /* Remove negative numbers from the range.  */
> +  irange positives;
> +  range_positives (&positives, exptype, allow_zero ? 0 : 1);
> +  if (positives.Intersect (*ir))
> +    {
> +      /* Remove the unknown parts of a multi-range.
> +        This will transform [5,10][20,MAX] into [5,10].  */
> +      if (positives.num_ranges () > 1
> +         && positives.upper_bound () == wide_int (TYPE_MAX_VALUE
> (exptype)))
> +       positives.remove_range (positives.num_ranges () - 1);
> +
> +      range[0] = wide_int_to_tree (exptype, positives.lower_bound ());
> +      range[1] = wide_int_to_tree (exptype, positives.upper_bound ());
> +    }
> +  else
> +    {
> +      /* If removing the negative numbers didn't give us anything
> +        back, the entire range was negative. Leave things as they
> +        were, and let the caller sort it out.  */
> +      range[0] = wide_int_to_tree (exptype, min);
> +      range[1] = wide_int_to_tree (exptype, max);
>      }
>
> A few notes:
>
> 1. There are still two uses of VR_ANTI_RANGEs left after this patch:
> ip-cp.c and ipa-prop.c.  I didn't look too hard at these two passes, as
> they are using `struct value_range' which should only have been used
> *before* VRP finishes.  I can work on this as a follow-up.
>
> 2. I have not included ChangeLog entries, as I would hate to write them,
> and have the API or significant things change from under me as part of
> the review.  I can certainly cobble the ChangeLog entries as soon as/if
> we agree that this is an acceptable approach.
>
> 3. There are lots of unit tests to maintain sanity.  The cast ones in
> particular will definitely need refinement as Andrew's work stresses the
> class.
>
> The attached patch passes bootstrap and tests.
>
> I have also benchmarked an assortment of .ii files (from a previous
> bootstrap) with no noticeable difference in -O2 compilation time.  So, I
> would prefer to tackle any micro-optimizations either as a follow-up or
> if it actually becomes noticeable--. Yeah, yeah, I know.  Wishful
> thinking ;-).
>
>
> Fire away!
> Aldy
>
> p.s. Please refer any comments as to why we need the class, or why we
> need on-demand range information to Andrew :-P.

I'm pretty excited about this work (as you know :) so just a few
longish comments on/suggestions for the design of the class to
make it ever so slightly easier to work with.  Since YMMV, feel
free to disregard what you don't like.

+#define MAX_RANGES	6
+

This seems like an implementation detail that clients of the class
shouldn't know about or have access to.  Suggest to hide that from
the interface (e.g., by making a private member of irange).

+typedef class irange *irange_p;

FWIW, I find pointer typedefs more trouble than worth.  They
obscure the fact that they are pointers and cannot be made const
by adding the const qualifier.  E.g., this:

   void foo (const irange_p);

doesn't declare foo to take a pointer to a const range as one
might hope but rather a const pointer to a non-const range.

+enum irange_type { RANGE_PLAIN, RANGE_INVERT };

Consider making this a member and renaming it to something like
'kind' to avoid giving the impression that it's somehow a type
that's related to class irange.  Making it a member also avoids
the need to prefix the enumerators with RANGE_, and makes it
convenient to refer to PLAIN and INVERT using the dot and arrow
operators (in addition to irange::PLAIN).

+
+class GTY(()) irange
+{
+ private:
+  bool overflow;
+  size_t n;

Suggest to use a more descriptive name for the member.

+  void prepend (wide_int x, wide_int y);
+  void append (wide_int x, wide_int y);
+  void remove (unsigned i, unsigned j);
+  void canonicalize ();
+  /* This is stupid.  These two should be private, but the GTY
+     machinery can't look inside an irange.  */
+ public:
+  tree type;
+  wide_int bounds[MAX_RANGES];
+
+public:
+  irange () { type = NULL_TREE; n = 0; }
+  irange (tree t);

Should the range be implicitly convertible from any tree or would
it better if it required explicit conversion (i.e., should the ctor
be declared explicit)?  The usual rule of thumb is to err on the
side of safety to avoid accidentally creating objects of the class).

+  irange (tree t, wide_int lbound, wide_int ubound,
+	  irange_type rt = RANGE_PLAIN);
+  irange (const irange &r);
+
+  void set_range (tree t);
+  void set_range (tree t, wide_int lbound, wide_int ubound,
+		  irange_type rt = RANGE_PLAIN);
+
+  bool overflow_p () { return overflow && !TYPE_OVERFLOW_WRAPS (type);

Suggest to declare this function const.

}
+  void set_overflow () { overflow = true; }
+  void clear_overflow () { overflow = false; }
+
+  unsigned num_ranges () { return n / 2; }
+  wide_int lower_bound () const { return bounds[0]; }
+  wide_int lower_bound (unsigned index);

The implementation does this:

+  gcc_assert (n != 0 && index <= n/2);

Why is n required to be non-zero when overload just above has
well-defined behavior for the same index?  That seems needlessly
error prone (could cause ICEs), and will require callers to use
different algorithms for accessing the first lower bound versus
those with non-zero indices.  Suggest to define a single inline
form the function instead (taking an index, possibly with
a default argument).  That way the same algorithm can be used
to access any element.

+  wide_int upper_bound () const { return bounds[n - 1]; }
+  wide_int upper_bound (unsigned index);

Suggest to declare these five functions const.

+
+  void remove_range (unsigned i) { remove (i, i + 1); }

Since there is a remove() function, suggest to rename the one
argument form to remove() to avoid having to remember which is
which when calling it.

+  void clear () { n = 0; }
+  void clear (tree t) { type = t; n = 0; overflow = false; }
+  bool empty_p () const { return !n; }


+  bool range_for_type_p () const;
+  bool simple_range_p () const { return n == 2; }
+
+  void dump ();
+  void dump (pretty_printer *pp);
+  void dump (FILE *);

Suggest to declare these three const.

+
+  bool valid_p ();

Suggest to make it const.

+  void cast (tree type);
+  bool contains_p (wide_int element);
+  bool contains_p (tree);
+
+  tree get_type () { return type; }

Suggest to declare these three functions const (even if they call
functions that aren't const, as long as they don't modify any members
they should be const; it's better to cast the constness away in the
implementation than let "const-incorrect" implementation details leak
into the interface).

+
+  irange& operator= (const irange &r);
+  irange& operator= (tree t);
+
+  bool operator== (const irange &r) const;
+  bool operator!= (const irange &r) const { return !(*this == r); }

Suggest to define equality and inequality as non-members for
symmetry.  Since there is an implicit conversion from tree to
irange, this is valid:

   void f (irange ir, tree t)
   {
     ir == t;
   }

but with operator==() as a member, this isn't

   void f (irange ir, tree t)
   {
     t == ir;
   }

making the operator asymmetric (the opposite is usually expected).

+
+  void Union (wide_int x, wide_int y);
+  bool Union (const irange &r);
+  bool Union (const irange &r1, const irange &r2);
+
+  // THIS = THIS ^ [X,Y].  Return TRUE if result is non-empty.
+  bool Intersect (wide_int x, wide_int y, bool readonly = false);
+  // THIS = THIS ^ R.  Return TRUE if result is non-empty.
+  // THIS = R1 ^ R2.  Return TRUE if result is non-empty.
+  bool Intersect (const irange &r1, const irange &r2, bool readonly = 
false);
+  // Return TRUE if THIS ^ R will be non-empty.
+  bool Intersect_p (const irange &r)
+    { return Intersect (r, /*readonly=*/true); }

I would suggest the following changes to Union, Intersect, and Not:

1) Define all three members without the readonly argument and
    returning irange& (i.e., *this).  The return value can be
    used wherever irange& is expected, and the is_empty() member
    function can be called on it to obtain the same result.  E.g.,
    Intersect A with B, storing the result in A:

    irange A, B;
    if (A.Intersect (B).is_empty ()) { ... }

2) Add non-members like so:

   irange range_union (const irange &lhs, const irange &rhs)
   {
     return irange (lhs).Union (rhs);
   }

   and find out if the union of A or B is empty without modifying
   either argument:

   irange A, B;
   if (range_union (A, B).is_empty ()) { ... }

+
+  bool Not ();
+  bool Not (const irange& r);

Given that this function computes the inverted range of *this,
should it be called something like Invert instead?

Martin

PS For the names, have you considered using the bitwise operators?
I.e., operator| for union, operator& for intersection, and operator~
for not (aka invert).