Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Andrew MacLeod <amacleod@redhat.com>
To: Tamar Christina <Tamar.Christina@arm.com>,
	Richard Biener <rguenther@suse.de>,
	Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Tamar Christina via Gcc-patches <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>, "jlaw@ventanamicro.com" <jlaw@ventanamicro.com>
Subject: Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]
Date: Wed, 15 Feb 2023 12:50:56 -0500	[thread overview]
Message-ID: <77142b9b-7af8-eb04-e596-6dd2f97aff9a@redhat.com> (raw)
In-Reply-To: <VI1PR08MB5325B267F86B3AB4637C2F55FFA39@VI1PR08MB5325.eurprd08.prod.outlook.com>

[-- Attachment #1: Type: text/plain, Size: 6043 bytes --]


On 2/15/23 12:13, Tamar Christina wrote:
>> On 2/15/23 07:51, Tamar Christina wrote:
>>
Thanks, lots of useful context there.


> This second pattern replaces the above into:
>
>    _6 = _3 +w level_14(D);
>    _7 = _6 / 255;
>    _8 = (unsigned char) _7;
>
> Thus removing the need to promote before the addition.  What I'm working on
> is an optimization for division.  So I am after what the range of _6 is. oprnd0 in my
> example is the 1rst operand of the division.
>
> I need to know the range of_6 because based on the range we can optimize this
> division into something much more efficient.
>
>> ----  IF that is all true, then I would suggest one of 2 possible routes.
>> 1) we add WIDEN_PLUS_EXPR to range-ops.  THIs involves writing
>> fold_range() for it whereby it would create a range of a type double the
>> precision of _3, then take the 2 ranges for op1 and op2, cast them to this new
>> type and add them.
>>
> Right, so I guess none of the widening operations are currently there.  Can you
> point me in the right direction of where I need to add them?

sure, details below


>> 2) manually doing the same thing.   BUt if you are goignto manually do it, we
>> might as well put that same code into fold_range then the entire ecosystem
>> will benefit.
>>
>> Once the operation can be performed in range ops, you can cast the new
>> range back to the type of _3 and see if its fully represented. ie
>>
>> int_range_max r1, r2
>> if (ranger.range_of_stmt (r1, stmt))
>>     {
>>       r2 = r1;
>>       r2.cast (TREE_TYPE (_3));
>>       r2.cast (TREE_TYPE (patt_27));
>>       if (r1 == r2)
>>         // No info was lost casting back and forth, so r1 must fit into type of _3
>>
>> That should work for within the IL.  And if you want to do the same thing
>> outside of the IL, you have to come up with the values you want to use for
>> op1 and op2, replace the ranger query with a direct range-opfold:
>>
>> range_op_handler handler (WIDEN_PLUS_EXPR, TREE_TYPE (patt_27)); if
>> (handler && handler->fold_range (r1, range_of__3, range_of_level_15))
>>     {
>>       // same casting song and dance
>>
>>
> Just for my own understanding, does the fold_range here update the information
> in the IL? Or is it just for this computation? So when I hit this pattern again it
> recomputes it?

fold_range does not update anything.  It just performs the calculation, 
and passes like VRP etc are responsible for if, and when, that is 
reflected in some way/transformation in the IL. The IL is primarily used 
for context to look back and try to determine the range of the inputs to 
the statement.   Thats why, if you arent using an expression in the IL, 
you need to provide the ranges yourself.   BY default, you end up with 
the full range for the type, ie VARYING.  but if ranger can detertmine 
through branches and such that its something different, it will. ie, so 
if you case is preceeded by

if (_3 < 20 && level_15< 20)
   //  the range of _3 will be [0, 19] and _15 will be [0, 19], and th 
addition will end up with a range of [0, 38]

In your case, I see the ranges are the range of the 8 bit type: irange] 
int [0, 255] NONZERO 0xff



>> If you dont want to go thru this process, in theory, you could try simply
>> adding _3 and level_15 in their own precision, and if max/min aren't +INF/-
>> INF then you can probably assume there is no overflow?
>> in which case, all you do is the path you are on above for within a stmt should
>> work:
>>
>> 	  gimple_ranger ranger;
>> 	  int_range_max r0, r1, def;
>> 	  /* Check that no overflow will occur.  If we don't have range
>> 	     information we can't perform the optimization.  */
>> 	  if (ranger.range_of_expr (r0, oprnd0, stmt) &&
>> ranger.range_of_expr (r1,oprnd1, stmt)
>> 	    {
>> 	      range_op_handler handler (PLUS_EXPR, TREE_TYPE (_3));
>> 	      if (handler && handler->fold_range (def, r0, r1))so I would expect a skeleton to be
>> 		// examine def.upper_bound() and def.lower_bound()
>>
>> Am I grasping some of the issue here?
> You are, and this was helpful.  I would imagine that Richard wouldn't accept me
> to do it locally though.  So I guess if it's safe to do for this PR fix, I can add the basic
> widening operations to ranger-ops if you can show me where.
>

all the range-op integer code is in gcc/range-op.cc.  As this is a basic 
binary operation, you should be able to get away with implementing a 
single routine,  wi_fold () which adds 2 wide int bounds  together and 
returns a result.  THis si the implelemntaion for operator_plus.

void
operator_plus::wi_fold (irange &r, tree type,
                         const wide_int &lh_lb, const wide_int &lh_ub,
                         const wide_int &rh_lb, const wide_int &rh_ub) const
{
   wi::overflow_type ov_lb, ov_ub;
   signop s = TYPE_SIGN (type);
   wide_int new_lb = wi::add (lh_lb, rh_lb, s, &ov_lb);
   wide_int new_ub = wi::add (lh_ub, rh_ub, s, &ov_ub);
   value_range_with_overflow (r, type, new_lb, new_ub, ov_lb, ov_ub);
}


you shouldn't have to do any of the overflow stuff at the end, just take 
the 2 sets of wide int, double their precision to start, add them 
together (it cant possible overflow right) and then return an 
int_range<2> with those bounds...
ie

void
operator_plus::wi_fold (irange &r, tree type,
                         const wide_int &lh_lb, const wide_int &lh_ub,
                         const wide_int &rh_lb, const wide_int &rh_ub) const
{
   wi::overflow_type ov_lb, ov_ub;
   signop s = TYPE_SIGN (type);

   // Do whatever wideint magic is required to do this adds in higher 
precision
   wide_int new_lb = wi::add (lh_lb, rh_lb, s, &ov_lb);
   wide_int new_ub = wi::add (lh_ub, rh_ub, s, &ov_ub);

   r = int_range<2> (type, new_lb, new_ub);
}


The operator needs to be registered, I've attached the skeleton for it.  
you should just have to finish implementing wi_fold().

in theory :-)


[-- Attachment #2: tam.diff --]
[-- Type: text/x-patch, Size: 1227 bytes --]

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 5c67bce6d3a..c425c496c25 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -1730,6 +1730,29 @@ operator_minus::op2_range (irange &r, tree type,
   return fold_range (r, type, op1, lhs);
 }
 
+class operator_widen_plus : public range_operator
+{
+public:
+  virtual void wi_fold (irange &r, tree type,
+			const wide_int &lh_lb,
+			const wide_int &lh_ub,
+			const wide_int &rh_lb,
+			const wide_int &rh_ub) const;
+} op_widen_plus;
+
+void
+operator_widen_plus::wi_fold (irange &r, tree type,
+			      const wide_int &lh_lb,
+			      const wide_int &lh_ub,
+			      const wide_int &rh_lb,
+			      const wide_int &rh_ub) const
+{
+  wi::overflow_type ov_lb, ov_ub;
+  signop s = TYPE_SIGN (type);
+  wide_int new_lb = wi::add (lh_lb, rh_lb, s, &ov_lb);
+  wide_int new_ub = wi::add (lh_ub, rh_ub, s, &ov_ub);
+  r = int_range<2> (type, new_lb, new_ub);
+}
 
 class operator_pointer_diff : public range_operator
 {
@@ -4505,6 +4528,7 @@ integral_table::integral_table ()
   set (ABSU_EXPR, op_absu);
   set (NEGATE_EXPR, op_negate);
   set (ADDR_EXPR, op_addr);
+  set (WIDEN_PLUS_EXPR, op_widen_plus);
 }
 
 // Instantiate a range op table for pointer operations.

next prev parent reply	other threads:[~2023-02-15 17:51 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-09 17:16 Tamar Christina
2023-02-09 17:22 ` [PATCH 2/2]AArch64 Update div-bitmask to implement new optab instead of target hook [PR108583] Tamar Christina
2023-02-10 10:35   ` Tamar Christina
2023-02-10 14:10   ` Richard Sandiford
2023-02-10 10:34 ` [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583] Tamar Christina
2023-02-10 13:13 ` Richard Biener
2023-02-10 13:36 ` Richard Sandiford
2023-02-10 13:52   ` Richard Biener
2023-02-10 14:13   ` Tamar Christina
2023-02-10 14:30     ` Richard Sandiford
2023-02-10 14:54       ` Tamar Christina
2023-02-27 11:09       ` Tamar Christina
2023-02-27 12:11         ` Richard Sandiford
2023-02-27 12:14           ` Tamar Christina
2023-02-27 21:33             ` Richard Sandiford
2023-02-27 22:10               ` Tamar Christina
2023-02-28 11:08                 ` Richard Sandiford
2023-02-28 11:12                   ` Tamar Christina
2023-02-28 12:03                     ` Richard Sandiford
2023-03-01 11:30                       ` Richard Biener
2023-02-10 15:56     ` Richard Sandiford
2023-02-10 16:09       ` Tamar Christina
2023-02-10 16:25         ` Richard Sandiford
2023-02-10 16:33           ` Tamar Christina
2023-02-10 16:57             ` Richard Sandiford
2023-02-10 17:01               ` Richard Sandiford
2023-02-10 17:14               ` Tamar Christina
2023-02-10 18:12                 ` Richard Sandiford
2023-02-10 18:34                   ` Richard Biener
2023-02-10 20:58                     ` Andrew MacLeod
2023-02-13  9:54                       ` Tamar Christina
2023-02-15 12:51                         ` Tamar Christina
2023-02-15 16:05                           ` Andrew MacLeod
2023-02-15 17:13                             ` Tamar Christina
2023-02-15 17:50                               ` Andrew MacLeod [this message]
2023-02-15 18:42                                 ` Andrew MacLeod
2023-02-22 12:51                                   ` Tamar Christina
2023-02-22 16:41                                   ` Andrew MacLeod
2023-02-22 18:03                                     ` Tamar Christina
2023-02-22 18:33                                       ` Andrew MacLeod
2023-02-23  8:36                                         ` Tamar Christina
2023-02-23 16:39                                           ` Andrew MacLeod
2023-02-23 16:56                                             ` Tamar Christina
2023-03-01 16:57                                             ` Andrew Carlotti
2023-03-01 18:16                                               ` Tamar Christina
2023-02-22 13:06                                 ` Tamar Christina
2023-02-22 15:19                                   ` Andrew MacLeod

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77142b9b-7af8-eb04-e596-6dd2f97aff9a@redhat.com \
    --to=amacleod@redhat.com \
    --cc=Richard.Sandiford@arm.com \
    --cc=Tamar.Christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@ventanamicro.com \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).