Re: [RFC][IPA-VRP] Early VRP Implementation

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Biener <richard.guenther@gmail.com>
To: kugan <kugan.vivekanandarajah@linaro.org>
Cc: Andrew Pinski <pinskia@gmail.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
		Jan Hubicka <hubicka@ucw.cz>, Martin Jambor <mjambor@suse.cz>
Subject: Re: [RFC][IPA-VRP] Early VRP Implementation
Date: Tue, 26 Jul 2016 13:37:00 -0000	[thread overview]
Message-ID: <CAFiYyc0b3J7a6kB0VoLWGG9_qJ6eK2NuGshQfDpewsipkUh_7g@mail.gmail.com> (raw)
In-Reply-To: <19ff8188-aed7-0f9e-cc0b-0603698787ff@linaro.org>

On Tue, Jul 26, 2016 at 2:27 PM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
>
>
> Hi Riachard,
>
> Thanks for the review. Here is an updated patch with comments below.
>
>> +/* Restore/Pop all the old VRs maintained in the cond_stack.  */
>> +
>> +void evrp_dom_walker::finalize_dom_walker ()
>> +{
>> +  while (!cond_stack.is_empty ())
>> +    {
>> +      tree var = cond_stack.last ().second;
>> +      pop_value_range (var);
>> +      cond_stack.pop ();
>> +    }
>>
>> why is this necessary at all?  Looks like a waste of time (and the
>> stack should be empty here
>> anyway).  I think you leak cond_stack as well, I suppose using
>> auto_vec might work here,
>> you have to check.
>
> Done.
>
>>
>>>>
>>>> +         /* Discover VR when condition is true.  */
>>>> +         if (te == e
>>>> +             && !TREE_OVERFLOW_P (op0)
>>>> +             && !TREE_OVERFLOW_P (op1))
>>>
>>>
>>>
>>> This is mainly to handle gcc.dg/pr38934.c. This would result in ICE from
>>> set_value_range otherwise:
>>>
>>> gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
>>>                   && (!TREE_OVERFLOW_P (max) || is_overflow_infinity
>>> (max)));
>>
>>
>> Ok, instead make sure to remove the overflow flag via
>>
>>   if (TREE_OVERFLOW_P (op0))
>>     op0 = drop_tree_overflow (op0);
>
>  ...
> Done.
>
>>
>> it looks like you want to merge the if ( & EDGE_TRUE_VALUE) and
>> FALSE_VALUE
>> cases and only invert the tree comparison for the false edge.
>
> Done.
>
>>
>> +             tree cond = build2 (code, boolean_type_node, op0, op1);
>> +             extract_range_for_var_from_comparison_expr (&vr, op0, cond);
>>
>> no wasteful tree building here please.  Instead of cond pass in code,
>> op0 and op1
>> as separate arguments.
>
> Done.
>
>>
>> +         /* If we found any usable VR, set the VR to ssa_name and create
>> a
>> +            PUSH old value in the cond_stack with the old VR.  */
>> +         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
>> +           {
>> +             value_range *new_vr = vrp_value_range_pool.allocate ();
>> +             memset (new_vr, 0, sizeof (*new_vr));
>> +             *new_vr = vr;
>> +             cond_stack.safe_push (std::make_pair (bb, op0));
>>
>> the memset looks redundant given you copy-assing from vr anyway.
>
> Done.
>
>>
>> You seem to miss the chance to register ranges for x_2 in i_1 < x_2 where
>> both i_1 and x_2 might have ranges that are useful.
>
> I will address this once the basic structure is in place if that is OK with
> you.

Sure.  Just note it down somewhere.

>>
>> push and pop_value_range should be methods in the dom walker class
>> and vrp_stack and cond_stack should be a single stack.  You can omit
>> basic_block if you use a "magic" NULL_TREE var as marker (simply
>> push a NULL_TREE, NULL pair at the end of before_dom_children).
>>
> Done.
>
>
>>>>
>>>> you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE
>>>>
>>>> why do you need those TREE_OVERFLOW checks?
>>>>
>>>> +             tree cond = build2 (code, boolean_type_node, op0, op1);
>>>> +             tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond);
>>>> +             extract_range_from_assert (&vr, a);
>>>>
>>>> so I was hoping that the "refactoring" patch in the series would expose
>>>> a
>>>> more
>>>> useful interface than extract_range_from_assert ... namely one that can
>>>> extract a range from the comparison directly and does not require
>>>> building
>>>> a scratch ASSERT_EXPR.
>>>
>>>
>>>
>>> I have split extract_range_from_assert into
>>> extract_range_for_var_from_comparison_expr and using it now. Is that
>>> better?
>>
>>
>> See above.
>>
>>>>
>>>>
>>>> +         /* If we found any usable VR, set the VR to ssa_name and
>>>> create
>>>> a
>>>> +            restore point in the cond_stack with the  old VR. */
>>>> +         if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
>>>> +           {
>>>> +             value_range *new_vr = XCNEW (value_range);
>>>> +             *new_vr = vr;
>>>> +             cond_stack.safe_push (std::make_pair (bb,
>>>> +                                                   std::make_pair (op0,
>>>> +
>>>> old_vr)));
>>>> +             change_value_range (op0, new_vr);
>>>>
>>>> I don't like 'change_value_range' as interface, please integrate that
>>>> into
>>>> a push/pop_value_range style interface instead.
>>>
>>>
>>>
>>> Tried using push/pop interface.
>>>>
>>>>
>>>>
>>>> +       vrp_visit_stmt (stmt, &taken_edge_p, &output_p);
>>>> +    }
>>>> +
>>>> +  return NULL;
>>>>
>>>> you should return taken_edge_p (misnamed as it isn't a pointer) and take
>>>> advantage of EDGE_EXECUTABLE.  Again see DOM/SCCVN (might want to
>>>> defer this as a followup improvement).
>>>
>>>
>>>
>>> I have added a TODO to this effect and will comeback to it.
>>>>
>>>>
>>>>
>>>> Note that the advantage of a DOM-based VRP is that backtracking is easy
>>>> to implement (you don't do that yet).  That is, after DEF got a (better)
>>>> value-range you can simply re-visit all the defs of its uses (and
>>>> recursively).
>>>> I think you have to be careful with stmts that might prematurely leave a
>>>> BB
>>>> though (like via EH).  So sth for a followup as well.
>>>
>>>
>>>
>>> Is this OK now. Bootstrapped and regression tested on x86_64-linux  with
>>> no
>>> new regressions.
>>
>>
>> Better, still not perfect.
>>
>> I'm also arguing on the pass placement now - you put it where it may make
>> sense
>> for IPA VRP (but then IPA VRP could simply do this in its analysis phase)
>> but
>> not so much where it makes sense during the early pipeline.  I think it
>> makes
>> sense after FRE.
>>
>> Looking at how you finalize I see several issues with simply re-using
>> vrp_finalize.
>>
>> First of all the loop doing the set_range_info will turn up with
>> nothing as you've
>> popped off all value-ranges from the lattice.  Same for
>> substitute-and-fold (or
>> jump-threading).
>
> I am not sure understanding you. I am poping the value ranges for scope when
> we leave them. Other value ranges which lives in all the regions will
> remain.

Ah, yeah.  Sorry for the confusion.

>>
>> What you instead need to do with the DOM scheme is at the point you
>> call vrp_visit_stmt do sth like
>>
>>      vrp_visit_stmt (....);
>>      if (fold_stmt (&gsi, follow_single_use_edges))
>>        update_stmt ();
>
> I have added this. I think this will be good as we would be able to optimize
> with value ranges that is valid within the scope.

Yes, it also improves memory access locality and thus compile-time I think.

>>      tree def = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_DEF);
>>      if (def
>>          && INTEGRAL_TYPE_P (TREE_TYPE (def))
>>          && (TREE_CODE (vr_value[i]->min) == INTEGER_CST)
>>           && (TREE_CODE (vr_value[i]->max) == INTEGER_CST)
>>           && (vr_value[i]->type == VR_RANGE
>>               || vr_value[i]->type == VR_ANTI_RANGE))
>>         set_range_info (name, vr_value[i]->type, vr_value[i]->min,
>>                         vr_value[i]->max);
>>
> I am not sure if this is needed. I dont know if my push/pop value range
> implementation is not what you wanted. If you still prefer, I am happy to
> add this.

See above, also if you fold at this point names used should have their
ranges updated so match.pd patterns can utilize them.

I see you are still calling substitute_and_fold though so if you want to
continue doing that (for now) then you don't need any of the above.
If you stop doing that then you need to supply vrp_valueize instead of
follow_single_use_edges to fold_stmt, otherwise you won't get any
constants propagated.

I'd prefer to not call substitute_and_fold as that is a function of the
SSA propagator.

>
>> thus please split vrp_finalize into two parts, one of it with the memory
>> freeing which is the one you'd call.
>>
>> Note that EVRP is not enabled by default at any optimization level
>> it seems so bootstrap / test of it would be useless (did you only
>> test with the full series?  I never looked at the IPA VRP part)
>>
> I have:
>
> +ftree-evrp
> +Common Report Var(flag_tree_early_vrp) Init(1) Optimization
> +Perform Early Value Range Propagation on trees.
>
> I would like to run this only for -O2 and above but for now I am using this
> to test.

There is an array in opts.c, default_options_table, where you can set
defaults based on the optimization level.  If you want to turn it on
at -O2 that would match the setting for -ftree-vrp in which case I would
rather not add another option.  For debugging one can use
-fdisable-tree-evrp1 for example.

+value_range *evrp_dom_walker::pop_value_range (const_tree var)
+{

coding style nit: newline missing after return type.

It seems that in your pop_value_range you assume you only pop one
range per BB - while that's likely true at the moment it will be a limitation
in the future.  You want to pop ranges until you hit the NULL marker
in after_dom_children and unconditionally push a NULL marker.

For example to match current VRPs behavior on say

   i_2 = (int) j_3;
   if (i_2 < 0)
     ...

which can register an assert for j_3 when i_2 < 0 is true we'd do that
by re-simulating DEFs of uses we figured out new ranges of (and all
their uses).  All those ranges would be temporary as well, thus they'd
need to be pushed/popped.  In my quick prototype this was done
using a worklist seeded by the names we can derive a range from from
conditionals and "SSA propagating" from it.  Note that for this
the generic vrp_visit_stmt cannot be re-used as it doesn't push/pop,
factoring out the lattice update is what is needed here.

+/* Visit the basic blocks in the dominance order and set the Value Ranges (VR)
+   for SSA_NAMEs in the scope.  Use this VR to discover more VRs.  Restore the
+   old VR once the scope is exited.  */
+
+static bool
+evrp_visit_phi_node_local (gphi *phi)
+{
+  size_t i;
+  tree lhs = PHI_RESULT (phi);
+  value_range vr_result = VR_INITIALIZER;
+  bool first = true;
+  int edges;
+
+  edges = 0;
+  for (i = 0; i < gimple_phi_num_args (phi); i++)
+    {
+      edge e = gimple_phi_arg_edge (phi, i);
+      tree arg = PHI_ARG_DEF (phi, i);
+      value_range vr_arg = VR_INITIALIZER;
+      ++edges;
+
+      /* If there is a back-edge, set the result to VARYING.  */
+      if (e->flags & (EDGE_DFS_BACK | EDGE_COMPLEX))
+       {
+         set_value_range_to_varying (&vr_result);
+         break;
+       }
...
+      /* If any of the RHS value is VARYING, set the result to VARYING.  */
+      if ((vr_arg.type != VR_RANGE)
+         && (vr_arg.type != VR_ANTI_RANGE))
+       {
+         set_value_range_to_varying (&vr_result);
+         break;
+       }

this shows that you need to start conservative for a DOM based VRP,
thus with all lattice values initialized to VARYING (but undefined SSA
names of course still can be UNDEFINED) rather than UNDEFINED.

+      if (TREE_CODE (arg) == SSA_NAME)
+       vr_arg = *(get_value_range (arg));
+      else
+       set_value_range_to_varying (&vr_arg);

err - what about constants?  When you initialize the lattice properly
you should be able to re-use vrp_visit_phi_node (maybe split out
its head to avoid using SCEV or the iteration limitation).

Btw, you don't want to call vrp_initialize in evrp either, this is setting
SSA propagator state which you do not want to do.  Please factor
out lattice allocation/deallocation instead.  I see that might require
really factoring vrp_visit_stmt into a function that omits updating
the lattice and just returns a range for the single DEF.

That said, a good refactoring is to split the SSA propagator callback
implementations (vrp_visit_stmt and vrp_visit_phi_node) into workers
returning a value range and the callback that does the update_value_range
call plus returing a SSA propgator state.  You can then re-use the worker.

Thanks,
Richard.

> I  have tested the last set of patch separately.
>
> I will do more testing on this patch based on your feedback. Does this look
> better?
>
> Thanks,
> Kugan
>
>

next prev parent reply	other threads:[~2016-07-26 13:37 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-15  4:41 [RFC][IPA-VRP] IPA " kugan
2016-07-15  4:42 ` [RFC][IPA-VRP] Disable setting param of __builtin_constant_p to null kugan
2016-07-15  8:43   ` Jan Hubicka
2016-07-25  6:59     ` kugan
2016-07-25 10:02       ` Richard Biener
2016-07-15  4:43 ` [RFC][IPA-VRP] Check for POINTER_TYPE_P before accessing SSA_NAME_PTR_INFO in tree-inline kugan
2016-07-15  4:47   ` Andrew Pinski
2016-07-15  7:03     ` Jakub Jelinek
2016-07-15  7:03     ` kugan
2016-07-15  7:32   ` Richard Biener
2016-07-15  4:44 ` [RFC][IPA-VRP] Re-factor tree-vrp to factor out common code kugan
2016-07-15  4:47   ` [RFC][IPA-VRP] Add support for IPA VRP in ipa-cp/ipa-prop kugan
2016-07-15 12:23     ` Martin Jambor
2016-07-19  8:22       ` kugan
2016-07-19 21:27         ` kugan
2016-07-21 12:54           ` Jan Hubicka
2016-08-30  5:21             ` Kugan Vivekanandarajah
2016-08-30 18:12               ` Prathamesh Kulkarni
2016-08-30 21:10                 ` kugan
2016-09-02 12:31               ` Jan Hubicka
2016-07-17 13:24     ` Prathamesh Kulkarni
2016-07-22 12:27   ` [RFC][IPA-VRP] Re-factor tree-vrp to factor out common code kugan
2016-07-22 12:49     ` Richard Biener
2016-07-22 14:34       ` kugan
2016-07-23 10:12         ` kugan
2016-08-16  8:09           ` kugan
2016-08-16 11:56             ` Richard Biener
2016-08-16 22:20               ` kugan
2016-08-17  2:50                 ` kugan
2016-08-17 13:46                   ` Richard Biener
2016-07-15  4:45 ` [RFC][IPA-VRP] Early VRP Implementation kugan
2016-07-15  4:52   ` Andrew Pinski
2016-07-15  7:08     ` kugan
2016-07-15  7:28       ` Andrew Pinski
2016-07-15  7:33         ` kugan
2016-07-18 11:51           ` Richard Biener
2016-07-22 12:10             ` kugan
2016-07-25 11:18               ` Richard Biener
2016-07-26 12:27                 ` kugan
2016-07-26 13:37                   ` Richard Biener [this message]
2016-07-28  7:36                     ` kugan
2016-07-28 11:34                       ` Richard Biener
2016-08-03  1:17                         ` kugan
2016-08-12 10:43                           ` Richard Biener
2016-08-16  7:39                             ` [RFC][IPA-VRP] splits out the update_value_range calls from vrp_visit_stmt kugan
2016-08-16 10:58                               ` Richard Biener
2016-08-17  2:27                                 ` kugan
2016-08-17 13:44                                   ` Richard Biener
2016-08-16  7:45                             ` [RFC][IPA-VRP] Early VRP Implementation kugan
2016-08-19 11:41                               ` Richard Biener
2016-08-23  2:12                                 ` Kugan Vivekanandarajah
2016-09-02  8:11                                   ` Kugan Vivekanandarajah
2016-09-14 12:11                                   ` Richard Biener
2016-09-14 21:47                                     ` Jan Hubicka
2016-09-15  7:23                                       ` Richard Biener
2016-09-15 14:57                                         ` Jeff Law
2016-09-16  8:59                                           ` Richard Biener
2016-09-16  6:37                                     ` kugan
2016-09-16 10:26                                       ` Richard Biener
2016-09-18 23:40                                         ` kugan
2016-09-19 13:30                                           ` Richard Biener
2016-09-20  5:48                                             ` kugan
2016-07-19 16:19     ` Jeff Law
2016-07-19 18:35       ` Richard Biener
2016-07-19 20:14         ` Jeff Law
2016-07-15  4:47 ` [RFC][IPA-VRP] Teach tree-vrp to use the VR set in params kugan
2016-07-18 11:33   ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc0b3J7a6kB0VoLWGG9_qJ6eK2NuGshQfDpewsipkUh_7g@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    --cc=kugan.vivekanandarajah@linaro.org \
    --cc=mjambor@suse.cz \
    --cc=pinskia@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).