public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* On-Demand range technology [5/5] - Looking to the future.
@ 2019-05-23  1:30 Andrew MacLeod
  2019-05-23 14:07 ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew MacLeod @ 2019-05-23  1:30 UTC (permalink / raw)
  To: GCC; +Cc: Jeff Law, Aldy Hernandez

A primary goal of this approach is to try to pull the various aspects of 
VRP apart and make them individually viable so they can be used at 
appropriate places as needed.  The various components of VRP were 
identified as:
     - Ranges
     - Relational queries
     - Equivalencies
     - Bitmask tracking
     - Symbolic range endpoint resolution

This prototype implementation tackles only the range component. It makes 
ranges easily accessible from anywhere in the compiler.

I have plans for addressing these components within the same model, but 
maintaining their independence.  This should make maintenance easier, 
less error prone, and ultimately be far more flexible as other passes 
can utilize whichever aspects they need.


Symbolic range endpoint resolution
--------------------------------------------------

I touched on this in the prototype section, and maintain that the only 
requirement for symbols falls under the equivalence and relational tracking.
     x_2 = y_3 + 5
     If (x_2 > y_3)
Ranges themselves in VRP are eventually resolved to a constant for 
export to the global range table.  At this point, whatever range is 
known for the symbolic is substituted into the value_range, and any 
expression resolved to come up with the final non-symbolic range.
     X_2 = [y_3 + 5, MAX]
If y_3 evaluates to [20, 30], then x_2 is resolved as [25, MAX].

The ranger does this by default on the fly due to its nature. When the 
range of x_2 is requested the first time, it evaluates y_3 , comes up 
with the same [20, 30] range for y_3, and evaluates it to [25,max] 
immediately.

The facility is there to reevaluate the range if the range of y_3 
changes, but to this point it has not been needed. Typically it involves 
y_3 derived in some way from a back edge, and also being derived by yet 
another ssa-name from a different back edge. So, not super common.    
However, I do plan to get to this eventually to enable those 
re-calculations. For the protype, it has not been deemed critical since 
EVRP doesn't even do back edges.

Equivalencies and other Relationals
--------------------------------------------------

The relationship between ssa names are the primary use of symbols in 
ranges today, but the actual property of relations and equivalencies has 
little to do with ranges.

I propose that we utilize the exact same model as the range operation 
database to track relationships. Both equivalencies and relationals can 
be combined as “==” and “!=” is merely another relation.   Each tree 
code has a query to ask for the relationship between any of its 
operands. Ie:
     y_2 = x_3
     j_4 = y_2 + 6
     If (j_4 > x_3)
Knowing the ranges of j_4 and x_3 don’t really help resolve the 
condition.  If x_3 is varying, or even a non-constant, we know nothing 
at all, at least from a range perspective.

Applying the same calculation model the ranger uses from a relational 
point of view, range-ops can be given a relational interface in which 
each tree code can evaluate the relation between its operands.   A copy 
would return “==” for the relation between the LHS and op1, so we’d have 
the relation y_2 == x_3

Operator plus would look at its operands, and be able to indicate J_4 < 
Y_2 because operand 2 is a positive constant.

The branch is the one we care about, and a query would be made for the 
relation between j_4 and x_3.  By combining the relations that feed it, 
we’d get the j_4 < (y_2 == x_3), and the relational result would be j_4 
< x_3.  When applied to (j_4 > x_3) the result is false.

So the relational query would be able to answer the question without 
ever looking at a range, although if a range is available, it may help 
refine the answer.  The lookup process is identical to the way ranges 
are currently handled, which means the same query infrastructure can be 
leveraged and used independently or in concert with ranges.

This would also benefit from not carrying information around unless it 
is requested/required. Currently all equivalences must be stored in case 
we need to know if there is an equivalency. Just like with ranges, this 
model would have no need to even look at an equivalency unless there is 
an actual need to know.

Clearly there is work to do, but this has a lot of potential as a follow 
up to the range work since it uses the same infrastructure. Any pass 
could cheaply ask about the equivalence/relation between any 2 ssa_names.


Bitmask tracking
------------------------
Not to sound like a broken record, but the exact same process can also 
be applied to bitmasks.
     X_2 = y_1 | 0x01
     If (x_2 == 2)    // can never be true

Bitwise operators, as well as other operators like *, /, shifts, etc can 
calculate bitmasks  in exactly the same way ranges are calculated. I 
also considered adding them as an element of the range class, but that 
would complicate the range class, and I maintain that keeping this all 
independant is better from both a maintainability and correctness point 
of view.

If the bitmask becomes part of the range, then we will have to deal with 
the interactions between the two whenever one changes.. Ie if the range 
is [0,45] and we OR  it with 0xF00  what is the resulting range?   We 
don’t care if the only use if to then check a bit,  but it matters a lot 
if we check to see if its < 44.

If the two are kept separate, we will only calculate the range if we 
care (ie we see (x_2 < 44).  If we see a bit check, then we will only 
look back to see what bits might be set.  I would also add that this 
would give us an easy ability to check for bits that are known 0 as well 
as bits that are known 1.

If both are available, then the combination of the 2 could be applied 
together to answer a query if so desired.

Putting it all together.
------------------------------
All these components of current VRP would now be available, but 
maintained, tested, and available anywhere either independently or 
together in whatever combination desired.  They all utilize the same 
basic query engine, so a unifying pass like VRP can track/query all 3 as 
needed.  But ONLY as needed saving overall computation time

Arbitrarily complex situations like
     If (b_4 > 7)                  // b_4 range [8, MAX]
     X_2 = b_4 & 0x0E    // x_2 has range [8, 14], known 0s’ 0xF1
     Y_4 = x_2 + 3           // y_4 has range [11, 17], known 0’s 0xE0,  
known 1’s 0x01, Rel y_4 < x_2
     Z_5 = y_4                // Rel   z_5 == y_4
     If ((z_5 & 0x01) && z_5 < 20)

Could solve the condition as always being TRUE  with little effort 
because each of the simple building blocks combine to work together.

*blink*.  The less than obvious piece here would be teaching the bitmask 
routine for operator PLUS_EXPR that adding a number with trailing 1’s 
(0x03) to a bitmask with trailing 0;s will fill some of those  known 0’s 
with known 1’s.   Missed opportunities are usually as simple as 
enhancing the evaluation routine for an op-code.   This will then be 
applied everywhere it is encountered as its just a basic property of 
PLUS and how it affects bitmasks.

This aspect of all calculations being driven from the opcode and 
combined generically without special casing at  a higher level is both 
very powerful and less prone to produce errors. Our initial experiences  
involved debugging a lot of ranges because they didn’t look right… but 
it would inevitably turn out that a sequence of statements and 
conditions ended up determining an unexpected range, we just couldn’t 
understand from looking at it how it was arrived at.

Comments and feedback always welcome!
Thanks
Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: On-Demand range technology [5/5] - Looking to the future.
  2019-05-23  1:30 On-Demand range technology [5/5] - Looking to the future Andrew MacLeod
@ 2019-05-23 14:07 ` Richard Biener
  2019-05-24 15:51   ` Andrew MacLeod
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2019-05-23 14:07 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: GCC, Jeff Law, Aldy Hernandez

On Thu, May 23, 2019 at 3:30 AM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> A primary goal of this approach is to try to pull the various aspects of
> VRP apart and make them individually viable so they can be used at
> appropriate places as needed.  The various components of VRP were
> identified as:
>      - Ranges
>      - Relational queries
>      - Equivalencies
>      - Bitmask tracking
>      - Symbolic range endpoint resolution
>
> This prototype implementation tackles only the range component. It makes
> ranges easily accessible from anywhere in the compiler.
>
> I have plans for addressing these components within the same model, but
> maintaining their independence.  This should make maintenance easier,
> less error prone, and ultimately be far more flexible as other passes
> can utilize whichever aspects they need.
>
>
> Symbolic range endpoint resolution
> --------------------------------------------------
>
> I touched on this in the prototype section, and maintain that the only
> requirement for symbols falls under the equivalence and relational tracking.
>      x_2 = y_3 + 5
>      If (x_2 > y_3)
> Ranges themselves in VRP are eventually resolved to a constant for
> export to the global range table.  At this point, whatever range is
> known for the symbolic is substituted into the value_range, and any
> expression resolved to come up with the final non-symbolic range.
>      X_2 = [y_3 + 5, MAX]
> If y_3 evaluates to [20, 30], then x_2 is resolved as [25, MAX].
>
> The ranger does this by default on the fly due to its nature. When the
> range of x_2 is requested the first time, it evaluates y_3 , comes up
> with the same [20, 30] range for y_3, and evaluates it to [25,max]
> immediately.
>
> The facility is there to reevaluate the range if the range of y_3
> changes, but to this point it has not been needed. Typically it involves
> y_3 derived in some way from a back edge, and also being derived by yet
> another ssa-name from a different back edge. So, not super common.
> However, I do plan to get to this eventually to enable those
> re-calculations. For the protype, it has not been deemed critical since
> EVRP doesn't even do back edges.
>
> Equivalencies and other Relationals
> --------------------------------------------------
>
> The relationship between ssa names are the primary use of symbols in
> ranges today, but the actual property of relations and equivalencies has
> little to do with ranges.
>
> I propose that we utilize the exact same model as the range operation
> database to track relationships. Both equivalencies and relationals can
> be combined as “==” and “!=” is merely another relation.   Each tree
> code has a query to ask for the relationship between any of its
> operands. Ie:
>      y_2 = x_3
>      j_4 = y_2 + 6
>      If (j_4 > x_3)
> Knowing the ranges of j_4 and x_3 don’t really help resolve the
> condition.  If x_3 is varying, or even a non-constant, we know nothing
> at all, at least from a range perspective.
>
> Applying the same calculation model the ranger uses from a relational
> point of view, range-ops can be given a relational interface in which
> each tree code can evaluate the relation between its operands.   A copy
> would return “==” for the relation between the LHS and op1, so we’d have
> the relation y_2 == x_3
>
> Operator plus would look at its operands, and be able to indicate J_4 <
> Y_2 because operand 2 is a positive constant.
>
> The branch is the one we care about, and a query would be made for the
> relation between j_4 and x_3.  By combining the relations that feed it,
> we’d get the j_4 < (y_2 == x_3), and the relational result would be j_4
> < x_3.  When applied to (j_4 > x_3) the result is false.
>
> So the relational query would be able to answer the question without
> ever looking at a range, although if a range is available, it may help
> refine the answer.  The lookup process is identical to the way ranges
> are currently handled, which means the same query infrastructure can be
> leveraged and used independently or in concert with ranges.
>
> This would also benefit from not carrying information around unless it
> is requested/required. Currently all equivalences must be stored in case
> we need to know if there is an equivalency. Just like with ranges, this
> model would have no need to even look at an equivalency unless there is
> an actual need to know.
>
> Clearly there is work to do, but this has a lot of potential as a follow
> up to the range work since it uses the same infrastructure. Any pass
> could cheaply ask about the equivalence/relation between any 2 ssa_names.
>
>
> Bitmask tracking
> ------------------------
> Not to sound like a broken record, but the exact same process can also
> be applied to bitmasks.
>      X_2 = y_1 | 0x01
>      If (x_2 == 2)    // can never be true
>
> Bitwise operators, as well as other operators like *, /, shifts, etc can
> calculate bitmasks  in exactly the same way ranges are calculated. I
> also considered adding them as an element of the range class, but that
> would complicate the range class, and I maintain that keeping this all
> independant is better from both a maintainability and correctness point
> of view.
>
> If the bitmask becomes part of the range, then we will have to deal with
> the interactions between the two whenever one changes.. Ie if the range
> is [0,45] and we OR  it with 0xF00  what is the resulting range?   We
> don’t care if the only use if to then check a bit,  but it matters a lot
> if we check to see if its < 44.
>
> If the two are kept separate, we will only calculate the range if we
> care (ie we see (x_2 < 44).  If we see a bit check, then we will only
> look back to see what bits might be set.  I would also add that this
> would give us an easy ability to check for bits that are known 0 as well
> as bits that are known 1.
>
> If both are available, then the combination of the 2 could be applied
> together to answer a query if so desired.
>
> Putting it all together.
> ------------------------------
> All these components of current VRP would now be available, but
> maintained, tested, and available anywhere either independently or
> together in whatever combination desired.  They all utilize the same
> basic query engine, so a unifying pass like VRP can track/query all 3 as
> needed.  But ONLY as needed saving overall computation time
>
> Arbitrarily complex situations like
>      If (b_4 > 7)                  // b_4 range [8, MAX]
>      X_2 = b_4 & 0x0E    // x_2 has range [8, 14], known 0s’ 0xF1
>      Y_4 = x_2 + 3           // y_4 has range [11, 17], known 0’s 0xE0,
> known 1’s 0x01, Rel y_4 < x_2
>      Z_5 = y_4                // Rel   z_5 == y_4
>      If ((z_5 & 0x01) && z_5 < 20)
>
> Could solve the condition as always being TRUE  with little effort
> because each of the simple building blocks combine to work together.
>
> *blink*.  The less than obvious piece here would be teaching the bitmask
> routine for operator PLUS_EXPR that adding a number with trailing 1’s
> (0x03) to a bitmask with trailing 0;s will fill some of those  known 0’s
> with known 1’s.   Missed opportunities are usually as simple as
> enhancing the evaluation routine for an op-code.   This will then be
> applied everywhere it is encountered as its just a basic property of
> PLUS and how it affects bitmasks.
>
> This aspect of all calculations being driven from the opcode and
> combined generically without special casing at  a higher level is both
> very powerful and less prone to produce errors. Our initial experiences
> involved debugging a lot of ranges because they didn’t look right… but
> it would inevitably turn out that a sequence of statements and
> conditions ended up determining an unexpected range, we just couldn’t
> understand from looking at it how it was arrived at.

Sounds like those primitive workers for ranges, relations and bitmasks
should be factored out where they are not already as a prerequesite,
not as some future project?  Note the bitmask ones are already
(just residing in tree-ssa-ccp.c), likewise the wide-int based range
ones are (mostly I think).  Relations are missing unfortunately
and passes like tree-ssa-uninit.c would benefit from such machinery
(or rather contain another copy of such machinery).

Of course factoring always comes with commoning because usually
in GCC you find at least two implementations of everything.

Richard.

> Comments and feedback always welcome!
> Thanks
> Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: On-Demand range technology [5/5] - Looking to the future.
  2019-05-23 14:07 ` Richard Biener
@ 2019-05-24 15:51   ` Andrew MacLeod
  2019-05-27 13:13     ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew MacLeod @ 2019-05-24 15:51 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC, Jeff Law, Aldy Hernandez

On 5/23/19 10:07 AM, Richard Biener wrote:
> On Thu, May 23, 2019 at 3:30 AM Andrew MacLeod <amacleod@redhat.com> wrote:
>>
>> This aspect of all calculations being driven from the opcode and
>> combined generically without special casing at  a higher level is both
>> very powerful and less prone to produce errors. Our initial experiences
>> involved debugging a lot of ranges because they didn’t look right… but
>> it would inevitably turn out that a sequence of statements and
>> conditions ended up determining an unexpected range, we just couldn’t
>> understand from looking at it how it was arrived at.
> Sounds like those primitive workers for ranges, relations and bitmasks
> should be factored out where they are not already as a prerequesite,
> not as some future project?  Note the bitmask ones are already
> (just residing in tree-ssa-ccp.c), likewise the wide-int based range
> ones are (mostly I think).  Relations are missing unfortunately
> and passes like tree-ssa-uninit.c would benefit from such machinery
> (or rather contain another copy of such machinery).
>
> Of course factoring always comes with commoning because usually
> in GCC you find at least two implementations of everything.
>
> Richard.
>
>
Yes, this would indeed be the case, I also consider the relations 
functionality as required, or to at least demonstrate a viable path 
forward that is being worked on. The bitmask operations are currently 
only used in full VRP (?) but they probably aren't a lot of work, I'd 
put them at a lower priority until VRP is slated to go.

I would really like to see a single implementation for each of these 
things centrally maintained and handled generically.   At this point I 
don't think it was  realistic to proceed any further until we get some 
sort of agreement on a path forward to base it on.  we have found trying 
to maintain a common base with VRP just for ranges to be a fair amount 
of work on the side, so adding other things on top of that seems 
nightmarish :-)


So let me see if I can summarize everything to this point.

1 * The range-ops infrastructure to solve for other operands coupled 
with the windback calculations within a block is desirable to replace 
the assert mechanism.
2 * common rangeops-like primitives for relational and bitmask  also 
seems desirable.
3 * You are not convinced the on demand component can/should replace the 
EVRP dominator walk.

Does that about sum it up?


1 - The way everything is structured, both range-ops and the 
gori-computational component are independent of the on-demand engine. 
There is reason to believe those can be integrated with E/VRP as we are 
sort of doing that under the covers now.
      what function does register_edge_assert() perform?  is it 
producing ranges just for outgoing edges for a block? and then you apply 
whatever range you currently have to that and this gives you a range 
going into the block on the other end of the edge?   If so, that 
functionality maps pretty close to what gori-computes provides.  It has 
a single API which returns whatever range restrictions an edge puts on 
an ssa-name.

there would have to be some tweaking to work with symbolics since we 
hate those and they aren't currently handled.


2 - implementing range-ops like routines for relations and eventually 
bitmasks should not be too difficult. The process is pretty 
straightforward.  I will need to work on the relation tracking/querying, 
but that was pretty much in plan as the next task anyway.   I had pushed 
bitmasks way down the list as "eventually" since I think that is only in 
full VRP and a lower priority until considering VRP for removal.


3 - Give me a few days to mull over some details on how some of this may 
all coexist and I'll get back to you next week.

Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: On-Demand range technology [5/5] - Looking to the future.
  2019-05-24 15:51   ` Andrew MacLeod
@ 2019-05-27 13:13     ` Richard Biener
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Biener @ 2019-05-27 13:13 UTC (permalink / raw)
  To: Andrew MacLeod; +Cc: GCC, Jeff Law, Aldy Hernandez

On Fri, May 24, 2019 at 5:51 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 5/23/19 10:07 AM, Richard Biener wrote:
> > On Thu, May 23, 2019 at 3:30 AM Andrew MacLeod <amacleod@redhat.com> wrote:
> >>
> >> This aspect of all calculations being driven from the opcode and
> >> combined generically without special casing at  a higher level is both
> >> very powerful and less prone to produce errors. Our initial experiences
> >> involved debugging a lot of ranges because they didn’t look right… but
> >> it would inevitably turn out that a sequence of statements and
> >> conditions ended up determining an unexpected range, we just couldn’t
> >> understand from looking at it how it was arrived at.
> > Sounds like those primitive workers for ranges, relations and bitmasks
> > should be factored out where they are not already as a prerequesite,
> > not as some future project?  Note the bitmask ones are already
> > (just residing in tree-ssa-ccp.c), likewise the wide-int based range
> > ones are (mostly I think).  Relations are missing unfortunately
> > and passes like tree-ssa-uninit.c would benefit from such machinery
> > (or rather contain another copy of such machinery).
> >
> > Of course factoring always comes with commoning because usually
> > in GCC you find at least two implementations of everything.
> >
> > Richard.
> >
> >
> Yes, this would indeed be the case, I also consider the relations
> functionality as required, or to at least demonstrate a viable path
> forward that is being worked on. The bitmask operations are currently
> only used in full VRP (?) but they probably aren't a lot of work, I'd
> put them at a lower priority until VRP is slated to go.
>
> I would really like to see a single implementation for each of these
> things centrally maintained and handled generically.   At this point I
> don't think it was  realistic to proceed any further until we get some
> sort of agreement on a path forward to base it on.  we have found trying
> to maintain a common base with VRP just for ranges to be a fair amount
> of work on the side, so adding other things on top of that seems
> nightmarish :-)
>
>
> So let me see if I can summarize everything to this point.
>
> 1 * The range-ops infrastructure to solve for other operands coupled
> with the windback calculations within a block is desirable to replace
> the assert mechanism.
> 2 * common rangeops-like primitives for relational and bitmask  also
> seems desirable.
> 3 * You are not convinced the on demand component can/should replace the
> EVRP dominator walk.
>
> Does that about sum it up?

Yes.  For 1 EVRP already doesn't use asserts and I dislike VRP mostly
because of the asserts.  EVRP doesn't have symbolic ranges so it would
benefit from range-ops as well.

Yes, I'm not convinced that replacing a simple forward walk on the
dominators is to be replaced by using whatever on-demand infrastructure
we have on each and every stmt.  But in the end the difference should be
to just not using the GORI cache but the workers themselves which
should speed up things and remove the caching memory overhead.

The current ranges stuff got heavily penaltized by refactoring it into
C++, splitting over a dozen files and using virtual function calls, so
comparing speed is unfair within the current sturcture.

> 1 - The way everything is structured, both range-ops and the
> gori-computational component are independent of the on-demand engine.
> There is reason to believe those can be integrated with E/VRP as we are
> sort of doing that under the covers now.
>       what function does register_edge_assert() perform?  is it
> producing ranges just for outgoing edges for a block?
> and then you apply
> whatever range you currently have to that and this gives you a range
> going into the block on the other end of the edge?   If so, that
> functionality maps pretty close to what gori-computes provides.  It has
> a single API which returns whatever range restrictions an edge puts on
> an ssa-name.

It's split into two steps but basically doing what you say.  One part
is register_edge_assert_for which does common analysis for
building ASSERT_EXPRs for VRP and building temporary ranges for
EVRP.  Then EVRP has try_find_new_range which for one of the
"asserts" computes the new value-range [on a specific BB entry edge].

>
> there would have to be some tweaking to work with symbolics since we
> hate those and they aren't currently handled.
>
>
> 2 - implementing range-ops like routines for relations and eventually
> bitmasks should not be too difficult. The process is pretty
> straightforward.

As said it already exists for bitmasks, see tree-ssa-ccp.c:bit_value_*

>  I will need to work on the relation tracking/querying,
> but that was pretty much in plan as the next task anyway.   I had pushed
> bitmasks way down the list as "eventually" since I think that is only in
> full VRP and a lower priority until considering VRP for removal.

Note that VRP doesn't really track bitmasks IIRC, it merely uses
those recorded on SSA names to prune ranges.

Richard.

>
> 3 - Give me a few days to mull over some details on how some of this may
> all coexist and I'll get back to you next week.
>
> Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-27 13:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-23  1:30 On-Demand range technology [5/5] - Looking to the future Andrew MacLeod
2019-05-23 14:07 ` Richard Biener
2019-05-24 15:51   ` Andrew MacLeod
2019-05-27 13:13     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).