From: Andrew MacLeod <amacleod@redhat.com>
To: GCC <gcc@gcc.gnu.org>
Cc: Jeff Law <law@redhat.com>, Aldy Hernandez <aldyh@redhat.com>
Subject: On-Demand range technology [5/5] - Looking to the future.
Date: Thu, 23 May 2019 01:30:00 -0000 [thread overview]
Message-ID: <47283c00-4da8-31fa-232c-1a127ad95284@redhat.com> (raw)
A primary goal of this approach is to try to pull the various aspects of
VRP apart and make them individually viable so they can be used at
appropriate places as needed. The various components of VRP were
identified as:
   - Ranges
   - Relational queries
   - Equivalencies
   - Bitmask tracking
   - Symbolic range endpoint resolution
This prototype implementation tackles only the range component. It makes
ranges easily accessible from anywhere in the compiler.
I have plans for addressing these components within the same model, but
maintaining their independence. This should make maintenance easier,
less error prone, and ultimately be far more flexible as other passes
can utilize whichever aspects they need.
Symbolic range endpoint resolution
--------------------------------------------------
I touched on this in the prototype section, and maintain that the only
requirement for symbols falls under the equivalence and relational tracking.
   x_2 = y_3 + 5
   If (x_2 > y_3)
Ranges themselves in VRP are eventually resolved to a constant for
export to the global range table. At this point, whatever range is
known for the symbolic is substituted into the value_range, and any
expression resolved to come up with the final non-symbolic range.
   X_2 = [y_3 + 5, MAX]
If y_3 evaluates to [20, 30], then x_2 is resolved as [25, MAX].
The ranger does this by default on the fly due to its nature. When the
range of x_2 is requested the first time, it evaluates y_3 , comes up
with the same [20, 30] range for y_3, and evaluates it to [25,max]
immediately.
The facility is there to reevaluate the range if the range of y_3
changes, but to this point it has not been needed. Typically it involves
y_3 derived in some way from a back edge, and also being derived by yet
another ssa-name from a different back edge. So, not super common.  Â
However, I do plan to get to this eventually to enable those
re-calculations. For the protype, it has not been deemed critical since
EVRP doesn't even do back edges.
Equivalencies and other Relationals
--------------------------------------------------
The relationship between ssa names are the primary use of symbols in
ranges today, but the actual property of relations and equivalencies has
little to do with ranges.
I propose that we utilize the exact same model as the range operation
database to track relationships. Both equivalencies and relationals can
be combined as â==â and â!=â is merely another relation.  Each tree
code has a query to ask for the relationship between any of its
operands. Ie:
   y_2 = x_3
   j_4 = y_2 + 6
   If (j_4 > x_3)
Knowing the ranges of j_4 and x_3 donât really help resolve the
condition. If x_3 is varying, or even a non-constant, we know nothing
at all, at least from a range perspective.
Applying the same calculation model the ranger uses from a relational
point of view, range-ops can be given a relational interface in which
each tree code can evaluate the relation between its operands.  A copy
would return â==â for the relation between the LHS and op1, so weâd have
the relation y_2 == x_3
Operator plus would look at its operands, and be able to indicate J_4 <
Y_2 because operand 2 is a positive constant.
The branch is the one we care about, and a query would be made for the
relation between j_4 and x_3. By combining the relations that feed it,
weâd get the j_4 < (y_2 == x_3), and the relational result would be j_4
< x_3. When applied to (j_4 > x_3) the result is false.
So the relational query would be able to answer the question without
ever looking at a range, although if a range is available, it may help
refine the answer. The lookup process is identical to the way ranges
are currently handled, which means the same query infrastructure can be
leveraged and used independently or in concert with ranges.
This would also benefit from not carrying information around unless it
is requested/required. Currently all equivalences must be stored in case
we need to know if there is an equivalency. Just like with ranges, this
model would have no need to even look at an equivalency unless there is
an actual need to know.
Clearly there is work to do, but this has a lot of potential as a follow
up to the range work since it uses the same infrastructure. Any pass
could cheaply ask about the equivalence/relation between any 2 ssa_names.
Bitmask tracking
------------------------
Not to sound like a broken record, but the exact same process can also
be applied to bitmasks.
   X_2 = y_1 | 0x01
   If (x_2 == 2)   // can never be true
Bitwise operators, as well as other operators like *, /, shifts, etc can
calculate bitmasks in exactly the same way ranges are calculated. I
also considered adding them as an element of the range class, but that
would complicate the range class, and I maintain that keeping this all
independant is better from both a maintainability and correctness point
of view.
If the bitmask becomes part of the range, then we will have to deal with
the interactions between the two whenever one changes.. Ie if the range
is [0,45] and we OR it with 0xF00 what is the resulting range?  We
donât care if the only use if to then check a bit, but it matters a lot
if we check to see if its < 44.
If the two are kept separate, we will only calculate the range if we
care (ie we see (x_2 < 44). If we see a bit check, then we will only
look back to see what bits might be set. I would also add that this
would give us an easy ability to check for bits that are known 0 as well
as bits that are known 1.
If both are available, then the combination of the 2 could be applied
together to answer a query if so desired.
Putting it all together.
------------------------------
All these components of current VRP would now be available, but
maintained, tested, and available anywhere either independently or
together in whatever combination desired. They all utilize the same
basic query engine, so a unifying pass like VRP can track/query all 3 as
needed. But ONLY as needed saving overall computation time
Arbitrarily complex situations like
   If (b_4 > 7)                 // b_4 range [8, MAX]
   X_2 = b_4 & 0x0E   // x_2 has range [8, 14], known 0sâ 0xF1
   Y_4 = x_2 + 3          // y_4 has range [11, 17], known 0âs 0xE0,Â
known 1âs 0x01, Rel y_4 < x_2
   Z_5 = y_4              // Rel  z_5 == y_4
   If ((z_5 & 0x01) && z_5 < 20)
Could solve the condition as always being TRUEÂ with little effort
because each of the simple building blocks combine to work together.
*blink*. The less than obvious piece here would be teaching the bitmask
routine for operator PLUS_EXPR that adding a number with trailing 1âs
(0x03) to a bitmask with trailing 0;s will fill some of those known 0âs
with known 1âs.  Missed opportunities are usually as simple as
enhancing the evaluation routine for an op-code.  This will then be
applied everywhere it is encountered as its just a basic property of
PLUS and how it affects bitmasks.
This aspect of all calculations being driven from the opcode and
combined generically without special casing at a higher level is both
very powerful and less prone to produce errors. Our initial experiencesÂ
involved debugging a lot of ranges because they didnât look right⦠but
it would inevitably turn out that a sequence of statements and
conditions ended up determining an unexpected range, we just couldnât
understand from looking at it how it was arrived at.
Comments and feedback always welcome!
Thanks
Andrew
next reply other threads:[~2019-05-23 1:30 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-23 1:30 Andrew MacLeod [this message]
2019-05-23 14:07 ` Richard Biener
2019-05-24 15:51 ` Andrew MacLeod
2019-05-27 13:13 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47283c00-4da8-31fa-232c-1a127ad95284@redhat.com \
--to=amacleod@redhat.com \
--cc=aldyh@redhat.com \
--cc=gcc@gcc.gnu.org \
--cc=law@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).