The origin of scalar evolutions?

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* The origin of scalar evolutions?
@ 2005-07-04 16:55 Robert van Engelen
  2005-07-05  3:36 ` Daniel Berlin
  2005-07-13  6:23 ` Sebastian Pop
  0 siblings, 2 replies; 4+ messages in thread
From: Robert van Engelen @ 2005-07-04 16:55 UTC (permalink / raw)
  To: gcc

Hi,

I am interested in the recent work in gcc 4.0 with respect to "scalar 
evolutions". The students in my compiler laboratory studied the 
algorithm implemented in gcc 4.0. We are considering extending this 
work based on our experience [4] building a similar framework for 
symbolic analysis with chains of recurrences for SUIF and Polaris 
(since we introduced this approach in 2001 [1]).

However, after reading the paper on "Fast Recognition of Scalar 
Evolutions on Three-Address SSA Code", we have a couple of questions.

1. After reading the paper, we concluded that the scalar evolutions are 
actually a restricted polynomial form of chains of recurrences by 
Bachmann and Zima [6,8]. Is this true? Or is there an essential 
difference with multi-variate chains of recurrences [7]? Does the 
"scalar evolutions" name suggest something else beyond the recurrence 
forms. Are we missing a crucial point?

2. What is the difference between the SSA-based algorithm described in 
the paper and the G-SSA form proposed in 1995 by Tu and Padua [9]. The 
G-SSA algorithms compute recurrence relations from which chains of 
recurrences can be easily constructed using the construction algorithm 
presented in [1].

3. Do you compute recurrence forms for pointer variables? If so, what 
is the difference compared to the method described in our 2001 paper 
[2]?

4. The "peeled" forms are described as "wrap around" forms in the paper 
but do not seem to handle true wrap around variable. How do you handle 
wrap around variables, i.e. those that have the form:
k = 9;
for (i=0; i < 10; i++)
{ a[k] = ...
   k = i;
}
Can you please explain?

5. Do you plan on implementing dependence testing and range analysis on 
chains of recurrences as described in [3,4,5]?

With respect to our first question, we felt that the use of a new name 
for an existing technique is confusing. For example, by calling a 
"matrix" from linear algebra a "scalar square" we do not change its 
semantics. The (algebraic) operations on matrices are still the same. 
So are the operations on scalar evolutions the same as those on chains 
of recurrences. Janson's classic "History of Art" has apt advice: "It 
has always been easier to invent new labels than to create a movement 
in art that truly deserves a new name."

References

[1]	Robert A. van Engelen, "Efficient Symbolic Analysis for Optimizing 
Compilers", in proceedings of the International Conference on Compiler 
Construction, ETAPS 2001, LNCS 2027, pages 118-132

[2]	Robert A. van Engelen and Kyle A. Gallivan, "An Efficient Algorithm 
for Pointer-to-Array Access Conversion for Compiling and Optimizing DSP 
Applications", in proceedings of the 2001 International Workshop on 
Innovative Architectures for Future Generation High-Performance 
Processors and Systems (IWIA 2001), 18-19 January 2001, Maui, Hawaii, 
pages 80-89.

[3]	Robert van Engelen, Johnnie Birch, and Kyle Gallivan, "Array Data 
Dependence Testing with the Chains of Recurrences Algebra", in the 
proceedings of the IEEE International Workshop on Innovative 
Architectures for Future Generation High-Performance Processors and 
Systems (IWIA), January 2004, pages 70-81.

[4]	Robert van Engelen, Johnnie Birch, Yixin Shou, Burt Walsh, and Kyle 
Gallivan, "A Unified Framework for Nonlinear Dependence Testing and 
Symbolic Analysis", in the proceedings of the ACM International 
Conference on Supercomputing (ICS), 2004, pages 106-115.

[5]	Johnnie Birch, Robert van Engelen, and Kyle Gallivan, "Value Range 
Analysis of Conditionally Updated Variables and Pointers", in the 
proceedings of Compilers for Parallel Computing (CPC), 2004, pages 
265-276.

[6]	O. Bachmann, P.S. Wang and E.V. Zima, "Chains of Recurrences - a 
Method to Expedite the Evaluation of Closed-Form Functions",in 
proceedings of the International Symposium on Symbolic and Algebraic 
Computing (ISSAC), 1994, pages 242-249.

[7]	O. Bachmann, "Chains of Recurrences for Functions of Two Variables 
and Their Application to Surface Plotting", in "Human Interaction for 
Symbolic Computation, 1996.

[8]	E.V. Zima, "On computational properties of chains of recurrences",
in proceedings of the 2001 International Symposium on Symbolic and 
Algebraic Computation, pages 345.

[9]	P. Tu and D. Padua, "Gated SSA-Based Demand-Driven Symbolic 
Analysis for Parallelizing Compilers", in proceedings of the ACM 
International Conference on Supercomputing (ICS), 1995, pages 414-423.

- Robert van Engelen

Robert van Engelen: Associate Professor, Computer Science Department
Florida State University, 162 J. Love Bldg., Tallahassee, FL32306-4530
Offices: 162LOV/471DSL, (850)644-9661/645-0309, Fax: (850)644-0058
Email: engelen@cs.fsu.edu, URL: http://www.cs.fsu.edu/~engelen

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: The origin of scalar evolutions?
  2005-07-04 16:55 The origin of scalar evolutions? Robert van Engelen
@ 2005-07-05  3:36 ` Daniel Berlin
  2005-07-13  6:23 ` Sebastian Pop
  1 sibling, 0 replies; 4+ messages in thread
From: Daniel Berlin @ 2005-07-05  3:36 UTC (permalink / raw)
  To: Sebastian Pop, Robert van Engelen; +Cc: gcc

On Mon, 2005-07-04 at 12:31 -0400, Robert van Engelen wrote:
> Hi,
> 
> I am interested in the recent work in gcc 4.0 with respect to "scalar 
> evolutions". The students in my compiler laboratory studied the 
> algorithm implemented in gcc 4.0. We are considering extending this 
> work based on our experience [4] building a similar framework for 
> symbolic analysis with chains of recurrences for SUIF and Polaris 
> (since we introduced this approach in 2001 [1]).

(I've snipped the paper references because i've read all these papers in
the past, and i'm pretty sure Sebastian has as well :P)
That would be great!  We'd be glad to work with you to extend this and
contribute the work to gcc. (We being Sebastian and I.  Together we are
responsible for maintenance of the chains of recurrence/scalar
evolutions code and the data dependence analysis)

I should note that we have significantly advanced data dependence work
going on based on chrecs on the autovect-branch of gcc cvs (it's just a
convenient place for the work, most of the autovectorization
improvements from that code tree have already been put into 4.1)

In addition, you probably want to look at the current code, as it is
changing as we speak as the final "non-bugfixing" phase of gcc 4.1
closes in the next few days.

Things like range analysis using the results of the analysis are also in
4.1.

You can get a snapshot of the 4.1 code from
ftp://gcc.gnu.org/pub/gcc/snapshots

This should include the Value Range Propagation that uses the results,
but won't yet  include the access function information for recurrences.

> 
> However, after reading the paper on "Fast Recognition of Scalar 
> Evolutions on Three-Address SSA Code", we have a couple of questions.
> 

> 1. After reading the paper, we concluded that the scalar evolutions are 
> actually a restricted polynomial form of chains of recurrences by 
> Bachmann and Zima [6,8]. Is this true?

Yes.

Or rather, the scalar evolution code uses chains of recurrences as an
internal representation.
See tree-chrec.c

"
  This file implements operations on chains of recurrences.  Chains
  of recurrences are used for modeling evolution functions of scalar
  variables.
"

>  Or is there an essential 
> difference with multi-variate chains of recurrences [7]? 
I don't believe so, but you'll have to ask Sebastian (who i have
copied).

> Does the 
> "scalar evolutions" name suggest something else beyond the recurrence 
> forms. 

Not really, they are evolution functions for scalar variables,
represented used chains of recurrences.

> Are we missing a crucial point?

Dunno.  I should note that that we properly give credit where it is due:

"   The notation used in this file is called "chains of recurrences",
   and has been proposed by Eugene Zima, Robert Van Engelen, and
   others for describing induction variables in programs. "

(from tree-scalar-evolution.c)
:)

Sebastian devised the notation, and nobody argued with him.  
> 
> 2. What is the difference between the SSA-based algorithm described in 
> the paper and the G-SSA form proposed in 1995 by Tu and Padua [9]. The 
> G-SSA algorithms compute recurrence relations from which chains of 
> recurrences can be easily constructed using the construction algorithm 
> presented in [1].

Sebastian can cover this.
> 
> 3. Do you compute recurrence forms for pointer variables? If so, what 
> is the difference compared to the method described in our 2001 paper 
> [2]?

Yes to the first question,   I don't believe anyone implemented the
method discussed in your paper.  One thing to note is that we have no
direct array reference operation that operates on pointers at the
moment.  So they really are just treated as pointers, and  all data
dependence on pointers is very simple (mainly because we have not
implemented the symbolic differencing necessary).

You can see this basic work (originally from the autovectorization
branch of gcc cvs) in patch at 
http://gcc.gnu.org/ml/gcc-patches/2005-06/msg02057.html

(and the followup patch)
This work was only recently submitted to gcc 4.1, and i am about to
approve it for gcc 4.1, so it should appear in the gcc 4.1 snapshots in
a few weeks or so.

> 
> 4. The "peeled" forms are described as "wrap around" forms in the paper 
> but do not seem to handle true wrap around variable. How do you handle 
> wrap around variables, i.e. those that have the form:
> k = 9;
> for (i=0; i < 10; i++)
> { a[k] = ...
>    k = i;
> }
> Can you please explain?

We used to handle true wraparounds, but don't anymore.  This work can be
found in a branch of the gcc source tree if you would like to look at
it. It was too expensive, and didn't seem to buy us anything over what
we handle now for real code.  If there comes a time this isn't true,
we'll probably bring it back (IE we are eminently pragmatic about this
stuff.  Or  at least, Sebastian and I am)

> 
> 5. Do you plan on implementing dependence testing and range analysis on 
> chains of recurrences as described in [3,4,5]?

We have simple ZIV, SIV, and MIV dependence testing based on this work.
See tree-data-ref.c in 4.0.
This is used by, e.g., the linear loop transform code (in a simple way)
in tree-loop-linear, and will be used in other places as well as more
high level loop transforms are implemented

On the autovect branch, we have improved begun work on improved data
dependence testing including simple improvements to the current tests,
the omega test, and even more general polyhedron dependence testing.

I've submitted work to enable direct array reference operations on
pointers when the user wrote them way (and in the future, i will
probably implement pointer->array recovery).

Range analysis using SCEV info is used in the value range propagation
pass of gcc 4.1.

I should note that one thing that most papers i've read on chrecs don't
cover but end up very important in practice are casts between signed and
unsigned, overflow behavior, etc.

--Dan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: The origin of scalar evolutions?
  2005-07-04 16:55 The origin of scalar evolutions? Robert van Engelen
  2005-07-05  3:36 ` Daniel Berlin
@ 2005-07-13  6:23 ` Sebastian Pop
  2005-07-15 14:59   ` Robert van Engelen
  1 sibling, 1 reply; 4+ messages in thread
From: Sebastian Pop @ 2005-07-13  6:23 UTC (permalink / raw)
  To: Robert van Engelen; +Cc: gcc

Robert van Engelen wrote:
> 
> 1. After reading the paper, we concluded that the scalar evolutions are 
> actually a restricted polynomial form of chains of recurrences by 
> Bachmann and Zima [6,8]. Is this true? Or is there an essential 
> difference with multi-variate chains of recurrences [7]? Does the 
> "scalar evolutions" name suggest something else beyond the recurrence 
> forms. Are we missing a crucial point?
> 

a = {1, +, a}
b = {3, +, [1, 2]}
c = {-, +, +}
d = {abstract_value, +, abstract_value}

are not chains of recurrences.  More generally chains of recurrences
use integer or real coefficients, whereas coefficients of scalar
evolutions are over any abstract domain.

a is a mixer (exponential), b is an envelope (interval coefficients),
c is a monotonic evolution (coefficients in {+, -}).  The name of
scalar evolutions is not that far from the name "monotonic evolution"
used by Peng Wu, and it seemed appropriate to be used for the general
concept (you're free to disagree with my choice for using this name).

> With respect to our first question, we felt that the use of a new name 
> for an existing technique is confusing. For example, by calling a 
> "matrix" from linear algebra a "scalar square" we do not change its 
> semantics. The (algebraic) operations on matrices are still the same. 
> So are the operations on scalar evolutions the same as those on chains 
> of recurrences. Janson's classic "History of Art" has apt advice: "It 
> has always been easier to invent new labels than to create a movement 
> in art that truly deserves a new name."

Do you know why in natural languages we have so many synonyms?  Well,
because otherwise poets would compose very poor texts.  Poets like to
play with the phonetics of the words because they can express their
moods in the phonemes.  Restricting the number of words that point to
the same (or close) semantics would restrict the ability of writers.
Anyway, this is way off topic for this mailing list.

> 2. What is the difference between the SSA-based algorithm described in 
> the paper and the G-SSA form proposed in 1995 by Tu and Padua [9]. 

Not the same.  You're rebuilding a higher level tree from the
gimple-ssa form, but then you use abstract domains for representing
some of the difficult evolutions.  Analyzers are like poetry: there
always will be room for something new because they are not comparable;
they just fill a missing topic.

> 
> 3. Do you compute recurrence forms for pointer variables? If so, what 
> is the difference compared to the method described in our 2001 paper 
> [2]?

Daniel has already replied to this one and many other questions
(thanks Daniel).

> 
> 4. The "peeled" forms are described as "wrap around" forms in the paper 
> but do not seem to handle true wrap around variable. How do you handle 
> wrap around variables, i.e. those that have the form:
> k = 9;
> for (i=0; i < 10; i++)
> { a[k] = ...
>   k = i;
> }
> Can you please explain?
> 

In SSA form the previous program is written:

loop_1
  c = phi (0, d)
  d = c + 1
  e = phi (9, c)
  a[e] = ...
endloop_1

so you have the following evolutions:

c = {0, +, 1}_1
d = {1, +, 1}_1
e = (9, {0, +, 1}_1)_1

e is a wrap around, whose first value when entering in the loop is 9,
followed in the next iterations by 0, 1, 2, ...

As Daniel has pointed out, this extension sleeps in the lno-branch.
We will consider this extension useful for mainline only the day when
some optimizer will show a net improvement that justifies the extra
burden of maintaining the code.  

I also have decided to restrict the polynomials to a degree less or
equal than 2 (affine evolutions) because all the other constructs are
just pure nonsense, and not used by any optimizer or other analyzer.
It's too bad that I have not restricted the analyzer earlier based on
the suggestions from Zdenek Dvorak.

Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: The origin of scalar evolutions?
  2005-07-13  6:23 ` Sebastian Pop
@ 2005-07-15 14:59   ` Robert van Engelen
  0 siblings, 0 replies; 4+ messages in thread
From: Robert van Engelen @ 2005-07-15 14:59 UTC (permalink / raw)
  To: Daniel Berlin, Sebastian Pop; +Cc: gcc

Hi Sebastian,

Thank you for your quick reply. I have a few more comments though, and 
I hope you can take a look at these.

>> 1. After reading the paper, we concluded that the scalar evolutions 
>> are
>> actually a restricted polynomial form of chains of recurrences by
>> Bachmann and Zima [6,8]. Is this true? Or is there an essential
>> difference with multi-variate chains of recurrences [7]? Does the
>> "scalar evolutions" name suggest something else beyond the recurrence
>> forms. Are we missing a crucial point?
>>
>
> a = {1, +, a}
> b = {3, +, [1, 2]}
> c = {-, +, +}
> d = {abstract_value, +, abstract_value}
>
> are not chains of recurrences.  More generally chains of recurrences
> use integer or real coefficients, whereas coefficients of scalar
> evolutions are over any abstract domain.

I disagree. The chains of recurrences algebra is applicable to any 
semi-ring, as long as the objects adhere to the properties, based on 
the fact that chains of recurrences are strongly related to finite 
differencing, which are known to work for functions over commutative 
rings. You could use the algebra on matrices for example. There has 
been a lot of similar work on differencing techniques for various 
domains. Wouldn't it be prudent to prove you have a commutative 
(semi-)ring when dealing with abstract values? In our chains of 
recurrences approach we impose restrictions on symbolic coefficients to 
ensure the validity of the representation. For example, {i,+,1}_i is 
not valid.

> a is a mixer (exponential), b is an envelope (interval coefficients),
> c is a monotonic evolution (coefficients in {+, -}).  The name of
> scalar evolutions is not that far from the name "monotonic evolution"
> used by Peng Wu, and it seemed appropriate to be used for the general
> concept (you're free to disagree with my choice for using this name).

The fact remains that scalar evolutions are the same as chains of 
recurrences based on my statement above.

>> 2. What is the difference between the SSA-based algorithm described in
>> the paper and the G-SSA form proposed in 1995 by Tu and Padua [9].
>
> Not the same.  You're rebuilding a higher level tree from the
> gimple-ssa form, but then you use abstract domains for representing
> some of the difficult evolutions.  Analyzers are like poetry: there
> always will be room for something new because they are not comparable;
> they just fill a missing topic.

But that is my whole point: if you have a (slightly) different code 
representation (with the same semantics), I can think of another 
(slightly) different code representation and modify an existing 
algorithm to do the same job on the new representation. The basic 
principles of the approach won't change though.

The show that there is no inherent difference in the approach, consider 
the IV problem:

k = 0
DO
   k = k + 1
   j = k
   k = j + 1

Using our algorithm published in 2001 [1,2] by analyzing the code 
bottom-up we get:

k = 0
DO
   k = k + 1     => (step 3)  k = k + 2 => {0,+,2}
   j = k         => (step 2)  k = k + 1
   k = j + 1     => (step 1)  k = j + 1

Note that we don't need to substitute the variables to obtain the CR, 
following links is sufficient. It is just more convenient to use 
replacement to visualize the algorithm.

With an SSA form, we get (again using the same algorithm):

k1 = 0
DO
   k2 = phi(k1, k4) => (step 4)  k4 = phi(0, k4) + 2 => {0,+,2}
   k3 = k2 + 1      => (step 3)  k4 = k2 + 2
   j1 = k3          => (step 2)  k4 = k3 + 1
   k4 = j1 + 1      => (step 1)  k4 = j1 + 1

This algorithm is applicable to affine, polynomial, and exponential 
series. Maybe I am losing my mind here, but I don't see the difference 
when comparing the scalar evolutions approach on SSA forms to the 
earlier CR approach.

> I also have decided to restrict the polynomials to a degree less or
> equal than 2 (affine evolutions) because all the other constructs are
> just pure nonsense, and not used by any optimizer or other analyzer.
> It's too bad that I have not restricted the analyzer earlier based on
> the suggestions from Zdenek Dvorak.

If you restrict the degree to affine, you recreate the induction 
variable representations discussed in the Red Dragon book (offset + 
stride) modulo the use of intervals, of course. But there are codes 
that use quadratic forms. It is true that auto-vectorization wouldn't 
be easy to achieve with non-affine forms thereby rendering the use of 
higher-order forms useless.

- Robert van Engelen

Robert van Engelen: Associate Professor, Computer Science Department
Florida State University, 162 J. Love Bldg., Tallahassee, FL32306-4530
Offices: 162LOV/471DSL, (850)644-9661/645-0309, Fax: (850)644-0058
Email: engelen@cs.fsu.edu, URL: http://www.cs.fsu.edu/~engelen

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-07-15 14:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-07-04 16:55 The origin of scalar evolutions? Robert van Engelen
2005-07-05  3:36 ` Daniel Berlin
2005-07-13  6:23 ` Sebastian Pop
2005-07-15 14:59   ` Robert van Engelen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).