public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [graphite] Loop tiling
@ 2008-06-09 16:59 Tobias Grosser
  2008-06-09 17:25 ` Sebastian Pop
  0 siblings, 1 reply; 5+ messages in thread
From: Tobias Grosser @ 2008-06-09 16:59 UTC (permalink / raw)
  To: Sebastian Pop
  Cc: Cedric Bastoul, GCC Patches, Albert Cohen, Sebastian Pop,
	Konrad Trifunovic, louis-noel.pouchet, Cédric Bastoul,
	Karthik A, Harle Christophe, Adrien ELICHE

Hi Sebastian, Hi Cedric,

since Thursday I think about how to implement loop tiling in graphite.

Lets start with a simple example:

for (i=0;i<=10;i++) {
  for (j=0;j<=20;j++) {
    S1 ;
  }
}

# eq/in i  j  n  1
    1   1  0  0  0   # i >= 0
    1  -1  0  0  10  # i <= 10
    1   0  1  0  0   # j >= 0
    1   0 -1  0  20  # j <= 20

What I would like to generate is:

for (p1=0;p1<=10;p1+=2) {
  for (p2=0;p2<=20;p2+=2) {
    for (i=p1;i<=min(10,p1+1);i++) {
      for (j=p2;j<=min(20,p2+1);j++) {
        S1 ;
      }
    }
  }
}


So there seem to be different ways to make cloog generate this code:

Solution 1: Change the cloog domain matrix.
===========================================	
	What to do:
	
	- Add two more dimensions.
	- Copy the restrictions of i and j to p1 and p2.
	- Add the new restrictions to connect i and j.

The domain for S1:
     # eq/in p1 p2 i  j  n  1
         1   1  0  0  0  0  0   # p1 >= 0
         1  -1  0  0  0  0  10  # p1 <= 10
         1   0  1  0  0  0  0   # p2 >= 0
         1   0 -1  0  0  0  20  # p2 <= 20
         1  -1  0  1  0  0  0   # i >= p1
         1   1  0 -1  0  0  1   # i <= p1 + 1
         1   0 -1  0  1  0  0   # j >= p2
         1   0  1  0 -1  0  1   # j <= p2 + 1
         1   0  0  1  0  0  0   # i >= 0  
         1   0  0  0  1  0  0   # i <= 10
         1   0  0 -1  0  0  10  # j >= 0
         1   0  0  0 -1  0  20  # j <= 20

What I get:


for (p1=0;p1<=10;p1++) {
  for (p2=0;p2<=20;p2++) {
    for (i=p1;i<=min(10,p1+1);i++) {
      for (j=p2;j<=min(20,p2+1);j++) {
        S1 ;
      }
    }
  }
}

The problem:

This code is not correct, as we iterate over all values of p1, but we
only should iterate over the even values.
I have no idea how to restrict p1 only to even values. But this seems to
be the only way to change "p1++" to "p1+=2".

Solution 2: Use scattering function:
====================================

To create this code I used the cloog scattering functions and the
original unmodified domain:

for (p1=0;p1<=5;p1++) {
  for (p2=0;p2<=10;p2++) {
    for (i=max(2*p1,0);i<=min(10,2*p1+1);i++) {
      for (j=max(2*p2,0);j<=min(20,2*p2+1);j++) {
        S1 ;
      }
    }
  }
}

Scattering:
     # eq/in p1 p2 p3 p4 p5 p6 p7 p8  i  j  n  1
         1    2  0  0  0  0  0  0  0 -1  0  0  1     # 2 * p1 >= i 
         1   -2  0  0  0  0  0  0  0  1  0  0  0     # 2 * p1 <= i + 1
         1    0  2  0  0  0  0  0  0  0 -1  0  1     # 2 * p2 >= j
         1    0 -2  0  0  0  0  0  0  0  1  0  0     # 2 * pr <= j + 1
         0    0  0  1  0  0  0  0  0  0  0  0  0
         0    0  0  0  1  0  0  0  0  0  0  0  0     
         0    0  0  0  0  1  0  0  0  0  0  0  0    
         0    0  0  0  0  0  1  0  0  0  0  0  0     
         0    0  0  0  0  0  0  1  0  0  0  0  0     
         0    0  0  0  0  0  0  0  1  0  0  0  0    

Domain:
# eq/in i  j  n  1
    1   1  0  0  0   # i >= 0
    1  -1  0  0  10  # i <= 10
    1   0  1  0  0   # j >= 0
    1   0 -1  0  20  # j <= 20

This solution has some advantages:

1. The code is correct
2. It is very easy. Just add two lines to the scattering function.

But some problems:

1. It is not very efficient. As we have many multiplications in the
code.
Will later gcc optimization passes convert these multiplications to
additions?

2. I could not find any documentation about scattering functions, that
describe this way of using the scattering functions.

3. At the moment I have to investigate, if we can mix this approach with
the normal scattering to regenerate the original control flow.


Solution 3. Other ideas??
===============================
Has someone another idea, how to implement loop tiling?


Another graphite specific problem is, how we connect the real loop
variables to the cloog variables. Before loop tiling this was easy.
Loops of the same depth in the cloog output correspond to the loops of
the same depth in the original gimple loop nest. But know we have to add
some data structure, that connects the gimple loops, with the cloog
loops.

Thanks 
Tobi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [graphite] Loop tiling
  2008-06-09 16:59 [graphite] Loop tiling Tobias Grosser
@ 2008-06-09 17:25 ` Sebastian Pop
  2008-06-11  5:36   ` Albert Cohen
  2008-06-11  5:54   ` Cédric Bastoul
  0 siblings, 2 replies; 5+ messages in thread
From: Sebastian Pop @ 2008-06-09 17:25 UTC (permalink / raw)
  To: Tobias Grosser
  Cc: Cedric Bastoul, GCC Patches, Albert Cohen, Konrad Trifunovic,
	louis-noel.pouchet, Karthik A, Harle Christophe, Adrien ELICHE

Hi Tobi,

Thanks for working on this.

Solution 2 is the right one.

On Mon, Jun 9, 2008 at 11:58 AM, Tobias Grosser
<grosser@fim.uni-passau.de> wrote:
> 1. It is not very efficient. As we have many multiplications in the
> code.
> Will later gcc optimization passes convert these multiplications to
> additions?
>

Yes, you should not worry about scalar optimizations at all.

> 2. I could not find any documentation about scattering functions, that
> describe this way of using the scattering functions.
>

I think that the best place to look at this is:
page 4 of http://www.lri.fr/~girbal/publi/ics05_facilitating.ps

also have a look at the other reports from:
http://www.lri.fr/~girbal/site_wrapit/

Albert probably has some other pointers for reports that describe how
to do loop tiling.

> Another graphite specific problem is, how we connect the real loop
> variables to the cloog variables. Before loop tiling this was easy.
> Loops of the same depth in the cloog output correspond to the loops of
> the same depth in the original gimple loop nest. But know we have to add
> some data structure, that connects the gimple loops, with the cloog
> loops.
>

This was also the case before: we needed a map between the old
induction variable and the new ones.

Sebastian Pop
--
AMD - GNU Tools

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [graphite] Loop tiling
  2008-06-09 17:25 ` Sebastian Pop
@ 2008-06-11  5:36   ` Albert Cohen
  2008-06-11  5:54   ` Cédric Bastoul
  1 sibling, 0 replies; 5+ messages in thread
From: Albert Cohen @ 2008-06-11  5:36 UTC (permalink / raw)
  To: Sebastian Pop
  Cc: Tobias Grosser, Cedric Bastoul, GCC Patches, Konrad Trifunovic,
	louis-noel.pouchet, Karthik A, Harle Christophe, Adrien ELICHE

Hi all,

Yes, Sebastian is Right for now, and thanks for pointing out the references.

Yet in the longer term, we are working on CLooG extensions to facilitate
tiling with parametric sizes, multidomensional tiling, increased
scalability, no risks of integer overflows, and better quality of
generated code... :-).

The magic is mostly in a couple of papers by Radjopadhye and his
students, at PLDI'07 and SC'07. We are working on extending it to
affine-by-statement scheduling and imperfectly nested loops, and also to
 make it run within one single run of CLooG (see those papers for details).

More info at an upcoming phone meeting, or if Cedric wants to comment a
little more.

Albert

Sebastian Pop wrote:
> Hi Tobi,
> 
> Thanks for working on this.
> 
> Solution 2 is the right one.
> 
> On Mon, Jun 9, 2008 at 11:58 AM, Tobias Grosser
> <grosser@fim.uni-passau.de> wrote:
>> 1. It is not very efficient. As we have many multiplications in the
>> code.
>> Will later gcc optimization passes convert these multiplications to
>> additions?
>>
> 
> Yes, you should not worry about scalar optimizations at all.
> 
>> 2. I could not find any documentation about scattering functions, that
>> describe this way of using the scattering functions.
>>
> 
> I think that the best place to look at this is:
> page 4 of http://www.lri.fr/~girbal/publi/ics05_facilitating.ps
> 
> also have a look at the other reports from:
> http://www.lri.fr/~girbal/site_wrapit/
> 
> Albert probably has some other pointers for reports that describe how
> to do loop tiling.
> 
>> Another graphite specific problem is, how we connect the real loop
>> variables to the cloog variables. Before loop tiling this was easy.
>> Loops of the same depth in the cloog output correspond to the loops of
>> the same depth in the original gimple loop nest. But know we have to add
>> some data structure, that connects the gimple loops, with the cloog
>> loops.
>>
> 
> This was also the case before: we needed a map between the old
> induction variable and the new ones.
> 
> Sebastian Pop
> --
> AMD - GNU Tools

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [graphite] Loop tiling
  2008-06-09 17:25 ` Sebastian Pop
  2008-06-11  5:36   ` Albert Cohen
@ 2008-06-11  5:54   ` Cédric Bastoul
  2008-06-26 23:57     ` Konrad Trifunovic
  1 sibling, 1 reply; 5+ messages in thread
From: Cédric Bastoul @ 2008-06-11  5:54 UTC (permalink / raw)
  To: Sebastian Pop
  Cc: Tobias Grosser, GCC Patches, Albert Cohen, Konrad Trifunovic,
	louis-noel.pouchet, Karthik A, Harle Christophe, Adrien ELICHE

I started to write the following message two days ago but I wished to
send a .cloog which is not easy since I changed my laptop just before
the trip I'm still on ! I need to finish installing all the tools...
Here is the message, just a follow-up of Albert's one.

Ced.

---
Hi Tobi,
Sebastian gave you the right pointers, you should also have a look at
examples in the CLooG test directory called tilingX.cloog. However to
be honest I'm not fond of the second solution you suggested. The
reason is the scattering is supposed to only drive the way the
polyhedra are scanned. Adding new dimensions there is against this
(well, I designed scattering this way -allowing inequalities- because
I wished to do that, but I changed my mind and now I think it's
dirty...). The right solution to me is acting both on the domain and
in the scattering (in the end, something similar to your second
solution).

By the way we are incorporating directly inside CLooG the tiling
scheme described there
http://www.cs.colostate.edu/~ln/publications/pldi07.pdf (a student is
working on this at this moment).
---

On Mon, Jun 9, 2008 at 7:25 PM, Sebastian Pop <sebpop@gmail.com> wrote:
> Hi Tobi,
>
> Thanks for working on this.
>
> Solution 2 is the right one.
>
> On Mon, Jun 9, 2008 at 11:58 AM, Tobias Grosser
> <grosser@fim.uni-passau.de> wrote:
>> 1. It is not very efficient. As we have many multiplications in the
>> code.
>> Will later gcc optimization passes convert these multiplications to
>> additions?
>>
>
> Yes, you should not worry about scalar optimizations at all.
>
>> 2. I could not find any documentation about scattering functions, that
>> describe this way of using the scattering functions.
>>
>
> I think that the best place to look at this is:
> page 4 of http://www.lri.fr/~girbal/publi/ics05_facilitating.ps
>
> also have a look at the other reports from:
> http://www.lri.fr/~girbal/site_wrapit/
>
> Albert probably has some other pointers for reports that describe how
> to do loop tiling.
>
>> Another graphite specific problem is, how we connect the real loop
>> variables to the cloog variables. Before loop tiling this was easy.
>> Loops of the same depth in the cloog output correspond to the loops of
>> the same depth in the original gimple loop nest. But know we have to add
>> some data structure, that connects the gimple loops, with the cloog
>> loops.
>>
>
> This was also the case before: we needed a map between the old
> induction variable and the new ones.
>
> Sebastian Pop
> --
> AMD - GNU Tools
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [graphite] Loop tiling
  2008-06-11  5:54   ` Cédric Bastoul
@ 2008-06-26 23:57     ` Konrad Trifunovic
  0 siblings, 0 replies; 5+ messages in thread
From: Konrad Trifunovic @ 2008-06-26 23:57 UTC (permalink / raw)
  To: Cédric Bastoul
  Cc: Sebastian Pop, Tobias Grosser, GCC, Albert Cohen,
	louis-noel.pouchet, Karthik A, Sjodin, Jan, Adrien ELICHE

Hi,

some short note on coupling loop tiling and gloog code generator.
I will need to extend mapping of old induction variables <-> new
induction variables. Until now I assume preserved one-to-one
relationship. But, this does not allow many of the loop
transformations (like loop inversion). I will implement an
mechanism to express old induction variable as the function (gimple
expression) of new induction variables. Ideally it should
handle functional relationship of any complexity.
I also think this is necessary to enable code generation for tiled loops.
Let me work on this ;)

regards,
Konrad


2008/6/11 Cédric Bastoul <cedric.bastoul@inria.fr>:
> I started to write the following message two days ago but I wished to
> send a .cloog which is not easy since I changed my laptop just before
> the trip I'm still on ! I need to finish installing all the tools...
> Here is the message, just a follow-up of Albert's one.
>
> Ced.
>
> ---
> Hi Tobi,
> Sebastian gave you the right pointers, you should also have a look at
> examples in the CLooG test directory called tilingX.cloog. However to
> be honest I'm not fond of the second solution you suggested. The
> reason is the scattering is supposed to only drive the way the
> polyhedra are scanned. Adding new dimensions there is against this
> (well, I designed scattering this way -allowing inequalities- because
> I wished to do that, but I changed my mind and now I think it's
> dirty...). The right solution to me is acting both on the domain and
> in the scattering (in the end, something similar to your second
> solution).
>
> By the way we are incorporating directly inside CLooG the tiling
> scheme described there
> http://www.cs.colostate.edu/~ln/publications/pldi07.pdf (a student is
> working on this at this moment).
> ---
>
> On Mon, Jun 9, 2008 at 7:25 PM, Sebastian Pop <sebpop@gmail.com> wrote:
>> Hi Tobi,
>>
>> Thanks for working on this.
>>
>> Solution 2 is the right one.
>>
>> On Mon, Jun 9, 2008 at 11:58 AM, Tobias Grosser
>> <grosser@fim.uni-passau.de> wrote:
>>> 1. It is not very efficient. As we have many multiplications in the
>>> code.
>>> Will later gcc optimization passes convert these multiplications to
>>> additions?
>>>
>>
>> Yes, you should not worry about scalar optimizations at all.
>>
>>> 2. I could not find any documentation about scattering functions, that
>>> describe this way of using the scattering functions.
>>>
>>
>> I think that the best place to look at this is:
>> page 4 of http://www.lri.fr/~girbal/publi/ics05_facilitating.ps
>>
>> also have a look at the other reports from:
>> http://www.lri.fr/~girbal/site_wrapit/
>>
>> Albert probably has some other pointers for reports that describe how
>> to do loop tiling.
>>
>>> Another graphite specific problem is, how we connect the real loop
>>> variables to the cloog variables. Before loop tiling this was easy.
>>> Loops of the same depth in the cloog output correspond to the loops of
>>> the same depth in the original gimple loop nest. But know we have to add
>>> some data structure, that connects the gimple loops, with the cloog
>>> loops.
>>>
>>
>> This was also the case before: we needed a map between the old
>> induction variable and the new ones.
>>
>> Sebastian Pop
>> --
>> AMD - GNU Tools
>>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-06-26 23:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-09 16:59 [graphite] Loop tiling Tobias Grosser
2008-06-09 17:25 ` Sebastian Pop
2008-06-11  5:36   ` Albert Cohen
2008-06-11  5:54   ` Cédric Bastoul
2008-06-26 23:57     ` Konrad Trifunovic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).