public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Implementing an algorithm in place of gomp  'auto'
@ 2019-03-01 14:40 김규래
  2019-03-01 17:32 ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: 김규래 @ 2019-03-01 14:40 UTC (permalink / raw)
  To: gcc Mailing List

Hello everyone, 
I'm an EE student at Sogang University Korea.
I have recently submitted a paper on parallel loop scheduling algorithm and had to modify libgomp a bit in the process.
It is known that the...
 
 

/* For now map to schedule(static), later on we could play with feedback driven choice. */


... comment has been in place for quite a while.
I'm curious if experimentally implementing a non-OMP-standard algorithm for the auto policy would be interesting.
Specifically, one of the Tapering Self-Scheduling, Bold Self-Scheduling, Factoring Self-Scheduling [1] algorithms.
If that is the case, I would like to propose this project for GSoC 2019.
 
Great regards from Korea,
Ray Kim
[1] "OpenMP Loop Scheduling Revisited: Making a Case for More Schedules", IWOMP 2018, https://arxiv.org/abs/1809.03188

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing an algorithm in place of gomp  'auto'
  2019-03-01 14:40 Implementing an algorithm in place of gomp 'auto' 김규래
@ 2019-03-01 17:32 ` Jakub Jelinek
  2019-03-01 17:40   ` 김규래
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2019-03-01 17:32 UTC (permalink / raw)
  To: 김규래; +Cc: gcc Mailing List

On Fri, Mar 01, 2019 at 11:40:50PM +0900, 김규래 wrote:
> Hello everyone, 
> I'm an EE student at Sogang University Korea.
> I have recently submitted a paper on parallel loop scheduling algorithm and had to modify libgomp a bit in the process.
> It is known that the...
>  
>  
> 
> /* For now map to schedule(static), later on we could play with feedback driven choice. */
> 
> 
> ... comment has been in place for quite a while.
> I'm curious if experimentally implementing a non-OMP-standard algorithm for the auto policy would be interesting.
> Specifically, one of the Tapering Self-Scheduling, Bold Self-Scheduling, Factoring Self-Scheduling [1] algorithms.
> If that is the case, I would like to propose this project for GSoC 2019.

For auto yes, if it makes good results, though it would be nice to use some
smarts on the compiler side to decide if the loop is a good candidate for
such scheduling, static scheduling has the major advantage that it doesn't
need to enter the runtime library (except for querying number of threads and
thread number, but those can be cached from earlier and barrier at the end
if not nowait) and for many common loops where the work in each iteration is
really constant it also gives best results.

Especially for the case when schedule clause is not present at all, we'd
need to do a very good job at the compiler side to use some dynamic
algorithm when it is likely it will be beneficial.  Unfortunately at least
right now the lowering to either static scheduling or a library call is done
quite early, before inlining etc., and for best decisions we'd need to
either defer it to far later, or be able to convert a library call back to
static scheduling.

Another option is to add further schedules as extensions (say starting with
gnu_ prefix or similar).

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing an algorithm in place of gomp 'auto'
  2019-03-01 17:32 ` Jakub Jelinek
@ 2019-03-01 17:40   ` 김규래
  2019-03-01 17:46     ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: 김규래 @ 2019-03-01 17:40 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc Mailing List

Nice to meet you Jacob.
 
> Another option is to add further schedules as extensions (say starting with
> gnu_ prefix or similar). 
 
In this case, I believe that modifying the frontend would be necessary?
Last time I looked, it seemed that adding a new scheduling keyword would require quite some work.
Also, out of curiosity, is there any plan to add work stealing (affinity schedules) to gomp?
The clang implementation seem have work stealing.
 
Ray Kim 
 
-----Original Message-----
From: "Jakub Jelinek"<jakub@redhat.com>
To: "김규래"<msca8h@naver.com>;
Cc: "gcc Mailing List"<gcc@gcc.gnu.org>;
Sent: 2019-03-02 (토) 02:32:29 (GMT+09:00)
Subject: Re: Implementing an algorithm in place of gomp 'auto'
 
On Fri, Mar 01, 2019 at 11:40:50PM +0900, 김규래 wrote:
> Hello everyone,
> I'm an EE student at Sogang University Korea.
> I have recently submitted a paper on parallel loop scheduling algorithm and had to modify libgomp a bit in the process.
> It is known that the...
>  
>  
>
> /* For now map to schedule(static), later on we could play with feedback driven choice. */
>
>
> ... comment has been in place for quite a while.
> I'm curious if experimentally implementing a non-OMP-standard algorithm for the auto policy would be interesting.
> Specifically, one of the Tapering Self-Scheduling, Bold Self-Scheduling, Factoring Self-Scheduling [1] algorithms.
> If that is the case, I would like to propose this project for GSoC 2019.

For auto yes, if it makes good results, though it would be nice to use some
smarts on the compiler side to decide if the loop is a good candidate for
such scheduling, static scheduling has the major advantage that it doesn't
need to enter the runtime library (except for querying number of threads and
thread number, but those can be cached from earlier and barrier at the end
if not nowait) and for many common loops where the work in each iteration is
really constant it also gives best results.

Especially for the case when schedule clause is not present at all, we'd
need to do a very good job at the compiler side to use some dynamic
algorithm when it is likely it will be beneficial.  Unfortunately at least
right now the lowering to either static scheduling or a library call is done
quite early, before inlining etc., and for best decisions we'd need to
either defer it to far later, or be able to convert a library call back to
static scheduling.

Another option is to add further schedules as extensions (say starting with
gnu_ prefix or similar).

Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing an algorithm in place of gomp 'auto'
  2019-03-01 17:40   ` 김규래
@ 2019-03-01 17:46     ` Jakub Jelinek
  2019-03-02 13:05       ` 김규래
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2019-03-01 17:46 UTC (permalink / raw)
  To: 김규래; +Cc: gcc Mailing List

On Sat, Mar 02, 2019 at 02:40:35AM +0900, 김규래 wrote:
> Nice to meet you Jacob.
>  
> > Another option is to add further schedules as extensions (say starting with
> > gnu_ prefix or similar). 
>  
> In this case, I believe that modifying the frontend would be necessary?

Yes.

> Last time I looked, it seemed that adding a new scheduling keyword would require quite some work.

Not that much.

> Also, out of curiosity, is there any plan to add work stealing (affinity schedules) to gomp?

It is on the wish list, but I'm afraid I won't have cycles for it in the
next year or two at least (once GCC 9 is released, I need to work on the
remaining OpenMP 5.0 features).  Of course if somebody implements it and submits
and it is of reasonable quality/performance, it will be accepted.

> The clang implementation seem have work stealing.

clang doesn't have its own runtime library, you mean the Intel OpenMP
library, right?

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing an algorithm in place of gomp 'auto'
  2019-03-01 17:46     ` Jakub Jelinek
@ 2019-03-02 13:05       ` 김규래
  2019-03-02 16:48         ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: 김규래 @ 2019-03-02 13:05 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc Mailing List

> > The clang implementation seem have work stealing.

> clang doesn't have its own runtime library, you mean the Intel OpenMP
> library, right?
 
I guess so.
 
> > Also, out of curiosity, is there any plan to add work stealing (affinity schedules) to gomp? 
 
> It is on the wish list, but I'm afraid I won't have cycles for it in the
> next year or two at least (once GCC 9 is released, I need to work on the
> remaining OpenMP 5.0 features).  Of course if somebody implements it and submits
> and it is of reasonable quality/performance, it will be accepted. 
 
Implementing work stealing (WS) also sounds interesting to me.
Do you have any plan of how it should look like?
For static scheduling, I don't quite see how WS could be implemented since the control doesn't enter the OMP runtime.
Lastly, do you think the subjects we are discussing (Additional scheduling algorithms, doing something about auto, WS etc..) 
could make the cut for a GSoC 2019 project?
 
Ray Kim
 
-----Original Message-----
From: "Jakub Jelinek"<jakub@redhat.com>
To: "김규래"<msca8h@naver.com>;
Cc: "gcc Mailing List"<gcc@gcc.gnu.org>;
Sent: 2019-03-02 (토) 02:46:14 (GMT+09:00)
Subject: Re: Implementing an algorithm in place of gomp 'auto'
 
On Sat, Mar 02, 2019 at 02:40:35AM +0900, 김규래 wrote:
> Nice to meet you Jacob.
>  
> > Another option is to add further schedules as extensions (say starting with
> > gnu_ prefix or similar).
>  
> In this case, I believe that modifying the frontend would be necessary?

Yes.

> Last time I looked, it seemed that adding a new scheduling keyword would require quite some work.

Not that much.

> Also, out of curiosity, is there any plan to add work stealing (affinity schedules) to gomp?

It is on the wish list, but I'm afraid I won't have cycles for it in the
next year or two at least (once GCC 9 is released, I need to work on the
remaining OpenMP 5.0 features).  Of course if somebody implements it and submits
and it is of reasonable quality/performance, it will be accepted.

> The clang implementation seem have work stealing.

clang doesn't have its own runtime library, you mean the Intel OpenMP
library, right?

Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing an algorithm in place of gomp 'auto'
  2019-03-02 13:05       ` 김규래
@ 2019-03-02 16:48         ` Jakub Jelinek
  2019-03-02 17:16           ` 김규래
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2019-03-02 16:48 UTC (permalink / raw)
  To: 김규래; +Cc: gcc Mailing List

On Sat, Mar 02, 2019 at 10:05:25PM +0900, 김규래 wrote:
> > It is on the wish list, but I'm afraid I won't have cycles for it in the
> > next year or two at least (once GCC 9 is released, I need to work on the
> > remaining OpenMP 5.0 features).  Of course if somebody implements it and submits
> > and it is of reasonable quality/performance, it will be accepted. 
>  
> Implementing work stealing (WS) also sounds interesting to me.

If you mean work stealing for worksharing loop scheduling, then it would
need to be yet another non-standard schedule, the current schedules (except
for auto or no schedule clause) don't allow it.

If you mean work stealing for task scheduling, then that is more important.

> Do you have any plan of how it should look like?
> For static scheduling, I don't quite see how WS could be implemented since the control doesn't enter the OMP runtime.

Sure, explicit schedule(static, N) or schedule(static) is quite well defined
and shouldn't go into the OMP runtime.
I was talking about no schedule clause at all, then the spec says it is
implementation defined what scheduling is used.

> Lastly, do you think the subjects we are discussing (Additional scheduling algorithms, doing something about auto, WS etc..) 
> could make the cut for a GSoC 2019 project?

I think so.  If you'd be interested in task scheduling, even better, but if
not, I think enough work can be done on worksharing loop scheduling.

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Implementing an algorithm in place of gomp 'auto'
  2019-03-02 16:48         ` Jakub Jelinek
@ 2019-03-02 17:16           ` 김규래
  0 siblings, 0 replies; 7+ messages in thread
From: 김규래 @ 2019-03-02 17:16 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc Mailing List

> > > It is on the wish list, but I'm afraid I won't have cycles for it in the
> > > next year or two at least (once GCC 9 is released, I need to work on the
> > > remaining OpenMP 5.0 features).  Of course if somebody implements it and submits
> > > and it is of reasonable quality/performance, it will be accepted. 
> >  
> > Implementing work stealing (WS) also sounds interesting to me.

> If you mean work stealing for worksharing loop scheduling, then it would
> need to be yet another non-standard schedule, the current schedules (except
> for auto or no schedule clause) don't allow it.

> If you mean work stealing for task scheduling, then that is more important.
 
> > Lastly, do you think the subjects we are discussing (Additional scheduling algorithms, doing something about auto, WS etc..) 
> > could make the cut for a GSoC 2019 project?

> I think so.  If you'd be interested in task scheduling, even better, but if
> not, I think enough work can be done on worksharing loop scheduling.
 
Great.
I think task scheduling is indeed more important.
I'll first take a look at the task system of gomp since I mostly worked on worksharing loop parallel stuff
and write a GSoC proposal when ready.
It will probably be about 1) implementing basic work stealing and then 2) maybe some locality-aware stuff.
Is there a standard benchmark used by gomp for task parallelism?
 
Ray Kim
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-03-02 17:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-01 14:40 Implementing an algorithm in place of gomp 'auto' 김규래
2019-03-01 17:32 ` Jakub Jelinek
2019-03-01 17:40   ` 김규래
2019-03-01 17:46     ` Jakub Jelinek
2019-03-02 13:05       ` 김규래
2019-03-02 16:48         ` Jakub Jelinek
2019-03-02 17:16           ` 김규래

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).