public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [0/3] OpenACC reductions
@ 2015-11-02 16:10 Nathan Sidwell
  2015-10-18 23:20 ` [gomp4] fortran testcase Nathan Sidwell
                   ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: Nathan Sidwell @ 2015-11-02 16:10 UTC (permalink / raw)
  To: Jakub Jelinek, GCC Patches; +Cc: Cesar Philippidis

Jakub,
the following patch series implements the reduction handling for OpenACC:

01-trunk-reductions-core-1102.patch  Core  execution changes
02-trunk-reductions-ptx-1102.patch   PTX backend bits
03-trunk-reductions-tests-1102.patch Testcases


The reduction mechanism relies on a new internal builtin -- IFN_GOACC_REDUCTION, 
which is used in 4 different places.  IYR the loop partionining is managed with 
FORK and JOIN unique_fn markers.  The reductions go around these as follows:

IFN_UNIQUE (HEAD_MARKER ...)
IFN_REDUCTION (SETUP ...)
IFN_UNIQUE (FORK ...)
IFN_REDUCTION (INIT ...)
IFN_UNIQUE (HEAD_MARKER)
<loop here>
IFN_UNIQUE (TAIL_MARKER ...)
IFN_REDUCTION (FINI ...)
IFN_UNIQUE (JOIN ...)
IFN_REDUCTION (TEARDOWN ...)
IFN_UNIQUE (TAIL_MARKER)


There's a quad of functions for each reduction variable of the loop.  If a loop 
is partitioned over multiple dimensions, there are additional quads for each 
dimension, surrounding the fork/join for that dimension.

All the reduction calls look similar and are:

V = REDUCTION (KIND, REF_TO_RES, LOCAL_VAR, LEVEL, OP, OFFSET)

REF_TO_RES is a pointer to a reciever object.  it is a null pointer constant if 
there is no such object.
LOCAL_VAR is the executing thread's instance of the reduction variable.
LEVEL is the dimension across which this reduction is partitiong (gang, worker, 
vector).  As with the head/tail markers,this assignment of level is deferred to 
the target compiler.
OP is the reduction operator
OFFSET is an offset into a hypothetical buffer allocated for all the reductions 
of this particular loop.  It's a way of identifying which quad of reductions 
apply to the same logical variable, and happens to be useful in some use cases 
(I'll expand on that in the PTX fragment).

All these functions return a new value for the local variable.

When everything collapses to a single thread (i.e. on the host), the 
implementation of these functions is trivial.

SETUP
    - if REF_TO_RES is not  nullptrconst, return *REF_TO_RES, else return 
LOCAL_VAR (this is  a compile-time check)
INIT & FINI
   - return LOCAL_VAR
TEARDOWN
   - if REF_TO_RES is not nullptrconst *REF_TO_RES = LOCAL_VAR.
     always return LOCAL_VAR

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2021-08-09 11:37 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-02 16:10 [0/3] OpenACC reductions Nathan Sidwell
2015-10-18 23:20 ` [gomp4] fortran testcase Nathan Sidwell
2015-11-02 16:18 ` [1/3] OpenACC reductions Nathan Sidwell
2015-11-03 15:46   ` Jakub Jelinek
2015-11-03 16:02     ` Nathan Sidwell
2015-11-04 10:31       ` Jakub Jelinek
2015-11-04 13:58         ` Nathan Sidwell
2015-11-04 14:08           ` Jakub Jelinek
2015-11-04  9:59   ` Jakub Jelinek
2015-11-06 10:47   ` [gomp4] " Thomas Schwinge
2016-01-07  3:55     ` [gomp4] private reductions Cesar Philippidis
2016-01-07 16:53       ` Cesar Philippidis
2016-01-09  1:14       ` Cesar Philippidis
2016-01-11 12:10       ` Thomas Schwinge
2016-01-11 14:55         ` Cesar Philippidis
2021-08-09 11:37   ` [1/3] OpenACC reductions Thomas Schwinge
2015-11-02 16:35 ` [2/3] " Nathan Sidwell
2015-11-04 10:01   ` Jakub Jelinek
2015-11-04 13:57     ` Nathan Sidwell
2015-11-04 13:27   ` Bernd Schmidt
2015-11-04 14:09     ` Nathan Sidwell
2015-11-04 16:59     ` Nathan Sidwell
2015-11-06 10:48       ` [gomp4] " Thomas Schwinge
2015-11-02 16:38 ` [3/3] " Nathan Sidwell
2015-11-04 10:03   ` Jakub Jelinek
2015-11-06 10:49   ` [gomp4] " Thomas Schwinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).