public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [gomp4 00/14] NVPTX: further porting
@ 2015-10-20 18:34 Alexander Monakov
  2015-10-20 18:34 ` [gomp4 06/14] omp-low: copy omp_data_o to shared memory on NVPTX Alexander Monakov
                   ` (16 more replies)
  0 siblings, 17 replies; 99+ messages in thread
From: Alexander Monakov @ 2015-10-20 18:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Dmitry Melnik

Hello,

This patch series moves libgomp/nvptx porting further along to get initial
bits of parallel execution working, mostly unbreaking the testsuite.  Please
have a look!  I'm interested in feedback, and would like to know if it's
suitable to become a part of a branch.

This patch series ports enough of libgomp.c to get warp-level parallelism
working for OpenMP offloading.  The overall approach is as follows.

I've opted not to use dynamic parallelism.  It increases the hardware
requirement from sm_30 to sm_35, needs a library from CUDA Toolkit at link
time (libcudadevrt.a), and imposes overhead at run time.  The last point might
be moot if we don't manage to make libgomp's own overhead low, but still my
judgement is that a hard dependency on dynamic parallelism is problematic.

The plugin launches one (for now) thread block with 8 warps, which begin
executing a new function in libgomp, gomp_nvptx_main.  The warps for a
(pre-allocated) pool.  Warp 0 is responsible for initialization and final
cleanup, and proceeds to execute target region functions.  Other warps proceed
to gomp_thread_start.

With these patches, it's possible to have libgomp testsuite mostly passing.
The failures are as follows:

libgomp.c/target-{1,7,critical-1}.c: segfault in accelerator code

libgomp.c/thread-limit-2.c: fails to link due to 'usleep' unavailable on
NVPTX.  Note, the test does not run anything on the device because the target
region has 'if (0)' clause.

libgomp.c++/examples-4/declare_target-2.C: libgomp: Can't map target variables
(size mismatch).  Will investigate later.

libgomp.c++/target-1.C: same as libgomp.c/target-1.c, segfault on device.

I didn't run the libgomp/gfortran testsuite yet.  I'd like your input on
dealing with testsuite breaks (XFAIL?).

I have not rebased my private branch in a while, so context in
gcc/config/nvptx is probably out-of-date in places.

Yours,
Alexander


  nvptx: emit kernels for 'omp target entrypoint' only for OpenACC
  nvptx: emit pointers to OpenMP target region entry points
  nvptx: expand support for address spaces
  nvptx: fix output of _Bool global variables
  omp-low: set 'omp target entrypoint' only on entypoints
  omp-low: copy omp_data_o to shared memory on NVPTX
  libgomp nvptx plugin: launch target functions via gomp_nvptx_main
  libgomp nvptx: populate proc.c
  libgomp: provide barriers on NVPTX
  libgomp: arrange a team of pre-started threads via gomp_nvptx_main
  libgomp: avoid variable-length stack allocation in team.c
  libgomp: fixup error.c on nvptx
  libgomp: provide minimal GOMP_teams
  libgomp: use more generic implementations on nvptx

 gcc/config/nvptx/nvptx.c        |  78 +++++++++++++--
 gcc/omp-low.c                   |  58 +++++++++--
 libgomp/config/nvptx/alloc.c    |   0
 libgomp/config/nvptx/bar.c      | 210 ++++++++++++++++++++++++++++++++++++++++
 libgomp/config/nvptx/bar.h      | 129 +++++++++++++++++++++++-
 libgomp/config/nvptx/barrier.c  |   0
 libgomp/config/nvptx/critical.c |  57 -----------
 libgomp/config/nvptx/error.c    |   0
 libgomp/config/nvptx/iter.c     |   0
 libgomp/config/nvptx/iter_ull.c |   0
 libgomp/config/nvptx/loop.c     |   0
 libgomp/config/nvptx/loop_ull.c |   0
 libgomp/config/nvptx/ordered.c  |   0
 libgomp/config/nvptx/parallel.c |   0
 libgomp/config/nvptx/proc.c     |  40 ++++++++
 libgomp/config/nvptx/single.c   |   0
 libgomp/config/nvptx/target.c   |  39 ++++++++
 libgomp/config/nvptx/task.c     |   0
 libgomp/config/nvptx/team.c     |   0
 libgomp/config/nvptx/work.c     |   0
 libgomp/error.c                 |   5 +
 libgomp/libgomp.h               |  10 +-
 libgomp/plugin/plugin-nvptx.c   |  23 ++++-
 libgomp/task.c                  |   7 +-
 libgomp/team.c                  |  92 +++++++++++++++++-
 25 files changed, 664 insertions(+), 84 deletions(-)
 delete mode 100644 libgomp/config/nvptx/alloc.c
 delete mode 100644 libgomp/config/nvptx/barrier.c
 delete mode 100644 libgomp/config/nvptx/critical.c
 delete mode 100644 libgomp/config/nvptx/error.c
 delete mode 100644 libgomp/config/nvptx/iter.c
 delete mode 100644 libgomp/config/nvptx/iter_ull.c
 delete mode 100644 libgomp/config/nvptx/loop.c
 delete mode 100644 libgomp/config/nvptx/loop_ull.c
 delete mode 100644 libgomp/config/nvptx/ordered.c
 delete mode 100644 libgomp/config/nvptx/parallel.c
 delete mode 100644 libgomp/config/nvptx/single.c
 delete mode 100644 libgomp/config/nvptx/task.c
 delete mode 100644 libgomp/config/nvptx/team.c
 delete mode 100644 libgomp/config/nvptx/work.c

^ permalink raw reply	[flat|nested] 99+ messages in thread

end of thread, other threads:[~2015-11-26  9:50 UTC | newest]

Thread overview: 99+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-20 18:34 [gomp4 00/14] NVPTX: further porting Alexander Monakov
2015-10-20 18:34 ` [gomp4 06/14] omp-low: copy omp_data_o to shared memory on NVPTX Alexander Monakov
2015-10-21  0:07   ` Bernd Schmidt
2015-10-21  6:49     ` Alexander Monakov
2015-10-21  8:48   ` Jakub Jelinek
2015-10-21  9:09     ` Alexander Monakov
2015-10-21  9:24       ` Jakub Jelinek
2015-10-21 10:42       ` Bernd Schmidt
2015-10-21 14:06         ` Alexander Monakov
2015-11-03 14:25   ` Alexander Monakov
2015-11-06 14:00     ` Bernd Schmidt
2015-11-06 14:06       ` Jakub Jelinek
2015-11-10 10:39     ` Jakub Jelinek
2015-11-26  9:51       ` Jakub Jelinek
2015-10-20 18:34 ` [gomp4 07/14] libgomp nvptx plugin: launch target functions via gomp_nvptx_main Alexander Monakov
2015-10-20 21:12   ` Bernd Schmidt
2015-10-20 21:19     ` Alexander Monakov
2015-10-20 21:27       ` Bernd Schmidt
2015-10-21  9:07         ` Jakub Jelinek
2015-10-20 18:34 ` [gomp4 14/14] libgomp: use more generic implementations on nvptx Alexander Monakov
2015-10-21 10:17   ` Jakub Jelinek
2015-10-20 18:34 ` [gomp4 08/14] libgomp nvptx: populate proc.c Alexander Monakov
2015-10-21  9:15   ` Jakub Jelinek
2015-10-20 18:34 ` [gomp4 12/14] libgomp: fixup error.c on nvptx Alexander Monakov
2015-10-21 10:03   ` Jakub Jelinek
2015-10-20 18:34 ` [gomp4 04/14] nvptx: fix output of _Bool global variables Alexander Monakov
2015-10-20 20:51   ` Bernd Schmidt
2015-10-20 21:04     ` Alexander Monakov
2015-10-28 16:56       ` Alexander Monakov
2015-10-28 17:01         ` Bernd Schmidt
2015-10-28 17:38           ` Alexander Monakov
2015-10-28 17:39             ` Bernd Schmidt
2015-10-28 17:51               ` Alexander Monakov
2015-10-28 18:06                 ` Bernd Schmidt
2015-10-28 18:07                   ` Alexander Monakov
2015-10-28 18:33                     ` Bernd Schmidt
2015-10-28 19:37                       ` Alexander Monakov
2015-10-29 11:13                         ` Bernd Schmidt
2015-10-30 13:27                           ` Alexander Monakov
2015-10-30 13:38                             ` Bernd Schmidt
2015-10-20 18:34 ` [gomp4 01/14] nvptx: emit kernels for 'omp target entrypoint' only for OpenACC Alexander Monakov
2015-10-20 23:48   ` Bernd Schmidt
2015-10-21  5:40     ` Alexander Monakov
2015-10-21  8:11   ` Jakub Jelinek
2015-10-21  8:36     ` Alexander Monakov
2015-10-20 18:34 ` [gomp4 03/14] nvptx: expand support for address spaces Alexander Monakov
2015-10-20 20:56   ` Bernd Schmidt
2015-10-20 21:06     ` Alexander Monakov
2015-10-20 21:13       ` Bernd Schmidt
2015-10-20 21:41         ` Cesar Philippidis
2015-10-20 21:51           ` Bernd Schmidt
2015-10-20 18:34 ` [gomp4 11/14] libgomp: avoid variable-length stack allocation in team.c Alexander Monakov
2015-10-20 20:48   ` Bernd Schmidt
2015-10-20 21:41     ` Alexander Monakov
2015-10-20 21:46       ` Bernd Schmidt
2015-10-21  9:59   ` Jakub Jelinek
2015-10-20 18:34 ` [gomp4 05/14] omp-low: set 'omp target entrypoint' only on entypoints Alexander Monakov
2015-10-20 23:57   ` Bernd Schmidt
2015-10-21  8:20   ` Jakub Jelinek
2015-10-30 16:58     ` Alexander Monakov
2015-11-06 14:05       ` Bernd Schmidt
2015-11-06 14:08         ` Jakub Jelinek
2015-11-06 14:12           ` Bernd Schmidt
2015-11-06 17:16         ` Alexander Monakov
2015-10-20 18:52 ` [gomp4 10/14] libgomp: arrange a team of pre-started threads via gomp_nvptx_main Alexander Monakov
2015-10-21  9:49   ` Jakub Jelinek
2015-10-21 14:41     ` Alexander Monakov
2015-10-21 15:02       ` Jakub Jelinek
2015-10-20 18:52 ` [gomp4 13/14] libgomp: provide minimal GOMP_teams Alexander Monakov
2015-10-21 10:12   ` Jakub Jelinek
2015-10-20 18:53 ` [gomp4 09/14] libgomp: provide barriers on NVPTX Alexander Monakov
2015-10-20 20:56   ` Bernd Schmidt
2015-10-20 22:00     ` Alexander Monakov
2015-10-21  2:23       ` Bernd Schmidt
2015-10-21  9:39   ` Jakub Jelinek
2015-10-20 19:01 ` [gomp4 02/14] nvptx: emit pointers to OpenMP target region entry points Alexander Monakov
2015-10-21  7:55 ` [gomp4 00/14] NVPTX: further porting Martin Jambor
2015-10-21  8:56 ` Jakub Jelinek
2015-10-21  9:17   ` Alexander Monakov
2015-10-21  9:29     ` Jakub Jelinek
2015-10-28 17:22       ` Alexander Monakov
2015-10-29  8:54         ` Jakub Jelinek
2015-10-29 11:38           ` Alexander Monakov
2015-10-21 12:06 ` Bernd Schmidt
2015-10-21 15:48   ` Alexander Monakov
2015-10-21 16:10     ` Bernd Schmidt
2015-10-22  9:55     ` Jakub Jelinek
2015-10-22 16:42       ` Alexander Monakov
2015-10-22 17:16         ` Julian Brown
2015-10-22 18:19           ` Alexander Monakov
2015-10-22 17:17         ` Bernd Schmidt
2015-10-22 18:10           ` Alexander Monakov
2015-10-22 18:27             ` Bernd Schmidt
2015-10-22 19:28               ` Alexander Monakov
2015-10-23  8:23           ` Jakub Jelinek
2015-10-23  8:25           ` Jakub Jelinek
2015-10-23 10:24           ` Jakub Jelinek
2015-10-23 10:48             ` Bernd Schmidt
2015-10-23 17:36             ` Alexander Monakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).