public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/4] openacc: Async fixes
@ 2021-06-29 23:42 Julian Brown
  2021-06-29 23:42 ` [PATCH 1/4] openacc: Async fix for lib-94 testcase Julian Brown
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Julian Brown @ 2021-06-29 23:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: Thomas Schwinge, Jakub Jelinek, Chung-Lin Tang

This patch series contains fixes for various problems with async support
for OpenACC at present:

 - Asynchonous host-to-device copies invoked from within libgomp
   (target.c) could copy bad data to the target -- and the workaround
   for that currently used in the AMD GCN target plugin could lead to
   a different problem (a race condition).

 - The OpenACC profiling-interface implementation did not measure
   asynchronous operations properly.

 - Several test cases misuse OpenACC asynchronous support (more race
   conditions).

Further comments on individual patches. Tested with offloading to AMD
GCN. OK for mainline?

Thanks,

Julian

Julian Brown (4):
  openacc: Async fix for lib-94 testcase
  openacc: Fix async bugs in several OpenACC test cases
  openacc: Fix asynchronous host-to-device copies in libgomp runtime
  openacc: Profiling-interface fixes for asynchronous operations

 libgomp/libgomp.h                             |   2 +-
 libgomp/oacc-host.c                           |   5 +-
 libgomp/oacc-mem.c                            |  36 +++-
 libgomp/oacc-parallel.c                       | 190 ++++++++++++++----
 libgomp/plugin/plugin-gcn.c                   |  20 +-
 libgomp/target.c                              | 111 ++++++----
 .../acc_prof-init-1.c                         |   5 +-
 .../acc_prof-parallel-1.c                     |  64 ++----
 .../libgomp.oacc-c-c++-common/deep-copy-10.c  |  14 +-
 .../libgomp.oacc-c-c++-common/lib-94.c        |   4 +-
 .../libgomp.oacc-fortran/lib-16-2.f90         |   5 +
 .../testsuite/libgomp.oacc-fortran/lib-16.f90 |   5 +
 12 files changed, 289 insertions(+), 172 deletions(-)

-- 
2.29.2


^ permalink raw reply	[flat|nested] 13+ messages in thread
* [PATCH 0/4] openacc: Worker partitioning in the middle end
@ 2021-03-02 12:20 Julian Brown
  2021-03-02 12:20 ` [PATCH 2/4] openacc: Fix async bugs in several OpenACC test cases Julian Brown
  0 siblings, 1 reply; 13+ messages in thread
From: Julian Brown @ 2021-03-02 12:20 UTC (permalink / raw)
  To: gcc-patches
  Cc: Thomas Schwinge, Tobias Burnus, Kwok Cheung Yeung, Jakub Jelinek

This series contains updated parts of the patch series that was previously
sent upstream in November 2019:

  https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534547.html

The purpose of the series is to enable multiple workers for OpenACC
(workers being one of the dimensions of parallelism supported by the
standard) on targets such as AMD GCN. (NVPTX uses its own scheme for
supporting multiple workers, implemented mostly in the backend.)

Tested with offloading to AMD GCN and (separately) to NVPTX.

Further commentary is provided alongside individual patches. I'm posting
these patches for review now, but I don't expect to commit them until
stage 1.

Thanks,

Julian

Julian Brown (4):
  openacc: Middle-end worker-partitioning support
  openacc: Fix async bugs in several OpenACC test cases
  amdgcn: Enable OpenACC worker partitioning for AMD GCN
  openacc: Reference-typed reduction and private variable rewriting

 gcc/Makefile.in                               |    1 +
 gcc/config/gcn/gcn-protos.h                   |    2 +-
 gcc/config/gcn/gcn-tree.c                     |    6 +-
 gcc/config/gcn/gcn.c                          |   23 +-
 gcc/config/gcn/gcn.opt                        |    5 -
 gcc/doc/tm.texi                               |   10 +
 gcc/doc/tm.texi.in                            |    4 +
 gcc/gimplify.c                                |  117 ++
 gcc/oacc-neuter-bcast.c                       | 1471 +++++++++++++++++
 gcc/oacc-neuter-bcast.h                       |   26 +
 gcc/omp-builtins.def                          |    8 +
 gcc/omp-low.c                                 |   47 +-
 gcc/omp-offload.c                             |  159 +-
 gcc/omp-offload.h                             |    1 +
 gcc/passes.def                                |    2 +
 gcc/target.def                                |   13 +
 gcc/targhooks.h                               |    1 +
 .../goacc/classify-kernels-unparallelized.c   |    8 +-
 .../c-c++-common/goacc/classify-kernels.c     |    8 +-
 .../c-c++-common/goacc/classify-parallel.c    |    8 +-
 .../c-c++-common/goacc/classify-routine.c     |    8 +-
 .../c-c++-common/goacc/classify-serial.c      |    8 +-
 .../gcc.dg/goacc/loop-processing-1.c          |    2 +-
 .../goacc/classify-kernels-unparallelized.f95 |    8 +-
 .../gfortran.dg/goacc/classify-kernels.f95    |    8 +-
 .../gfortran.dg/goacc/classify-parallel.f95   |    8 +-
 .../gfortran.dg/goacc/classify-routine.f95    |    8 +-
 .../gfortran.dg/goacc/classify-serial.f95     |    8 +-
 gcc/tree-core.h                               |    4 +-
 gcc/tree-pass.h                               |    2 +
 gcc/tree.c                                    |   11 +-
 gcc/tree.h                                    |    2 +
 libgomp/plugin/plugin-gcn.c                   |    4 +-
 .../libgomp.oacc-c++/privatized-ref-2.C       |   64 +
 .../libgomp.oacc-c++/privatized-ref-3.C       |   64 +
 .../libgomp.oacc-c-c++-common/deep-copy-10.c  |   14 +-
 .../loop-dim-default.c                        |   11 +-
 .../libgomp.oacc-c-c++-common/parallel-dims.c |   13 +-
 .../libgomp.oacc-fortran/lib-16-2.f90         |    5 +
 .../testsuite/libgomp.oacc-fortran/lib-16.f90 |    5 +
 .../libgomp.oacc-fortran/parallel-dims-aux.c  |    9 +-
 .../libgomp.oacc-fortran/privatized-ref-1.f95 |   71 +
 42 files changed, 2112 insertions(+), 145 deletions(-)
 create mode 100644 gcc/oacc-neuter-bcast.c
 create mode 100644 gcc/oacc-neuter-bcast.h
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-2.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-3.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-1.f95

-- 
2.29.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-03-10 15:23 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-29 23:42 [PATCH 0/4] openacc: Async fixes Julian Brown
2021-06-29 23:42 ` [PATCH 1/4] openacc: Async fix for lib-94 testcase Julian Brown
2021-06-29 23:42 ` [PATCH 2/4] openacc: Fix async bugs in several OpenACC test cases Julian Brown
2021-06-29 23:52   ` Julian Brown
2021-06-29 23:42 ` [PATCH 3/4] openacc: Fix asynchronous host-to-device copies in libgomp runtime Julian Brown
2021-07-27 10:01   ` Thomas Schwinge
2023-03-10 15:22     ` Allow libgomp 'cbuf' buffering with OpenACC 'async' for 'ephemeral' data (was: [PATCH 3/4] openacc: Fix asynchronous host-to-device copies in libgomp runtime) Thomas Schwinge
2021-06-29 23:42 ` [PATCH 4/4] openacc: Profiling-interface fixes for asynchronous operations Julian Brown
2021-06-30  8:28 ` [PATCH 0/4] openacc: Async fixes Thomas Schwinge
2021-06-30 10:40   ` Julian Brown
2021-07-02 13:51     ` Julian Brown
2023-03-10 11:38   ` Thomas Schwinge
  -- strict thread matches above, loose matches on Subject: below --
2021-03-02 12:20 [PATCH 0/4] openacc: Worker partitioning in the middle end Julian Brown
2021-03-02 12:20 ` [PATCH 2/4] openacc: Fix async bugs in several OpenACC test cases Julian Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).