From: Frederik Harwath <frederik@codesourcery.com>
To: <gcc-patches@gcc.gnu.org>
Subject: [OG11][committed][PATCH 00/22] OpenACC "kernels" Improvements
Date: Wed, 17 Nov 2021 17:03:09 +0100 [thread overview]
Message-ID: <20211117160330.20029-1-frederik@codesourcery.com> (raw)
Hi,
this patch series implements the re-work of the OpenACC "kernels"
implementation that has been announced at the GNU Tools Track of this
year's Linux Plumbers Conference; see
https://linuxplumbersconf.org/event/11/contributions/998/. The
central step is contained in the commit titled "openacc: Use Graphite
for dependence analysis in \"kernels\" regions" whose commit message
also contains further explanations.
Best regards,
Frederik
PS: The commit series also includes a backport from master
"00b98b6cac25 Add dg-final option-based target selectors" and two
trivial unrelated commits "fa558c2a6664 Fix gimple_debug_cfg
declaration" and "35cdc94463fe Fix branch prediction dump message"
Andrew Stubbs (2):
openacc: Add data optimization pass
openacc: Add runtime alias checking for OpenACC kernels
Frederik Harwath (19):
openacc: Move pass_oacc_device_lower after pass_graphite
graphite: Extend SCoP detection dump output
graphite: Rename isl_id_for_ssa_name
graphite: Fix minor mistakes in comments
Fix branch prediction dump message
Move compute_alias_check_pairs to tree-data-ref.c
graphite: Add runtime alias checking
openacc: Use Graphite for dependence analysis in "kernels" regions
openacc: Add "can_be_parallel" flag info to "graph" dumps
openacc: Add further kernels tests
openacc: Remove unused partitioning in "kernels" regions
Add function for printing a single OMP_CLAUSE
openacc: Warn about "independent" "kernels" loops with
data-dependences
openacc: Handle internal function calls in pass_lim
openacc: Disable pass_pre on outlined functions analyzed by Graphite
graphite: Tune parameters for OpenACC use
graphite: Adjust scop loop-nest choice
graphite: Accept loops without data references
openacc: Adjust test expectations to new "kernels" handling
Sandra Loosemore (1):
Fortran: delinearize multi-dimensional array accesses
gcc/Makefile.in | 2 +
gcc/cfgloop.c | 1 +
gcc/cfgloop.h | 6 +
gcc/cfgloopmanip.c | 1 +
gcc/common.opt | 9 +
gcc/config/nvptx/nvptx.c | 7 +
gcc/doc/gimple.texi | 2 +
gcc/doc/invoke.texi | 20 +-
gcc/doc/passes.texi | 6 +-
gcc/expr.c | 1 +
gcc/flag-types.h | 1 +
gcc/fortran/lang.opt | 4 +
gcc/fortran/trans-array.c | 321 ++++--
gcc/gimple-loop-interchange.cc | 2 +-
gcc/gimple-pretty-print.c | 3 +
gcc/gimple-walk.c | 15 +-
gcc/gimple-walk.h | 6 +
gcc/gimple.h | 7 +-
gcc/gimplify.c | 13 +-
gcc/graph.c | 35 +-
gcc/graphite-dependences.c | 220 +++-
gcc/graphite-isl-ast-to-gimple.c | 271 ++++-
gcc/graphite-oacc.c | 689 ++++++++++++
gcc/graphite-oacc.h | 55 +
gcc/graphite-optimize-isl.c | 42 +-
gcc/graphite-poly.c | 41 +-
gcc/graphite-scop-detection.c | 654 +++++++++--
gcc/graphite-sese-to-poly.c | 90 +-
gcc/graphite.c | 120 +-
gcc/graphite.h | 40 +-
gcc/internal-fn.c | 2 +
gcc/internal-fn.h | 4 +-
gcc/omp-data-optimize.cc | 951 ++++++++++++++++
gcc/omp-expand.c | 110 +-
gcc/omp-general.c | 23 +-
gcc/omp-general.h | 1 +
gcc/omp-low.c | 321 +++++-
gcc/omp-oacc-kernels-decompose.cc | 145 ++-
gcc/omp-offload.c | 1001 +++++++++++++----
gcc/omp-offload.h | 2 +
gcc/params.opt | 5 +-
gcc/passes.c | 42 +
gcc/passes.def | 47 +-
gcc/predict.c | 2 +-
gcc/sese.c | 25 +-
gcc/sese.h | 19 +
gcc/testsuite/c-c++-common/goacc/acc-icf.c | 4 +-
gcc/testsuite/c-c++-common/goacc/cache-3-1.c | 2 +-
...classify-kernels-unparallelized-graphite.c | 41 +
...lassify-kernels-unparallelized-parloops.c} | 12 +-
.../c-c++-common/goacc/classify-kernels.c | 27 +-
.../c-c++-common/goacc/classify-parallel.c | 8 +-
.../c-c++-common/goacc/classify-routine.c | 8 +-
.../c-c++-common/goacc/classify-serial.c | 12 +-
.../device-lowering-debug-optimization.c | 29 +
.../goacc/device-lowering-no-loops.c | 17 +
.../goacc/device-lowering-no-optimization.c | 30 +
.../c-c++-common/goacc/if-clause-2.c | 2 +-
.../goacc/kernels-decompose-1-parloops.c | 125 ++
.../c-c++-common/goacc/kernels-decompose-1.c | 31 +-
.../c-c++-common/goacc/kernels-decompose-2.c | 2 +-
.../goacc/kernels-decompose-ice-1.c | 5 +-
.../goacc/kernels-decompose-ice-2.c | 3 +-
.../goacc/kernels-loop-3-acc-loop.c | 2 +-
.../c-c++-common/goacc/kernels-loop-3.c | 2 +-
...duction.c => kernels-reduction-parloops.c} | 0
.../c-c++-common/goacc/loop-2-kernels.c | 20 +-
.../c-c++-common/goacc/loop-auto-reductions.c | 22 +
.../goacc/nested-reductions-2-parallel.c | 138 +++
...kernels-conditional-loop-independent_seq.c | 129 ---
...parallelism-1-kernels-loop-auto-parloops.c | 128 +++
.../note-parallelism-1-kernels-loop-auto.c | 104 +-
...rallelism-1-kernels-loop-independent_seq.c | 19 +-
.../goacc/note-parallelism-1-kernels-loops.c | 11 +-
...note-parallelism-1-kernels-straight-line.c | 11 +-
...e-parallelism-combined-kernels-loop-auto.c | 34 +-
...sm-combined-kernels-loop-independent_seq.c | 16 -
...kernels-conditional-loop-independent_seq.c | 38 +-
.../note-parallelism-kernels-loop-auto.c | 100 +-
...parallelism-kernels-loop-independent_seq.c | 27 +-
.../goacc/note-parallelism-kernels-loops-1.c | 61 +
.../note-parallelism-kernels-loops-parloops.c | 53 +
.../goacc/note-parallelism-kernels-loops.c | 39 +-
.../c-c++-common/goacc/omp_data_optimize-1.c | 677 +++++++++++
gcc/testsuite/c-c++-common/goacc/routine-1.c | 2 +-
.../goacc/routine-level-of-parallelism-2.c | 2 -
.../c-c++-common/goacc/routine-nohost-1.c | 4 +-
gcc/testsuite/c-c++-common/unroll-1.c | 8 +-
gcc/testsuite/c-c++-common/unroll-4.c | 4 +-
.../g++.dg/goacc/omp_data_optimize-1.C | 169 +++
.../gcc.dg/goacc/graphite-parameter-1.c | 21 +
.../gcc.dg/goacc/graphite-parameter-2.c | 23 +
.../gcc.dg/goacc/loop-processing-1.c | 7 +-
.../gcc.dg/goacc/nested-function-1.c | 3 +-
gcc/testsuite/gcc.dg/graphite/alias-1.c | 22 +
gcc/testsuite/gcc.dg/tree-ssa/backprop-1.c | 6 +-
gcc/testsuite/gcc.dg/tree-ssa/backprop-2.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/backprop-3.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/backprop-4.c | 6 +-
gcc/testsuite/gcc.dg/tree-ssa/backprop-5.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c | 6 +-
gcc/testsuite/gcc.dg/tree-ssa/cunroll-1.c | 6 +-
gcc/testsuite/gcc.dg/tree-ssa/cunroll-3.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/cunroll-9.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/loop-38.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/pr21463.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/pr45427.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/pr61743-1.c | 2 +-
gcc/testsuite/gcc.dg/unroll-2.c | 2 +-
gcc/testsuite/gcc.dg/unroll-3.c | 4 +-
gcc/testsuite/gcc.dg/unroll-4.c | 4 +-
gcc/testsuite/gcc.dg/unroll-5.c | 4 +-
gcc/testsuite/gcc.dg/vect/bb-slp-59.c | 2 +-
gcc/testsuite/gcc.dg/vect/vect-profile-1.c | 2 +-
gcc/testsuite/gfortran.dg/assumed_type_2.f90 | 6 +-
...assify-kernels-unparallelized-parloops.f95 | 44 +
.../goacc/classify-kernels-unparallelized.f95 | 26 +-
.../gfortran.dg/goacc/classify-kernels.f95 | 26 +-
.../gfortran.dg/goacc/classify-parallel.f95 | 6 +-
.../gfortran.dg/goacc/classify-routine.f95 | 8 +-
.../gfortran.dg/goacc/classify-serial.f95 | 11 +-
.../gfortran.dg/goacc/common-block-3.f90 | 14 +-
.../gfortran.dg/goacc/gang-static.f95 | 14 +-
.../gfortran.dg/goacc/kernels-conversion.f95 | 52 +
.../goacc/kernels-decompose-1-parloops.f95 | 121 ++
.../gfortran.dg/goacc/kernels-decompose-1.f95 | 183 ++-
.../gfortran.dg/goacc/kernels-decompose-2.f95 | 112 +-
.../goacc/kernels-decompose-parloops-2.f95 | 154 +++
.../gfortran.dg/goacc/kernels-loop-2.f95 | 13 +-
.../gfortran.dg/goacc/kernels-loop-data-2.f95 | 13 +-
.../goacc/kernels-loop-data-parloops-2.f95 | 52 +
.../gfortran.dg/goacc/kernels-loop-inner.f95 | 6 +-
.../goacc/kernels-loop-parloops-2.f95 | 45 +
.../goacc/kernels-loop-parloops.f95 | 39 +
.../gfortran.dg/goacc/kernels-loop.f95 | 12 +-
.../gfortran.dg/goacc/kernels-reductions.f90 | 37 +
.../gfortran.dg/goacc/kernels-tree.f95 | 2 +-
.../gfortran.dg/goacc/loop-2-kernels.f95 | 22 +-
.../goacc/loop-auto-transfer-2.f90 | 45 +
.../goacc/loop-auto-transfer-3.f90 | 95 ++
.../goacc/loop-auto-transfer-4.f90 | 293 +++++
.../gfortran.dg/goacc/nested-function-1.f90 | 2 +
.../goacc/nested-reductions-2-parallel.f90 | 177 +++
.../gfortran.dg/goacc/omp_data_optimize-1.f90 | 588 ++++++++++
gcc/testsuite/gfortran.dg/goacc/pr72741.f90 | 8 +-
.../goacc/private-explicit-kernels-1.f95 | 13 +-
.../goacc/private-predetermined-kernels-1.f95 | 16 +-
.../goacc/routine-module-mod-1.f90 | 2 +-
gcc/testsuite/gfortran.dg/graphite/block-2.f | 9 +-
.../gfortran.dg/graphite/block-3.f90 | 1 -
.../gfortran.dg/graphite/block-4.f90 | 1 -
gcc/testsuite/gfortran.dg/graphite/id-9.f | 2 +-
.../gfortran.dg/inline_matmul_24.f90 | 2 +-
gcc/testsuite/gfortran.dg/no_arg_check_2.f90 | 6 +-
gcc/testsuite/gfortran.dg/pr32921.f | 2 +-
gcc/testsuite/gfortran.dg/reassoc_4.f | 2 +-
gcc/tree-chrec.c | 3 +
gcc/tree-data-ref.c | 107 +-
gcc/tree-data-ref.h | 3 +
gcc/tree-loop-distribution.c | 87 --
gcc/tree-parloops.c | 18 +-
gcc/tree-pass.h | 3 +
gcc/tree-pretty-print.c | 11 +
gcc/tree-pretty-print.h | 1 +
gcc/tree-scalar-evolution.c | 179 ++-
gcc/tree-scalar-evolution.h | 3 +
gcc/tree-ssa-dce.c | 14 +
gcc/tree-ssa-loop-im.c | 58 +-
gcc/tree-ssa-loop-ivcanon.c | 2 +
gcc/tree-ssa-loop-manip.h | 2 +-
gcc/tree-ssa-loop-niter.c | 6 +
gcc/tree-ssa-loop.c | 110 ++
gcc/tree-ssa-phiprop.c | 2 +
gcc/tree-ssa-pre.c | 17 +
.../acc_prof-kernels-1.c | 19 +-
.../kernels-decompose-1.c | 7 +-
.../libgomp.oacc-c-c++-common/parallel-dims.c | 34 +-
.../libgomp.oacc-c-c++-common/pr84955-1.c | 1 -
.../libgomp.oacc-c-c++-common/pr85381-2.c | 8 +-
.../libgomp.oacc-c-c++-common/pr85381-3.c | 3 -
.../libgomp.oacc-c-c++-common/pr85381-4.c | 4 +-
.../libgomp.oacc-c-c++-common/pr85486-2.c | 2 +-
.../libgomp.oacc-c-c++-common/pr85486-3.c | 2 +-
.../libgomp.oacc-c-c++-common/pr85486.c | 2 +-
.../runtime-alias-check-1.c | 79 ++
.../runtime-alias-check-2.c | 90 ++
.../vector-length-128-1.c | 3 +-
.../vector-length-128-2.c | 3 +-
.../vector-length-128-3.c | 3 +-
.../vector-length-128-4.c | 3 +-
.../vector-length-128-5.c | 3 +-
.../vector-length-128-6.c | 3 +-
.../vector-length-128-7.c | 3 +-
.../gangprivate-attrib-1.f90 | 5 +-
.../gangprivate-attrib-2.f90 | 3 +-
.../kernels-acc-loop-reduction-2.f90 | 12 +-
.../kernels-independent.f90 | 1 +
.../libgomp.oacc-fortran/kernels-loop-1.f90 | 1 +
.../libgomp.oacc-fortran/pr94358-1.f90 | 7 +-
201 files changed, 9403 insertions(+), 1524 deletions(-)
create mode 100644 gcc/graphite-oacc.c
create mode 100644 gcc/graphite-oacc.h
create mode 100644 gcc/omp-data-optimize.cc
create mode 100644 gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized-graphite.c
rename gcc/testsuite/c-c++-common/goacc/{classify-kernels-unparallelized.c => classify-kernels-unparallelized-parloops.c} (84%)
create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-debug-optimization.c
create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-no-loops.c
create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-no-optimization.c
create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-decompose-1-parloops.c
rename gcc/testsuite/c-c++-common/goacc/{kernels-reduction.c => kernels-reduction-parloops.c} (100%)
create mode 100644 gcc/testsuite/c-c++-common/goacc/loop-auto-reductions.c
delete mode 100644 gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-conditional-loop-independent_seq.c
create mode 100644 gcc/testsuite/c-c++-common/goacc/note-parallelism-1-kernels-loop-auto-parloops.c
create mode 100644 gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops-1.c
create mode 100644 gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops-parloops.c
create mode 100644 gcc/testsuite/c-c++-common/goacc/omp_data_optimize-1.c
create mode 100644 gcc/testsuite/g++.dg/goacc/omp_data_optimize-1.C
create mode 100644 gcc/testsuite/gcc.dg/goacc/graphite-parameter-1.c
create mode 100644 gcc/testsuite/gcc.dg/goacc/graphite-parameter-2.c
create mode 100644 gcc/testsuite/gcc.dg/graphite/alias-1.c
create mode 100644 gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95
create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-conversion.f95
create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-decompose-1-parloops.f95
create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-decompose-parloops-2.f95
create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-parloops-2.f95
create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-parloops-2.f95
create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-parloops.f95
create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-reductions.f90
create mode 100644 gcc/testsuite/gfortran.dg/goacc/loop-auto-transfer-2.f90
create mode 100644 gcc/testsuite/gfortran.dg/goacc/loop-auto-transfer-3.f90
create mode 100644 gcc/testsuite/gfortran.dg/goacc/loop-auto-transfer-4.f90
create mode 100644 gcc/testsuite/gfortran.dg/goacc/omp_data_optimize-1.f90
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-1.c
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-2.c
--
2.33.0
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
next reply other threads:[~2021-11-17 16:03 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-17 16:03 Frederik Harwath [this message]
2021-11-17 16:03 ` [OG11][committed][PATCH 01/22] Fortran: delinearize multi-dimensional array accesses Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 02/22] openacc: Move pass_oacc_device_lower after pass_graphite Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 03/22] graphite: Extend SCoP detection dump output Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 04/22] graphite: Rename isl_id_for_ssa_name Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 05/22] graphite: Fix minor mistakes in comments Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 07/22] Move compute_alias_check_pairs to tree-data-ref.c Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 08/22] graphite: Add runtime alias checking Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 09/22] openacc: Use Graphite for dependence analysis in "kernels" regions Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 10/22] openacc: Add "can_be_parallel" flag info to "graph" dumps Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 11/22] openacc: Add further kernels tests Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 12/22] openacc: Remove unused partitioning in "kernels" regions Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 13/22] Add function for printing a single OMP_CLAUSE Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 14/22] openacc: Add data optimization pass Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 15/22] openacc: Add runtime alias checking for OpenACC kernels Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 16/22] openacc: Warn about "independent" "kernels" loops with data-dependences Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 17/22] openacc: Handle internal function calls in pass_lim Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 18/22] openacc: Disable pass_pre on outlined functions analyzed by Graphite Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 19/22] graphite: Tune parameters for OpenACC use Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 20/22] graphite: Adjust scop loop-nest choice Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 21/22] graphite: Accept loops without data references Frederik Harwath
2021-11-17 16:03 ` [OG11][committed][PATCH 22/22] openacc: Adjust test expectations to new "kernels" handling Frederik Harwath
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211117160330.20029-1-frederik@codesourcery.com \
--to=frederik@codesourcery.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).