public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 00/40] OpenACC "kernels" Improvements
@ 2021-12-15 15:54 Frederik Harwath
  2021-12-15 15:54 ` [PATCH 03/40] Kernels loops annotation: Fortran Frederik Harwath
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Frederik Harwath @ 2021-12-15 15:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: rguenther, fortran, matz, Catherine_Moore

Hi,
this patch series implements the re-work of the OpenACC "kernels"
implementation that has been announced at the GNU Tools Track of this
year's Linux Plumbers Conference; see
https://linuxplumbersconf.org/event/11/contributions/998/.  Versions
of the patches have also been committed to the devel/omp/gcc-11 branch
recently.

The patch series contains middle-end changes that modify the "kernels"
loop handling to use Graphite for dependence analysis of loops in
"kernels" regions, as well as new optimizations and adjustments to
existing optimizations to support this analysis. A central step is
contained in the commit titled "openacc: Use Graphite for dependence
analysis in \"kernels\" regions" whose commit message also contains
further explanations. There are also front end changes (cf. the
patches by Sandra Loosemore) that prepare the loops in "kernels"
regions for the middle-end processing and which lift various
restrictions on "kernels" regions.  I have included some dependences
(the patches by Julian Brown) from the devel/omp/gcc-11 branch which
will be re-submitted independently for review.

I have bootstrapped the compiler on x86_64-linux-gnu and performed
comprehensive testing on a powerpc64le-linux-gnu target.  The patches
should apply cleanly on commit r12-4865 of the master branch.

I am aware that we cannot incorporate those patches into GCC at the
current development stage. I hope that we can discuss some of the
changes before they can be considered for inclusion in GCC during the
next stage 1.

Best regards,
Frederik


Andrew Stubbs (2):
  openacc: Add data optimization pass
  openacc: Add runtime a lias checking for OpenACC kernels

Frederik Harwath (20):
  Fortran: Delinearize array accesses
  openacc: Move pass_oacc_device_lower after pass_graphite
  graphite: Extend SCoP detection dump output
  graphite: Rename isl_id_for_ssa_name
  graphite: Fix minor mistakes in comments
  Move compute_alias_check_pairs to tree-data-ref.c
  graphite: Add runtime alias checking
  openacc: Use Graphite for dependence analysis in "kernels" regions
  openacc: Add "can_be_parallel" flag info to "graph" dumps
  openacc: Remove unused partitioning in "kernels" regions
  Add function for printing a single OMP_CLAUSE
  openacc: Warn about "independent" "kernels" loops with
    data-dependences
  openacc: Handle internal function calls in pass_lim
  openacc: Disable pass_pre on outlined functions analyzed by Graphite
  graphite: Tune parameters for OpenACC use
  graphite: Adjust scop loop-nest choice
  graphite: Accept loops without data references
  openacc: Enable reduction variable localization for "kernels"
  openacc: Check type for references in reduction lowering
  openacc: Adjust testsuite to new "kernels" handling

Julian Brown (4):
  Reference reduction localization
  Fix tree check failure with reduction localization
  Use more appropriate var in localize_reductions call
  Handle references in OpenACC "private" clauses

Sandra Loosemore (12):
  Kernels loops annotation: C and C++.
  Add -fno-openacc-kernels-annotate-loops option to more testcases.
  Kernels loops annotation: Fortran.
  Additional Fortran testsuite fixes for kernels loops annotation pass.
  Fix bug in processing of array dimensions in data clauses.
  Add a "combined" flag for "acc kernels loop" etc directives.
  Annotate inner loops in "acc kernels loop" directives (C/C++).
  Annotate inner loops in "acc kernels loop" directives (Fortran).
  Permit calls to builtins and intrinsics in kernels loops.
  Fix patterns in Fortran tests for kernels loop annotation.
  Clean up loop variable extraction in OpenACC kernels loop annotation.
  Relax some restrictions on the loop bound in kernels loop annotation.

Tobias Burnus (2):
  Fix for is_gimple_reg vars to 'data kernels'
  openacc: fix privatization of by-reference arrays

 gcc/Makefile.in                               |   2 +
 gcc/c-family/c-common.h                       |   1 +
 gcc/c-family/c-omp.c                          | 915 +++++++++++++++--
 gcc/c-family/c.opt                            |   8 +
 gcc/c/c-decl.c                                |  28 +
 gcc/c/c-parser.c                              |   3 +
 gcc/cfgloop.c                                 |   1 +
 gcc/cfgloop.h                                 |   6 +
 gcc/cfgloopmanip.c                            |   1 +
 gcc/common.opt                                |   9 +
 gcc/config/nvptx/nvptx.c                      |   7 +
 gcc/cp/decl.c                                 |  44 +
 gcc/cp/parser.c                               |   3 +
 gcc/cp/semantics.c                            |   9 +
 gcc/doc/gimple.texi                           |   2 +
 gcc/doc/invoke.texi                           |  52 +-
 gcc/doc/passes.texi                           |   6 +-
 gcc/expr.c                                    |   1 +
 gcc/flag-types.h                              |   1 +
 gcc/fortran/gfortran.h                        |   1 +
 gcc/fortran/lang.opt                          |  12 +
 gcc/fortran/openmp.c                          | 415 ++++++++
 gcc/fortran/parse.c                           |   9 +
 gcc/fortran/trans-array.c                     | 321 ++++--
 gcc/fortran/trans-openmp.c                    |  34 +-
 gcc/gimple-loop-interchange.cc                |   2 +-
 gcc/gimple-pretty-print.c                     |   3 +
 gcc/gimple-walk.c                             |  15 +-
 gcc/gimple-walk.h                             |   6 +
 gcc/gimple.h                                  |   5 +
 gcc/gimplify.c                                | 117 +++
 gcc/graph.c                                   |  35 +-
 gcc/graphite-dependences.c                    | 220 ++--
 gcc/graphite-isl-ast-to-gimple.c              | 271 ++++-
 gcc/graphite-oacc.c                           | 688 +++++++++++++
 gcc/graphite-oacc.h                           |  55 +
 gcc/graphite-optimize-isl.c                   |  42 +-
 gcc/graphite-poly.c                           |  41 +-
 gcc/graphite-scop-detection.c                 | 651 ++++++++++--
 gcc/graphite-sese-to-poly.c                   |  90 +-
 gcc/graphite.c                                | 120 ++-
 gcc/graphite.h                                |  40 +-
 gcc/internal-fn.c                             |   4 +
 gcc/internal-fn.h                             |   4 +-
 gcc/omp-data-optimize.cc                      | 951 ++++++++++++++++++
 gcc/omp-expand.c                              | 102 +-
 gcc/omp-general.c                             |  23 +-
 gcc/omp-general.h                             |   1 +
 gcc/omp-low.c                                 | 439 ++++++--
 gcc/omp-oacc-kernels-decompose.cc             | 154 ++-
 gcc/omp-oacc-neuter-broadcast.cc              |   2 +
 gcc/omp-offload.c                             | 830 ++++++++++++---
 gcc/omp-offload.h                             |   2 +
 gcc/params.opt                                |   7 +-
 gcc/passes.c                                  |  42 +
 gcc/passes.def                                |  47 +-
 gcc/sese.c                                    |  25 +-
 gcc/sese.h                                    |  19 +
 .../c-c++-common/goacc-gomp/nesting-1.c       |  10 +-
 gcc/testsuite/c-c++-common/goacc/cache-3-1.c  |   2 +-
 .../goacc/classify-kernels-unparallelized.c   |  35 +-
 .../c-c++-common/goacc/classify-kernels.c     |  24 +-
 .../c-c++-common/goacc/classify-parallel.c    |   8 +-
 .../goacc/classify-routine-nohost.c           |  20 +-
 .../c-c++-common/goacc/classify-routine.c     |  22 +-
 .../c-c++-common/goacc/classify-serial.c      |   8 +-
 .../c-c++-common/goacc/combined-directives.c  |   2 +-
 .../device-lowering-debug-optimization.c      |  29 +
 .../goacc/device-lowering-no-loops.c          |  17 +
 .../goacc/device-lowering-no-optimization.c   |  30 +
 .../c-c++-common/goacc/if-clause-2.c          |   2 +-
 gcc/testsuite/c-c++-common/goacc/kernels-1.c  |  17 +-
 .../kernels-counter-var-redundant-load.c      |  19 +-
 .../kernels-counter-vars-function-scope.c     |  10 +-
 .../c-c++-common/goacc/kernels-decompose-1.c  |  31 +-
 .../c-c++-common/goacc/kernels-decompose-2.c  |  57 +-
 .../goacc/kernels-decompose-ice-1.c           |   7 +-
 .../goacc/kernels-decompose-ice-2.c           |   3 +-
 .../goacc/kernels-double-reduction-n.c        |   6 +-
 .../goacc/kernels-double-reduction.c          |   5 +-
 .../c-c++-common/goacc/kernels-loop-2.c       |  19 +-
 .../c-c++-common/goacc/kernels-loop-3.c       |   3 +
 .../goacc/kernels-loop-annotation-1.c         |  26 +
 .../goacc/kernels-loop-annotation-10.c        |  32 +
 .../goacc/kernels-loop-annotation-11.c        |  27 +
 .../goacc/kernels-loop-annotation-12.c        |  28 +
 .../goacc/kernels-loop-annotation-13.c        |  27 +
 .../goacc/kernels-loop-annotation-14.c        |  22 +
 .../goacc/kernels-loop-annotation-15.c        |  22 +
 .../goacc/kernels-loop-annotation-16.c        |  26 +
 .../goacc/kernels-loop-annotation-17.c        |  26 +
 .../goacc/kernels-loop-annotation-18.c        |  18 +
 .../goacc/kernels-loop-annotation-19.c        |  19 +
 .../goacc/kernels-loop-annotation-2.c         |  21 +
 .../goacc/kernels-loop-annotation-20.c        |  23 +
 .../goacc/kernels-loop-annotation-21.c        |  42 +
 .../goacc/kernels-loop-annotation-22.c        |  41 +
 .../goacc/kernels-loop-annotation-3.c         |  24 +
 .../goacc/kernels-loop-annotation-4.c         |  34 +
 .../goacc/kernels-loop-annotation-5.c         |  27 +
 .../goacc/kernels-loop-annotation-6.c         |  27 +
 .../goacc/kernels-loop-annotation-7.c         |  26 +
 .../goacc/kernels-loop-annotation-8.c         |  27 +
 .../goacc/kernels-loop-annotation-9.c         |  26 +
 .../c-c++-common/goacc/kernels-loop-data-2.c  |  17 +-
 .../goacc/kernels-loop-data-enter-exit-2.c    |  16 +-
 .../goacc/kernels-loop-data-enter-exit.c      |  17 +-
 .../goacc/kernels-loop-data-update.c          |  13 +-
 .../c-c++-common/goacc/kernels-loop-data.c    |  12 +-
 .../c-c++-common/goacc/kernels-loop-g.c       |  14 +-
 .../goacc/kernels-loop-mod-not-zero.c         |  10 +-
 .../c-c++-common/goacc/kernels-loop-n.c       |  10 +-
 .../c-c++-common/goacc/kernels-loop-nest.c    |  12 +-
 .../c-c++-common/goacc/kernels-loop.c         |  10 +-
 .../goacc/kernels-one-counter-var.c           |  12 +-
 .../kernels-parallel-loop-data-enter-exit.c   |  17 +-
 .../c-c++-common/goacc/kernels-reduction.c    |  10 +-
 .../c-c++-common/goacc/loop-2-kernels.c       |   6 +-
 .../c-c++-common/goacc/loop-auto-1.c          | 127 +--
 .../c-c++-common/goacc/loop-auto-2.c          |  37 +-
 .../c-c++-common/goacc/loop-auto-reductions.c |  22 +
 .../goacc/nested-reductions-2-parallel.c      | 138 +++
 .../goacc/note-parallelism-kernels-loops-1.c  |  61 ++
 .../note-parallelism-kernels-loops-parloops.c |  53 +
 .../c-c++-common/goacc/omp_data_optimize-1.c  | 677 +++++++++++++
 .../c-c++-common/goacc/routine-nohost-1.c     |   8 +-
 gcc/testsuite/c-c++-common/unroll-1.c         |   8 +-
 gcc/testsuite/c-c++-common/unroll-4.c         |   4 +-
 .../g++.dg/goacc/omp_data_optimize-1.C        | 169 ++++
 gcc/testsuite/g++.dg/goacc/template.C         |  18 +-
 .../gcc.dg/goacc/graphite-parameter-1.c       |  21 +
 .../gcc.dg/goacc/graphite-parameter-2.c       |  23 +
 .../gcc.dg/goacc/loop-processing-1.c          |   8 +-
 .../gcc.dg/goacc/nested-function-1.c          |   3 +-
 gcc/testsuite/gcc.dg/graphite/alias-1.c       |  22 +
 gcc/testsuite/gcc.dg/tree-ssa/backprop-1.c    |   6 +-
 gcc/testsuite/gcc.dg/tree-ssa/backprop-2.c    |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/backprop-3.c    |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/backprop-4.c    |   6 +-
 gcc/testsuite/gcc.dg/tree-ssa/backprop-5.c    |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c    |   6 +-
 gcc/testsuite/gcc.dg/tree-ssa/cunroll-1.c     |   6 +-
 gcc/testsuite/gcc.dg/tree-ssa/cunroll-3.c     |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/cunroll-9.c     |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c      |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/loop-38.c       |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr21463.c       |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr45427.c       |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr59597.c       |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr61743-1.c     |   2 +-
 gcc/testsuite/gcc.dg/unroll-2.c               |   2 +-
 gcc/testsuite/gcc.dg/unroll-3.c               |   4 +-
 gcc/testsuite/gcc.dg/unroll-4.c               |   4 +-
 gcc/testsuite/gcc.dg/unroll-5.c               |   4 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-59.c         |   2 +-
 gcc/testsuite/gcc.dg/vect/vect-profile-1.c    |   2 +-
 gcc/testsuite/gfortran.dg/assumed_type_2.f90  |   6 +-
 .../gfortran.dg/directive_unroll_1.f90        |   2 +-
 .../gfortran.dg/directive_unroll_4.f90        |   2 +-
 ...assify-kernels-unparallelized-parloops.f95 |  44 +
 .../goacc/classify-kernels-unparallelized.f95 |  27 +-
 .../gfortran.dg/goacc/classify-kernels.f95    |  21 +-
 .../gfortran.dg/goacc/classify-parallel.f95   |   6 +-
 .../goacc/classify-routine-nohost.f95         |  18 +-
 .../gfortran.dg/goacc/classify-routine.f95    |  20 +-
 .../gfortran.dg/goacc/classify-serial.f95     |   8 +-
 .../gfortran.dg/goacc/combined-directives.f90 |  19 +-
 .../gfortran.dg/goacc/common-block-3.f90      |  17 +-
 .../gfortran.dg/goacc/gang-static.f95         |  14 +-
 .../gfortran.dg/goacc/kernels-conversion.f95  |  52 +
 .../gfortran.dg/goacc/kernels-decompose-1.f95 | 186 ++--
 .../gfortran.dg/goacc/kernels-decompose-2.f95 | 113 ++-
 .../gfortran.dg/goacc/kernels-loop-2.f95      |  10 +-
 .../goacc/kernels-loop-annotation-1.f95       |  33 +
 .../goacc/kernels-loop-annotation-10.f95      |  32 +
 .../goacc/kernels-loop-annotation-11.f95      |  34 +
 .../goacc/kernels-loop-annotation-12.f95      |  39 +
 .../goacc/kernels-loop-annotation-13.f95      |  38 +
 .../goacc/kernels-loop-annotation-14.f95      |  35 +
 .../goacc/kernels-loop-annotation-15.f95      |  35 +
 .../goacc/kernels-loop-annotation-16.f95      |  34 +
 .../goacc/kernels-loop-annotation-18.f95      |  28 +
 .../goacc/kernels-loop-annotation-19.f95      |  29 +
 .../goacc/kernels-loop-annotation-2.f95       |  32 +
 .../goacc/kernels-loop-annotation-20.f95      |  26 +
 .../goacc/kernels-loop-annotation-3.f95       |  33 +
 .../goacc/kernels-loop-annotation-4.f95       |  34 +
 .../goacc/kernels-loop-annotation-5.f95       |  35 +
 .../goacc/kernels-loop-annotation-6.f95       |  34 +
 .../goacc/kernels-loop-annotation-7.f95       |  48 +
 .../goacc/kernels-loop-annotation-8.f95       |  50 +
 .../goacc/kernels-loop-annotation-9.f95       |  34 +
 .../gfortran.dg/goacc/kernels-loop-data-2.f95 |  10 +-
 .../goacc/kernels-loop-data-enter-exit-2.f95  |  12 +-
 .../goacc/kernels-loop-data-enter-exit.f95    |  12 +-
 .../goacc/kernels-loop-data-update.f95        |  12 +-
 .../gfortran.dg/goacc/kernels-loop-data.f95   |  14 +-
 .../gfortran.dg/goacc/kernels-loop-inner.f95  |   6 +-
 .../gfortran.dg/goacc/kernels-loop-n.f95      |  13 +-
 .../gfortran.dg/goacc/kernels-loop.f95        |   9 +-
 .../kernels-parallel-loop-data-enter-exit.f95 |  12 +-
 .../gfortran.dg/goacc/kernels-reductions.f90  |  37 +
 .../gfortran.dg/goacc/kernels-tree.f95        |   2 +-
 .../gfortran.dg/goacc/loop-2-kernels.f95      |   6 +-
 .../goacc/loop-auto-transfer-2.f90            |  45 +
 .../goacc/loop-auto-transfer-3.f90            |  95 ++
 .../goacc/loop-auto-transfer-4.f90            | 293 ++++++
 .../gfortran.dg/goacc/nested-function-1.f90   |  12 +-
 .../goacc/nested-reductions-2-parallel.f90    | 177 ++++
 .../gfortran.dg/goacc/omp_data_optimize-1.f90 | 588 +++++++++++
 .../goacc/private-explicit-kernels-1.f95      |  20 +-
 .../goacc/private-predetermined-kernels-1.f95 |  23 +-
 .../goacc/privatization-1-compute-loop.f90    |   3 -
 .../goacc/routine-module-mod-1.f90            |   4 +-
 .../goacc/routine-multiple-directives-1.f90   |  32 +-
 .../gfortran.dg/gomp/affinity-clause-1.f90    |   2 +-
 gcc/testsuite/gfortran.dg/graphite/block-2.f  |   9 +-
 .../gfortran.dg/graphite/block-3.f90          |   2 +-
 .../gfortran.dg/graphite/block-4.f90          |   2 +-
 gcc/testsuite/gfortran.dg/graphite/id-9.f     |   2 +-
 .../gfortran.dg/inline_matmul_16.f90          |   2 +
 .../gfortran.dg/inline_matmul_24.f90          |   2 +-
 gcc/testsuite/gfortran.dg/no_arg_check_2.f90  |   6 +-
 gcc/testsuite/gfortran.dg/pr32921.f           |   2 +-
 gcc/testsuite/gfortran.dg/reassoc_4.f         |   2 +-
 .../gfortran.dg/vect/fast-math-mgrid-resid.f  |   1 +
 gcc/tree-chrec.c                              |   3 +
 gcc/tree-core.h                               |   4 +-
 gcc/tree-data-ref.c                           | 107 +-
 gcc/tree-data-ref.h                           |   3 +
 gcc/tree-loop-distribution.c                  |  87 --
 gcc/tree-parloops.c                           |  18 +-
 gcc/tree-pass.h                               |   3 +
 gcc/tree-pretty-print.c                       |  11 +
 gcc/tree-pretty-print.h                       |   1 +
 gcc/tree-scalar-evolution.c                   | 177 +++-
 gcc/tree-scalar-evolution.h                   |   3 +
 gcc/tree-ssa-dce.c                            |  23 +
 gcc/tree-ssa-loop-im.c                        |  57 +-
 gcc/tree-ssa-loop-ivcanon.c                   |   2 +
 gcc/tree-ssa-loop-manip.h                     |   2 +-
 gcc/tree-ssa-loop-niter.c                     |   6 +
 gcc/tree-ssa-loop.c                           | 110 ++
 gcc/tree-ssa-phiprop.c                        |   2 +
 gcc/tree-ssa-pre.c                            |  17 +
 gcc/tree.c                                    | 137 ++-
 gcc/tree.h                                    |   7 +
 .../libgomp.oacc-c++/privatized-ref-2.C       |  64 ++
 .../libgomp.oacc-c++/privatized-ref-3.C       |  64 ++
 .../acc_prof-kernels-1.c                      |  22 +-
 .../declare-vla-kernels-decompose-ice-1.c     |   4 -
 .../kernels-decompose-1.c                     |  10 +-
 .../kernels-private-vars-local-worker-1.c     |   6 +-
 .../kernels-private-vars-local-worker-2.c     |   6 +-
 .../kernels-private-vars-local-worker-3.c     |   6 +-
 .../kernels-private-vars-local-worker-4.c     |   8 +-
 .../kernels-private-vars-local-worker-5.c     |   6 +-
 .../kernels-private-vars-loop-gang-1.c        |   4 +-
 .../kernels-private-vars-loop-gang-2.c        |   4 +-
 .../kernels-private-vars-loop-gang-3.c        |   4 +-
 .../kernels-private-vars-loop-gang-4.c        |  15 +-
 .../kernels-private-vars-loop-gang-5.c        |  10 +-
 .../kernels-private-vars-loop-gang-6.c        |   4 +-
 .../kernels-private-vars-loop-vector-1.c      |   6 +-
 .../kernels-private-vars-loop-vector-2.c      |   6 +-
 .../kernels-private-vars-loop-worker-1.c      |   8 +-
 .../kernels-private-vars-loop-worker-2.c      |   6 +-
 .../kernels-private-vars-loop-worker-3.c      |   6 +-
 .../kernels-private-vars-loop-worker-4.c      |   6 +-
 .../kernels-private-vars-loop-worker-5.c      |   9 +-
 .../kernels-private-vars-loop-worker-6.c      |   6 +-
 .../kernels-private-vars-loop-worker-7.c      |   6 +-
 .../libgomp.oacc-c-c++-common/loop-auto-1.c   |  30 +-
 .../libgomp.oacc-c-c++-common/parallel-dims.c |  39 +-
 .../libgomp.oacc-c-c++-common/pr84955-1.c     |   1 -
 .../libgomp.oacc-c-c++-common/pr85381-2.c     |   8 +-
 .../libgomp.oacc-c-c++-common/pr85381-3.c     |   8 +-
 .../libgomp.oacc-c-c++-common/pr85381-4.c     |   4 +-
 .../libgomp.oacc-c-c++-common/pr85486-2.c     |   4 +-
 .../libgomp.oacc-c-c++-common/pr85486-3.c     |   4 +-
 .../libgomp.oacc-c-c++-common/pr85486.c       |   4 +-
 .../routine-nohost-1.c                        |   6 +-
 .../runtime-alias-check-1.c                   |  79 ++
 .../runtime-alias-check-2.c                   |  90 ++
 .../vector-length-128-1.c                     |   5 +-
 .../vector-length-128-2.c                     |   5 +-
 .../vector-length-128-3.c                     |   5 +-
 .../vector-length-128-4.c                     |   5 +-
 .../vector-length-128-5.c                     |   5 +-
 .../vector-length-128-6.c                     |   5 +-
 .../vector-length-128-7.c                     |   5 +-
 .../testsuite/libgomp.oacc-fortran/if-1.f90   |  32 +-
 .../kernels-acc-loop-reduction-2.f90          |  12 +-
 .../kernels-independent.f90                   |   1 +
 .../libgomp.oacc-fortran/kernels-loop-1.f90   |   1 +
 .../kernels-private-vars-loop-gang-1.f90      |   4 +-
 .../kernels-private-vars-loop-gang-2.f90      |   4 +-
 .../kernels-private-vars-loop-gang-3.f90      |   4 +-
 .../kernels-private-vars-loop-gang-6.f90      |   5 +-
 .../kernels-private-vars-loop-vector-1.f90    |   4 +-
 .../kernels-private-vars-loop-vector-2.f90    |  11 +-
 .../kernels-private-vars-loop-worker-1.f90    |   6 +-
 .../kernels-private-vars-loop-worker-2.f90    |   4 +-
 .../kernels-private-vars-loop-worker-3.f90    |   4 +-
 .../kernels-private-vars-loop-worker-4.f90    |   4 +-
 .../kernels-private-vars-loop-worker-5.f90    |   7 +-
 .../kernels-private-vars-loop-worker-6.f90    |   4 +-
 .../kernels-private-vars-loop-worker-7.f90    |   6 +-
 .../libgomp.oacc-fortran/optional-private.f90 |   2 -
 .../libgomp.oacc-fortran/pr94358-1.f90        |   7 +-
 .../libgomp.oacc-fortran/privatized-ref-1.f95 |  71 ++
 .../libgomp.oacc-fortran/routine-nohost-1.f90 |   4 +-
 313 files changed, 12131 insertions(+), 1729 deletions(-)
 create mode 100644 gcc/graphite-oacc.c
 create mode 100644 gcc/graphite-oacc.h
 create mode 100644 gcc/omp-data-optimize.cc
 create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-debug-optimization.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-no-loops.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-no-optimization.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/loop-auto-reductions.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops-parloops.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/omp_data_optimize-1.c
 create mode 100644 gcc/testsuite/g++.dg/goacc/omp_data_optimize-1.C
 create mode 100644 gcc/testsuite/gcc.dg/goacc/graphite-parameter-1.c
 create mode 100644 gcc/testsuite/gcc.dg/goacc/graphite-parameter-2.c
 create mode 100644 gcc/testsuite/gcc.dg/graphite/alias-1.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-conversion.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-reductions.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/loop-auto-transfer-2.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/loop-auto-transfer-3.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/loop-auto-transfer-4.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/omp_data_optimize-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-2.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-3.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/runtime-alias-check-2.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-1.f95

--
2.33.0

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 03/40] Kernels loops annotation: Fortran.
  2021-12-15 15:54 [PATCH 00/40] OpenACC "kernels" Improvements Frederik Harwath
@ 2021-12-15 15:54 ` Frederik Harwath
  2021-12-15 15:54 ` [PATCH 04/40] Additional Fortran testsuite fixes for kernels loops annotation pass Frederik Harwath
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Frederik Harwath @ 2021-12-15 15:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Sandra Loosemore, thomas, tobias, fortran, Gergö Barany

From: Sandra Loosemore <sandra@codesourcery.com>

This patch implements the Fortran support for adding "#pragma acc loop auto"
annotations to loops in OpenACC kernels regions.  It implements the same
-fopenacc-kernels-annotate-loops and -Wopenacc-kernels-annotate-loops options
that were previously added (and documented) for the C/C++ front ends.

Co-Authored-By: Gergö Barany <gergo@codesourcery.com>

gcc/fortran/
        * gfortran.h (gfc_oacc_annotate_loops_in_kernels_regions): Declare.
        * lang.opt (Wopenacc-kernels-annotate-loops): New.
        (fopenacc-kernels-annotate-loops): New.
        * openmp.c: Include options.h.
        (enum annotation_state, enum annotation_result): New.
        (check_code_for_invalid_calls): New.
        (check_expr_for_invalid_calls): New.
        (check_for_invalid_calls): New.
        (annotate_do_loop): New.
        (annotate_do_loops_in_kernels): New.
        (compute_goto_targets): New.
        (gfc_oacc_annotate_loops_in_kernels_regions): New.
        * parse.c (gfc_parse_file): Handle -fopenacc-kernels-annotate-loops.

gcc/testsuite/
        * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Add
        -fno-openacc-kernels-annotate-loops option.
        * gfortran.dg/goacc/classify-kernels.f95: Likewise.
        * gfortran.dg/goacc/common-block-3.f90: Likewise.
        * gfortran.dg/goacc/kernels-loop-2.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-n.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop.f95: Likewise.
        * gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95:
        Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-1.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-2.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-3.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-4.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-5.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-6.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-7.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-8.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-9.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-10.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-11.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-12.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-13.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-14.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-15.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-16.f95: New.
---
 gcc/fortran/gfortran.h                        |   1 +
 gcc/fortran/lang.opt                          |   8 +
 gcc/fortran/openmp.c                          | 364 ++++++++++++++++++
 gcc/fortran/parse.c                           |   9 +
 .../goacc/classify-kernels-unparallelized.f95 |   1 +
 .../gfortran.dg/goacc/classify-kernels.f95    |   1 +
 .../gfortran.dg/goacc/common-block-3.f90      |   1 +
 .../gfortran.dg/goacc/kernels-loop-2.f95      |   1 +
 .../goacc/kernels-loop-annotation-1.f95       |  33 ++
 .../goacc/kernels-loop-annotation-10.f95      |  32 ++
 .../goacc/kernels-loop-annotation-11.f95      |  34 ++
 .../goacc/kernels-loop-annotation-12.f95      |  39 ++
 .../goacc/kernels-loop-annotation-13.f95      |  38 ++
 .../goacc/kernels-loop-annotation-14.f95      |  35 ++
 .../goacc/kernels-loop-annotation-15.f95      |  35 ++
 .../goacc/kernels-loop-annotation-16.f95      |  34 ++
 .../goacc/kernels-loop-annotation-2.f95       |  32 ++
 .../goacc/kernels-loop-annotation-3.f95       |  33 ++
 .../goacc/kernels-loop-annotation-4.f95       |  34 ++
 .../goacc/kernels-loop-annotation-5.f95       |  35 ++
 .../goacc/kernels-loop-annotation-6.f95       |  34 ++
 .../goacc/kernels-loop-annotation-7.f95       |  48 +++
 .../goacc/kernels-loop-annotation-8.f95       |  50 +++
 .../goacc/kernels-loop-annotation-9.f95       |  34 ++
 .../gfortran.dg/goacc/kernels-loop-data-2.f95 |   1 +
 .../goacc/kernels-loop-data-enter-exit-2.f95  |   1 +
 .../goacc/kernels-loop-data-enter-exit.f95    |   1 +
 .../goacc/kernels-loop-data-update.f95        |   1 +
 .../gfortran.dg/goacc/kernels-loop-data.f95   |   1 +
 .../gfortran.dg/goacc/kernels-loop-n.f95      |   1 +
 .../gfortran.dg/goacc/kernels-loop.f95        |   1 +
 .../kernels-parallel-loop-data-enter-exit.f95 |   1 +
 32 files changed, 974 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index f7662c59a5df..50db768ce0fc 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3545,6 +3545,7 @@ void gfc_resolve_oacc_declare (gfc_namespace *);
 void gfc_resolve_oacc_parallel_loop_blocks (gfc_code *, gfc_namespace *);
 void gfc_resolve_oacc_blocks (gfc_code *, gfc_namespace *);
 void gfc_resolve_oacc_routines (gfc_namespace *);
+void gfc_oacc_annotate_loops_in_kernels_regions (gfc_namespace *);

 /* expr.c */
 void gfc_free_actual_arglist (gfc_actual_arglist *);
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 6db01c736be1..a202c04c4a25 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -289,6 +289,10 @@ Wopenacc-parallelism
 Fortran
 ; Documented in C

+Wopenacc-kernels-annotate-loops
+Fortran
+; Documented in C
+
 Wopenmp-simd
 Fortran
 ; Documented in C
@@ -695,6 +699,10 @@ fopenacc-dim=
 Fortran LTO Joined Var(flag_openacc_dims)
 ; Documented in C

+fopenacc-kernels-annotate-loops
+Fortran LTO Optimization
+; Documented in C
+
 fopenmp
 Fortran LTO
 ; Documented in C
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index dcf22ac2c2f3..243b5e0a9ac6 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic.h"
 #include "gomp-constants.h"
 #include "target-memory.h"  /* For gfc_encode_character.  */
+#include "options.h"

 /* Match an end of OpenMP directive.  End of OpenMP directive is optional
    whitespace, followed by '\n' or comment '!'.  */
@@ -9090,3 +9091,366 @@ gfc_resolve_omp_udrs (gfc_symtree *st)
   for (omp_udr = st->n.omp_udr; omp_udr; omp_udr = omp_udr->next)
     gfc_resolve_omp_udr (omp_udr);
 }
+
+
+/* The following functions implement automatic recognition and annotation of
+   DO loops in OpenACC kernels regions.  Inside a kernels region, a nest of
+   DO loops that does not contain any annotated OpenACC loops, nor EXIT
+   or GOTO statements, gets an automatic "acc loop auto" annotation
+   on each loop.
+   This feature is controlled by flag_openacc_kernels_annotate_loops.  */
+
+
+/* State of annotation state traversal for DO loops in kernels regions.  */
+enum annotation_state {
+  as_outer,
+  as_in_kernels_region,
+  as_in_kernels_loop,
+  as_in_kernels_inner_loop
+};
+
+/* Return status of annotation traversal.  */
+enum annotation_result {
+  ar_ok,
+  ar_invalid_loop,
+  ar_invalid_nest
+};
+
+/* Code walk function for check_for_invalid_calls.  */
+
+static int
+check_code_for_invalid_calls (gfc_code **codep, int *walk_subtrees,
+                             void *data ATTRIBUTE_UNUSED)
+{
+  gfc_code *code = *codep;
+  switch (code->op)
+    {
+    case EXEC_CALL:
+      /* Calls to openacc routines are permitted.  */
+      if (code->resolved_sym
+         && (code->resolved_sym->attr.oacc_routine_lop
+             != OACC_ROUTINE_LOP_NONE))
+       return 0;
+      /* Else fall through.  */
+
+    case EXEC_CALL_PPC:
+    case EXEC_ASSIGN_CALL:
+      gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+                  "Subroutine call at %L prevents annotation of loop nest",
+                  &code->loc);
+      *walk_subtrees = 0;
+      return 1;
+
+    default:
+      return 0;
+    }
+}
+
+/* Expr walk function for check_for_invalid_calls.  */
+
+static int
+check_expr_for_invalid_calls (gfc_expr **exprp, int *walk_subtrees,
+                             void *data ATTRIBUTE_UNUSED)
+{
+  gfc_expr *expr = *exprp;
+  switch (expr->expr_type)
+    {
+    case EXPR_FUNCTION:
+      if (expr->value.function.esym
+         && (expr->value.function.esym->attr.oacc_routine_lop
+             != OACC_ROUTINE_LOP_NONE))
+       return 0;
+      /* Else fall through.  */
+
+    case EXPR_COMPCALL:
+      gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+                  "Function call at %L prevents annotation of loop nest",
+                  &expr->where);
+      *walk_subtrees = 0;
+      return 1;
+
+    default:
+      return 0;
+    }
+}
+
+/* Return TRUE if the DO loop CODE contains function or procedure
+   calls that ought to prohibit annotation.  This traversal is
+   separate from the main annotation tree walk because we need to walk
+   expressions as well as executable statements.  */
+
+static bool
+check_for_invalid_calls (gfc_code *code)
+{
+  gcc_assert (code->op == EXEC_DO);
+  return gfc_code_walker (&code, check_code_for_invalid_calls,
+                         check_expr_for_invalid_calls, NULL);
+}
+
+/* Annotate DO loop CODE with OpenACC "loop auto".  */
+
+static void
+annotate_do_loop (gfc_code *code, gfc_code *parent)
+{
+
+  /* A DO loop's body is another phony DO node whose next pointer starts
+     the actual body.  */
+  gcc_assert (code->op == EXEC_DO);
+  gcc_assert (code->block->op == EXEC_DO);
+
+  /* Build the "acc loop auto" annotation and add the loop as its
+     body.  */
+  gfc_omp_clauses *clauses = gfc_get_omp_clauses ();
+  clauses->par_auto = 1;
+  gfc_code *oacc_loop = gfc_get_code (EXEC_OACC_LOOP);
+  oacc_loop->block = gfc_get_code (EXEC_OACC_LOOP);
+  oacc_loop->block->next = code;
+  oacc_loop->ext.omp_clauses = clauses;
+  oacc_loop->loc = code->loc;
+  oacc_loop->block->loc = code->loc;
+
+  /* Splice the annotation into the place of the original loop.  */
+  if (parent->block == code)
+    parent->block = oacc_loop;
+  else
+    {
+      gfc_code *prev = parent->block;
+      while (prev != code && prev->next != code)
+       {
+         prev = prev->next;
+         gcc_assert (prev != NULL);
+       }
+      prev->next = oacc_loop;
+    }
+  oacc_loop->next = code->next;
+  code->next = NULL;
+}
+
+/* Recursively traverse CODE in block PARENT, finding OpenACC kernels
+   regions.  GOTO_TARGETS keeps track of statement labels that are
+   targets of gotos in the current function, while STATE keeps track
+   of the current context of the traversal.  If the traversal
+   encounters a DO loop inside a kernels region, annotate it with
+   OpenACC loop directives if appropriate.  Return the status of the
+   traversal.  */
+
+static enum annotation_result
+annotate_do_loops_in_kernels (gfc_code *code, gfc_code *parent,
+                             hash_set <gfc_st_label *> *goto_targets,
+                             annotation_state state)
+{
+  gfc_code *next_code = NULL;
+  enum annotation_result retval = ar_ok;
+
+  for ( ; code; code = next_code)
+    {
+      bool walk_block = true;
+      next_code = code->next;
+
+      if (state >= as_in_kernels_loop
+         && code->here && goto_targets->contains (code->here))
+       /* This statement has a label that is the target of a GOTO or some
+          other jump.  Do not try to sort out the details, just reject
+          this loop nest.  */
+       {
+         gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+                      "Possible control transfer to label at %L "
+                      "prevents annotation of loop nest",
+                      &code->loc);
+         return ar_invalid_nest;
+       }
+
+      switch (code->op)
+       {
+       case EXEC_OACC_KERNELS:
+         /* Enter kernels region.  */
+         annotate_do_loops_in_kernels (code->block->next, code,
+                                       goto_targets,
+                                       as_in_kernels_region);
+         walk_block = false;
+         break;
+
+       case EXEC_OACC_PARALLEL_LOOP:
+       case EXEC_OACC_PARALLEL:
+       case EXEC_OACC_KERNELS_LOOP:
+       case EXEC_OACC_LOOP:
+         /* Do not try to add automatic OpenACC annotations inside manually
+            annotated loops.  Presumably, the user avoided doing it on
+            purpose; for example, all available levels of parallelism may
+            have been used up.  */
+         if (state >= as_in_kernels_region)
+           {
+             gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+                          "Explicit loop annotation at %L "
+                          "prevents annotation of loop nest",
+                          &code->loc);
+             return ar_invalid_nest;
+           }
+         walk_block = false;
+         break;
+
+       case EXEC_DO:
+         if (state >= as_in_kernels_region)
+           {
+             /* A DO loop's body is another phony DO node whose next
+                pointer starts the actual body.  Skip the phony node.  */
+             gcc_assert (code->block->op == EXEC_DO);
+             enum annotation_result result
+               = annotate_do_loops_in_kernels (code->block->next, code,
+                                               goto_targets,
+                                               as_in_kernels_loop);
+             /* Check for function/procedure calls in the body of the
+                loop that would prevent parallelization.  Unlike in C/C++,
+                we do not have to check that there is no modification of
+                the loop variable or loop count since they are already
+                handled by the semantics of DO loops in the FORTRAN
+                language.  */
+             if (result != ar_invalid_nest && check_for_invalid_calls (code))
+               result = ar_invalid_nest;
+             if (result == ar_ok)
+               annotate_do_loop (code, parent);
+             else if (result == ar_invalid_nest
+                      && state >= as_in_kernels_loop)
+               /* The outer loop is invalid, too, so stop traversal.  */
+               return result;
+             walk_block = false;
+           }
+         break;
+
+       case EXEC_DO_WHILE:
+       case EXEC_DO_CONCURRENT:
+         /* Traverse the body in a special state to allow EXIT statements
+            from these loops.  */
+         if (state >= as_in_kernels_loop)
+           {
+             enum annotation_result result
+               = annotate_do_loops_in_kernels (code->block, code,
+                                               goto_targets,
+                                               as_in_kernels_inner_loop);
+             if (result == ar_invalid_nest)
+               return result;
+             else if (result != ar_ok)
+               retval = result;
+             walk_block = false;
+           }
+         break;
+
+       case EXEC_GOTO:
+       case EXEC_ARITHMETIC_IF:
+       case EXEC_STOP:
+       case EXEC_ERROR_STOP:
+         /* A jump that may leave this loop.  */
+         if (state >= as_in_kernels_loop)
+           {
+             gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+                          "Possible unstructured control flow at %L "
+                          "prevents annotation of loop nest",
+                          &code->loc);
+             return ar_invalid_nest;
+           }
+         break;
+
+       case EXEC_RETURN:
+         /* A return from a kernels region is diagnosed elsewhere as a
+            hard error, so no warning is needed here.  */
+         if (state >= as_in_kernels_loop)
+           return ar_invalid_nest;
+         break;
+
+       case EXEC_EXIT:
+         if (state == as_in_kernels_loop)
+           {
+             gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+                          "Exit at %L prevents annotation of loop",
+                          &code->loc);
+             retval = ar_invalid_loop;
+           }
+         break;
+
+       case EXEC_BACKSPACE:
+       case EXEC_CLOSE:
+       case EXEC_ENDFILE:
+       case EXEC_FLUSH:
+       case EXEC_INQUIRE:
+       case EXEC_OPEN:
+       case EXEC_READ:
+       case EXEC_REWIND:
+       case EXEC_WRITE:
+         /* Executing side-effecting I/O statements in parallel doesn't
+            make much sense.  If this is what users want, they can always
+            add explicit annotations on the loop nest.  */
+         if (state >= as_in_kernels_loop)
+           {
+             gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+                          "I/O statement at %L prevents annotation of loop",
+                          &code->loc);
+             return ar_invalid_nest;
+           }
+         break;
+
+       default:
+         break;
+       }
+
+      /* Visit nested statements, if any, returning early if we hit
+        any problems.  */
+      if (walk_block)
+       {
+         enum annotation_result result
+           = annotate_do_loops_in_kernels (code->block, code,
+                                           goto_targets, state);
+         if (result == ar_invalid_nest)
+           return result;
+         else if (result != ar_ok)
+           retval = result;
+       }
+    }
+  return retval;
+}
+
+/* Traverse CODE to find all the labels referenced by GOTO and similar
+   statements and store them in GOTO_TARGETS.  */
+
+static void
+compute_goto_targets (gfc_code *code, hash_set <gfc_st_label *> *goto_targets)
+{
+  for ( ; code; code = code->next)
+    {
+      switch (code->op)
+       {
+       case EXEC_GOTO:
+       case EXEC_LABEL_ASSIGN:
+         goto_targets->add (code->label1);
+         gcc_fallthrough ();
+
+       case EXEC_ARITHMETIC_IF:
+         goto_targets->add (code->label2);
+         goto_targets->add (code->label3);
+         gcc_fallthrough ();
+
+       default:
+         /* Visit nested statements, if any.  */
+         if (code->block != NULL)
+           compute_goto_targets (code->block, goto_targets);
+       }
+    }
+}
+
+/* Find DO loops in OpenACC kernels regions that do not have OpenACC
+   annotations but look like they might benefit from automatic
+   parallelization.  Add "acc loop auto" annotations for them.  Assumes
+   flag_openacc_kernels_annotate_loops is set.  */
+
+void
+gfc_oacc_annotate_loops_in_kernels_regions (gfc_namespace *ns)
+{
+  if (ns->proc_name)
+    {
+      hash_set <gfc_st_label *> goto_targets;
+      compute_goto_targets (ns->code, &goto_targets);
+      annotate_do_loops_in_kernels (ns->code, NULL, &goto_targets, as_outer);
+    }
+
+  for (ns = ns->contained; ns; ns = ns->sibling)
+    gfc_oacc_annotate_loops_in_kernels_regions (ns);
+}
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 12aa80ec45ca..04e9d2450b16 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -6912,6 +6912,15 @@ done:
   if (flag_c_prototypes || flag_c_prototypes_external)
     fprintf (stdout, "\n#ifdef __cplusplus\n}\n#endif\n");

+  /* Add annotations on loops in OpenACC kernels regions if requested.  This
+     is most easily done on this representation close to the source code.  */
+  if (flag_openacc && flag_openacc_kernels_annotate_loops)
+    {
+      gfc_current_ns = gfc_global_ns_list;
+      for (; gfc_current_ns; gfc_current_ns = gfc_current_ns->sibling)
+       gfc_oacc_annotate_loops_in_kernels_regions (gfc_current_ns);
+    }
+
   /* Do the translation.  */
   translate_all_program_units (gfc_global_ns_list);

diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
index 3fb48b321f2f..2ceae2088070 100644
--- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
@@ -2,6 +2,7 @@
 ! OpenACC kernels.

 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fopt-info-optimized-omp" }
 ! { dg-additional-options "-fdump-tree-ompexp" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
index 6c8d298e236d..d061a241074b 100644
--- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
@@ -2,6 +2,7 @@
 ! kernels.

 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fopt-info-optimized-omp" }
 ! { dg-additional-options "-fdump-tree-ompexp" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
index 5defe2ea85de..d2816c3e9364 100644
--- a/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
@@ -1,4 +1,5 @@
 ! { dg-options "-fopenacc -fdump-tree-omplower" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }

 module consts
   integer, parameter :: n = 100
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95
index ef53324dd2a0..63774ffb5aff 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
new file mode 100644
index 000000000000..41f6307dbb17
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
@@ -0,0 +1,33 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that all loops in the nest are annotated.
+
+subroutine f (a, b, c)
+  implicit none
+
+  real, intent (in), dimension(16,16) :: a
+  real, intent (in), dimension(16,16) :: b
+  real, intent (out), dimension(16,16) :: c
+
+  integer :: i, j, k
+  real :: t
+
+!$acc kernels copyin(a(1:16,1:16), b(1:16,1:16)) copyout(c(1:16,1:16))
+
+  do i = 1, 16
+    do j = 1, 16
+      t = 0
+      do k = 1, 16
+        t = t + a(i,k) * b(k,j)
+      end do
+      c(i,j) = t;
+    end do
+  end do
+
+!$acc end kernels
+end subroutine f
+
+! { dg-final { scan-tree-dump-times "acc loop private\\(.\\) auto" 3 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95
new file mode 100644
index 000000000000..f612c5beb963
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95
@@ -0,0 +1,32 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a random goto in the body can't be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0 .or. b(i) < 0) then
+      go to 10  ! { dg-warning "Possible unstructured control flow" }
+    end if
+    t = t + a(i) * b(i)
+  end do
+
+10  f = t
+
+!$acc end kernels
+
+end function f
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
new file mode 100644
index 000000000000..d51482e4685d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-additional-options "-std=legacy" }
+! { dg-do compile }
+
+! Test that a loop with a random label in the body cannot be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  goto 10
+
+  do i = 1, 16
+10  t = t + a(i) * b(i)  ! { dg-warning "Possible control transfer to label" }
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
new file mode 100644
index 000000000000..3c4956d70775
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
@@ -0,0 +1,39 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that in a situation with nested loops, a problem that prevents
+! annotation of the inner loop only still allows the outer loop to be
+! annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    do j = 1, 16
+      if (a(i) < 0 .or. b(j) < 0) then
+        exit  ! { dg-warning "Exit" }
+      else
+        t = t + a(i) * b(j)
+      end if
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
new file mode 100644
index 000000000000..3ec459f0a8df
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
@@ -0,0 +1,38 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that in a situation with nested loops, a problem that prevents
+! annotation of the outer loop only still allows the inner loop to be
+! annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0) then
+      exit  ! { dg-warning "Exit" }
+    end if
+    do j = 1, 16
+      t = t + a(i) * b(j)
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
new file mode 100644
index 000000000000..91f431cca432
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
@@ -0,0 +1,35 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that an explicit annotation on an outer loop suppresses annotation
+!  of inner loops, and produces a diagnostic.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+!$acc loop seq  ! { dg-warning "Explicit loop annotation" }
+  do i = 1, 16
+    do j = 1, 16
+      t = t + a(i) * b(j)
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
new file mode 100644
index 000000000000..570c12d3ad70
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
@@ -0,0 +1,35 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that an explicit annotation on an inner loop suppresses annotation
+! of the outer loop, and produces a diagnostic.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    !$acc loop seq  ! { dg-warning "Explicit loop annotation" }
+    do j = 1, 16
+      t = t + a(i) * b(j)
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
new file mode 100644
index 000000000000..6e44a304b28b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that loops containing I/O statements can't be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    do j = 1, 16
+      print *, " i =", i, " j =", j  ! { dg-warning "I/O statement" }
+      t = t + a(i) * b(j)
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
new file mode 100644
index 000000000000..4624a05247d9
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
@@ -0,0 +1,32 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a variable bound can be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (:) :: a, b
+
+  integer :: i, n
+  real :: t
+
+  t = 0.0
+  n = size (a)
+
+!$acc kernels
+
+  do i = 1, n
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
new file mode 100644
index 000000000000..daed8f7f6e9d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
@@ -0,0 +1,33 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a conditional in the body can be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) > 0 .and. b(i) > 0) then
+      t = t + a(i) * b(i)
+    end if
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
new file mode 100644
index 000000000000..0c4ad256b7eb
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a case construct in the body can be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+!$acc kernels
+
+  do i = 1, 16
+    select case (i)
+      case (1)
+        t = a(i) * b(i)
+      case default
+        t = t + a(i) * b(i)
+    end select
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
new file mode 100644
index 000000000000..1c3f87eed6e4
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
@@ -0,0 +1,35 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a cycle statement in the body can be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0 .or. b(i) < 0) then
+      cycle
+    end if
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
new file mode 100644
index 000000000000..43173a70df24
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a exit statement in the body cannot be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0 .or. b(i) < 0) then
+      exit     ! { dg-warning "Exit" }
+    end if
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
new file mode 100644
index 000000000000..ec42213220e7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
@@ -0,0 +1,48 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a random function call in the body cannot
+! be annotated.
+
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  interface
+    function g (x)
+      real :: g
+      real, intent (in) :: x
+    end function g
+
+    subroutine h (x)
+      real, intent (in) :: x
+    end subroutine h
+  end interface
+
+  t = 0.0
+
+!$acc kernels
+  do i = 1, 16
+    t = t + g (a(i) * b(i))  ! { dg-warning "Function call" }
+  end do
+
+  do i = 1, 16
+    call h (t) ! { dg-warning "Subroutine call" }
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
new file mode 100644
index 000000000000..9188f70d9664
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
@@ -0,0 +1,50 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a call to a declared openacc function/subroutine
+! can be annotated.
+
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  interface
+    function g (x)
+      !$acc routine worker
+      real :: g
+      real, intent (in) :: x
+    end function g
+
+    subroutine h (x)
+      !$acc routine worker
+      real, intent (in) :: x
+    end subroutine h
+  end interface
+
+  t = 0.0
+
+!$acc kernels
+  do i = 1, 16
+    t = t + g (a(i) * b(i))
+  end do
+
+  do i = 1, 16
+    call h (t)
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop private\\(i\\) auto" 2 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95
new file mode 100644
index 000000000000..f5aa5a0f43b5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a return statement in the body gives a hard
+! error.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0 .or. b(i) < 0) then
+      f = 0.0
+      return   ! { dg-error "invalid branch" }
+    end if
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95
index 2f1dcd603a14..c1f6ef8df600 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95
index 447e85d64483..313e3df7f63d 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95
index 4edb2889b7b1..26671064ba27 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95
index fc113e1f6602..d79ed796c366 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95
index 94522f586362..d8ef52af2e6a 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95
index b9c4aea074d7..6b7334144c87 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95
index 6dc7b2e0f28f..aadfcfc41448 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95
index 48c20b999423..0d45c5cf4338 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }

--
2.33.0

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 04/40] Additional Fortran testsuite fixes for kernels loops annotation pass.
  2021-12-15 15:54 [PATCH 00/40] OpenACC "kernels" Improvements Frederik Harwath
  2021-12-15 15:54 ` [PATCH 03/40] Kernels loops annotation: Fortran Frederik Harwath
@ 2021-12-15 15:54 ` Frederik Harwath
  2021-12-15 15:54 ` [PATCH 06/40] Add a "combined" flag for "acc kernels loop" etc directives Frederik Harwath
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Frederik Harwath @ 2021-12-15 15:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Sandra Loosemore, thomas, tobias, fortran

From: Sandra Loosemore <sandra@codesourcery.com>

2020-03-27  Sandra Loosemore  <sandra@codesourcery.com>

        gcc/testsuite/
        * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust
        line numbering.
        * gfortran.dg/goacc/classify-kernels.f95: Likewise.
        * gfortran.dg/goacc/kernels-decompose-2.f95: Add
        -fno-openacc-kernels-annotate-loops.
---
 .../gfortran.dg/goacc/classify-kernels-unparallelized.f95    | 5 +++--
 gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95         | 5 +++--
 gcc/testsuite/gfortran.dg/goacc/kernels-decompose-2.f95      | 1 +
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
index 2ceae2088070..00aac9aa94ea 100644
--- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
@@ -23,8 +23,9 @@ program main

   call setup(a, b)

-  !$acc kernels copyin (a(0:n-1), b(0:n-1)) copyout (c(0:n-1)) ! { dg-message "optimized: assigned OpenACC seq loop parallelism" }
-  do i = 0, n - 1
+  !$acc kernels copyin (a(0:n-1), b(0:n-1)) copyout (c(0:n-1))
+  do i = 0, n - 1 ! { dg-message "optimized: assigned OpenACC seq loop parallelism" }
+                  ! { dg-message "note: beginning .parloops. part in OpenACC .kernels. region" "" { target *-*-* } 24 }
      c(i) = a(f (i)) + b(f (i))
   end do
   !$acc end kernels
diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
index d061a241074b..ba815319abf2 100644
--- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
@@ -19,8 +19,9 @@ program main

   call setup(a, b)

-  !$acc kernels copyin (a(0:n-1), b(0:n-1)) copyout (c(0:n-1)) ! { dg-message "optimized: assigned OpenACC gang loop parallelism" }
-  do i = 0, n - 1
+  !$acc kernels copyin (a(0:n-1), b(0:n-1)) copyout (c(0:n-1))
+  do i = 0, n - 1 ! { dg-message "optimized: assigned OpenACC gang loop parallelism" }
+                  ! { dg-message "beginning .parloops. part in OpenACC .kernels. region" "" { target *-*-* } 20 }
      c(i) = a(i) + b(i)
   end do
   !$acc end kernels
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-2.f95
index 238482b91a49..04c998d11dad 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-decompose-2.f95
@@ -1,5 +1,6 @@
 ! Test OpenACC 'kernels' construct decomposition.

+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fopt-info-omp-all" }
 ! { dg-additional-options "--param=openacc-kernels=decompose" }
 ! { dg-additional-options "-O2" } for 'parloops'.
--
2.33.0

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 06/40] Add a "combined" flag for "acc kernels loop" etc directives.
  2021-12-15 15:54 [PATCH 00/40] OpenACC "kernels" Improvements Frederik Harwath
  2021-12-15 15:54 ` [PATCH 03/40] Kernels loops annotation: Fortran Frederik Harwath
  2021-12-15 15:54 ` [PATCH 04/40] Additional Fortran testsuite fixes for kernels loops annotation pass Frederik Harwath
@ 2021-12-15 15:54 ` Frederik Harwath
  2021-12-15 15:54 ` [PATCH 08/40] Annotate inner loops in "acc kernels loop" directives (Fortran) Frederik Harwath
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Frederik Harwath @ 2021-12-15 15:54 UTC (permalink / raw)
  To: gcc-patches
  Cc: Sandra Loosemore, thomas, joseph, jason, nathan, tobias, fortran

From: Sandra Loosemore <sandra@codesourcery.com>

2020-08-19  Sandra Loosemore  <sandra@codesourcery.com>

        gcc/
        * tree.h (OACC_LOOP_COMBINED): New.

        gcc/c/
        * c-parser.c (c_parser_oacc_loop): Set OACC_LOOP_COMBINED.

        gcc/cp/
        * parser.c (cp_parser_oacc_loop): Set OACC_LOOP_COMBINED.

        gcc/fortran/
        * trans-openmp.c (gfc_trans_omp_do): Add combined parameter,
        use it to set OACC_LOOP_COMBINED.  Update all call sites.
---
 gcc/c/c-parser.c           |  3 +++
 gcc/cp/parser.c            |  3 +++
 gcc/fortran/trans-openmp.c | 34 +++++++++++++++++++++-------------
 gcc/tree.h                 |  5 +++++
 4 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 80dd61d599ef..1258b48693de 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -17371,6 +17371,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
                    omp_clause_mask mask, tree *cclauses, bool *if_p)
 {
   bool is_parallel = ((mask >> PRAGMA_OACC_CLAUSE_REDUCTION) & 1) == 1;
+  bool is_combined = (cclauses != NULL);

   strcat (p_name, " loop");
   mask |= OACC_LOOP_CLAUSE_MASK;
@@ -17389,6 +17390,8 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
   tree block = c_begin_compound_stmt (true);
   tree stmt = c_parser_omp_for_loop (loc, parser, OACC_LOOP, clauses, NULL,
                                     if_p);
+  if (stmt && stmt != error_mark_node)
+    OACC_LOOP_COMBINED (stmt) = is_combined;
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 4c2075742d6a..c834d25b028f 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -44580,6 +44580,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
                     omp_clause_mask mask, tree *cclauses, bool *if_p)
 {
   bool is_parallel = ((mask >> PRAGMA_OACC_CLAUSE_REDUCTION) & 1) == 1;
+  bool is_combined = (cclauses != NULL);

   strcat (p_name, " loop");
   mask |= OACC_LOOP_CLAUSE_MASK;
@@ -44598,6 +44599,8 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
   tree block = begin_omp_structured_block ();
   int save = cp_parser_begin_omp_structured_block (parser);
   tree stmt = cp_parser_omp_for_loop (parser, OACC_LOOP, clauses, NULL, if_p);
+  if (stmt && stmt != error_mark_node)
+    OACC_LOOP_COMBINED (stmt) = is_combined;
   cp_parser_end_omp_structured_block (parser, save);
   add_stmt (finish_omp_structured_block (block));

diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index e81c5588c53c..618e106791e5 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -4855,7 +4855,8 @@ typedef struct dovar_init_d {

 static tree
 gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
-                 gfc_omp_clauses *do_clauses, tree par_clauses)
+                 gfc_omp_clauses *do_clauses, tree par_clauses,
+                 bool combined)
 {
   gfc_se se;
   tree dovar, stmt, from, to, step, type, init, cond, incr, orig_decls;
@@ -5219,7 +5220,10 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
     case EXEC_OMP_DISTRIBUTE: stmt = make_node (OMP_DISTRIBUTE); break;
     case EXEC_OMP_LOOP: stmt = make_node (OMP_LOOP); break;
     case EXEC_OMP_TASKLOOP: stmt = make_node (OMP_TASKLOOP); break;
-    case EXEC_OACC_LOOP: stmt = make_node (OACC_LOOP); break;
+    case EXEC_OACC_LOOP:
+      stmt = make_node (OACC_LOOP);
+      OACC_LOOP_COMBINED (stmt) = combined;
+      break;
     default: gcc_unreachable ();
     }

@@ -5313,7 +5317,8 @@ gfc_trans_oacc_combined_directive (gfc_code *code)
     pblock = &block;
   else
     pushlevel ();
-  stmt = gfc_trans_omp_do (code, EXEC_OACC_LOOP, pblock, &loop_clauses, NULL);
+  stmt = gfc_trans_omp_do (code, EXEC_OACC_LOOP, pblock, &loop_clauses, NULL,
+                          true);
   protected_set_expr_location (stmt, loc);
   if (TREE_CODE (stmt) != BIND_EXPR)
     stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
@@ -6151,7 +6156,7 @@ gfc_trans_omp_do_simd (gfc_code *code, stmtblock_t *pblock,
     omp_do_clauses
       = gfc_trans_omp_clauses (&block, &clausesa[GFC_OMP_SPLIT_DO], code->loc);
   body = gfc_trans_omp_do (code, EXEC_OMP_SIMD, pblock ? pblock : &block,
-                          &clausesa[GFC_OMP_SPLIT_SIMD], omp_clauses);
+                          &clausesa[GFC_OMP_SPLIT_SIMD], omp_clauses, false);
   if (pblock == NULL)
     {
       if (TREE_CODE (body) != BIND_EXPR)
@@ -6209,7 +6214,7 @@ gfc_trans_omp_parallel_do (gfc_code *code, bool is_loop, stmtblock_t *pblock,
     }
   stmt = gfc_trans_omp_do (code, is_loop ? EXEC_OMP_LOOP : EXEC_OMP_DO,
                           new_pblock, &clausesa[GFC_OMP_SPLIT_DO],
-                          omp_clauses);
+                          omp_clauses, false);
   if (pblock == NULL)
     {
       if (TREE_CODE (stmt) != BIND_EXPR)
@@ -6496,7 +6501,8 @@ gfc_trans_omp_distribute (gfc_code *code, gfc_omp_clauses *clausesa)
     case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD:
     case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD:
       stmt = gfc_trans_omp_do (code, EXEC_OMP_SIMD, &block,
-                              &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE);
+                              &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE,
+                              false);
       if (TREE_CODE (stmt) != BIND_EXPR)
        stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
       else
@@ -6555,13 +6561,13 @@ gfc_trans_omp_teams (gfc_code *code, gfc_omp_clauses *clausesa,
     case EXEC_OMP_TEAMS_DISTRIBUTE:
       stmt = gfc_trans_omp_do (code, EXEC_OMP_DISTRIBUTE, NULL,
                               &clausesa[GFC_OMP_SPLIT_DISTRIBUTE],
-                              NULL);
+                              NULL, false);
       break;
     case EXEC_OMP_TARGET_TEAMS_LOOP:
     case EXEC_OMP_TEAMS_LOOP:
       stmt = gfc_trans_omp_do (code, EXEC_OMP_LOOP, NULL,
                               &clausesa[GFC_OMP_SPLIT_DO],
-                              NULL);
+                              NULL, false);
       break;
     default:
       stmt = gfc_trans_omp_distribute (code, clausesa);
@@ -6641,7 +6647,8 @@ gfc_trans_omp_target (gfc_code *code)
       break;
     case EXEC_OMP_TARGET_SIMD:
       stmt = gfc_trans_omp_do (code, EXEC_OMP_SIMD, &block,
-                              &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE);
+                              &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE,
+                              false);
       if (TREE_CODE (stmt) != BIND_EXPR)
        stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
       else
@@ -6712,7 +6719,8 @@ gfc_trans_omp_taskloop (gfc_code *code, gfc_exec_op op)
       break;
     case EXEC_OMP_TASKLOOP_SIMD:
       stmt = gfc_trans_omp_do (code, EXEC_OMP_SIMD, &block,
-                              &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE);
+                              &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE,
+                              false);
       if (TREE_CODE (stmt) != BIND_EXPR)
        stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
       else
@@ -6756,7 +6764,7 @@ gfc_trans_omp_master_masked_taskloop (gfc_code *code, gfc_exec_op op)
       stmt = gfc_trans_omp_do (code, EXEC_OMP_TASKLOOP, NULL,
                               code->op != EXEC_OMP_MASTER_TASKLOOP
                               ? &clausesa[GFC_OMP_SPLIT_TASKLOOP]
-                              : code->ext.omp_clauses, NULL);
+                              : code->ext.omp_clauses, NULL, false);
     }
   if (TREE_CODE (stmt) != BIND_EXPR)
     stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
@@ -7119,7 +7127,7 @@ gfc_trans_oacc_directive (gfc_code *code)
       return gfc_trans_oacc_construct (code);
     case EXEC_OACC_LOOP:
       return gfc_trans_omp_do (code, code->op, NULL, code->ext.omp_clauses,
-                              NULL);
+                              NULL, false);
     case EXEC_OACC_UPDATE:
     case EXEC_OACC_CACHE:
     case EXEC_OACC_ENTER_DATA:
@@ -7159,7 +7167,7 @@ gfc_trans_omp_directive (gfc_code *code)
     case EXEC_OMP_SIMD:
     case EXEC_OMP_TASKLOOP:
       return gfc_trans_omp_do (code, code->op, NULL, code->ext.omp_clauses,
-                              NULL);
+                              NULL, false);
     case EXEC_OMP_DISTRIBUTE_PARALLEL_DO:
     case EXEC_OMP_DISTRIBUTE_PARALLEL_DO_SIMD:
     case EXEC_OMP_DISTRIBUTE_SIMD:
diff --git a/gcc/tree.h b/gcc/tree.h
index 7542d97ce121..15e5147f40b0 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1524,6 +1524,11 @@ class auto_suppress_location_wrappers
 #define OMP_MASKED_COMBINED(NODE) \
   (OMP_MASKED_CHECK (NODE)->base.private_flag)

+/* True on an OACC_LOOP statement if it is part of a combined construct,
+   for example "#pragma acc kernels loop".  */
+#define OACC_LOOP_COMBINED(NODE) \
+  (OACC_LOOP_CHECK (NODE)->base.private_flag)
+
 /* Memory order for OMP_ATOMIC*.  */
 #define OMP_ATOMIC_MEMORY_ORDER(NODE) \
   (TREE_RANGE_CHECK (NODE, OMP_ATOMIC, \
--
2.33.0

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 08/40] Annotate inner loops in "acc kernels loop" directives (Fortran).
  2021-12-15 15:54 [PATCH 00/40] OpenACC "kernels" Improvements Frederik Harwath
                   ` (2 preceding siblings ...)
  2021-12-15 15:54 ` [PATCH 06/40] Add a "combined" flag for "acc kernels loop" etc directives Frederik Harwath
@ 2021-12-15 15:54 ` Frederik Harwath
  2021-12-15 15:54 ` [PATCH 09/40] Permit calls to builtins and intrinsics in kernels loops Frederik Harwath
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Frederik Harwath @ 2021-12-15 15:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Sandra Loosemore, thomas, tobias, fortran

From: Sandra Loosemore <sandra@codesourcery.com>

Normally explicit loop directives in a kernels region inhibit
automatic annotation of other loops in the same nest, on the theory
that users have indicated they want manual control over that section
of code.  However there seems to be an expectation in user code that
the combined "kernels loop" directive should still allow annotation of
inner loops.  This patch implements this behavior in Fortran.

2020-08-19  Sandra Loosemore  <sandra@codesourcery.com>

        gcc/fortran/
        * openmp.c (annotate_do_loops_in_kernels): Handle
        EXEC_OACC_KERNELS_LOOP separately to permit annotation of inner
        loops in a combined "acc kernels loop" directive.

        gcc/testsuite/
        * gfortran.dg/goacc/kernels-loop-annotation-18.f95: New.
        * gfortran.dg/goacc/kernels-loop-annotation-19.f95: New.
        * gfortran.dg/goacc/combined-directives.f90: Adjust expected
        patterns.
        * gfortran.dg/goacc/private-explicit-kernels-1.f95: Likewise.
        * gfortran.dg/goacc/private-predetermined-kernels-1.f95:
        Likewise.
---
 gcc/fortran/openmp.c                          | 50 ++++++++++++++++++-
 .../gfortran.dg/goacc/combined-directives.f90 | 19 +++++--
 .../goacc/kernels-loop-annotation-18.f95      | 28 +++++++++++
 .../goacc/kernels-loop-annotation-19.f95      | 29 +++++++++++
 .../goacc/private-explicit-kernels-1.f95      |  7 ++-
 .../goacc/private-predetermined-kernels-1.f95 |  7 ++-
 6 files changed, 131 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 243b5e0a9ac6..b0b68b494778 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -9272,7 +9272,6 @@ annotate_do_loops_in_kernels (gfc_code *code, gfc_code *parent,

        case EXEC_OACC_PARALLEL_LOOP:
        case EXEC_OACC_PARALLEL:
-       case EXEC_OACC_KERNELS_LOOP:
        case EXEC_OACC_LOOP:
          /* Do not try to add automatic OpenACC annotations inside manually
             annotated loops.  Presumably, the user avoided doing it on
@@ -9317,6 +9316,55 @@ annotate_do_loops_in_kernels (gfc_code *code, gfc_code *parent,
            }
          break;

+       case EXEC_OACC_KERNELS_LOOP:
+         /* This is a combined "acc kernels loop" directive.  We want to
+            leave the outer loop alone but try to annotate any nested
+            loops in the body.  The expected structure nesting here is
+              EXEC_OACC_KERNELS_LOOP
+                EXEC_OACC_KERNELS_LOOP
+                  EXEC_DO
+                    EXEC_DO
+                      ...body...  */
+         if (code->block)
+           /* Might be empty?  */
+           {
+             gcc_assert (code->block->op == EXEC_OACC_KERNELS_LOOP);
+             gfc_omp_clauses *clauses = code->ext.omp_clauses;
+             int collapse = clauses->collapse;
+             gfc_expr_list *tile = clauses->tile_list;
+             gfc_code *inner = code->block->next;
+
+             gcc_assert (inner->op == EXEC_DO);
+             gcc_assert (inner->block->op == EXEC_DO);
+
+             /* We need to skip over nested loops covered by "collapse" or
+                "tile" clauses.  "Tile" takes precedence
+                (see gfc_trans_omp_do).  */
+             if (tile)
+               {
+                 collapse = 0;
+                 for (gfc_expr_list *el = tile; el; el = el->next)
+                   collapse++;
+               }
+             if (clauses->orderedc)
+               collapse = clauses->orderedc;
+             if (collapse <= 0)
+               collapse = 1;
+             for (int i = 1; i < collapse; i++)
+               {
+                 gcc_assert (inner->op == EXEC_DO);
+                 gcc_assert (inner->block->op == EXEC_DO);
+                 inner = inner->block->next;
+               }
+             if (inner)
+               /* Loop might have empty body?  */
+               annotate_do_loops_in_kernels (inner->block->next,
+                                             inner, goto_targets,
+                                             as_in_kernels_region);
+           }
+         walk_block = false;
+         break;
+
        case EXEC_DO_WHILE:
        case EXEC_DO_CONCURRENT:
          /* Traverse the body in a special state to allow EXIT statements
diff --git a/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 b/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
index 956349204f4d..562a4e40cd7d 100644
--- a/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
@@ -139,10 +139,21 @@ end subroutine test

 ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. collapse.2." 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "acc loop private.i. gang" 2 "gimple" } }
-! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. worker" 2 "gimple" } }
-! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. vector" 2 "gimple" } }
-! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. seq" 2 "gimple" } }
-! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. auto" 2 "gimple" } }
+
+! These are the parallel loop variants.
+! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. worker" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. vector" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. seq" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. auto" 1 "gimple" } }
+
+! These are the kernels loop variants.  Here the inner loops are annotated
+! separately.
+! { dg-final { scan-tree-dump-times "acc loop private.i. worker" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. vector" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. seq" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. auto" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop auto private.j." 4 "gimple" } }
+
 ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. tile.2, 3" 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "acc loop private.i. independent" 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "private.z" 2 "gimple" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95
new file mode 100644
index 000000000000..e4e210a92dbb
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95
@@ -0,0 +1,28 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that "acc kernels loop" directive causes annotation of the entire
+! loop nest.
+
+subroutine f (a, b)
+
+  implicit none
+  real, intent (in), dimension(20) :: a
+  real, intent (out), dimension(20) :: b
+  integer :: k, l, m
+
+!$acc kernels loop
+  do k = 1, 20
+    do l = 1, 20
+      do m = 1, 20
+       b(m) = a(m);
+      end do
+    end do
+  end do
+
+end subroutine f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 2 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95
new file mode 100644
index 000000000000..5dd6e7f538a6
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95
@@ -0,0 +1,29 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that "acc kernels loop" directive causes annotation of the entire
+! loop nest in the presence of a collapse clause.
+
+subroutine f (a, b)
+
+  implicit none
+  real, intent (in), dimension(20) :: a
+  real, intent (out), dimension(20) :: b
+  integer :: k, l, m
+
+!$acc kernels loop collapse(2)
+  do k = 1, 20
+    do l = 1, 20
+      do m = 1, 20
+       b(m) = a(m);
+      end do
+    end do
+  end do
+
+end subroutine f
+
+! { dg-final { scan-tree-dump-times "acc loop .*collapse.2." 1 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95 b/gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95
index 5d563d226b0c..0c47045df9c8 100644
--- a/gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95
@@ -73,8 +73,9 @@ program test

   !$acc kernels loop private(i2_1_c, j2_1_c) independent
   ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) private\\(j2_1_c\\) independent" 1 "original" } }
-  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) private\\(j2_1_c\\) independent" 1 "gimple" } }
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) independent" 1 "gimple" } }
   do i2_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(j2_1_c\\)" 1 "gimple" } }
      do j2_1_c = 1, 100
      end do
   end do
@@ -130,9 +131,11 @@ program test

   !$acc kernels loop private(i3_1_c, j3_1_c, k3_1_c) independent
   ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) private\\(j3_1_c\\) private\\(k3_1_c\\) independent" 1 "original" } }
-  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) private\\(j3_1_c\\) private\\(k3_1_c\\) independent" 1 "gimple" } }
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) independent" 1 "gimple" } }
   do i3_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(j3_1_c\\)" 1 "gimple" } }
      do j3_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(k3_1_c\\)" 1 "gimple" } }
         do k3_1_c = 1, 100
         end do
      end do
diff --git a/gcc/testsuite/gfortran.dg/goacc/private-predetermined-kernels-1.f95 b/gcc/testsuite/gfortran.dg/goacc/private-predetermined-kernels-1.f95
index 12a7854526a9..3357a20263e7 100644
--- a/gcc/testsuite/gfortran.dg/goacc/private-predetermined-kernels-1.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/private-predetermined-kernels-1.f95
@@ -73,8 +73,9 @@ program test

   !$acc kernels loop independent
   ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) private\\(j2_1_c\\) independent" 1 "original" } }
-  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) private\\(j2_1_c\\) independent" 1 "gimple" } }
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) independent" 1 "gimple" } }
   do i2_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(j2_1_c\\)" 1 "gimple" } }
      do j2_1_c = 1, 100
      end do
   end do
@@ -130,9 +131,11 @@ program test

   !$acc kernels loop independent
   ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) private\\(j3_1_c\\) private\\(k3_1_c\\) independent" 1 "original" } }
-  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) private\\(j3_1_c\\) private\\(k3_1_c\\) independent" 1 "gimple" } }
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) independent" 1 "gimple" } }
   do i3_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(j3_1_c\\)" 1 "gimple" } }
      do j3_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(k3_1_c\\)" 1 "gimple" } }
         do k3_1_c = 1, 100
         end do
      end do
--
2.33.0

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 09/40] Permit calls to builtins and intrinsics in kernels loops.
  2021-12-15 15:54 [PATCH 00/40] OpenACC "kernels" Improvements Frederik Harwath
                   ` (3 preceding siblings ...)
  2021-12-15 15:54 ` [PATCH 08/40] Annotate inner loops in "acc kernels loop" directives (Fortran) Frederik Harwath
@ 2021-12-15 15:54 ` Frederik Harwath
  2021-12-15 15:54 ` [PATCH 10/40] Fix patterns in Fortran tests for kernels loop annotation Frederik Harwath
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Frederik Harwath @ 2021-12-15 15:54 UTC (permalink / raw)
  To: gcc-patches
  Cc: Sandra Loosemore, thomas, joseph, jason, nathan, tobias, fortran

From: Sandra Loosemore <sandra@codesourcery.com>

This tweak to the OpenACC kernels loop annotation relaxes the
restrictions on function calls in the loop body.  Normally calls to
functions not explicitly marked with a parallelism attribute are not
permitted, but C/C++ builtins and Fortran intrinsics have known
semantics so we can generally permit those without restriction.  If
any turn out to be problematical, we can add on here to recognize
them, or in the processing of the "auto" annotations.

2020-08-22  Sandra Loosemore  <sandra@codesourcery.com>

        gcc/c-family/
        * c-omp.c (annotate_loops_in_kernels_regions): Test for
        calls to builtins.

        gcc/fortran/
        * openmp.c (check_expr_for_invalid_calls): Check for intrinsic
        functions.

        gcc/testsuite/
        * c-c++-common/goacc/kernels-loop-annotation-20.c: New.
        * gfortran.dg/goacc/kernels-loop-annotation-20.f95: New.
---
 gcc/c-family/c-omp.c                          | 10 ++++---
 gcc/fortran/openmp.c                          |  9 ++++---
 .../goacc/kernels-loop-annotation-20.c        | 23 ++++++++++++++++
 .../goacc/kernels-loop-annotation-20.f95      | 26 +++++++++++++++++++
 4 files changed, 61 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95

diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 30757877eafe..e7c27f45e888 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -3545,8 +3545,9 @@ annotate_loops_in_kernels_regions (tree *nodeptr, int *walk_subtrees,
       break;

     case CALL_EXPR:
-      /* Direct function calls to functions marked as OpenACC routines are
-        allowed.  Reject indirect calls or calls to non-routines.  */
+      /* Direct function calls to builtins and functions marked as
+        OpenACC routines are allowed.  Reject indirect calls or calls
+        to non-routines.  */
       if (info->state >= as_in_kernels_loop)
        {
          tree fn = CALL_EXPR_FN (node), fn_decl = NULL_TREE;
@@ -3560,8 +3561,9 @@ annotate_loops_in_kernels_regions (tree *nodeptr, int *walk_subtrees,
            }
          if (fn_decl == NULL_TREE)
            do_not_annotate_loop_nest (info, as_invalid_call, node);
-         else if (!lookup_attribute ("oacc function",
-                                     DECL_ATTRIBUTES (fn_decl)))
+         else if (!fndecl_built_in_p (fn_decl, BUILT_IN_NORMAL)
+                  && !lookup_attribute ("oacc function",
+                                        DECL_ATTRIBUTES (fn_decl)))
            do_not_annotate_loop_nest (info, as_invalid_call, node);
        }
       break;
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index b0b68b494778..d5d996e378d7 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -9156,9 +9156,12 @@ check_expr_for_invalid_calls (gfc_expr **exprp, int *walk_subtrees,
   switch (expr->expr_type)
     {
     case EXPR_FUNCTION:
-      if (expr->value.function.esym
-         && (expr->value.function.esym->attr.oacc_routine_lop
-             != OACC_ROUTINE_LOP_NONE))
+      /* Permit calls to Fortran intrinsic functions and to routines
+        with an explicitly declared parallelism level.  */
+      if (expr->value.function.isym
+         || (expr->value.function.esym
+             && (expr->value.function.esym->attr.oacc_routine_lop
+                 != OACC_ROUTINE_LOP_NONE)))
        return 0;
       /* Else fall through.  */

diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
new file mode 100644
index 000000000000..5e3f02845713
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
@@ -0,0 +1,23 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that calls to built-in functions don't inhibit kernels loop
+   annotation.  */
+
+void foo (int n, int *input, int *out1, int *out2)
+{
+#pragma acc kernels
+  {
+    int i;
+
+    for (i = 0; i < n; i++)
+      {
+       out1[i] = __builtin_clz (input[i]);
+       out2[i] = __builtin_popcount (input[i]);
+      }
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95
new file mode 100644
index 000000000000..5169a0a1676d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95
@@ -0,0 +1,26 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with calls to intrinsics in the body can be annotated.
+
+subroutine f (n, input, out1, out2)
+  implicit none
+  integer :: n
+  integer, intent (in), dimension (n) :: input
+  integer, intent (out), dimension (n) :: out1, out2
+
+  integer :: i
+
+!$acc kernels
+
+  do i = 1, n
+      out1(i) = min (i, input(i))
+      out2(i) = not (input(i))
+  end do
+!$acc end kernels
+
+end subroutine f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
--
2.33.0

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 10/40] Fix patterns in Fortran tests for kernels loop annotation.
  2021-12-15 15:54 [PATCH 00/40] OpenACC "kernels" Improvements Frederik Harwath
                   ` (4 preceding siblings ...)
  2021-12-15 15:54 ` [PATCH 09/40] Permit calls to builtins and intrinsics in kernels loops Frederik Harwath
@ 2021-12-15 15:54 ` Frederik Harwath
  2021-12-15 15:54 ` [PATCH 13/40] Fortran: Delinearize array accesses Frederik Harwath
  2021-12-16 12:00 ` [PATCH 40/40] openacc: Adjust testsuite to new "kernels" handling Frederik Harwath
  7 siblings, 0 replies; 9+ messages in thread
From: Frederik Harwath @ 2021-12-15 15:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Sandra Loosemore, thomas, tobias, fortran

From: Sandra Loosemore <sandra@codesourcery.com>

Several of the Fortran tests for kernels loop annotation were failing
due to changes in the formatting of "acc loop" constructs in the dump
file.  Now the "auto" clause appears first, instead of after "private".

2020-08-23   Sandra Loosemore  <sandra@codesourcery.com>

        gcc/testsuite/
        * gfortran.dg/goacc/kernels-loop-annotation-1.f95: Update
        expected output.
        * gfortran.dg/goacc/kernels-loop-annotation-2.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-3.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-4.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-5.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-6.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-7.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-8.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-11.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-12.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-13.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-14.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-15.f95: Likewise.
        * gfortran.dg/goacc/kernels-loop-annotation-16.f95: Likewise.
---
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95  | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95 | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95 | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95 | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95 | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95 | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95 | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95  | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95  | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95  | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95  | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95  | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95  | 2 +-
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95  | 2 +-
 14 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
index 41f6307dbb17..42e751dbfb83 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
@@ -30,4 +30,4 @@ subroutine f (a, b, c)
 !$acc end kernels
 end subroutine f

-! { dg-final { scan-tree-dump-times "acc loop private\\(.\\) auto" 3 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 3 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
index d51482e4685d..6e2e2c41172b 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
@@ -31,4 +31,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
index 3c4956d70775..03c4234ce7cd 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
@@ -36,4 +36,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
index 3ec459f0a8df..6aeb3f2fe4d0 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
@@ -35,4 +35,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
index 91f431cca432..7d1cff64a3d9 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
@@ -32,4 +32,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
index 570c12d3ad70..dab0d4030d03 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
@@ -32,4 +32,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
index 6e44a304b28b..15ef670e246d 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
@@ -31,4 +31,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
index 4624a05247d9..2baaa594be18 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
@@ -29,4 +29,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
index daed8f7f6e9d..e629891e31f9 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
@@ -30,4 +30,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
index 0c4ad256b7eb..6c3300b70537 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
@@ -31,4 +31,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
index 1c3f87eed6e4..52a9e7e7a85b 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
@@ -31,5 +31,5 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 1 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
index 43173a70df24..60eb245a22a9 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
@@ -31,4 +31,4 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
index ec42213220e7..438a13acee18 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
@@ -44,5 +44,5 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private.* auto" 0 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }

diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
index 9188f70d9664..aa97e37c054c 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
@@ -46,5 +46,5 @@ function f (a, b)

 end function f

-! { dg-final { scan-tree-dump-times "acc loop private\\(i\\) auto" 2 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 2 "original" } }

--
2.33.0

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 13/40] Fortran: Delinearize array accesses
  2021-12-15 15:54 [PATCH 00/40] OpenACC "kernels" Improvements Frederik Harwath
                   ` (5 preceding siblings ...)
  2021-12-15 15:54 ` [PATCH 10/40] Fix patterns in Fortran tests for kernels loop annotation Frederik Harwath
@ 2021-12-15 15:54 ` Frederik Harwath
  2021-12-16 12:00 ` [PATCH 40/40] openacc: Adjust testsuite to new "kernels" handling Frederik Harwath
  7 siblings, 0 replies; 9+ messages in thread
From: Frederik Harwath @ 2021-12-15 15:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: thomas, tobias, fortran, rguenther

The Fortran front end presently linearizes accesses to
multi-dimensional arrays by combining the indices for the various
dimensions into a series of explicit multiplies and adds with
refactoring to allow CSE of invariant parts of the computation.
Unfortunately this representation interferes with Graphite-based loop
optimizations.  It is difficult to recover the original
multi-dimensional form of the access by the time loop optimizations
run because parts of it have already been optimized away or into a
form that is not easily recognizable, so it seems better to have the
Fortran front end produce delinearized accesses to begin with, a set
of nested ARRAY_REFs similar to the existing behavior of the C and C++
front ends.  This is a long-standing problem that has previously been
discussed e.g. in PR 14741 and PR61000.

This patch is an initial implementation for explicit array accesses
only; it doesn't handle the accesses generated during scalarization of
whole-array or array-section operations, which follow a different code
path.

Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>

gcc/ChangeLog:

        * expr.c (get_inner_reference): Handle NOP_EXPR.

gcc/fortran/ChangeLog:

        * lang.opt: Document -param=delinearize.
        * trans-array.c: (get_class_array_vptr): New function.
        (get_array_lbound): New function.
        (get_array_ubound): New function.
        (gfc_conv_array_ref): Implement main delinearization logic.
        (build_array_ref): Adjust.

gcc/testsuite/ChangeLog:

        * gfortran.dg/assumed_type_2.f90: Adjust test expectations.
        * gfortran.dg/goacc/kernels-loop-inner.f95: Likewise.
        * gfortran.dg/gomp/affinity-clause-1.f90: Likewise.
        * gfortran.dg/graphite/block-2.f: Likewise.
        * gfortran.dg/graphite/block-3.f90: Likewise.
        * gfortran.dg/graphite/block-4.f90: Likewise.
        * gfortran.dg/graphite/id-9.f: Likewise.
        * gfortran.dg/inline_matmul_16.f90: Likewise.
        * gfortran.dg/inline_matmul_24.f90: Likewise.
        * gfortran.dg/no_arg_check_2.f90: Likewise.
        * gfortran.dg/pr32921.f: Likewise.
        * gfortran.dg/reassoc_4.f: Likewise.
        * gfortran.dg/vect/fast-math-mgrid-resid.f: Likewise.
---
 gcc/expr.c                                    |   1 +
 gcc/fortran/lang.opt                          |   4 +
 gcc/fortran/trans-array.c                     | 321 +++++++++++++-----
 gcc/testsuite/gfortran.dg/assumed_type_2.f90  |   6 +-
 .../gfortran.dg/goacc/kernels-loop-inner.f95  |   2 +-
 .../gfortran.dg/gomp/affinity-clause-1.f90    |   2 +-
 gcc/testsuite/gfortran.dg/graphite/block-2.f  |   9 +-
 .../gfortran.dg/graphite/block-3.f90          |   2 +-
 .../gfortran.dg/graphite/block-4.f90          |   2 +-
 gcc/testsuite/gfortran.dg/graphite/id-9.f     |   2 +-
 .../gfortran.dg/inline_matmul_16.f90          |   2 +
 .../gfortran.dg/inline_matmul_24.f90          |   2 +-
 gcc/testsuite/gfortran.dg/no_arg_check_2.f90  |   6 +-
 gcc/testsuite/gfortran.dg/pr32921.f           |   2 +-
 gcc/testsuite/gfortran.dg/reassoc_4.f         |   2 +-
 .../gfortran.dg/vect/fast-math-mgrid-resid.f  |   1 +
 16 files changed, 270 insertions(+), 96 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index eb33643bd770..188905b4fe4d 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -7759,6 +7759,7 @@ get_inner_reference (tree exp, poly_int64_pod *pbitsize,
          break;

        case VIEW_CONVERT_EXPR:
+       case NOP_EXPR:
          break;

        case MEM_REF:
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index a202c04c4a25..25c5a5a32c41 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -521,6 +521,10 @@ fdefault-real-16
 Fortran Var(flag_default_real_16)
 Set the default real kind to an 16 byte wide type.

+-param=delinearize=
+Common Joined UInteger Var(flag_delinearize_aref) Init(1) IntegerRange(0,1) Param Optimization
+Delinearize array references.
+
 fdollar-ok
 Fortran Var(flag_dollar_ok)
 Allow dollar signs in entity names.
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 5ceb261b6989..e84b4cb55f05 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -3747,11 +3747,9 @@ add_to_offset (tree *cst_offset, tree *offset, tree t)
     }
 }

-
 static tree
-build_array_ref (tree desc, tree offset, tree decl, tree vptr)
+get_class_array_vptr (tree desc, tree vptr)
 {
-  tree tmp;
   tree type;
   tree cdesc;

@@ -3775,19 +3773,74 @@ build_array_ref (tree desc, tree offset, tree decl, tree vptr)
          && GFC_CLASS_TYPE_P (TYPE_CANONICAL (type)))
        vptr = gfc_class_vptr_get (TREE_OPERAND (cdesc, 0));
     }
+  return vptr;
+}

+static tree
+build_array_ref (tree desc, tree offset, tree decl, tree vptr)
+{
+  tree tmp;
+  vptr = get_class_array_vptr (desc, vptr);
   tmp = gfc_conv_array_data (desc);
   tmp = build_fold_indirect_ref_loc (input_location, tmp);
   tmp = gfc_build_array_ref (tmp, offset, decl, vptr);
   return tmp;
 }

+/* Get the declared lower bound for rank N of array DECL which might
+   be either a bare array or a descriptor.  This differs from
+   gfc_conv_array_lbound because it gets information for temporary array
+   objects from AR instead of the descriptor (they can differ).  */
+
+static tree
+get_array_lbound (tree decl, int n, gfc_symbol *sym,
+                 gfc_array_ref *ar, gfc_se *se)
+{
+  if (sym->attr.temporary)
+    {
+      gfc_se tmpse;
+      gfc_init_se (&tmpse, se);
+      gfc_conv_expr_type (&tmpse, ar->as->lower[n], gfc_array_index_type);
+      gfc_add_block_to_block (&se->pre, &tmpse.pre);
+      return tmpse.expr;
+    }
+  else
+    return gfc_conv_array_lbound (decl, n);
+}
+
+/* Similarly for the upper bound.  */
+static tree
+get_array_ubound (tree decl, int n, gfc_symbol *sym,
+                 gfc_array_ref *ar, gfc_se *se)
+{
+  if (sym->attr.temporary)
+    {
+      gfc_se tmpse;
+      gfc_init_se (&tmpse, se);
+      gfc_conv_expr_type (&tmpse, ar->as->upper[n], gfc_array_index_type);
+      gfc_add_block_to_block (&se->pre, &tmpse.pre);
+      return tmpse.expr;
+    }
+  else
+    return gfc_conv_array_ubound (decl, n);
+}
+

 /* Build an array reference.  se->expr already holds the array descriptor.
    This should be either a variable, indirect variable reference or component
    reference.  For arrays which do not have a descriptor, se->expr will be
    the data pointer.
-   a(i, j, k) = base[offset + i * stride[0] + j * stride[1] + k * stride[2]]*/
+
+   There are two strategies here.  In the traditional case, multidimensional
+   arrays are explicitly linearized into a one-dimensional array, with the
+   index computed as if by
+   a(i, j, k) = base[offset + i * stride[0] + j * stride[1] + k * stride[2]]
+
+   However, we can often get better code using the Graphite framework
+   and scalar evolutions in the middle end, which expects to see
+   multidimensional array accesses represented as nested ARRAY_REFs, similar
+   to what the C/C++ front ends produce.  Delinearization is controlled
+   by flag_delinearize_aref.  */

 void
 gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr,
@@ -3798,11 +3851,16 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr,
   tree tmp;
   tree stride;
   tree decl = NULL_TREE;
+  tree cooked_decl = NULL_TREE;
+  tree vptr = se->class_vptr;
   gfc_se indexse;
   gfc_se tmpse;
   gfc_symbol * sym = expr->symtree->n.sym;
   char *var_name = NULL;
+  tree aref = NULL_TREE;
+  tree atype = NULL_TREE;

+  /* Handle coarrays.  */
   if (ar->dimen == 0)
     {
       gcc_assert (ar->codimen || sym->attr.select_rank_temporary
@@ -3862,15 +3920,160 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr,
        }
     }

+  /* Per comments above, DECL is not always a declaration.  It may be
+     either a variable, indirect variable reference, or component
+     reference.  It may have array or pointer type, or it may be a
+     descriptor with RECORD_TYPE.  */
   decl = se->expr;
   if (IS_CLASS_ARRAY (sym) && sym->attr.dummy && ar->as->type != AS_DEFERRED)
     decl = sym->backend_decl;

-  cst_offset = offset = gfc_index_zero_node;
-  add_to_offset (&cst_offset, &offset, gfc_conv_array_offset (decl));
+  /* A pointer array component can be detected from its field decl. Fix
+     the descriptor, mark the resulting variable decl and store it in
+     COOKED_DECL to pass to gfc_build_array_ref.  */
+  if (get_CFI_desc (sym, expr, &cooked_decl, ar))
+    cooked_decl = build_fold_indirect_ref_loc (input_location, cooked_decl);
+  if (!expr->ts.deferred && !sym->attr.codimension
+      && is_pointer_array (se->expr))
+    {
+      if (TREE_CODE (se->expr) == COMPONENT_REF)
+       cooked_decl = se->expr;
+      else if (TREE_CODE (se->expr) == INDIRECT_REF)
+       cooked_decl = TREE_OPERAND (se->expr, 0);
+      else
+       cooked_decl = se->expr;
+    }
+  else if (expr->ts.deferred
+          || (sym->ts.type == BT_CHARACTER
+              && sym->attr.select_type_temporary))
+    {
+      if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se->expr)))
+       {
+         cooked_decl = se->expr;
+         if (TREE_CODE (cooked_decl) == INDIRECT_REF)
+           cooked_decl = TREE_OPERAND (cooked_decl, 0);
+       }
+      else
+       cooked_decl = sym->backend_decl;
+    }
+  else if (sym->ts.type == BT_CLASS)
+    {
+      if (UNLIMITED_POLY (sym))
+       {
+         gfc_expr *class_expr = gfc_find_and_cut_at_last_class_ref (expr);
+         gfc_init_se (&tmpse, NULL);
+         gfc_conv_expr (&tmpse, class_expr);
+         if (!se->class_vptr)
+           vptr = gfc_class_vptr_get (tmpse.expr);
+         gfc_free_expr (class_expr);
+         cooked_decl = tmpse.expr;
+       }
+      else
+       cooked_decl = NULL_TREE;
+    }
+
+  /* Find the base of the array; this normally has ARRAY_TYPE.  */
+  tree base = build_fold_indirect_ref_loc (input_location,
+                                          gfc_conv_array_data (se->expr));
+  tree type = TREE_TYPE (base);

-  /* Calculate the offsets from all the dimensions.  Make sure to associate
-     the final offset so that we form a chain of loop invariant summands.  */
+  /* Handle special cases, copied from gfc_build_array_ref.  After we get
+     through this, we know TYPE definitely is an ARRAY_TYPE.  */
+  if (GFC_ARRAY_TYPE_P (type) && GFC_TYPE_ARRAY_RANK (type) == 0)
+    {
+      gcc_assert (GFC_TYPE_ARRAY_CORANK (type) > 0);
+      se->expr = fold_convert (TYPE_MAIN_VARIANT (type), base);
+      return;
+    }
+  if (TREE_CODE (type) != ARRAY_TYPE)
+    {
+      gcc_assert (cooked_decl == NULL_TREE);
+      se->expr = base;
+      return;
+    }
+
+  /* Check for cases where we cannot delinearize.  */
+
+  bool delinearize = flag_delinearize_aref;
+
+  /* There is no point in trying to delinearize 1-dimensional arrays.  */
+  if (ar->dimen == 1)
+    delinearize = false;
+
+  if (delinearize
+      && (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se->expr))
+         || (DECL_P (se->expr)
+             && DECL_LANG_SPECIFIC (se->expr)
+             && GFC_DECL_SAVED_DESCRIPTOR (se->expr))))
+    {
+      /* Descriptor arrays that may not be contiguous cannot
+        be delinearized without using the stride in the descriptor,
+        which generally involves introducing a division operation.
+        That's unlikely to produce optimal code, so avoid doing it.  */
+      tree desc = se->expr;
+      if (!GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se->expr)))
+       desc = GFC_DECL_SAVED_DESCRIPTOR (se->expr);
+      tree tmptype = TREE_TYPE (desc);
+      if (POINTER_TYPE_P (tmptype))
+       tmptype = TREE_TYPE (tmptype);
+      enum gfc_array_kind akind = GFC_TYPE_ARRAY_AKIND (tmptype);
+      if (akind != GFC_ARRAY_ASSUMED_SHAPE_CONT
+         && akind != GFC_ARRAY_ASSUMED_RANK_CONT
+         && akind != GFC_ARRAY_ALLOCATABLE
+         && akind != GFC_ARRAY_POINTER_CONT)
+       delinearize = false;
+    }
+
+  /* See gfc_build_array_ref in trans.c.  If we have a cooked_decl or
+     vptr, then we most likely have to do pointer arithmetic using a
+     linearized array offset.  */
+  if (delinearize && cooked_decl)
+    delinearize = false;
+  else if (delinearize && get_class_array_vptr (se->expr, vptr))
+    delinearize = false;
+
+  if (!delinearize)
+    {
+      /* Initialize the offset from the array descriptor.  This accounts
+        for the array base being something other than zero.  */
+      cst_offset = offset = gfc_index_zero_node;
+      add_to_offset (&cst_offset, &offset, gfc_conv_array_offset (decl));
+    }
+  else
+    {
+      /* If we are delinearizing, build up the nested array type using the
+        dimension information we have for each rank.  */
+      atype = TREE_TYPE (type);
+      for (n = 0; n < ar->dimen; n++)
+       {
+         /* We're working from the outermost nested array reference inward
+            in this step.  ATYPE is the element type for the access in
+            this rank; build the new array type based on the bounds
+            information and store it back into ATYPE for the next rank's
+            processing.  */
+         tree lbound = get_array_lbound (decl, n, sym, ar, se);
+         tree ubound = get_array_ubound (decl, n, sym, ar, se);
+         tree dimen = build_range_type (TREE_TYPE (lbound),
+                                        lbound, ubound);
+         atype = build_array_type (atype, dimen);
+
+         /* Emit a DECL_EXPR for the array type so the gimplification of
+            its type sizes works correctly.  */
+         if (! TYPE_NAME (atype))
+           TYPE_NAME (atype) = build_decl (UNKNOWN_LOCATION, TYPE_DECL,
+                                           NULL_TREE, atype);
+         gfc_add_expr_to_block (&se->pre,
+                                build1 (DECL_EXPR, atype,
+                                        TYPE_NAME (atype)));
+       }
+
+      /* Cast base to the innermost array type.  */
+      if (DECL_P (base))
+       TREE_ADDRESSABLE (base) = 1;
+      aref = build1 (NOP_EXPR, atype, base);
+    }
+
+  /* Process indices in reverse order.  */
   for (n = ar->dimen - 1; n >= 0; n--)
     {
       /* Calculate the index for this dimension.  */
@@ -3888,16 +4091,7 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr,
          indexse.expr = save_expr (indexse.expr);

          /* Lower bound.  */
-         tmp = gfc_conv_array_lbound (decl, n);
-         if (sym->attr.temporary)
-           {
-             gfc_init_se (&tmpse, se);
-             gfc_conv_expr_type (&tmpse, ar->as->lower[n],
-                                 gfc_array_index_type);
-             gfc_add_block_to_block (&se->pre, &tmpse.pre);
-             tmp = tmpse.expr;
-           }
-
+         tmp = get_array_lbound (decl, n, sym, ar, se);
          cond = fold_build2_loc (input_location, LT_EXPR, logical_type_node,
                                  indexse.expr, tmp);
          msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' "
@@ -3912,16 +4106,7 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr,
             arrays.  */
          if (n < ar->dimen - 1 || ar->as->type != AS_ASSUMED_SIZE)
            {
-             tmp = gfc_conv_array_ubound (decl, n);
-             if (sym->attr.temporary)
-               {
-                 gfc_init_se (&tmpse, se);
-                 gfc_conv_expr_type (&tmpse, ar->as->upper[n],
-                                     gfc_array_index_type);
-                 gfc_add_block_to_block (&se->pre, &tmpse.pre);
-                 tmp = tmpse.expr;
-               }
-
+             tmp = get_array_ubound (decl, n, sym, ar, se);
              cond = fold_build2_loc (input_location, GT_EXPR,
                                      logical_type_node, indexse.expr, tmp);
              msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' "
@@ -3934,65 +4119,41 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, gfc_expr *expr,
            }
        }

-      /* Multiply the index by the stride.  */
-      stride = gfc_conv_array_stride (decl, n);
-      tmp = fold_build2_loc (input_location, MULT_EXPR, gfc_array_index_type,
-                            indexse.expr, stride);
-
-      /* And add it to the total.  */
-      add_to_offset (&cst_offset, &offset, tmp);
-    }
-
-  if (!integer_zerop (cst_offset))
-    offset = fold_build2_loc (input_location, PLUS_EXPR,
-                             gfc_array_index_type, offset, cst_offset);
-
-  /* A pointer array component can be detected from its field decl. Fix
-     the descriptor, mark the resulting variable decl and pass it to
-     build_array_ref.  */
-  decl = NULL_TREE;
-  if (get_CFI_desc (sym, expr, &decl, ar))
-    decl = build_fold_indirect_ref_loc (input_location, decl);
-  if (!expr->ts.deferred && !sym->attr.codimension
-      && is_pointer_array (se->expr))
-    {
-      if (TREE_CODE (se->expr) == COMPONENT_REF)
-       decl = se->expr;
-      else if (TREE_CODE (se->expr) == INDIRECT_REF)
-       decl = TREE_OPERAND (se->expr, 0);
-      else
-       decl = se->expr;
-    }
-  else if (expr->ts.deferred
-          || (sym->ts.type == BT_CHARACTER
-              && sym->attr.select_type_temporary))
-    {
-      if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se->expr)))
+      if (!delinearize)
        {
-         decl = se->expr;
-         if (TREE_CODE (decl) == INDIRECT_REF)
-           decl = TREE_OPERAND (decl, 0);
+         /* Multiply the index by the stride.  */
+         stride = gfc_conv_array_stride (decl, n);
+         tmp = fold_build2_loc (input_location, MULT_EXPR,
+                                gfc_array_index_type,
+                                indexse.expr, stride);
+
+         /* And add it to the total.  */
+         add_to_offset (&cst_offset, &offset, tmp);
        }
       else
-       decl = sym->backend_decl;
-    }
-  else if (sym->ts.type == BT_CLASS)
-    {
-      if (UNLIMITED_POLY (sym))
        {
-         gfc_expr *class_expr = gfc_find_and_cut_at_last_class_ref (expr);
-         gfc_init_se (&tmpse, NULL);
-         gfc_conv_expr (&tmpse, class_expr);
-         if (!se->class_vptr)
-           se->class_vptr = gfc_class_vptr_get (tmpse.expr);
-         gfc_free_expr (class_expr);
-         decl = tmpse.expr;
+         /* Peel off a layer of array nesting from ATYPE to
+            to get the result type of the new ARRAY_REF.  */
+         atype = TREE_TYPE (atype);
+         aref = build4 (ARRAY_REF, atype, aref, indexse.expr,
+                        NULL_TREE, NULL_TREE);
        }
-      else
-       decl = NULL_TREE;
     }

-  se->expr = build_array_ref (se->expr, offset, decl, se->class_vptr);
+  if (!delinearize)
+    {
+      /* Build a linearized array reference using the offset from all
+        dimensions.  */
+      if (!integer_zerop (cst_offset))
+       offset = fold_build2_loc (input_location, PLUS_EXPR,
+                                 gfc_array_index_type, offset, cst_offset);
+      se->class_vptr = vptr;
+      vptr = get_class_array_vptr (se->expr, vptr);
+      se->expr = gfc_build_array_ref (base, offset, cooked_decl, vptr);
+    }
+ else
+   /* Return the outermost ARRAY_REF we already built.  */
+   se->expr = aref;
 }


diff --git a/gcc/testsuite/gfortran.dg/assumed_type_2.f90 b/gcc/testsuite/gfortran.dg/assumed_type_2.f90
index 5d3cd7eaece9..07be87ef1eb6 100644
--- a/gcc/testsuite/gfortran.dg/assumed_type_2.f90
+++ b/gcc/testsuite/gfortran.dg/assumed_type_2.f90
@@ -147,12 +147,12 @@ end

 ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_int," 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_t1," 1 "original" } }
-! { dg-final { scan-tree-dump-times "sub_scalar .&array_int.1.," 1 "original" } }
+! { dg-final { scan-tree-dump-times "sub_scalar .&.*array_int" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_t1," 1 "original" } }

-! { dg-final { scan-tree-dump-times "sub_scalar .&\\(.\\(real.kind=4..0:. . restrict\\) array_real_alloc.data" 1 "original" } }
+! { dg-final { scan-tree-dump-times "sub_scalar .&.*real.kind=4..0.*restrict.*array_real_alloc.data" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .\\(character.kind=1..1:1. .\\) .array_char_ptr.data" 1 "original" } }
-! { dg-final { scan-tree-dump-times "sub_scalar .&\\(.\\(struct t2.0:. . restrict\\) array_t2_alloc.data" 1 "original" } }
+! { dg-final { scan-tree-dump-times "sub_scalar .&.*struct t2.0:..*restrict.*array_t2_alloc.data" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t3 .\\) .array_t3_ptr.data" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t1 .\\) array_class_t1_alloc._data.data" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t1 .\\) \\(array_class_t1_ptr._data.dat" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95
index a3ad591f926c..d8d14c42be01 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95
@@ -7,7 +7,7 @@ program main
    integer :: a(100,100), b(100,100)
    integer :: i, j, d

-   !$acc kernels ! { dg-message "optimized: assigned OpenACC seq loop parallelism" }
+   !$acc kernels ! { dg-message "optimized: assigned OpenACC gang loop parallelism" }
    do i=1,100
      do j=1,100
        a(i,j) = 1
diff --git a/gcc/testsuite/gfortran.dg/gomp/affinity-clause-1.f90 b/gcc/testsuite/gfortran.dg/gomp/affinity-clause-1.f90
index 13bdd36d0b4d..51c6013565a1 100644
--- a/gcc/testsuite/gfortran.dg/gomp/affinity-clause-1.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/affinity-clause-1.f90
@@ -22,7 +22,7 @@ end

 ! { dg-final { scan-tree-dump-times "D\\.\[0-9\]+ = .integer.kind=4.. __builtin_cosf ..real.kind=4.. a \\+ 1.0e\\+0\\);" 2 "original" } }

-! { dg-final { scan-tree-dump-times "#pragma omp task affinity\\(iterator\\(integer\\(kind=4\\) jj=2:5:2, integer\\(kind=4\\) i=D\\.\[0-9\]+:5:1\\):\\*\\(c_char \\*\\) &b\\\[.* <?i>? \\+ -1\\\]\\) affinity\\(iterator\\(integer\\(kind=4\\) jj=2:5:2, integer\\(kind=4\\) i=D\\.\[0-9\]+:5:1\\):\\*\\(c_char \\*\\) &d\\\[\\(.*jj \\* 5 \\+ .* <?i>?\\) \\+ -6\\\]\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "#pragma omp task affinity\\(iterator\\(integer\\(kind=4\\) jj=2:5:2, integer\\(kind=4\\) i=D\\.\[0-9\]+:5:1\\):\\*\\(c_char \\*\\) &b\\\[.* <?i>? \\+ -1\\\]\\) affinity\\(iterator\\(integer\\(kind=4\\) jj=2:5:2, integer\\(kind=4\\) i=D\\.\[0-9\]+:5:1\\):\\*\\(c_char \\*\\) &\\(\\(integer\\(kind.*?d\\).*?$" 1 "original" } }

 ! { dg final { scan-tree-dump-times "#pragma omp task affinity\\(iterator\\(integer\\(kind=4\\) i=D.3938:5:1\\):\\*\\(c_char \\*\\) &b\\\[\\(.* <?i>? \\+ -1\\\]\\) affinity\\(iterator\\(integer\\(kind=4\\) i=D\\.\[0-9\]+:5:1\\):\\*\\(c_char \\*\\) &d\\\[\\(\\(integer\\(kind=8\\)\\) i \\+ -1\\) \\* 6\\\]\\)"  1 "original" } }

diff --git a/gcc/testsuite/gfortran.dg/graphite/block-2.f b/gcc/testsuite/gfortran.dg/graphite/block-2.f
index bea8ddeb8267..266da378c5d9 100644
--- a/gcc/testsuite/gfortran.dg/graphite/block-2.f
+++ b/gcc/testsuite/gfortran.dg/graphite/block-2.f
@@ -1,5 +1,11 @@
 ! { dg-do compile }
 ! { dg-additional-options "-std=legacy" }
+
+! ldist introduces a __builtin_memset for the first loop and hence
+! breaks the testcases's assumption regarding the number of SCoPs
+! because Graphite cannot deal with the call.
+! { dg-additional-options "-fdisable-tree-ldist" }
+
       SUBROUTINE MATRIX_MUL_UNROLLED (A, B, C, L, M, N)
       DIMENSION A(L,M), B(M,N), C(L,N)

@@ -18,5 +24,4 @@
       RETURN
       END

-! Disabled for now as it requires delinearization.
-! { dg-final { scan-tree-dump-times "number of SCoPs: 2" 1 "graphite" { xfail *-*-* } } }
+! { dg-final { scan-tree-dump-times "number of SCoPs: 2" 1 "graphite" } }
diff --git a/gcc/testsuite/gfortran.dg/graphite/block-3.f90 b/gcc/testsuite/gfortran.dg/graphite/block-3.f90
index 452de7349050..0edca92bb894 100644
--- a/gcc/testsuite/gfortran.dg/graphite/block-3.f90
+++ b/gcc/testsuite/gfortran.dg/graphite/block-3.f90
@@ -12,6 +12,6 @@ enddo

 end subroutine matrix_multiply

-! { dg-final { scan-tree-dump-times "number of SCoPs: 1" 1 "graphite" { xfail *-*-* } } }
+! { dg-final { scan-tree-dump-times "number of SCoPs: 1" 1 "graphite" } }
 ! { dg-final { scan-tree-dump-times "will be loop blocked" 1 "graphite" { xfail *-*-* } } }

diff --git a/gcc/testsuite/gfortran.dg/graphite/block-4.f90 b/gcc/testsuite/gfortran.dg/graphite/block-4.f90
index 42af5b62444e..f2aed98bcf82 100644
--- a/gcc/testsuite/gfortran.dg/graphite/block-4.f90
+++ b/gcc/testsuite/gfortran.dg/graphite/block-4.f90
@@ -15,6 +15,6 @@ enddo

 end subroutine matrix_multiply

-! { dg-final { scan-tree-dump-times "number of SCoPs: 1" 1 "graphite" { xfail *-*-* } } }
+! { dg-final { scan-tree-dump-times "number of SCoPs: 1" 1 "graphite" } }
 ! { dg-final { scan-tree-dump-times "will be loop blocked" 1 "graphite" { xfail *-*-* } } }

diff --git a/gcc/testsuite/gfortran.dg/graphite/id-9.f b/gcc/testsuite/gfortran.dg/graphite/id-9.f
index c93937088972..885a9dfaa1bb 100644
--- a/gcc/testsuite/gfortran.dg/graphite/id-9.f
+++ b/gcc/testsuite/gfortran.dg/graphite/id-9.f
@@ -8,7 +8,7 @@
                   do l=1,3
                      do k=1,l
                      enddo
-                     bar(k,l)=bar(k,l)+(v3b-1.d0)
+                     bar(k,l)=bar(k,l)+(v3b-1.d0) ! { dg-bogus ".*iteration 2 invokes undefined behavior" "TODO" { xfail *-*-* }   }
                   enddo
             enddo
             do m=1,ne
diff --git a/gcc/testsuite/gfortran.dg/inline_matmul_16.f90 b/gcc/testsuite/gfortran.dg/inline_matmul_16.f90
index 580cb1ac9393..2a7f63b9c963 100644
--- a/gcc/testsuite/gfortran.dg/inline_matmul_16.f90
+++ b/gcc/testsuite/gfortran.dg/inline_matmul_16.f90
@@ -1,5 +1,7 @@
 ! { dg-do run }
 ! { dg-options "-ffrontend-optimize -fdump-tree-optimized -Wrealloc-lhs -finline-matmul-limit=1000 -O" }
+! { dg-additional-options "--param delinearize=0" } TODO
+
 ! PR 66094: Check functionality for MATMUL(TRANSPOSE(A),B)) for two-dimensional arrays
 program main
   implicit none
diff --git a/gcc/testsuite/gfortran.dg/inline_matmul_24.f90 b/gcc/testsuite/gfortran.dg/inline_matmul_24.f90
index 3168d5f10064..8d84f3cdb01b 100644
--- a/gcc/testsuite/gfortran.dg/inline_matmul_24.f90
+++ b/gcc/testsuite/gfortran.dg/inline_matmul_24.f90
@@ -39,4 +39,4 @@ program testMATMUL
       call abort()
     end if
 end program testMATMUL
-! { dg-final { scan-tree-dump-times "gamma5\\\[__var_1_do \\* 4 \\+ __var_2_do\\\]|gamma5\\\[NON_LVALUE_EXPR <__var_1_do> \\* 4 \\+ NON_LVALUE_EXPR <__var_2_do>\\\]" 1 "original" } }
+! { dg-final { scan-tree-dump-times "gamma5.*\\\[NON_LVALUE_EXPR <__var_1_do>\\\]\\\[NON_LVALUE_EXPR <__var_2_do>\\\]" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/no_arg_check_2.f90 b/gcc/testsuite/gfortran.dg/no_arg_check_2.f90
index 3570b9719ebb..0900dd82646f 100644
--- a/gcc/testsuite/gfortran.dg/no_arg_check_2.f90
+++ b/gcc/testsuite/gfortran.dg/no_arg_check_2.f90
@@ -129,12 +129,12 @@ end

 ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_int," 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_t1," 1 "original" } }
-! { dg-final { scan-tree-dump-times "sub_scalar .&array_int.1.," 1 "original" } }
+! { dg-final { scan-tree-dump-times "sub_scalar .&.*array_int" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .&scalar_t1," 1 "original" } }

-! { dg-final { scan-tree-dump-times "sub_scalar .&\\(.\\(real.kind=4..0:. . restrict\\) array_real_alloc.data" 1 "original" } }
+! { dg-final { scan-tree-dump-times "sub_scalar .&.*real.kind=4..0.*restrict.*array_real_alloc.data" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .\\(character.kind=1..1:1. .\\) .array_char_ptr.data" 1 "original" } }
-! { dg-final { scan-tree-dump-times "sub_scalar .&\\(.\\(struct t2.0:. . restrict\\) array_t2_alloc.data" 1 "original" } }
+! { dg-final { scan-tree-dump-times "sub_scalar .&.*struct t2.0:..*restrict.*array_t2_alloc.data" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t3 .\\) .array_t3_ptr.data" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t1 .\\) array_class_t1_alloc._data.data" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "sub_scalar .\\(struct t1 .\\) \\(array_class_t1_ptr._data.dat" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/pr32921.f b/gcc/testsuite/gfortran.dg/pr32921.f
index 0661208edde5..853438609c43 100644
--- a/gcc/testsuite/gfortran.dg/pr32921.f
+++ b/gcc/testsuite/gfortran.dg/pr32921.f
@@ -45,4 +45,4 @@

       RETURN
       END
-! { dg-final { scan-tree-dump-times "stride" 4 "lim2" } }
+! { dg-final { scan-tree-dump-times "ubound" 4 "lim2" } }
diff --git a/gcc/testsuite/gfortran.dg/reassoc_4.f b/gcc/testsuite/gfortran.dg/reassoc_4.f
index fdcb46e835cf..2368b76aecb2 100644
--- a/gcc/testsuite/gfortran.dg/reassoc_4.f
+++ b/gcc/testsuite/gfortran.dg/reassoc_4.f
@@ -1,5 +1,5 @@
 ! { dg-do compile }
-! { dg-options "-O3 -ffast-math -fdump-tree-reassoc1 --param max-completely-peeled-insns=200" }
+! { dg-options "-O3 -ffast-math -fdump-tree-reassoc1 --param max-completely-peeled-insns=200 --param delinearize=0" }
       subroutine anisonl(w,vo,anisox,s,ii1,jj1,weight)
       integer ii1,jj1,i1,iii1,j1,jjj1,k1,l1,m1,n1
       real*8 w(3,3),vo(3,3),anisox(3,3,3,3),s(60,60),weight
diff --git a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
index 08965cc5e202..6c469b1964c6 100644
--- a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
+++ b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f
@@ -3,6 +3,7 @@
 ! { dg-options "-O3 --param vect-max-peeling-for-alignment=0 -fpredictive-commoning -fdump-tree-pcom-details -std=legacy" }
 ! { dg-additional-options "-mprefer-avx128" { target { i?86-*-* x86_64-*-* } } }
 ! { dg-additional-options "-mzarch" { target { s390*-*-* } } }
+! { dg-additional-options "--param delinearize=0" } TODO

 ******* RESID COMPUTES THE RESIDUAL:  R = V - AU
 *
--
2.33.0

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 40/40] openacc: Adjust testsuite to new "kernels" handling
  2021-12-15 15:54 [PATCH 00/40] OpenACC "kernels" Improvements Frederik Harwath
                   ` (6 preceding siblings ...)
  2021-12-15 15:54 ` [PATCH 13/40] Fortran: Delinearize array accesses Frederik Harwath
@ 2021-12-16 12:00 ` Frederik Harwath
  7 siblings, 0 replies; 9+ messages in thread
From: Frederik Harwath @ 2021-12-16 12:00 UTC (permalink / raw)
  To: gcc-patches, fortran; +Cc: Catherine_Moore

[-- Attachment #1: Type: text/plain, Size: 19973 bytes --]


Adjust the testsuite to changed expectations with the new
Graphite-based "kernels" handling.

libgomp/ChangeLog:

        * testsuite/libgomp.oacc-c++/privatized-ref-2.C: Adjust.
        * testsuite/libgomp.oacc-c++/privatized-ref-3.C: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/declare-vla-kernels-decompose-ice-1.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c:
        Adjust.
        * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/pr84955-1.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/pr85486.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Adjust.
        * testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Adjust.
        * testsuite/libgomp.oacc-fortran/if-1.f90: Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-acc-loop-reduction-2.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90:
        Adjust.
        * testsuite/libgomp.oacc-fortran/optional-private.f90: Adjust.
        * testsuite/libgomp.oacc-fortran/pr94358-1.f90: Adjust.
        * testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Adjust.

gcc/testsuite/ChangeLog:
        * c-c++-common/goacc-gomp/nesting-1.c: Adjust.
        * c-c++-common/goacc/cache-3-1.c: Adjust.
        * c-c++-common/goacc/classify-kernels-unparallelized.c: Adjust.
        * c-c++-common/goacc/classify-kernels.c: Adjust.
        * c-c++-common/goacc/classify-routine-nohost.c: Adjust.
        * c-c++-common/goacc/classify-serial.c: Adjust.
        * c-c++-common/goacc/if-clause-2.c: Adjust.
        * c-c++-common/goacc/kernels-1.c: Adjust.
        * c-c++-common/goacc/kernels-counter-var-redundant-load.c: Adjust.
        * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Adjust.
        * c-c++-common/goacc/kernels-decompose-1.c: Adjust.
        * c-c++-common/goacc/kernels-decompose-2.c: Adjust.
        * c-c++-common/goacc/kernels-decompose-ice-1.c: Adjust.
        * c-c++-common/goacc/kernels-decompose-ice-2.c: Adjust.
        * c-c++-common/goacc/kernels-double-reduction-n.c: Adjust.
        * c-c++-common/goacc/kernels-double-reduction.c: Adjust.
        * c-c++-common/goacc/kernels-loop-2.c: Adjust.
        * c-c++-common/goacc/kernels-loop-3.c: Adjust.
        * c-c++-common/goacc/kernels-loop-data-2.c: Adjust.
        * c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Adjust.
        * c-c++-common/goacc/kernels-loop-data-enter-exit.c: Adjust.
        * c-c++-common/goacc/kernels-loop-data-update.c: Adjust.
        * c-c++-common/goacc/kernels-loop-data.c: Adjust.
        * c-c++-common/goacc/kernels-loop-g.c: Adjust.
        * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Adjust.
        * c-c++-common/goacc/kernels-loop-n.c: Adjust.
        * c-c++-common/goacc/kernels-loop-nest.c: Adjust.
        * c-c++-common/goacc/kernels-loop.c: Adjust.
        * c-c++-common/goacc/kernels-one-counter-var.c: Adjust.
        * c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Adjust.
        * c-c++-common/goacc/kernels-reduction.c: Adjust.
        * c-c++-common/goacc/loop-2-kernels.c: Adjust.
        * c-c++-common/goacc/loop-auto-1.c: Adjust.
        * c-c++-common/goacc/loop-auto-2.c: Adjust.
        * c-c++-common/goacc/nested-reductions-2-parallel.c: Adjust.
        * c-c++-common/goacc/omp_data_optimize-1.c: Adjust.
        * c-c++-common/goacc/routine-nohost-1.c: Adjust.
        * c-c++-common/goacc/uninit-copy-clause.c: Adjust.
        * g++.dg/goacc/omp_data_optimize-1.C: Adjust.
        * g++.dg/goacc/template.C: Adjust.
        * gcc.dg/goacc/loop-processing-1.c: Adjust.
        * gcc.dg/goacc/nested-function-1.c: Adjust.
        * gfortran.dg/directive_unroll_1.f90: Adjust.
        * gfortran.dg/directive_unroll_4.f90: Adjust.
        * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust.
        * gfortran.dg/goacc/classify-kernels.f95: Adjust.
        * gfortran.dg/goacc/classify-parallel.f95: Adjust.
        * gfortran.dg/goacc/classify-routine-nohost.f95: Adjust.
        * gfortran.dg/goacc/classify-routine.f95: Adjust.
        * gfortran.dg/goacc/classify-serial.f95: Adjust.
        * gfortran.dg/goacc/common-block-3.f90: Adjust.
        * gfortran.dg/goacc/declare-3.f95: Adjust.
        * gfortran.dg/goacc/gang-static.f95: Adjust.
        * gfortran.dg/goacc/kernels-decompose-1.f95: Adjust.
        * gfortran.dg/goacc/kernels-decompose-2.f95: Adjust.
        * gfortran.dg/goacc/kernels-loop-2.f95: Adjust.
        * gfortran.dg/goacc/kernels-loop-data-2.f95: Adjust.
        * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Adjust.
        * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Adjust.
        * gfortran.dg/goacc/kernels-loop-data-update.f95: Adjust.
        * gfortran.dg/goacc/kernels-loop-data.f95: Adjust.
        * gfortran.dg/goacc/kernels-loop-inner.f95: Adjust.
        * gfortran.dg/goacc/kernels-loop-n.f95: Adjust.
        * gfortran.dg/goacc/kernels-loop.f95: Adjust.
        * gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Adjust.
        * gfortran.dg/goacc/kernels-tree.f95: Adjust.
        * gfortran.dg/goacc/loop-2-kernels.f95: Adjust.
        * gfortran.dg/goacc/loop-auto-transfer-2.f90: Adjust.
        * gfortran.dg/goacc/loop-auto-transfer-3.f90: Adjust.
        * gfortran.dg/goacc/loop-auto-transfer-4.f90: Adjust.
        * gfortran.dg/goacc/nested-function-1.f90: Adjust.
        * gfortran.dg/goacc/nested-reductions-2-parallel.f90: Adjust.
        * gfortran.dg/goacc/omp_data_optimize-1.f90: Adjust.
        * gfortran.dg/goacc/private-explicit-kernels-1.f95: Adjust.
        * gfortran.dg/goacc/private-predetermined-kernels-1.f95: Adjust.
        * gfortran.dg/goacc/privatization-1-compute-loop.f90: Adjust.
        * gfortran.dg/goacc/routine-module-mod-1.f90: Adjust.
        * gfortran.dg/goacc/routine-multiple-directives-1.f90: Adjust.
        * gfortran.dg/goacc/uninit-copy-clause.f95: Adjust.
        * c-c++-common/goacc/loop-auto-reductions.c: New test.
        * c-c++-common/goacc/note-parallelism-kernels-loops-1.c: New test.
        * c-c++-common/goacc/note-parallelism-kernels-loops-parloops.c: New test.
        * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95: New
        test.
        * gfortran.dg/goacc/kernels-conversion.f95: New test.
        * gfortran.dg/goacc/kernels-reductions.f90: New test.
---
 .../c-c++-common/goacc-gomp/nesting-1.c       |  10 +-
 gcc/testsuite/c-c++-common/goacc/cache-3-1.c  |   2 +-
 .../goacc/classify-kernels-unparallelized.c   |  34 ++-
 .../c-c++-common/goacc/classify-kernels.c     |  21 +-
 .../goacc/classify-routine-nohost.c           |  20 +-
 .../c-c++-common/goacc/classify-serial.c      |   8 +-
 .../c-c++-common/goacc/if-clause-2.c          |   2 +-
 gcc/testsuite/c-c++-common/goacc/kernels-1.c  |  17 +-
 .../kernels-counter-var-redundant-load.c      |  20 +-
 .../kernels-counter-vars-function-scope.c     |  11 +-
 .../c-c++-common/goacc/kernels-decompose-1.c  |  31 ++-
 .../c-c++-common/goacc/kernels-decompose-2.c  |  58 +++--
 .../goacc/kernels-decompose-ice-1.c           |   7 +-
 .../goacc/kernels-decompose-ice-2.c           |   3 +-
 .../goacc/kernels-double-reduction-n.c        |   5 +-
 .../goacc/kernels-double-reduction.c          |   4 +-
 .../c-c++-common/goacc/kernels-loop-2.c       |  20 +-
 .../c-c++-common/goacc/kernels-loop-3.c       |   2 +
 .../c-c++-common/goacc/kernels-loop-data-2.c  |  18 +-
 .../goacc/kernels-loop-data-enter-exit-2.c    |  17 +-
 .../goacc/kernels-loop-data-enter-exit.c      |  18 +-
 .../goacc/kernels-loop-data-update.c          |  14 +-
 .../c-c++-common/goacc/kernels-loop-data.c    |  13 +-
 .../c-c++-common/goacc/kernels-loop-g.c       |  15 +-
 .../goacc/kernels-loop-mod-not-zero.c         |  11 +-
 .../c-c++-common/goacc/kernels-loop-n.c       |  11 +-
 .../c-c++-common/goacc/kernels-loop-nest.c    |  13 +-
 .../c-c++-common/goacc/kernels-loop.c         |  11 +-
 .../goacc/kernels-one-counter-var.c           |  13 +-
 .../kernels-parallel-loop-data-enter-exit.c   |  18 +-
 .../c-c++-common/goacc/kernels-reduction.c    |   9 +-
 .../c-c++-common/goacc/loop-2-kernels.c       |   6 +-
 .../c-c++-common/goacc/loop-auto-1.c          | 127 +++++------
 .../c-c++-common/goacc/loop-auto-2.c          |  37 +--
 .../c-c++-common/goacc/loop-auto-reductions.c |  22 ++
 .../goacc/nested-reductions-2-parallel.c      | 138 +++++++++++
 .../goacc/note-parallelism-kernels-loops-1.c  |  61 +++++
 .../note-parallelism-kernels-loops-parloops.c |  53 +++++
 .../c-c++-common/goacc/omp_data_optimize-1.c  | 208 ++++++++---------
 .../c-c++-common/goacc/routine-nohost-1.c     |   2 +-
 .../c-c++-common/goacc/uninit-copy-clause.c   |   6 -
 .../g++.dg/goacc/omp_data_optimize-1.C        |  32 +--
 gcc/testsuite/g++.dg/goacc/template.C         |  18 +-
 .../gcc.dg/goacc/loop-processing-1.c          |   9 +-
 .../gcc.dg/goacc/nested-function-1.c          |   3 +-
 .../gfortran.dg/directive_unroll_1.f90        |   2 +-
 .../gfortran.dg/directive_unroll_4.f90        |   2 +-
 ...assify-kernels-unparallelized-parloops.f95 |  44 ++++
 .../goacc/classify-kernels-unparallelized.f95 |  27 +--
 .../gfortran.dg/goacc/classify-kernels.f95    |  21 +-
 .../gfortran.dg/goacc/classify-parallel.f95   |   6 +-
 .../goacc/classify-routine-nohost.f95         |  18 +-
 .../gfortran.dg/goacc/classify-routine.f95    |  20 +-
 .../gfortran.dg/goacc/classify-serial.f95     |   8 +-
 .../gfortran.dg/goacc/common-block-3.f90      |  16 +-
 gcc/testsuite/gfortran.dg/goacc/declare-3.f95 |   2 +-
 .../gfortran.dg/goacc/gang-static.f95         |  14 +-
 .../gfortran.dg/goacc/kernels-conversion.f95  |  52 +++++
 .../gfortran.dg/goacc/kernels-decompose-1.f95 | 186 ++++++++++-----
 .../gfortran.dg/goacc/kernels-decompose-2.f95 | 114 +++++++---
 .../gfortran.dg/goacc/kernels-loop-2.f95      |  11 +-
 .../gfortran.dg/goacc/kernels-loop-data-2.f95 |  11 +-
 .../goacc/kernels-loop-data-enter-exit-2.f95  |  13 +-
 .../goacc/kernels-loop-data-enter-exit.f95    |  13 +-
 .../goacc/kernels-loop-data-update.f95        |  13 +-
 .../gfortran.dg/goacc/kernels-loop-data.f95   |  15 +-
 .../gfortran.dg/goacc/kernels-loop-inner.f95  |   6 +-
 .../gfortran.dg/goacc/kernels-loop-n.f95      |  14 +-
 .../gfortran.dg/goacc/kernels-loop.f95        |  10 +-
 .../kernels-parallel-loop-data-enter-exit.f95 |  13 +-
 .../gfortran.dg/goacc/kernels-reductions.f90  |  37 +++
 .../gfortran.dg/goacc/kernels-tree.f95        |   2 +-
 .../gfortran.dg/goacc/loop-2-kernels.f95      |   6 +-
 .../goacc/loop-auto-transfer-2.f90            |   2 -
 .../goacc/loop-auto-transfer-3.f90            |   8 -
 .../goacc/loop-auto-transfer-4.f90            |  30 ---
 .../gfortran.dg/goacc/nested-function-1.f90   |  12 +-
 .../goacc/nested-reductions-2-parallel.f90    | 177 +++++++++++++++
 .../gfortran.dg/goacc/omp_data_optimize-1.f90 | 214 +++++++++---------
 .../goacc/private-explicit-kernels-1.f95      |  13 +-
 .../goacc/private-predetermined-kernels-1.f95 |  16 +-
 .../goacc/privatization-1-compute-loop.f90    |   3 -
 .../goacc/routine-module-mod-1.f90            |   4 +-
 .../goacc/routine-multiple-directives-1.f90   |  32 +--
 .../gfortran.dg/goacc/uninit-copy-clause.f95  |   2 -
 .../libgomp.oacc-c++/privatized-ref-2.C       |   4 +-
 .../libgomp.oacc-c++/privatized-ref-3.C       |   4 +-
 .../acc_prof-kernels-1.c                      |   4 +-
 .../declare-vla-kernels-decompose-ice-1.c     |   4 -
 .../kernels-decompose-1.c                     |   8 +-
 .../kernels-private-vars-local-worker-1.c     |   6 +-
 .../kernels-private-vars-local-worker-2.c     |   6 +-
 .../kernels-private-vars-local-worker-3.c     |   6 +-
 .../kernels-private-vars-local-worker-4.c     |   8 +-
 .../kernels-private-vars-local-worker-5.c     |   6 +-
 .../kernels-private-vars-loop-gang-1.c        |   4 +-
 .../kernels-private-vars-loop-gang-2.c        |   4 +-
 .../kernels-private-vars-loop-gang-3.c        |   4 +-
 .../kernels-private-vars-loop-gang-4.c        |  15 +-
 .../kernels-private-vars-loop-gang-5.c        |  10 +-
 .../kernels-private-vars-loop-gang-6.c        |   4 +-
 .../kernels-private-vars-loop-vector-1.c      |   6 +-
 .../kernels-private-vars-loop-vector-2.c      |   6 +-
 .../kernels-private-vars-loop-worker-1.c      |   8 +-
 .../kernels-private-vars-loop-worker-2.c      |   6 +-
 .../kernels-private-vars-loop-worker-3.c      |   6 +-
 .../kernels-private-vars-loop-worker-4.c      |   6 +-
 .../kernels-private-vars-loop-worker-5.c      |   9 +-
 .../kernels-private-vars-loop-worker-6.c      |   6 +-
 .../kernels-private-vars-loop-worker-7.c      |   6 +-
 .../libgomp.oacc-c-c++-common/loop-auto-1.c   |  30 ++-
 .../libgomp.oacc-c-c++-common/parallel-dims.c |  39 ++--
 .../libgomp.oacc-c-c++-common/pr84955-1.c     |   1 -
 .../libgomp.oacc-c-c++-common/pr85381-2.c     |   8 +-
 .../libgomp.oacc-c-c++-common/pr85381-3.c     |   8 +-
 .../libgomp.oacc-c-c++-common/pr85381-4.c     |   4 +-
 .../libgomp.oacc-c-c++-common/pr85486-2.c     |   4 +-
 .../libgomp.oacc-c-c++-common/pr85486-3.c     |   4 +-
 .../libgomp.oacc-c-c++-common/pr85486.c       |   4 +-
 .../routine-nohost-1.c                        |   6 +-
 .../vector-length-128-1.c                     |   5 +-
 .../vector-length-128-2.c                     |   6 +-
 .../vector-length-128-3.c                     |   5 +-
 .../vector-length-128-4.c                     |   5 +-
 .../vector-length-128-5.c                     |   5 +-
 .../vector-length-128-6.c                     |   5 +-
 .../vector-length-128-7.c                     |   5 +-
 .../testsuite/libgomp.oacc-fortran/if-1.f90   |  32 +--
 .../kernels-acc-loop-reduction-2.f90          |  12 +-
 .../kernels-private-vars-loop-gang-1.f90      |   4 +-
 .../kernels-private-vars-loop-gang-2.f90      |   4 +-
 .../kernels-private-vars-loop-gang-3.f90      |   4 +-
 .../kernels-private-vars-loop-gang-6.f90      |   5 +-
 .../kernels-private-vars-loop-vector-1.f90    |   4 +-
 .../kernels-private-vars-loop-vector-2.f90    |  11 +-
 .../kernels-private-vars-loop-worker-1.f90    |   6 +-
 .../kernels-private-vars-loop-worker-2.f90    |   4 +-
 .../kernels-private-vars-loop-worker-3.f90    |   4 +-
 .../kernels-private-vars-loop-worker-4.f90    |   4 +-
 .../kernels-private-vars-loop-worker-5.f90    |   7 +-
 .../kernels-private-vars-loop-worker-6.f90    |   4 +-
 .../kernels-private-vars-loop-worker-7.f90    |   6 +-
 .../libgomp.oacc-fortran/optional-private.f90 |   2 -
 .../libgomp.oacc-fortran/pr94358-1.f90        |   2 -
 .../libgomp.oacc-fortran/routine-nohost-1.f90 |   4 +-
 145 files changed, 1697 insertions(+), 1109 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/loop-auto-reductions.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/note-parallelism-kernels-loops-parloops.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-conversion.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-reductions.f90


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: 0040-openacc-Adjust-testsuite-to-new-kernels-handling.patch.gz --]
[-- Type: application/gzip, Size: 41658 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-12-16 12:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-15 15:54 [PATCH 00/40] OpenACC "kernels" Improvements Frederik Harwath
2021-12-15 15:54 ` [PATCH 03/40] Kernels loops annotation: Fortran Frederik Harwath
2021-12-15 15:54 ` [PATCH 04/40] Additional Fortran testsuite fixes for kernels loops annotation pass Frederik Harwath
2021-12-15 15:54 ` [PATCH 06/40] Add a "combined" flag for "acc kernels loop" etc directives Frederik Harwath
2021-12-15 15:54 ` [PATCH 08/40] Annotate inner loops in "acc kernels loop" directives (Fortran) Frederik Harwath
2021-12-15 15:54 ` [PATCH 09/40] Permit calls to builtins and intrinsics in kernels loops Frederik Harwath
2021-12-15 15:54 ` [PATCH 10/40] Fix patterns in Fortran tests for kernels loop annotation Frederik Harwath
2021-12-15 15:54 ` [PATCH 13/40] Fortran: Delinearize array accesses Frederik Harwath
2021-12-16 12:00 ` [PATCH 40/40] openacc: Adjust testsuite to new "kernels" handling Frederik Harwath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).