public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/2] [OpenACC] Kernels loop annotation
@ 2020-09-09 17:53 Sandra Loosemore
  2020-09-09 17:53 ` [PATCH 1/2] [OpenACC] Kernels loops annotation: C and C++ Sandra Loosemore
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Sandra Loosemore @ 2020-09-09 17:53 UTC (permalink / raw)
  To: gcc-patches

This set of patches implements C/C++ and Fortran front end support for
adding "acc loop auto" annotations to loop nests in OpenACC kernels
regions.  For background on this, refer to Thomas Schwinge's talk from
last year's cauldron, at

https://gcc.gnu.org/wiki/cauldron2019talks?action=AttachFile&do=view&target=OpenACC+kernels-cauldron2019.pdf

In particular, pages 20-24 describe this part of the work.  We're
trying to identify loops that might be parallelizable and convert them
to ACC_LOOP tree structures for further analysis, instead of lowering
them to goto form early in compilation, as we do with ordinary
for/while/do loops in C/C++ and DO loops in Fortran.

The C/C++ patches depend on my earlier not-yet-reviewed patch series
to unify the loop tree representations in the two front ends, which I
most recently reposted here:

https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551927.html

These patches have all been previously committed to the OG10 branch to
unblock other work on subsequent processing of the loops with auto
annotations, but without proper review.  Here I've mashed up the
followup bug fixes and incremental improvements I committed to the
branch together with the original patches to simplify review.  If
anyone cares, the corresponding commits on the OG10 branch were:

c96409c1f190e29fd9809890fb16d72556f3b7e6
fdbad20a57e03e05b608f19db41a454bc0cd1c47
6d670e648e76fe44589a42ee458098ff84d24af1
cb581bdb6689d74c1849b7e6bc139c6f122fdcc2
16d59cea8362c1ec731aa5b4db42a2817f036f23
7b436e90a4e03fdce5b0c6a8c452d3f23f1e136b
c2789b61cf29397295f39a43f5d1605ab8a32d87
1c9af55d7ff76e2e6b633af33e6e6991a0ba4c48
20f37fd2f9c8c52fff380982d6fc5eb2d88b3dd9
df5f2065bad30dc6aff9653237157c33fd4161cd

Sandra Loosemore (2):
  Kernels loops annotation: C and C++.
  Kernels loops annotation: Fortran.

 gcc/c-family/c-common.h                            |   1 +
 gcc/c-family/c-omp.c                               | 916 +++++++++++++++++++--
 gcc/c-family/c.opt                                 |   8 +
 gcc/c/c-decl.c                                     |  28 +
 gcc/c/c-parser.c                                   |   3 +
 gcc/cp/decl.c                                      |  44 +
 gcc/cp/parser.c                                    |   3 +
 gcc/cp/semantics.c                                 |   9 +
 gcc/doc/invoke.texi                                |  34 +-
 gcc/fortran/gfortran.h                             |   1 +
 gcc/fortran/lang.opt                               |   8 +
 gcc/fortran/openmp.c                               | 415 ++++++++++
 gcc/fortran/parse.c                                |   9 +
 gcc/fortran/trans-openmp.c                         |  30 +-
 .../goacc/classify-kernels-unparallelized.c        |   1 +
 .../c-c++-common/goacc/classify-kernels.c          |   1 +
 .../c-c++-common/goacc/combined-directives.c       |   2 +-
 .../goacc/kernels-counter-var-redundant-load.c     |   1 +
 .../goacc/kernels-counter-vars-function-scope.c    |   1 +
 .../goacc/kernels-double-reduction-n.c             |   1 +
 .../c-c++-common/goacc/kernels-double-reduction.c  |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c  |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c  |   1 +
 .../c-c++-common/goacc/kernels-loop-annotation-1.c |  26 +
 .../goacc/kernels-loop-annotation-10.c             |  32 +
 .../goacc/kernels-loop-annotation-11.c             |  27 +
 .../goacc/kernels-loop-annotation-12.c             |  28 +
 .../goacc/kernels-loop-annotation-13.c             |  27 +
 .../goacc/kernels-loop-annotation-14.c             |  22 +
 .../goacc/kernels-loop-annotation-15.c             |  22 +
 .../goacc/kernels-loop-annotation-16.c             |  26 +
 .../goacc/kernels-loop-annotation-17.c             |  26 +
 .../goacc/kernels-loop-annotation-18.c             |  18 +
 .../goacc/kernels-loop-annotation-19.c             |  19 +
 .../c-c++-common/goacc/kernels-loop-annotation-2.c |  21 +
 .../goacc/kernels-loop-annotation-20.c             |  23 +
 .../goacc/kernels-loop-annotation-21.c             |  42 +
 .../goacc/kernels-loop-annotation-22.c             |  41 +
 .../c-c++-common/goacc/kernels-loop-annotation-3.c |  24 +
 .../c-c++-common/goacc/kernels-loop-annotation-4.c |  34 +
 .../c-c++-common/goacc/kernels-loop-annotation-5.c |  27 +
 .../c-c++-common/goacc/kernels-loop-annotation-6.c |  27 +
 .../c-c++-common/goacc/kernels-loop-annotation-7.c |  26 +
 .../c-c++-common/goacc/kernels-loop-annotation-8.c |  27 +
 .../c-c++-common/goacc/kernels-loop-annotation-9.c |  26 +
 .../c-c++-common/goacc/kernels-loop-data-2.c       |   1 +
 .../goacc/kernels-loop-data-enter-exit-2.c         |   1 +
 .../goacc/kernels-loop-data-enter-exit.c           |   1 +
 .../c-c++-common/goacc/kernels-loop-data-update.c  |   1 +
 .../c-c++-common/goacc/kernels-loop-data.c         |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c  |   1 +
 .../c-c++-common/goacc/kernels-loop-mod-not-zero.c |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c  |   1 +
 .../c-c++-common/goacc/kernels-loop-nest.c         |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop.c    |   1 +
 .../c-c++-common/goacc/kernels-one-counter-var.c   |   1 +
 .../goacc/kernels-parallel-loop-data-enter-exit.c  |   1 +
 .../c-c++-common/goacc/kernels-reduction.c         |   1 +
 .../goacc/classify-kernels-unparallelized.f95      |   1 +
 .../gfortran.dg/goacc/classify-kernels.f95         |   1 +
 .../gfortran.dg/goacc/combined-directives.f90      |  19 +-
 gcc/testsuite/gfortran.dg/goacc/common-block-3.f90 |   1 +
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 |   1 +
 .../goacc/kernels-loop-annotation-1.f95            |  33 +
 .../goacc/kernels-loop-annotation-10.f95           |  32 +
 .../goacc/kernels-loop-annotation-11.f95           |  34 +
 .../goacc/kernels-loop-annotation-12.f95           |  39 +
 .../goacc/kernels-loop-annotation-13.f95           |  38 +
 .../goacc/kernels-loop-annotation-14.f95           |  35 +
 .../goacc/kernels-loop-annotation-15.f95           |  35 +
 .../goacc/kernels-loop-annotation-16.f95           |  34 +
 .../goacc/kernels-loop-annotation-18.f95           |  28 +
 .../goacc/kernels-loop-annotation-19.f95           |  29 +
 .../goacc/kernels-loop-annotation-2.f95            |  32 +
 .../goacc/kernels-loop-annotation-20.f95           |  26 +
 .../goacc/kernels-loop-annotation-3.f95            |  33 +
 .../goacc/kernels-loop-annotation-4.f95            |  34 +
 .../goacc/kernels-loop-annotation-5.f95            |  35 +
 .../goacc/kernels-loop-annotation-6.f95            |  34 +
 .../goacc/kernels-loop-annotation-7.f95            |  48 ++
 .../goacc/kernels-loop-annotation-8.f95            |  50 ++
 .../goacc/kernels-loop-annotation-9.f95            |  34 +
 .../gfortran.dg/goacc/kernels-loop-data-2.f95      |   1 +
 .../goacc/kernels-loop-data-enter-exit-2.f95       |   1 +
 .../goacc/kernels-loop-data-enter-exit.f95         |   1 +
 .../gfortran.dg/goacc/kernels-loop-data-update.f95 |   1 +
 .../gfortran.dg/goacc/kernels-loop-data.f95        |   1 +
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 |   1 +
 gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95   |   1 +
 .../kernels-parallel-loop-data-enter-exit.f95      |   1 +
 .../goacc/private-explicit-kernels-1.f95           |   7 +-
 .../goacc/private-predetermined-kernels-1.f95      |   7 +-
 gcc/tree.h                                         |   5 +
 93 files changed, 2755 insertions(+), 81 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95

-- 
2.8.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] [OpenACC] Kernels loops annotation: C and C++.
  2020-09-09 17:53 [PATCH 0/2] [OpenACC] Kernels loop annotation Sandra Loosemore
@ 2020-09-09 17:53 ` Sandra Loosemore
  2020-10-22 20:21   ` Jason Merrill
  2020-09-09 17:53 ` [PATCH 2/2] [OpenACC] Kernels loops annotation: Fortran Sandra Loosemore
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Sandra Loosemore @ 2020-09-09 17:53 UTC (permalink / raw)
  To: gcc-patches

This patch detects loops in kernels regions that are candidates for
parallelization, and adds "#pragma acc loop auto" annotations to them.
This annotation is controlled by the -fopenacc-kernels-annotate-loops
option, which is enabled by default.  -Wopenacc-kernels-annotate-loops
can be used to produce diagnostics about loops that cannot be
annotated.

2020-09-08  Sandra Loosemore  <sandra@codesourcery.com>

	gcc/c-family/
	* c-common.h (c_oacc_annotate_loops_in_kernels_regions): Declare.
	* c-omp.c: Include tree-iterator.h.
	(enum annotation_state): New.
	(struct annotation_info): New.
	(do_not_annotate_loop): New.
	(do_not_annotate_loop_nest): New.
	(annotation_error): New.
	(c_finish_omp_for_internal): New.
	(c_finish_omp_for): Use c_finish_omp_for_internal.
	(is_local_var): New.
	(end_test_ok_for_annotation_r): New.
	(end_test_ok_for_annotation): New.
	(lang_specific_unwrap_initializer): New.
	(annotate_for_loop): New.
	(annotate_and_check_for_loop): New.
	(annotate_loops_in_kernels_regions): New.
	(c_oacc_annotate_loops_in_kernels_regions): New.
	* c.opt (Wopenacc-kernels-annotate-loops): New.
	(fopenacc-kernels-annotate-loops): New.

	gcc/c/
	* c-decl.c (c_unwrap_for_init): New.
	(finish_function): Call c_oacc_annotate_loops_in_kernels_regions.
	* c-parser.c (c_parser_oacc_loop): Set OACC_LOOP_COMBINED.

	gcc/cp/
	* decl.c (cp_unwrap_for_init): New.
	(finish_function): Call c_oacc_annotate_loops_in_kernels_regions.
	* parser.c (cp_parser_oacc_loop): Set OACC_LOOP_COMBINED.
	* semantics.c (handle_omp_array_sections_1): Call STRIP_NOPS
	on length and bound.
	(handle_omp_array_sections): Likewise.

	gcc/
	* doc/invoke.texi (Option Summary): Add entries for
	-Wopenacc-kernels-annotate-loops and
	-fno-openacc-kernels-annotate-loops.
	(Warning Options): Document -Wopenacc-kernels-annotate-loops.
	(Optimization Options): Document
	-fno-openacc-kernels-annotate-loops.
	* tree.h (OACC_LOOP_COMBINED): New.

	gcc/testsuite/
	* c-c++-common/goacc/classify-kernels-unparallelized.c: Add
	-fno-openacc-kernels-annotate-loops option.
	* c-c++-common/goacc/classify-kernels.c: Likewise.
	* c-c++-common/goacc/combined-directives.c: Likewise.
	* c-c++-common/goacc/kernels-counter-var-redundant-load.c:
	Likewise.
	* c-c++-common/goacc/kernels-counter-vars-function-scope.c:
	Likewise.
	* c-c++-common/goacc/kernels-double-reduction-n.c: Likewise.
	* c-c++-common/goacc/kernels-double-reduction.c: Likewise.
	* c-c++-common/goacc/kernels-loop-2.c: Likewise.
	* c-c++-common/goacc/kernels-loop-3.c: Likewise.
	* c-c++-common/goacc/kernels-loop-data-2.c: Likewise.
	* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise.
	* c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise.
	* c-c++-common/goacc/kernels-loop-data-update.c: Likewise.
	* c-c++-common/goacc/kernels-loop-data.c: Likewise.
	* c-c++-common/goacc/kernels-loop-g.c: Likewise.
	* c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
	* c-c++-common/goacc/kernels-loop-n.c: Likewise.
	* c-c++-common/goacc/kernels-loop-nest.c: Likewise.
	* c-c++-common/goacc/kernels-loop.c: Likewise.
	* c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
	* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c:
	Likewise.
	* c-c++-common/goacc/kernels-reduction.c: Likewise.
	* c-c++-common/goacc/kernels-loop-annotation-1.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-2.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-3.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-4.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-5.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-6.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-7.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-8.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-9.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-10.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-11.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-12.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-13.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-14.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-15.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-16.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-17.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-18.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-19.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-20.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-21.c: New.
	* c-c++-common/goacc/kernels-loop-annotation-22.c: New.
---
 gcc/c-family/c-common.h                            |   1 +
 gcc/c-family/c-omp.c                               | 916 +++++++++++++++++++--
 gcc/c-family/c.opt                                 |   8 +
 gcc/c/c-decl.c                                     |  28 +
 gcc/c/c-parser.c                                   |   3 +
 gcc/cp/decl.c                                      |  44 +
 gcc/cp/parser.c                                    |   3 +
 gcc/cp/semantics.c                                 |   9 +
 gcc/doc/invoke.texi                                |  34 +-
 .../goacc/classify-kernels-unparallelized.c        |   1 +
 .../c-c++-common/goacc/classify-kernels.c          |   1 +
 .../c-c++-common/goacc/combined-directives.c       |   2 +-
 .../goacc/kernels-counter-var-redundant-load.c     |   1 +
 .../goacc/kernels-counter-vars-function-scope.c    |   1 +
 .../goacc/kernels-double-reduction-n.c             |   1 +
 .../c-c++-common/goacc/kernels-double-reduction.c  |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c  |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c  |   1 +
 .../c-c++-common/goacc/kernels-loop-annotation-1.c |  26 +
 .../goacc/kernels-loop-annotation-10.c             |  32 +
 .../goacc/kernels-loop-annotation-11.c             |  27 +
 .../goacc/kernels-loop-annotation-12.c             |  28 +
 .../goacc/kernels-loop-annotation-13.c             |  27 +
 .../goacc/kernels-loop-annotation-14.c             |  22 +
 .../goacc/kernels-loop-annotation-15.c             |  22 +
 .../goacc/kernels-loop-annotation-16.c             |  26 +
 .../goacc/kernels-loop-annotation-17.c             |  26 +
 .../goacc/kernels-loop-annotation-18.c             |  18 +
 .../goacc/kernels-loop-annotation-19.c             |  19 +
 .../c-c++-common/goacc/kernels-loop-annotation-2.c |  21 +
 .../goacc/kernels-loop-annotation-20.c             |  23 +
 .../goacc/kernels-loop-annotation-21.c             |  42 +
 .../goacc/kernels-loop-annotation-22.c             |  41 +
 .../c-c++-common/goacc/kernels-loop-annotation-3.c |  24 +
 .../c-c++-common/goacc/kernels-loop-annotation-4.c |  34 +
 .../c-c++-common/goacc/kernels-loop-annotation-5.c |  27 +
 .../c-c++-common/goacc/kernels-loop-annotation-6.c |  27 +
 .../c-c++-common/goacc/kernels-loop-annotation-7.c |  26 +
 .../c-c++-common/goacc/kernels-loop-annotation-8.c |  27 +
 .../c-c++-common/goacc/kernels-loop-annotation-9.c |  26 +
 .../c-c++-common/goacc/kernels-loop-data-2.c       |   1 +
 .../goacc/kernels-loop-data-enter-exit-2.c         |   1 +
 .../goacc/kernels-loop-data-enter-exit.c           |   1 +
 .../c-c++-common/goacc/kernels-loop-data-update.c  |   1 +
 .../c-c++-common/goacc/kernels-loop-data.c         |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c  |   1 +
 .../c-c++-common/goacc/kernels-loop-mod-not-zero.c |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c  |   1 +
 .../c-c++-common/goacc/kernels-loop-nest.c         |   1 +
 gcc/testsuite/c-c++-common/goacc/kernels-loop.c    |   1 +
 .../c-c++-common/goacc/kernels-one-counter-var.c   |   1 +
 .../goacc/kernels-parallel-loop-data-enter-exit.c  |   1 +
 .../c-c++-common/goacc/kernels-reduction.c         |   1 +
 gcc/tree.h                                         |   5 +
 54 files changed, 1603 insertions(+), 62 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 6abfe4b..d7938ba 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1220,6 +1220,7 @@ extern enum omp_clause_default_kind c_omp_predetermined_sharing (tree);
 extern enum omp_clause_defaultmap_kind c_omp_predetermined_mapping (tree);
 extern tree c_omp_check_context_selector (location_t, tree);
 extern void c_omp_mark_declare_variant (location_t, tree, tree);
+extern void c_oacc_annotate_loops_in_kernels_regions (tree, tree (*) (tree));
 extern const char *c_omp_map_clause_name (tree, bool);
 
 /* Return next tree in the chain for chain_next walking of tree nodes.  */
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index d7cff0f..3c86f3f 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -35,7 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "gimplify.h"
 #include "langhooks.h"
-
+#include "tree-iterator.h"
 
 /* Complete a #pragma oacc wait construct.  LOC is the location of
    the #pragma.  */
@@ -694,6 +694,110 @@ c_omp_for_incr_canonicalize_ptr (location_t loc, tree decl, tree incr)
   return incr;
 }
 
+/* State of annotation traversal for FOR loops in kernels regions,
+   used to control processing and diagnostic messages that are deferred until
+   the entire loop has been scanned.  */
+enum annotation_state {
+  as_outer,
+  as_in_kernels_region,
+  as_in_kernels_loop,
+  /* The remaining state values represent conversion failures caught
+     while in as_in_kernels_loop state.  To test whether the traversal is
+     in the body of a kernels loop, use (state >= as_in_kernels_loop).  */
+  as_invalid_variable_type,
+  as_missing_initializer,
+  as_invalid_initializer,
+  as_missing_predicate,
+  as_invalid_predicate,
+  as_missing_increment,
+  as_invalid_increment,
+  as_explicit_annotation,
+  as_invalid_control_flow,
+  as_invalid_break,
+  as_invalid_return,
+  as_invalid_call,
+  as_invalid_modification
+};
+
+/* Structure used to hold state for automatic annotation of FOR loops
+   in kernels regions.  LOOP is the nearest enclosing loop, or
+   NULL_TREE if outside of a loop context.  VARS is a tree_list
+   containing the variables controlling LOOP's termination (the
+   induction variable and a possible limit variable).  STATE keeps
+   track of whether loop satisfies all criteria making it legal to
+   parallelize.  Otherwise, REASON is a statement that blocks
+   automatic parallelization, such as an unstructured jump or an
+   assignment to a variable in VARS, used for printing diagnostics.
+
+   These structures are chained through NEXT, which points to the
+   next-closest enclosing loop's or the kernels region's annotation info, if
+   any.  */
+
+struct annotation_info
+{
+  tree loop;
+  tree vars;
+  bool break_ok;
+  enum annotation_state state;
+  tree reason;
+  struct annotation_info *next;
+};
+
+/* Mark the current loop's INFO as not OK to annotate, recording STATE
+   and REASON for producing diagnostics later.  */
+
+static void
+do_not_annotate_loop (struct annotation_info *info,
+		      enum annotation_state state, tree reason)
+{
+  if (info->state == as_in_kernels_loop)
+    {
+      info->state = state;
+      info->reason = reason;
+    }
+}
+
+/* Mark the current loop identified by INFO and all of its ancestors (i.e.,
+   enclosing loops) as not OK to annotate.  Arguments are the same as
+   for do_not_annotate_loop.  */
+
+static void
+do_not_annotate_loop_nest (struct annotation_info *info,
+			   enum annotation_state state, tree reason)
+{
+  while (info != NULL)
+    {
+      do_not_annotate_loop (info, state, reason);
+      info = info->next;
+    }
+}
+
+/* If INFO is non-null, call do_not_annotate_loop with STATE and REASON
+   to record info for diagnosing an error later.  Otherwise emit an error now
+   at ELOCUS with message MSG and the optional arguments.  */
+
+static void annotation_error (struct annotation_info *,
+			      enum annotation_state, tree, location_t,
+			      const char *, ...) ATTRIBUTE_GCC_DIAG(5,6);
+static
+void annotation_error (struct annotation_info *info,
+			      enum annotation_state state,
+			      tree reason,
+			      location_t elocus,
+			      const char *msg, ...)
+{
+  if (info)
+    do_not_annotate_loop (info, state, reason);
+  else
+    {
+      auto_diagnostic_group d;
+      va_list ap;
+      va_start (ap, msg);
+      emit_diagnostic_valist (DK_ERROR, elocus, -1, msg, &ap);
+      va_end (ap);
+    }
+}
+
 /* Validate and generate OMP_FOR.
    DECLV is a vector of iteration variables, for each collapsed loop.
 
@@ -703,12 +807,19 @@ c_omp_for_incr_canonicalize_ptr (location_t loc, tree decl, tree incr)
    INITV, CONDV and INCRV are vectors containing initialization
    expressions, controlling predicates and increment expressions.
    BODY is the body of the loop and PRE_BODY statements that go before
-   the loop.  */
+   the loop.  FINAL_P is true if not inside a C++ template.
 
-tree
-c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
-		  tree orig_declv, tree initv, tree condv, tree incrv,
-		  tree body, tree pre_body, bool final_p)
+   INFO is null if called to parse an explicitly-annotated OMP for
+   loop, otherwise it holds state information for automatically
+   annotating a regular FOR loop in a kernels region.  In the former case,
+   malformed loops are hard errors; otherwise we just record the annotation
+   failure in INFO.  */
+
+static tree
+c_finish_omp_for_internal (location_t locus, enum tree_code code, tree declv,
+			   tree orig_declv, tree initv, tree condv, tree incrv,
+			   tree body, tree pre_body, bool final_p,
+			   struct annotation_info *info)
 {
   location_t elocus;
   bool fail = false;
@@ -732,12 +843,14 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
       if (!INTEGRAL_TYPE_P (TREE_TYPE (decl))
 	  && TREE_CODE (TREE_TYPE (decl)) != POINTER_TYPE)
 	{
-	  error_at (elocus, "invalid type for iteration variable %qE", decl);
+	  annotation_error (info, as_invalid_variable_type, decl, elocus,
+			    "invalid type for iteration variable %qE", decl);
 	  fail = true;
 	}
       else if (TYPE_ATOMIC (TREE_TYPE (decl)))
 	{
-	  error_at (elocus, "%<_Atomic%> iteration variable %qE", decl);
+	  annotation_error (info, as_invalid_variable_type, decl, elocus,
+			    "%<_Atomic%> iteration variable %qE", decl);
 	  fail = true;
 	  /* _Atomic iterator confuses stuff too much, so we risk ICE
 	     trying to diagnose it further.  */
@@ -753,7 +866,8 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 	  init = DECL_INITIAL (decl);
 	  if (init == NULL)
 	    {
-	      error_at (elocus, "%qE is not initialized", decl);
+	      annotation_error (info, as_missing_initializer, decl, elocus,
+				"%qE is not initialized", decl);
 	      init = integer_zero_node;
 	      fail = true;
 	    }
@@ -774,7 +888,8 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 
       if (cond == NULL_TREE)
 	{
-	  error_at (elocus, "missing controlling predicate");
+	  annotation_error (info, as_missing_predicate, NULL_TREE, elocus,
+			    "missing controlling predicate");
 	  fail = true;
 	}
       else
@@ -790,12 +905,14 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 	  if (EXPR_HAS_LOCATION (cond))
 	    elocus = EXPR_LOCATION (cond);
 
-	  if (TREE_CODE (cond) == LT_EXPR
-	      || TREE_CODE (cond) == LE_EXPR
-	      || TREE_CODE (cond) == GT_EXPR
-	      || TREE_CODE (cond) == GE_EXPR
-	      || TREE_CODE (cond) == NE_EXPR
-	      || TREE_CODE (cond) == EQ_EXPR)
+	  enum tree_code condcode = TREE_CODE (cond);
+
+	  if (condcode == LT_EXPR
+	      || condcode == LE_EXPR
+	      || condcode == GT_EXPR
+	      || condcode == GE_EXPR
+	      || condcode == NE_EXPR
+	      || condcode == EQ_EXPR)
 	    {
 	      tree op0 = TREE_OPERAND (cond, 0);
 	      tree op1 = TREE_OPERAND (cond, 1);
@@ -815,79 +932,88 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 	      if (TREE_CODE (op0) == NOP_EXPR
 		  && decl == TREE_OPERAND (op0, 0))
 		{
-		  TREE_OPERAND (cond, 0) = TREE_OPERAND (op0, 0);
-		  TREE_OPERAND (cond, 1)
-		    = fold_build1_loc (elocus, NOP_EXPR, TREE_TYPE (decl),
-				   TREE_OPERAND (cond, 1));
+		  op0 = TREE_OPERAND (op0, 0);
+		  op1 = fold_build1_loc (elocus, NOP_EXPR, TREE_TYPE (decl),
+					 op1);
 		}
 	      else if (TREE_CODE (op1) == NOP_EXPR
 		       && decl == TREE_OPERAND (op1, 0))
 		{
-		  TREE_OPERAND (cond, 1) = TREE_OPERAND (op1, 0);
-		  TREE_OPERAND (cond, 0)
-		    = fold_build1_loc (elocus, NOP_EXPR, TREE_TYPE (decl),
-				   TREE_OPERAND (cond, 0));
+		  op1 = TREE_OPERAND (op1, 0);
+		  op0 = fold_build1_loc (elocus, NOP_EXPR, TREE_TYPE (decl),
+					 op0);
 		}
 
-	      if (decl == TREE_OPERAND (cond, 0))
+	      if (decl == op0)
 		cond_ok = true;
-	      else if (decl == TREE_OPERAND (cond, 1))
+	      else if (decl == op1)
 		{
-		  TREE_SET_CODE (cond,
-				 swap_tree_comparison (TREE_CODE (cond)));
-		  TREE_OPERAND (cond, 1) = TREE_OPERAND (cond, 0);
-		  TREE_OPERAND (cond, 0) = decl;
+		  condcode = swap_tree_comparison (condcode);
+		  op1 = op0;
+		  op0 = decl;
 		  cond_ok = true;
 		}
 
-	      if (TREE_CODE (cond) == NE_EXPR
-		  || TREE_CODE (cond) == EQ_EXPR)
+	      if (condcode == NE_EXPR || condcode == EQ_EXPR)
 		{
 		  if (!INTEGRAL_TYPE_P (TREE_TYPE (decl)))
 		    {
-		      if (code == OACC_LOOP || TREE_CODE (cond) == EQ_EXPR)
+		      if (code == OACC_LOOP || condcode == EQ_EXPR)
 			cond_ok = false;
 		    }
-		  else if (operand_equal_p (TREE_OPERAND (cond, 1),
+		  else if (operand_equal_p (op1,
 					    TYPE_MIN_VALUE (TREE_TYPE (decl)),
 					    0))
-		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
-					 ? GT_EXPR : LE_EXPR);
-		  else if (operand_equal_p (TREE_OPERAND (cond, 1),
+		    condcode = (condcode == NE_EXPR ? GT_EXPR : LE_EXPR);
+		  else if (operand_equal_p (op1,
 					    TYPE_MAX_VALUE (TREE_TYPE (decl)),
 					    0))
-		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
-					 ? LT_EXPR : GE_EXPR);
-		  else if (code == OACC_LOOP || TREE_CODE (cond) == EQ_EXPR)
+		    condcode = (condcode == NE_EXPR ? LT_EXPR : GE_EXPR);
+		  else if (code == OACC_LOOP || condcode == EQ_EXPR)
 		    cond_ok = false;
 		}
 
-	      if (cond_ok && TREE_VEC_ELT (condv, i) != cond)
+	      if (cond_ok)
 		{
-		  tree ce = NULL_TREE, *pce = &ce;
-		  tree type = TREE_TYPE (TREE_OPERAND (cond, 1));
-		  for (tree c = TREE_VEC_ELT (condv, i); c != cond;
-		       c = TREE_OPERAND (c, 1))
+		  /* We postponed destructive changes to canonicalize
+		     cond until we're sure it is OK.  In the !error_p
+		     case where we are trying to transform a regular FOR_STMT
+		     to OMP_FOR, we don't want to destroy the original
+		     condition if we aren't going to be able to do the
+		     transformation anyway.  */
+		  TREE_SET_CODE (cond, condcode);
+		  TREE_OPERAND (cond, 0) = op0;
+		  TREE_OPERAND (cond, 1) = op1;
+
+		  if (TREE_VEC_ELT (condv, i) != cond)
 		    {
-		      *pce = build2 (COMPOUND_EXPR, type, TREE_OPERAND (c, 0),
-				     TREE_OPERAND (cond, 1));
-		      pce = &TREE_OPERAND (*pce, 1);
+		      tree ce = NULL_TREE, *pce = &ce;
+		      tree type = TREE_TYPE (op1);
+		      for (tree c = TREE_VEC_ELT (condv, i); c != cond;
+			   c = TREE_OPERAND (c, 1))
+			{
+			  *pce = build2 (COMPOUND_EXPR, type,
+					 TREE_OPERAND (c, 0), op1);
+			  pce = &TREE_OPERAND (*pce, 1);
+			}
+		      op1 = ce;
+		      TREE_VEC_ELT (condv, i) = cond;
 		    }
-		  TREE_OPERAND (cond, 1) = ce;
-		  TREE_VEC_ELT (condv, i) = cond;
 		}
 	    }
 
 	  if (!cond_ok)
 	    {
-	      error_at (elocus, "invalid controlling predicate");
+	      annotation_error (info, as_invalid_predicate, cond, elocus,
+				"invalid controlling predicate");
 	      fail = true;
 	    }
 	}
 
       if (incr == NULL_TREE)
 	{
-	  error_at (elocus, "missing increment expression");
+	  annotation_error (info, as_missing_increment, NULL_TREE, elocus,
+			    "missing increment expression");
 	  fail = true;
 	}
       else
@@ -986,9 +1112,11 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 			      if (i == NULL_TREE
 				  || !operand_equal_p (unit, i, 0))
 				{
-				  error_at (elocus,
-					    "increment is not constant 1 or "
-					    "-1 for %<!=%> condition");
+				  annotation_error (info,
+						    as_invalid_increment,
+						    incr, elocus,
+						    "increment is not constant 1 or "
+						    "-1 for %<!=%> condition");
 				  fail = true;
 				}
 			    }
@@ -1004,9 +1132,10 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 		    {
 		      if (!integer_onep (i) && !integer_minus_onep (i))
 			{
-			  error_at (elocus,
-				    "increment is not constant 1 or -1 for"
-				    " %<!=%> condition");
+			  annotation_error (info, as_invalid_increment,
+					    incr, elocus,
+					    "increment is not constant 1 or -1 for"
+					    " %<!=%> condition");
 			  fail = true;
 			}
 		    }
@@ -1018,7 +1147,8 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
 	    }
 	  if (!incr_ok)
 	    {
-	      error_at (elocus, "invalid increment expression");
+	      annotation_error (info, as_invalid_increment, incr,
+				elocus, "invalid increment expression");
 	      fail = true;
 	    }
 	}
@@ -1046,6 +1176,20 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
     }
 }
 
+/* External entry point to c_finish_omp_for_internal, called from the
+   parsers.  See above for description of the arguments.  */
+
+tree
+c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
+		  tree orig_declv, tree initv, tree condv, tree incrv,
+		  tree body, tree pre_body, bool final_p)
+{
+  return c_finish_omp_for_internal (locus, code, declv,
+				    orig_declv, initv, condv, incrv,
+				    body, pre_body, final_p, NULL);
+}
+
+
 /* Type for passing data in between c_omp_check_loop_iv and
    c_omp_check_loop_iv_r.  */
 
@@ -2579,3 +2723,657 @@ c_omp_map_clause_name (tree clause, bool oacc)
     }
   return omp_clause_code_name[OMP_CLAUSE_CODE (clause)];
 }
+
+
+/* The following functions implement automatic recognition and annotation of
+   for loops in OpenACC kernels regions.  Inside a kernels region, a nest of
+   for loops that does not contain any annotated OpenACC loops, nor break
+   or goto statements or assignments to the variables controlling loop
+   termination, is converted to an OMP_FOR node with an "acc loop auto"
+   annotation on each loop.  This feature is controlled by
+   flag_openacc_kernels_annotate_loops.  */
+
+/* Check whether DECL is the declaration of a local variable (or function
+   parameter) of integral type that does not have its address taken.  */
+
+static bool
+is_local_var (tree decl)
+{
+  return ((TREE_CODE (decl) == VAR_DECL || TREE_CODE (decl) == PARM_DECL)
+	  && DECL_CONTEXT (decl) != NULL
+	  && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL
+	  && INTEGRAL_TYPE_P (TREE_TYPE (decl))
+	  && !TREE_ADDRESSABLE (decl));
+}
+
+/* EXP is a loop bound expression for a comparison against local
+   variable DECL.  Check whether this is potentially valid in an OpenACC loop
+   context, namely that it can be precomputed when entering the loop
+   construct per the OpenACC specification.  Local variables referenced
+   in both DECL and EXP that may not be modified in the body of the loop
+   are added to the list in INFO to be checked later.
+
+   FIXME: Ideally we would like to make this test permissive rather than
+   restrictive, and allow the later conversion of the "auto" attribute to
+   either "seq" or "independent" to make the determination using dataflow,
+   alias analysis, etc rather than a tree traversal.  But presently it does
+   not do that and always just hoists the loop bound expression.  So the
+   current implementation only considers expressions involving unmodified
+   local variables and constants, using a tree walk.  */
+
+static tree
+end_test_ok_for_annotation_r (tree *tp, int *walk_subtrees,
+			      void *data)
+{
+  tree exp = *tp;
+  struct annotation_info *info = (struct annotation_info *) data;
+
+  switch (TREE_CODE_CLASS (TREE_CODE (exp)))
+    {
+    case tcc_constant:
+      /* Constants are trivially known to be invariant.  */
+      return NULL_TREE;
+
+    case tcc_declaration:
+      if (is_local_var (exp))
+	{
+	  tree t;
+	  /* Add it to the list of variables that can't be modified in the
+	     loop, only if not already present.  */
+	  for (t = info->vars; t && TREE_VALUE (t) != exp;
+	       t = TREE_CHAIN (t))
+	    ;
+	  if (!t)
+	    info->vars = tree_cons (NULL_TREE, exp, info->vars);
+	  return NULL_TREE;
+	}
+      else if (TREE_CODE (exp) == VAR_DECL && TREE_READONLY (exp))
+	return NULL_TREE;
+      else if (TREE_CODE (exp) == FUNCTION_DECL)
+	return NULL_TREE;
+      break;
+
+    case tcc_unary:
+    case tcc_binary:
+    case tcc_comparison:
+      /* Allow arithmetic expressions and comparisons provided
+	 that the operands are good.  */
+      return NULL_TREE;
+
+    default:
+      /* Handle some special cases.  */
+      switch (TREE_CODE (exp))
+	{
+	case COND_EXPR:
+	case TRUTH_ANDIF_EXPR:
+	case TRUTH_ORIF_EXPR:
+	case TRUTH_AND_EXPR:
+	case TRUTH_OR_EXPR:
+	case TRUTH_XOR_EXPR:
+	case TRUTH_NOT_EXPR:
+	  /* ?: and boolean operators are OK.  */
+	  return NULL_TREE;
+
+	case CALL_EXPR:
+	  /* Allow calls to constant functions with invariant operands.  */
+	  {
+	    tree fndecl = get_callee_fndecl (exp);
+	    if (fndecl && TREE_READONLY (fndecl))
+	      return NULL_TREE;
+	  }
+	  break;
+
+	case ADDR_EXPR:
+	  /* We can expect addresses of things to be invariant.  */
+	  return NULL_TREE;
+
+	default:
+	  break;
+	}
+    }
+
+  /* Reject anything else.  */
+  *walk_subtrees = 0;
+  return exp;
+}
+
+static bool
+end_test_ok_for_annotation (tree decl, tree exp,
+			    struct annotation_info *info)
+{
+  /* Traversal returns NULL_TREE if all is well.  */
+  if (!walk_tree (&exp, end_test_ok_for_annotation_r, info, NULL))
+    {
+      /* So far, so good.  Check the decl against any variables collected
+	 in the exp.  */
+      tree t;
+      for (t = info->vars; t; t = TREE_CHAIN (t))
+	if (TREE_VALUE (t) == decl)
+	  return false;
+      info->vars = tree_cons (NULL_TREE, decl, info->vars);
+      return true;
+    }
+  return false;
+}
+
+/* The initializer for a FOR_STMT is sometimes wrapped in various other
+   language-specific tree structures.  We need a hook to unwrap them.
+   This function takes a tree argument and should return either a
+   MODIFY_EXPR, VAR_DECL, or NULL_TREE.  */
+
+static tree (*lang_specific_unwrap_initializer) (tree);
+
+/* Try to annotate the given NODE, which must be a FOR_STMT, with a
+   "#pragma acc loop auto" annotation.  In practice, this means
+   building an OMP_FOR node for it.  DECL and INIT are the
+   previously-verified iteration variable and initializer.  Annotating
+   the loop may fail, in which case INFO is used to record the cause
+   of the failure and the original loop remains unchanged.  This
+   function returns the transformed loop if the transformation
+   succeeded, the original node otherwise.  */
+
+static tree
+annotate_for_loop (tree node, tree decl, tree init,
+		   struct annotation_info *info)
+{
+  gcc_checking_assert (TREE_CODE (node) == FOR_STMT);
+
+  location_t loc = EXPR_LOCATION (node);
+  tree cond = FOR_COND (node);
+  tree incr = FOR_EXPR (node);
+
+  gcc_assert (decl);
+  gcc_assert (cond);
+  gcc_assert (decl && TREE_CODE (decl) == VAR_DECL);
+
+  /* The C++ frontend can wrap the increment two levels deep inside a
+     cleanup expression, but c_finish_omp_for does not care about that.  */
+  if (incr != NULL_TREE && TREE_CODE (incr) == CLEANUP_POINT_EXPR)
+    incr = TREE_OPERAND (TREE_OPERAND (incr, 0), 0);
+  tree body = FOR_BODY (node);
+
+  tree declv = make_tree_vec (1);
+  tree initv = make_tree_vec (1);
+  tree condv = make_tree_vec (1);
+  tree incrv = make_tree_vec (1);
+  TREE_VEC_ELT (declv, 0) = decl;
+  TREE_VEC_ELT (initv, 0) = init;
+  TREE_VEC_ELT (condv, 0) = cond;
+  TREE_VEC_ELT (incrv, 0) = incr;
+
+  /* Do the actual transformation.  This can still fail because
+     c_finish_omp_for has some stricter checks than we have performed up to
+     this point.  */
+  tree omp_for = c_finish_omp_for_internal (loc, OACC_LOOP, declv, NULL_TREE,
+					    initv, condv, incrv, body,
+					    NULL_TREE, false, info);
+  if (omp_for != NULL_TREE)
+    {
+      /* Add an auto clause, then return the new loop.  */
+      tree auto_clause = build_omp_clause (loc, OMP_CLAUSE_AUTO);
+      OMP_CLAUSE_CHAIN (auto_clause) = OMP_FOR_CLAUSES (omp_for);
+      OMP_FOR_CLAUSES (omp_for) = auto_clause;
+      return omp_for;
+    }
+
+  return node;
+}
+
+/* Forward declaration.  */
+static tree annotate_loops_in_kernels_regions (tree *, int *, void *);
+
+/* Given a FOR_STMT NODE that is a candidate for parallelization, check its
+   body for validity, then try to annotate it with
+   "#pragma oacc loop auto", possibly modifying the current node in place.
+   The INFO argument contains the traversal state at the point the loop
+   appears.  */
+
+static void
+check_and_annotate_for_loop (tree *nodeptr, tree_stmt_iterator *prev_tsi,
+			     struct annotation_info *info)
+{
+  tree node = *nodeptr;
+  gcc_assert (TREE_CODE (node) == FOR_STMT);
+  tree init = FOR_INIT_STMT (node);
+  tree cond = FOR_COND (node);
+  tree prev_stmt = NULL_TREE;
+  tree decl = NULL_TREE;
+  bool unlink_prev = false;
+  bool fix_decl = false;
+
+  /* This structure describes the current loop statement.  */
+  struct annotation_info loop_info
+    = { node, NULL_TREE, false, as_in_kernels_loop, NULL_TREE, info };
+
+  /* If we are in the body of an explicitly-annotated loop, do not add
+     annotations to this loop or any other nested loops.  */
+  if (info->state == as_explicit_annotation)
+    do_not_annotate_loop (&loop_info, as_explicit_annotation, info->reason);
+
+  /* We need to find the controlling variable for the loop in order
+     to detect whether it is modified in the body of the loop.
+     That is why we are doing some checks on the loop condition
+     that duplicate what c_finish_omp_for is doing.  */
+
+  /* First we need to find the decl and initializer for the
+     controlling variable.  Both the C and C++ front ends normally put
+     the initializer in the statement list just before the FOR_STMT
+     instead of in FOR_INIT_STMT.  If FOR_INIT_STMT happens to exist
+     but isn't a MODIFY_EXPR, give up.
+     handle it.  */
+
+  else if (init != NULL_TREE && TREE_CODE (init) != MODIFY_EXPR)
+    do_not_annotate_loop (&loop_info, as_invalid_initializer, NULL_TREE);
+
+  /* Examine the statement before the loop to see if it is a
+     valid initializer.  It must be either a MODIFY_EXPR or VAR_DECL,
+     possibly wrapped in language-specific structure.  */
+  else if (init == NULL_TREE && prev_tsi != NULL && tsi_stmt (*prev_tsi))
+    {
+      prev_stmt = tsi_stmt (*prev_tsi);
+
+      /* Call the language-specific hook to unwrap prev_stmt.  */
+      prev_stmt = (*lang_specific_unwrap_initializer) (prev_stmt);
+
+      /* See if we have a valid MODIFY_EXPR.  */
+      if (TREE_CODE (prev_stmt) == MODIFY_EXPR
+	  && is_local_var (TREE_OPERAND (prev_stmt, 0))
+	  && !TREE_SIDE_EFFECTS (TREE_OPERAND (prev_stmt, 1)))
+	{
+	  decl = TREE_OPERAND (prev_stmt, 0);
+	  init = prev_stmt;
+	  unlink_prev = true;
+	}
+      else if (is_local_var (prev_stmt)
+	       && !TREE_SIDE_EFFECTS (DECL_INITIAL (prev_stmt)))
+	{
+	  /* If the preceding statement is the declaration of the loop
+	     variable with its initialization, build an assignment
+	     expression for the loop's initializer.  */
+	  decl = prev_stmt;
+	  init = build2 (MODIFY_EXPR, TREE_TYPE (decl), decl,
+			 DECL_INITIAL (decl));
+	  /* We need to remove the initializer from the decl if we
+	     end up using the init we just built instead.  */
+	  fix_decl = true;
+	}
+    }
+
+  if (init == NULL_TREE || decl == NULL_TREE)
+    /* There is nothing we can do to find the correct init statement for
+       this loop.  */
+    do_not_annotate_loop (&loop_info, as_missing_initializer, NULL_TREE);
+
+  /* The condition must be a comparison of the decl we found in
+     the initializer against an expression that can be hoisted
+     outside the loop.  */
+  if (loop_info.state > as_in_kernels_loop)
+    /* Skip validating condition if we've already got an error.  */
+    ;
+  else if (cond == NULL_TREE)
+    do_not_annotate_loop (&loop_info, as_missing_predicate, NULL_TREE);
+  else if (TREE_CODE_CLASS (TREE_CODE (cond)) != tcc_comparison)
+    do_not_annotate_loop (&loop_info, as_invalid_predicate, cond);
+  else
+    {
+      tree limit_exp = NULL_TREE;
+
+      if (TREE_OPERAND (cond, 0) == decl)
+	limit_exp = TREE_OPERAND (cond, 1);
+      else if (TREE_OPERAND (cond, 1) == decl)
+	limit_exp = TREE_OPERAND (cond, 0);
+
+      if (!limit_exp
+	  || !end_test_ok_for_annotation (decl, limit_exp, &loop_info))
+	do_not_annotate_loop (&loop_info, as_invalid_predicate, cond);
+    }
+
+  /* Walk the body.  This will process any nested loops, so we have to do it
+     even if we have already rejected this loop as a candidate for
+     annotation.  */
+  walk_tree (&FOR_BODY (node), annotate_loops_in_kernels_regions,
+	     (void *) &loop_info, NULL);
+
+  if (loop_info.state == as_in_kernels_loop)
+    {
+      /* If the traversal of the loop and all nested loops didn't hit
+	 any problems, attempt the actual transformation.  If it
+	 succeeds, replace this node with the annotated loop.  */
+      tree result = annotate_for_loop (node, decl, init, &loop_info);
+      if (result != node)
+	{
+	  /* Success!  */
+	  *nodeptr = result;
+
+	  if (unlink_prev)
+	    /* We don't need the previous statement that we consumed
+	       as an initializer in the new OMP_FOR any more.  */
+	    tsi_delink (prev_tsi);
+
+	  if (fix_decl)
+	    /* We no longer need the initializer expression on the
+	       decl of the loop variable and don't want to duplicate
+	       it.  The kernels conversion pass would interpret it as
+	       a stray assignment in a gang-single region.  */
+	    DECL_INITIAL (decl) = NULL_TREE;
+
+	  return;
+	}
+    }
+
+  /* If we got here, we have a FOR_STMT we could not convert to an
+     OMP loop.  */
+
+  if (loop_info.state == as_invalid_return)
+    /* This is diagnosed elsewhere as a hard error, so no warning is
+       needed here.  */
+    return;
+
+  /* Issue warnings about other problems.  */
+  auto_diagnostic_group d;
+  if (warning_at (EXPR_LOCATION (node),
+		  OPT_Wopenacc_kernels_annotate_loops,
+		  "loop cannot be annotated for OpenACC parallelization"))
+    {
+      location_t locus;
+      if (loop_info.reason && EXPR_HAS_LOCATION (loop_info.reason))
+	locus = EXPR_LOCATION (loop_info.reason);
+      else
+	locus = EXPR_LOCATION (node);
+      switch (loop_info.state)
+	{
+	case as_invalid_variable_type:
+	  inform (locus, "invalid type for iteration variable %qE",
+		  loop_info.reason);
+	  break;
+	case as_missing_initializer:
+	  inform (locus, "missing iteration variable initializer");
+	  break;
+	case as_invalid_initializer:
+	  inform (locus, "unrecognized initializer");
+	  break;
+	case as_missing_predicate:
+	  inform (locus, "missing controlling predicate");
+	  break;
+	case as_invalid_predicate:
+	  inform (locus, "invalid controlling predicate");
+	  break;
+	case as_missing_increment:
+	  inform (locus, "missing increment expression");
+	  break;
+	case as_invalid_increment:
+	  inform (locus, "invalid increment expression");
+	  break;
+	case as_explicit_annotation:
+	  inform (locus, "explicit OpenACC annotation in loop nest");
+	  break;
+	case as_invalid_control_flow:
+	  inform (locus, "loop contains unstructured control flow");
+	  break;
+	case as_invalid_break:
+	  inform (locus, "loop contains %<break%> statement");
+	  break;
+	case as_invalid_call:
+	  inform (locus, "loop contains call to non-oacc function");
+	  break;
+	case as_invalid_modification:
+	  inform (locus, "invalid modification of controlling variable");
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+    }
+}
+
+/* Traversal function for walk_tree.  Visit the tree, finding OpenACC
+   kernels regions.  DATA is NULL if we are outside of a kernels region,
+   otherwise it is a pointer to the enclosing kernels region's
+   annotation_info struct.  If the traversal encounters a for loop inside a
+   kernels region that is a candidate for parallelization, annotate it
+   with OpenACC loop directives.  */
+
+static tree
+annotate_loops_in_kernels_regions (tree *nodeptr, int *walk_subtrees,
+				   void *data)
+{
+  tree node = *nodeptr;
+  struct annotation_info *info = (struct annotation_info *) data;
+  gcc_assert (info);
+
+  switch (TREE_CODE (node))
+    {
+    case OACC_KERNELS:
+      /* Recursively process the body of the kernels region in a new info
+	 scope.  */
+      if (info->state == as_outer)
+	{
+	  struct annotation_info nested_info
+	    = { NULL_TREE, NULL_TREE, true,
+		as_in_kernels_region, NULL_TREE, info };
+	  walk_tree (&OMP_BODY (node), annotate_loops_in_kernels_regions,
+		     (void *) &nested_info, NULL);
+	  *walk_subtrees = 0;
+	}
+      break;
+
+    case OACC_LOOP:
+      /* Do not try to add automatic OpenACC annotations inside manually
+	 annotated loops.  Presumably, the user avoided doing it on
+	 purpose; for example, all available levels of parallelism may
+	 have been used up.  However, assume that the combined construct
+	 "#pragma acc kernels loop" means to try to process the whole
+	 loop nest.
+	 Note that a single OACC_LOOP construct represents an entire set
+	 of collapsed loops so we do not have to deal explicitly with the
+	 collapse clause here, as the Fortran front end does.  */
+      if (info->state == as_in_kernels_region && OACC_LOOP_COMBINED (node))
+	{
+	  walk_tree (&OMP_BODY (node), annotate_loops_in_kernels_regions,
+		     (void *) info, NULL);
+	  *walk_subtrees = 0;
+	}
+      else
+	{
+	  struct annotation_info nested_info
+	    = { NULL_TREE, NULL_TREE, false, as_explicit_annotation,
+		node, info };
+	  if (info->state >= as_in_kernels_region)
+	    do_not_annotate_loop_nest (info, as_explicit_annotation,
+				       node);
+	  walk_tree (&OMP_BODY (node), annotate_loops_in_kernels_regions,
+		     (void *) &nested_info, NULL);
+	  *walk_subtrees = 0;
+	}
+      break;
+
+    case FOR_STMT:
+      /* Try to annotate the loop if we are in a kernels region.
+	 This will do a recursive traversal of the loop body in a new
+	 info scope.  */
+      if (info->state >= as_in_kernels_region)
+	{
+	  check_and_annotate_for_loop (nodeptr, NULL, info);
+	  *walk_subtrees = 0;
+	}
+      break;
+
+    case LABEL_EXPR:
+      /* Possibly unstructured control flow.  Unless we perform further
+	 analyses, we must assume that such control flow may enter the
+	 current loop.  In this case, we must not parallelize the loop.  */
+      if (info->state >= as_in_kernels_loop
+	  && TREE_USED (LABEL_EXPR_LABEL (node)))
+	do_not_annotate_loop_nest (info, as_invalid_control_flow, node);
+      break;
+
+    case GOTO_EXPR:
+      /* Possibly unstructured control flow.  Unless we perform further
+	 analyses, we must assume that such control flow may leave the
+	 current loop.  In this case, we must not parallelize the loop.  */
+      if (info->state >= as_in_kernels_loop)
+	do_not_annotate_loop_nest (info, as_invalid_control_flow, node);
+      break;
+
+    case BREAK_STMT:
+      /* A break statement.  Whether or not this is valid depends on the
+	 enclosing context.  */
+      if (info->state >= as_in_kernels_loop && !info->break_ok)
+	do_not_annotate_loop (info, as_invalid_break, node);
+      break;
+
+    case RETURN_EXPR:
+      /* A return leaves the entire loop nest.  */
+      if (info->state >= as_in_kernels_loop)
+	do_not_annotate_loop_nest (info, as_invalid_return, node);
+      break;
+
+    case CALL_EXPR:
+      /* Direct function calls to builtins and functions marked as
+	 OpenACC routines are allowed.  Reject indirect calls or calls
+	 to non-routines.  */
+      if (info->state >= as_in_kernels_loop)
+	{
+	  tree fn = CALL_EXPR_FN (node), fn_decl = NULL_TREE;
+	  if (fn != NULL_TREE && TREE_CODE (fn) == FUNCTION_DECL)
+	    fn_decl = fn;
+	  else if (fn != NULL_TREE && TREE_CODE (fn) == ADDR_EXPR)
+	    {
+	      tree fn_op = TREE_OPERAND (fn, 0);
+	      if (fn_op != NULL_TREE && TREE_CODE (fn_op) == FUNCTION_DECL)
+		fn_decl = fn_op;
+	    }
+	  if (fn_decl == NULL_TREE)
+	    do_not_annotate_loop_nest (info, as_invalid_call, node);
+	  else if (!fndecl_built_in_p (fn_decl, BUILT_IN_NORMAL)
+		   && !lookup_attribute ("oacc function",
+					 DECL_ATTRIBUTES (fn_decl)))
+	    do_not_annotate_loop_nest (info, as_invalid_call, node);
+	}
+      break;
+
+    case MODIFY_EXPR:
+      /* See if this assignment's LHS is one of the variables that must
+	 not be modified in the loop body because they control termination
+	 of the loop (or an enclosing loop in the nest).  */
+      if (info->state >= as_in_kernels_loop)
+	{
+	  tree lhs = TREE_OPERAND (node, 0);
+	  if (!is_local_var (lhs))
+	    /* Early exit: This cannot be a variable we care about.  */
+	    break;
+	  /* Walk up the loop stack.  Invalidate the ones controlled by this
+	     variable.  There may be several, if this variable is the common
+	     iteration limit for several nested loops.  */
+	  for (struct annotation_info *outer_loop = info; outer_loop != NULL;
+	       outer_loop = outer_loop->next)
+	    for (tree t = outer_loop->vars; t != NULL_TREE; t = TREE_CHAIN (t))
+	      if (TREE_VALUE (t) == lhs)
+		{
+		  do_not_annotate_loop (outer_loop,
+					as_invalid_modification,
+					node);
+		  break;
+		}
+	}
+      break;
+
+    case SWITCH_STMT:
+      /* Needs special handling to allow break in the body.  */
+      if (info->state >= as_in_kernels_loop)
+	{
+	  bool save_break_ok = info->break_ok;
+
+	  walk_tree (&SWITCH_STMT_COND (node),
+		     annotate_loops_in_kernels_regions,
+		     (void *) info, NULL);
+	  info->break_ok = true;
+	  walk_tree (&SWITCH_STMT_BODY (node),
+		     annotate_loops_in_kernels_regions,
+		     (void *) info, NULL);
+	  info->break_ok = save_break_ok;
+	  *walk_subtrees = 0;
+	}
+      break;
+
+    case WHILE_STMT:
+      /* Needs special handling to allow break in the body.  */
+      if (info->state >= as_in_kernels_loop)
+	{
+	  bool save_break_ok = info->break_ok;
+
+	  walk_tree (&WHILE_COND (node), annotate_loops_in_kernels_regions,
+		     (void *) info, NULL);
+	  info->break_ok = true;
+	  walk_tree (&WHILE_BODY (node), annotate_loops_in_kernels_regions,
+		     (void *) info, NULL);
+	  info->break_ok = save_break_ok;
+	  *walk_subtrees = 0;
+	}
+      break;
+
+    case DO_STMT:
+      /* Needs special handling to allow break in the body.  */
+      if (info->state >= as_in_kernels_loop)
+	{
+	  bool save_break_ok = info->break_ok;
+
+	  walk_tree (&DO_COND (node), annotate_loops_in_kernels_regions,
+		     (void *) info, NULL);
+	  info->break_ok = true;
+	  walk_tree (&DO_BODY (node), annotate_loops_in_kernels_regions,
+		     (void *) info, NULL);
+	  info->break_ok = save_break_ok;
+	  *walk_subtrees = 0;
+	}
+      break;
+
+    case STATEMENT_LIST:
+      /* We iterate over these explicitly so that we can track the previous
+	 statement in the chain.  It may be the initializer for a following
+	 FOR_STMT node.  */
+      if (info->state >= as_in_kernels_region)
+	{
+	  tree_stmt_iterator i = tsi_start (node);
+	  tree_stmt_iterator prev, *prev_tsi = NULL;
+	  while (!tsi_end_p (i))
+	    {
+	      tree *stmtptr = tsi_stmt_ptr (i);
+	      if (TREE_CODE (*stmtptr) == FOR_STMT)
+		{
+		  check_and_annotate_for_loop (stmtptr, prev_tsi, info);
+		  *walk_subtrees = 0;
+		}
+	      else
+		walk_tree (stmtptr, annotate_loops_in_kernels_regions,
+			   (void *) info, NULL);
+	      prev = i;
+	      prev_tsi = &prev;
+	      tsi_next (&i);
+	    }
+	  *walk_subtrees = 0;
+	}
+      break;
+
+    default:
+      break;
+    }
+
+  return NULL_TREE;
+}
+
+/* Find for loops in OpenACC kernels regions that do not have OpenACC
+   annotations but look like they might benefit from automatic
+   parallelization.  Convert them from FOR_STMT to OMP_FOR nodes and
+   add the equivalent of "#pragma acc loop auto" annotations for them.
+   Assumes flag_openacc_kernels_annotate_loops is set.  */
+
+void
+c_oacc_annotate_loops_in_kernels_regions (tree decl,
+					  tree (*unwrap_fn) (tree))
+{
+  struct annotation_info info
+    = { NULL_TREE, NULL_TREE, true, as_outer, NULL_TREE, NULL };
+  lang_specific_unwrap_initializer = unwrap_fn;
+  walk_tree (&DECL_SAVED_TREE (decl), annotate_loops_in_kernels_regions,
+	     (void *) &info, NULL);
+}
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index c1d8fd3..b146aad 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -975,6 +975,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Init(-1) Warning
 Warn if an old-style parameter definition is used.
 
+Wopenacc-kernels-annotate-loops
+C ObjC C++ ObjC++ Warning Var(warn_openacc_kernels_annotate_loops) Init(0)
+Warn about loops in OpenACC kernels regions that cannot be parallelized.
+
 Wopenmp-simd
 C C++ Var(warn_openmp_simd) Warning LangEnabledBy(C C++,Wall)
 Warn if a simd directive is overridden by the vectorizer cost model.
@@ -1744,6 +1748,10 @@ fopenacc-dim=
 C ObjC C++ ObjC++ LTO Joined Var(flag_openacc_dims)
 Specify default OpenACC compute dimensions.
 
+fopenacc-kernels-annotate-loops
+C ObjC C++ ObjC++ LTO Optimization Var(flag_openacc_kernels_annotate_loops) Init(1)
+Automatically parallelize unannotated loops in OpenACC kernels regions.
+
 fopenmp
 C ObjC C++ ObjC++ LTO Var(flag_openmp)
 Enable OpenMP (implies -frecursive in Fortran).
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 8204db2..02b7f2f 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -9885,6 +9885,29 @@ temp_pop_parm_decls (void)
   pop_scope ();
 }
 \f
+/* Function passed to c_oacc_annotate_loop_in_kernels_regions to do
+   language-specific unwrapping of an initializer expression.  */
+static tree
+c_unwrap_for_init (tree x)
+{
+  if (!x)
+    return NULL_TREE;
+
+  while (true)
+    switch (TREE_CODE (x))
+      {
+      case MODIFY_EXPR:
+      case VAR_DECL:
+	return x;
+
+      case DECL_EXPR:
+	x = TREE_OPERAND (x, 0);
+	break;
+
+      default:
+	return NULL_TREE;
+      }
+}
 
 /* Finish up a function declaration and compile that function
    all the way to assembler language output.  Then free the storage
@@ -9987,6 +10010,11 @@ finish_function (location_t end_loc)
   if (warn_unused_parameter)
     do_warn_unused_parameter (fndecl);
 
+  /* If requested, automatically annotate suitable loops in OpenACC kernels
+     regions with OpenACC loop annotations to allow auto-parallelization.  */
+  if (flag_openacc && flag_openacc_kernels_annotate_loops)
+    c_oacc_annotate_loops_in_kernels_regions (fndecl, c_unwrap_for_init);
+
   /* Store the end of the function, so that we get good line number
      info for the epilogue.  */
   cfun->function_end_locus = end_loc;
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 2e6775a..12fb4b7 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -16831,6 +16831,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 		    omp_clause_mask mask, tree *cclauses, bool *if_p)
 {
   bool is_parallel = ((mask >> PRAGMA_OACC_CLAUSE_REDUCTION) & 1) == 1;
+  bool is_combined = (cclauses != NULL);
 
   strcat (p_name, " loop");
   mask |= OACC_LOOP_CLAUSE_MASK;
@@ -16849,6 +16850,8 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
   tree block = c_begin_compound_stmt (true);
   tree stmt = c_parser_omp_for_loop (loc, parser, OACC_LOOP, clauses, NULL,
 				     if_p);
+  if (stmt && stmt != error_mark_node)
+    OACC_LOOP_COMBINED (stmt) = is_combined;
   block = c_end_compound_stmt (loc, block, true);
   add_stmt (block);
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 31d6874..2d83c41 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -16960,6 +16960,45 @@ emit_coro_helper (tree helper)
   expand_or_defer_fn (helper);
 }
 
+
+/* Function passed to c_oacc_annotate_loop_in_kernels_regions to do
+   language-specific unwrapping of an initializer expression.  */
+static tree
+cp_unwrap_for_init (tree x)
+{
+  if (!x)
+    return NULL_TREE;
+
+  while (true)
+    switch (TREE_CODE (x))
+      {
+      case MODIFY_EXPR:
+      case VAR_DECL:
+	return x;
+
+      case CLEANUP_POINT_EXPR:
+	x = TREE_OPERAND (x, 0);
+	break;
+
+      case EXPR_STMT:
+	x = TREE_OPERAND (x, 0);
+	break;
+
+      case DECL_EXPR:
+	x = TREE_OPERAND (x, 0);
+	break;
+
+      case CONVERT_EXPR:
+	if (TREE_TYPE (x) != void_type_node)
+	  return NULL_TREE;
+	x = TREE_OPERAND (x, 0);
+	break;
+
+      default:
+	return NULL_TREE;
+      }
+}
+
 /* Finish up a function declaration and compile that function
    all the way to assembler language output.  The free the storage
    for the function definition. INLINE_P is TRUE if we just
@@ -17264,6 +17303,11 @@ finish_function (bool inline_p)
       && !DECL_CLONED_FUNCTION_P (fndecl))
     do_warn_unused_parameter (fndecl);
 
+  /* If requested, automatically annotate suitable loops in OpenACC kernels
+     regions with OpenACC loop annotations to allow auto-parallelization.  */
+  if (flag_openacc && flag_openacc_kernels_annotate_loops)
+    c_oacc_annotate_loops_in_kernels_regions (fndecl, cp_unwrap_for_init);
+
   /* Genericize before inlining.  */
   if (!processing_template_decl
       && !DECL_IMMEDIATE_FUNCTION_P (fndecl)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 9849e59..1a11f2b 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -41281,6 +41281,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 		     omp_clause_mask mask, tree *cclauses, bool *if_p)
 {
   bool is_parallel = ((mask >> PRAGMA_OACC_CLAUSE_REDUCTION) & 1) == 1;
+  bool is_combined = (cclauses != NULL);
 
   strcat (p_name, " loop");
   mask |= OACC_LOOP_CLAUSE_MASK;
@@ -41299,6 +41300,8 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
   tree block = begin_omp_structured_block ();
   int save = cp_parser_begin_omp_structured_block (parser);
   tree stmt = cp_parser_omp_for_loop (parser, OACC_LOOP, clauses, NULL, if_p);
+  if (stmt && stmt != error_mark_node)
+    OACC_LOOP_COMBINED (stmt) = is_combined;
   cp_parser_end_omp_structured_block (parser, save);
   add_stmt (finish_omp_structured_block (block));
 
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 107d39d..efdb393 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -4900,6 +4900,10 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
     length = mark_rvalue_use (length);
   /* We need to reduce to real constant-values for checks below.  */
   if (length)
+    STRIP_NOPS (length);
+  if (low_bound)
+    STRIP_NOPS (low_bound);
+  if (length)
     length = fold_simple (length);
   if (low_bound)
     low_bound = fold_simple (low_bound);
@@ -5204,6 +5208,11 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
 	  tree low_bound = TREE_PURPOSE (t);
 	  tree length = TREE_VALUE (t);
 
+	  if (length)
+	    STRIP_NOPS (length);
+	  if (low_bound)
+	    STRIP_NOPS (low_bound);
+
 	  i--;
 	  if (low_bound
 	      && TREE_CODE (low_bound) == INTEGER_CST
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index bca8c85..af83a5f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -348,7 +348,8 @@ Objective-C and Objective-C++ Dialects}.
 -Wmissing-include-dirs  -Wmissing-noreturn  -Wno-missing-profile @gol
 -Wno-multichar  -Wmultistatement-macros  -Wnonnull  -Wnonnull-compare @gol
 -Wnormalized=@r{[}none@r{|}id@r{|}nfc@r{|}nfkc@r{]} @gol
--Wnull-dereference  -Wno-odr  -Wopenmp-simd  @gol
+-Wnull-dereference  -Wno-odr @gol
+-Wopenacc-kernels-annotate-loops  -Wopenmp-simd @gol
 -Wno-overflow  -Woverlength-strings  -Wno-override-init-side-effects @gol
 -Wpacked  -Wno-packed-bitfield-compat  -Wpacked-not-aligned  -Wpadded @gol
 -Wparentheses  -Wno-pedantic-ms-format @gol
@@ -500,7 +501,8 @@ Objective-C and Objective-C++ Dialects}.
 -fmerge-constants  -fmodulo-sched  -fmodulo-sched-allow-regmoves @gol
 -fmove-loop-invariants  -fno-branch-count-reg @gol
 -fno-defer-pop  -fno-fp-int-builtin-inexact  -fno-function-cse @gol
--fno-guess-branch-probability  -fno-inline  -fno-math-errno  -fno-peephole @gol
+-fno-guess-branch-probability  -fno-inline  -fno-math-errno @gol
+-fno-openacc-kernels-annotate-loops  -fno-peephole @gol
 -fno-peephole2  -fno-printf-return-value  -fno-sched-interblock @gol
 -fno-sched-spec  -fno-signed-zeros @gol
 -fno-toplevel-reorder  -fno-trapping-math  -fno-zero-initialized-in-bss @gol
@@ -8305,6 +8307,13 @@ Do not warn about compile-time overflow in constant expressions.
 Warn about One Definition Rule violations during link-time optimization.
 Enabled by default.
 
+@item -Wopenacc-kernels-annotate-loops
+@opindex Wopenacc-kernels-annotate-loops
+@opindex Wno-Wopenacc-kernels-annotate-loops
+Warn about @code{for} (C/C++) or @code{DO} (Fortran) loops in OpenACC
+kernels regions that cannot be automatically annotated for
+parallelization with @option{-fopenacc-kernels-annotate-loops}.
+
 @item -Wopenmp-simd
 @opindex Wopenmp-simd
 @opindex Wno-openmp-simd
@@ -13647,6 +13656,27 @@ approximation is enabled.  The default value is 2.
 
 @end table
 
+@item -fno-openacc-kernels-annotate-loops
+@opindex fno-openacc-kernels-annotate-loops
+@opindex fopenacc-kernels-annotate-loops
+@cindex kernels regions, OpenACC
+Disable automatic parallelization of unannotated loops in OpenACC
+kernels regions.  The default is to attempt to add implicit
+@code{acc loop auto} annotations to loops in kernels regions if
+@option{-fopenacc} is enabled.
+
+Note that you can use @option{-Wopenacc-kernels-annotate-loops} to
+diagnose @code{for} loops that cannot be automatically annotated
+(@pxref{Warning Options}).  Reasons why automatic loop annotations
+cannot be applied include premature exits, calls to functions without
+an @code{openacc routine} annotation, or unstructured control flow in
+the loop body.  In C and C++, the loop variable initialization, end
+test, and increment expressions must additionally conform to
+restrictions similar to those for explicitly-annotated loops, and the
+loop variable must not be otherwise modified in the body of the loop.
+An explicit @code{acc loop} annotation disables automatic annotations
+on any nested or containing loops.
+
 @end table
 
 @node Instrumentation Options
diff --git a/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c b/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
index d4c4b2c..a6e2d0b 100644
--- a/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
+++ b/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
@@ -2,6 +2,7 @@
    OpenACC kernels.  */
 
 /* { dg-additional-options "-O2" }
+   { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
    { dg-additional-options "-fopt-info-optimized-omp" }
    { dg-additional-options "-fdump-tree-ompexp" }
    { dg-additional-options "-fdump-tree-parloops1-all" }
diff --git a/gcc/testsuite/c-c++-common/goacc/classify-kernels.c b/gcc/testsuite/c-c++-common/goacc/classify-kernels.c
index 16e9b9e..bb21c9c 100644
--- a/gcc/testsuite/c-c++-common/goacc/classify-kernels.c
+++ b/gcc/testsuite/c-c++-common/goacc/classify-kernels.c
@@ -2,6 +2,7 @@
    kernels.  */
 
 /* { dg-additional-options "-O2" }
+   { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
    { dg-additional-options "-fopt-info-optimized-omp" }
    { dg-additional-options "-fdump-tree-ompexp" }
    { dg-additional-options "-fdump-tree-parloops1-all" }
diff --git a/gcc/testsuite/c-c++-common/goacc/combined-directives.c b/gcc/testsuite/c-c++-common/goacc/combined-directives.c
index c2a3c57..2519f23 100644
--- a/gcc/testsuite/c-c++-common/goacc/combined-directives.c
+++ b/gcc/testsuite/c-c++-common/goacc/combined-directives.c
@@ -110,7 +110,7 @@ test ()
 // { dg-final { scan-tree-dump-times "acc loop worker" 2 "gimple" } }
 // { dg-final { scan-tree-dump-times "acc loop vector" 2 "gimple" } }
 // { dg-final { scan-tree-dump-times "acc loop seq" 2 "gimple" } }
-// { dg-final { scan-tree-dump-times "acc loop auto" 2 "gimple" } }
+// { dg-final { scan-tree-dump-times "acc loop auto" 6 "gimple" } }
 // { dg-final { scan-tree-dump-times "acc loop tile.2, 3" 2 "gimple" } }
 // { dg-final { scan-tree-dump-times "acc loop independent private.i" 2 "gimple" } }
 // { dg-final { scan-tree-dump-times "private.z" 2 "gimple" } }
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c b/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c
index 0304254..c37152c 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-dom3" } */
 
 #include <stdlib.h>
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c b/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
index c475333..b1f4302 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c b/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c
index 8f7f415..e87aab3 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fopt-info-optimized-omp" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c b/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c
index c11d36f..2323857 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fopt-info-optimized-omp" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c
index acef6a1..adca30b 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c
index 75e2bb7..5f16085 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c
new file mode 100644
index 0000000..c7b5ac8
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c
@@ -0,0 +1,26 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that all loops in the nest are annotated.  */
+
+void f (float a[16][16], float b[16][16], float c[16][16])
+{
+  int i, j, k;
+
+#pragma acc kernels copyin(a[0:16][0:16], b[0:16][0:16]) copyout(c[0:16][0:16])
+  {
+    for (i = 0; i < 16; i++) {
+      for (j = 0; j < 16; j++) {
+	float t = 0;
+	for (k = 0; k < 16; k++)
+	  t += a[i][k] * b[k][j];
+	c[i][j] = t;
+      }
+    }
+  }
+
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 3 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c
new file mode 100644
index 0000000..58b41d2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c
@@ -0,0 +1,32 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a random goto in the body can't be annotated.  */
+
+#define n 16
+
+float f (float *a, float *b)
+{
+  float t = 0;
+  int i;
+
+#pragma acc kernels
+  {
+    for (i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
+      {
+	if (a[i] < 0)
+	  {
+	    t = 0;
+	    goto bad;
+	  }
+	t += a[i] * b[i];
+      }
+  bad:
+    ;
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c
new file mode 100644
index 0000000..e9d2ef4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c
@@ -0,0 +1,27 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a random label in the body triggers a warning.  */
+
+#define n 16
+
+float f (float *a, float *b)
+{
+  float t = 0;
+  int i = n - 1;
+
+#pragma acc kernels
+  {
+    goto spaghetti;
+    for (i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
+      {
+      spaghetti:
+	t += a[i] * b[i];
+      }
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c
new file mode 100644
index 0000000..ba408bc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c
@@ -0,0 +1,28 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that in a situation with nested loops, a problem that prevents
+   annotation of the inner loop only still allows the outer loop to be
+   annotated.  */
+
+float f (float *a, float *b, int n)
+{
+  float t = 0;
+
+#pragma acc kernels
+  {
+    for (int i = 0; i < n; i++)
+      for (int j = 0; j <= i; j++)  /* { dg-warning "loop cannot be annotated" } */
+       {
+         if (a[i] < 0 || b[j] < 0)
+           j = i;
+         else
+           t += a[i] * b[j];
+       }
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c
new file mode 100644
index 0000000..64433e8
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c
@@ -0,0 +1,27 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that in a situation with nested loops, a problem that prevents
+   annotation of the outer loop only still allows the inner loop to be
+   annotated.  */
+
+float f (float *a, float *b, int n)
+{
+  float t = 0;
+
+#pragma acc kernels
+  {
+    for (int i = 0; i < n; i++)	  /* { dg-warning "loop cannot be annotated" } */
+      {
+	if (a[i] < 0)
+	  n = i;
+	for (int j = 0; j <= i; j++)
+	  t += a[i] * b[j];
+      }
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c
new file mode 100644
index 0000000..379e6ba
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c
@@ -0,0 +1,22 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that an explicit annotation on an outer loop suppresses annotation
+   of inner loops, and produces a diagnostic.  */
+
+void f (float *a, float *b)
+{
+  float t = 0;
+
+#pragma acc kernels
+  {
+#pragma acc loop seq
+    for (int l = 0; l < 20; l++)
+      for (int m = 0; m < 20; m++)	/* { dg-warning "loop cannot be annotated" } */
+        b[m] = a[m];
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c
new file mode 100644
index 0000000..9a2a7ca
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c
@@ -0,0 +1,22 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that an explicit annotation on an inner loop suppresses annotation
+   of outer loops, and produces a diagnostic.  */
+
+void f (float *a, float *b)
+{
+  float t = 0;
+
+#pragma acc kernels
+  {
+    for (int l = 0; l < 20; l++)	/* { dg-warning "loop cannot be annotated" } */
+#pragma acc loop seq
+      for (int m = 0; m < 20; m++)
+        b[m] = a[m];
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c
new file mode 100644
index 0000000..075f897
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c
@@ -0,0 +1,26 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a modification of the loop variable in the
+   body cannot be annotated.  */
+
+float f (float *a, float *b, int n)
+{
+  float t = 0;
+
+#pragma acc kernels
+  {
+    for (int i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
+      {
+	if (a[i] < 0 || b[i] < 0)
+	  i = n;
+	else
+	  t += a[i] * b[i];
+      }
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c
new file mode 100644
index 0000000..5076789
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c
@@ -0,0 +1,26 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a modification of the loop iteration count
+   variable in the body cannot be annotated.  */
+
+float f (float *a, float *b, int n)
+{
+  float t = 0;
+
+#pragma acc kernels
+  {
+    for (int i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
+      {
+	if (a[i] < 0 || b[i] < 0)
+	  n = i;
+	else
+	  t += a[i] * b[i];
+      }
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c
new file mode 100644
index 0000000..89ec644
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c
@@ -0,0 +1,18 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that "acc kernels loop" directive causes annotation of the entire
+   loop nest.  */
+
+void f (float *a, float *b)
+{
+#pragma acc kernels loop
+  for (int k = 0; k < 20; k++)
+    for (int l = 0; l < 20; l++)
+      for (int m = 0; m < 20; m++)
+	b[m] = a[m];
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 2 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c
new file mode 100644
index 0000000..77a3b7a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c
@@ -0,0 +1,19 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that "acc kernels loop" directive causes annotation of the entire
+   loop nest in the presence of a collapse clause.  */
+
+void f (float *a, float *b)
+{
+#pragma acc kernels loop collapse(2)
+  for (int k = 0; k < 20; k++)
+    for (int l = 0; l < 20; l++)
+      for (int m = 0; m < 20; m++)
+	b[m] = a[m];
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop collapse.2." 1 "original" } } */
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c
new file mode 100644
index 0000000..9e0a946
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c
@@ -0,0 +1,21 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a variable bound can be annotated.  */
+
+float f (float *a, float *b, int n)
+{
+  float t = 0;
+  int i;
+
+#pragma acc kernels
+  {
+    for (i = 0; i < n; i++)
+      t += a[i] * b[i];
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
new file mode 100644
index 0000000..5e3f028
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
@@ -0,0 +1,23 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that calls to built-in functions don't inhibit kernels loop
+   annotation.  */
+
+void foo (int n, int *input, int *out1, int *out2)
+{
+#pragma acc kernels
+  {
+    int i;
+
+    for (i = 0; i < n; i++)
+      {
+	out1[i] = __builtin_clz (input[i]);
+	out2[i] = __builtin_popcount (input[i]);
+      }
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c
new file mode 100644
index 0000000..f87444e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c
@@ -0,0 +1,42 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test for rejecting annotation on loops that have various subexpressions
+   in the loop end test that are not loop-invariant.  */
+
+extern int g (int);
+extern int x;
+extern int gg (int, int) __attribute__ ((const));
+
+void f (float *a, float *b, int n)
+{
+
+  int j;
+#pragma acc kernels
+  {
+    /* Non-constant function call.  */
+    for (int i = 0; i < g(n); i++)	/* { dg-warning "loop cannot be annotated" } */
+      a[i] = b[i];
+
+    /* Global variable.  */
+    for (int i = x; i < n + x; i++)	/* { dg-warning "loop cannot be annotated" } */
+      a[i] = b[i];
+
+    /* Explicit reference to the loop variable.  */
+    for (int i = 0; i < gg (i, n); i++)	/* { dg-warning "loop cannot be annotated" } */
+      a[i] = b[i];
+
+    /* Reference to a variable that is modified in the body of the loop.  */
+    j = 0;
+    for (int i = 0; i < gg (j, n); i++)	/* { dg-warning "loop cannot be annotated" } */
+      {
+	a[i] = b[i];
+	j = i;
+      }
+
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c
new file mode 100644
index 0000000..6a5099d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c
@@ -0,0 +1,41 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test for accepting annotation on loops that have various forms of
+   loop-invariant expressions in their end test.  */
+
+extern const int x;
+extern int g (int) __attribute__ ((const));
+
+void f (float *a, float *b, int n)
+{
+
+  int j;
+#pragma acc kernels
+  {
+    /* Reversed form of comparison.  */
+    for (int i = 0; n >= i; i++)
+      a[i] = b[i];
+    
+    /* Constant function call.  */
+    for (int i = 0; i < g(n); i++)
+      a[i] = b[i];
+
+    /* Constant global variable.  */
+    for (int i = 0; i < x; i++)
+      a[i] = b[i];
+
+    /* Complicated expression involving conditionals, etc. */
+    for (int i = 0; i < ((x == 4) ? (n << 2) : (n << 3)); i++)
+      a[i] = b[i];
+
+    /* Reference to a local variable not modified in the loop.  */
+    j = ((x == 4) ? (n << 2) : (n << 3));
+    for (int i = 0; i < j; i++)
+      a[i] = b[i];
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 5 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c
new file mode 100644
index 0000000..f60070e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c
@@ -0,0 +1,24 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a conditional in the body can be annotated.  */
+
+#define n 16
+
+float f (float *a, float *b)
+{
+  float t = 0;
+  int i;
+
+#pragma acc kernels
+  {
+    for (i = 0; i < n; i++)
+      if (a[i] > 0 && b[i] > 0)
+	t += a[i] * b[i];
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c
new file mode 100644
index 0000000..949871c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c
@@ -0,0 +1,34 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a switch and break in the body can be annotated.  */
+
+#define n 16
+
+float f (float *a, float *b, int state)
+{
+  float t = 0;
+  int i;
+
+#pragma acc kernels
+  {
+    for (i = 0; i < n; i++)
+      switch (state)
+	{
+	case 0:
+	default:
+	  t += a[i] * b[i];
+	  break;
+
+	case 1:
+	  if (a[i] > 0 && b[i] > 0)
+	    t += a[i] * b[i];
+	  break;
+	}
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c
new file mode 100644
index 0000000..03dfe8f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c
@@ -0,0 +1,27 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a continue statement in the body can be annotated.  */
+
+#define n 16
+
+float f (float *a, float *b)
+{
+  float t = 0;
+  int i;
+
+#pragma acc kernels
+  {
+    for (i = 0; i < n; i++)
+      {
+	if (a[i] < 0 || b[i] < 0)
+	  continue;
+	t += a[i] * b[i];
+      }
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c
new file mode 100644
index 0000000..ede6b3c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c
@@ -0,0 +1,27 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a break statement in the body cannot be annotated.  */
+
+#define n 16
+
+float f (float *a, float *b)
+{
+  float t = 0;
+  int i;
+
+#pragma acc kernels
+  {
+    for (i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
+      {
+	if (a[i] < 0 || b[i] < 0)
+	  break;
+	t += a[i] * b[i];
+      }
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c
new file mode 100644
index 0000000..20ee299
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c
@@ -0,0 +1,26 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with a random function call in the body cannot be
+   annotated.  */
+
+extern float g (float);
+
+#define n 16
+
+float f (float *a, float *b)
+{
+  float t = 0;
+  int i;
+
+#pragma acc kernels
+  {
+    for (i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
+      t += g (a[i] * b[i]);
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c
new file mode 100644
index 0000000..796f048
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c
@@ -0,0 +1,27 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a loop with an openacc function call in the body can be
+   annotated.  */
+
+#pragma acc routine worker
+extern float g (float);
+
+#define n 16
+
+float f (float *a, float *b)
+{
+  float t = 0;
+  int i;
+
+#pragma acc kernels
+  {
+    for (i = 0; i < n; i++)
+      t += g (a[i] * b[i]);
+  }
+  return t;
+}
+
+/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c
new file mode 100644
index 0000000..048f1b0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c
@@ -0,0 +1,26 @@
+/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
+/* { dg-additional-options "-fdump-tree-original" } */
+/* { dg-do compile } */
+
+/* Test that a kernels loop with a return in the body triggers a hard
+   error.  */
+
+#define n 16
+
+float f (float *a, float *b)
+{
+  float t = 0;
+  int i;
+
+#pragma acc kernels
+  {
+    for (i = 0; i < n; i++)
+      {
+	if (a[i] < 0 || b[i] < 0)
+	  return 0.0;	/* { dg-error "invalid branch" } */
+	t += a[i] * b[i];
+      }
+  }
+  return t;
+}
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c
index 7180021..9a97de6 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c
index 0c9f833..31e8378 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c
index 0bd21b6..ad59155 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c
index dd5a841..4acffef 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c
index a658182..327aa05 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c
index 73b469d..26c65fe 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c
@@ -1,5 +1,6 @@
 /* { dg-additional-options "-O2" } */
 /* { dg-additional-options "-g" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c
index 5592623..8955cf2 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c
index e86be1b..d88a61d 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c
index 2b0e186..5943d56 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop.c
index 9619d53..ad525cd 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-loop.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c b/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c
index 69539b2..f799baf 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c b/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c
index 81b0fee..b8093b5 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-reduction.c b/gcc/testsuite/c-c++-common/goacc/kernels-reduction.c
index 5921b88..105cbcf 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-reduction.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-reduction.c
@@ -1,4 +1,5 @@
 /* { dg-additional-options "-O2" } */
+/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
 /* { dg-additional-options "-fdump-tree-parloops1-all" } */
 /* { dg-additional-options "-fdump-tree-optimized" } */
 
diff --git a/gcc/tree.h b/gcc/tree.h
index 9ec24a3..9ee9124 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1492,6 +1492,11 @@ class auto_suppress_location_wrappers
 #define OMP_TARGET_COMBINED(NODE) \
   (OMP_TARGET_CHECK (NODE)->base.private_flag)
 
+/* True on an OACC_LOOP statement if it is part of a combined construct,
+   for example "#pragma acc kernels loop".  */
+#define OACC_LOOP_COMBINED(NODE) \
+  (OACC_LOOP_CHECK (NODE)->base.private_flag)
+
 /* Memory order for OMP_ATOMIC*.  */
 #define OMP_ATOMIC_MEMORY_ORDER(NODE) \
   (TREE_RANGE_CHECK (NODE, OMP_ATOMIC, \
-- 
2.8.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] [OpenACC] Kernels loops annotation: Fortran.
  2020-09-09 17:53 [PATCH 0/2] [OpenACC] Kernels loop annotation Sandra Loosemore
  2020-09-09 17:53 ` [PATCH 1/2] [OpenACC] Kernels loops annotation: C and C++ Sandra Loosemore
@ 2020-09-09 17:53 ` Sandra Loosemore
  2020-09-10 10:20 ` [PATCH 0/2] [OpenACC] Kernels loop annotation Richard Biener
  2020-10-20 16:34 ` Sandra Loosemore
  3 siblings, 0 replies; 7+ messages in thread
From: Sandra Loosemore @ 2020-09-09 17:53 UTC (permalink / raw)
  To: gcc-patches

This patch implements the Fortran support for adding "#pragma acc loop
auto" annotations to loops in OpenACC kernels regions.  It implements
the same -fopenacc-kernels-annotate-loops and
-Wopenacc-kernels-annotate-loops options that were previously added
(and documented) for the C/C++ front ends.

2020-09-08  Sandra Loosemore  <sandra@codesourcery.com>
	    Gergö Barany <gergo@codesourcery.com>

	gcc/fortran/

	* gfortran.h (gfc_oacc_annotate_loops_in_kernels_regions):
	Declare.
	* lang.opt (Wopenacc-kernels-annotate-loops): New.
	(fopenacc-kernels-annotate-loops): New.
	* openmp.c: Include options.h.
	(enum annotation_state): New.
	(enum annotation_result): New.
	(check_code_for_invalid_calls): New.
	(check_expr_for_invalid_calls): New.
	(check_for_invalid_calls): New.
	(annotate_do_loop): New.
	(annotate_do_loops_in_kernels): New.
	(compute_goto_targets): New.
	(gfc_oacc_annotate_loops_in_kernels_regions): New.
	* parse.c (gfc_parse_file): Handle
	-fopenacc-kernels-annotate-loops.
	* trans-openmp.c (gfc_trans_omp_do): Add combined parameter.
	Use it to set OACC_LOOP_COMBINED.  Adjust call sites.

	gcc/testsuite/
	* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Add
	-fno-openacc-kernels-annotate-loops option.
	* gfortran.dg/goacc/classify-kernels.f95: Likewise.
	* gfortran.dg/goacc/combined-directives.f90: Adjust patterns.
	* gfortran.dg/goacc/common-block-3.f90: Add
	-fno-openacc-kernels-annotate-loops option.
	* gfortran.dg/goacc/kernels-loop-2.f95: Likewise.
	* gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
	* gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95:
	Likewise.
	* gfortran.dg/goacc/kernels-loop-data-enter-exit.f95:
	Likewise.
	* gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
	* gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
	* gfortran.dg/goacc/kernels-loop-n.f95: Likewise.
	* gfortran.dg/goacc/kernels-loop.f95: Likewise.
	* gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95:
	Likewise.
	* gfortran.dg/goacc/private-explicit-kernels-1.f95: Adjust
	patterns.
	* gfortran.dg/goacc/private-predetermined-kernels-1.f95:
	Likewise.
	* gfortran.dg/goacc/kernels-loop-annotation-1.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-2.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-3.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-4.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-5.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-6.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-7.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-8.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-9.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-10.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-11.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-12.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-13.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-14.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-15.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-16.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-18.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-19.f95: New.
	* gfortran.dg/goacc/kernels-loop-annotation-20.f95: New.
---
 gcc/fortran/gfortran.h                             |   1 +
 gcc/fortran/lang.opt                               |   8 +
 gcc/fortran/openmp.c                               | 415 +++++++++++++++++++++
 gcc/fortran/parse.c                                |   9 +
 gcc/fortran/trans-openmp.c                         |  30 +-
 .../goacc/classify-kernels-unparallelized.f95      |   1 +
 .../gfortran.dg/goacc/classify-kernels.f95         |   1 +
 .../gfortran.dg/goacc/combined-directives.f90      |  19 +-
 gcc/testsuite/gfortran.dg/goacc/common-block-3.f90 |   1 +
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 |   1 +
 .../goacc/kernels-loop-annotation-1.f95            |  33 ++
 .../goacc/kernels-loop-annotation-10.f95           |  32 ++
 .../goacc/kernels-loop-annotation-11.f95           |  34 ++
 .../goacc/kernels-loop-annotation-12.f95           |  39 ++
 .../goacc/kernels-loop-annotation-13.f95           |  38 ++
 .../goacc/kernels-loop-annotation-14.f95           |  35 ++
 .../goacc/kernels-loop-annotation-15.f95           |  35 ++
 .../goacc/kernels-loop-annotation-16.f95           |  34 ++
 .../goacc/kernels-loop-annotation-18.f95           |  28 ++
 .../goacc/kernels-loop-annotation-19.f95           |  29 ++
 .../goacc/kernels-loop-annotation-2.f95            |  32 ++
 .../goacc/kernels-loop-annotation-20.f95           |  26 ++
 .../goacc/kernels-loop-annotation-3.f95            |  33 ++
 .../goacc/kernels-loop-annotation-4.f95            |  34 ++
 .../goacc/kernels-loop-annotation-5.f95            |  35 ++
 .../goacc/kernels-loop-annotation-6.f95            |  34 ++
 .../goacc/kernels-loop-annotation-7.f95            |  48 +++
 .../goacc/kernels-loop-annotation-8.f95            |  50 +++
 .../goacc/kernels-loop-annotation-9.f95            |  34 ++
 .../gfortran.dg/goacc/kernels-loop-data-2.f95      |   1 +
 .../goacc/kernels-loop-data-enter-exit-2.f95       |   1 +
 .../goacc/kernels-loop-data-enter-exit.f95         |   1 +
 .../gfortran.dg/goacc/kernels-loop-data-update.f95 |   1 +
 .../gfortran.dg/goacc/kernels-loop-data.f95        |   1 +
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 |   1 +
 gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95   |   1 +
 .../kernels-parallel-loop-data-enter-exit.f95      |   1 +
 .../goacc/private-explicit-kernels-1.f95           |   7 +-
 .../goacc/private-predetermined-kernels-1.f95      |   7 +-
 39 files changed, 1152 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index d0cea83..bbde046 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3328,6 +3328,7 @@ void gfc_resolve_oacc_declare (gfc_namespace *);
 void gfc_resolve_oacc_parallel_loop_blocks (gfc_code *, gfc_namespace *);
 void gfc_resolve_oacc_blocks (gfc_code *, gfc_namespace *);
 void gfc_resolve_oacc_routines (gfc_namespace *);
+void gfc_oacc_annotate_loops_in_kernels_regions (gfc_namespace *);
 
 /* expr.c */
 void gfc_free_actual_arglist (gfc_actual_arglist *);
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index da4b1aa..34dc9ee 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -285,6 +285,10 @@ Wuse-without-only
 Fortran Var(warn_use_without_only) Warning
 Warn about USE statements that have no ONLY qualifier.
 
+Wopenacc-kernels-annotate-loops
+Fortran
+; Documented in C
+
 Wopenmp-simd
 Fortran
 ; Documented in C
@@ -687,6 +691,10 @@ fopenacc-dim=
 Fortran LTO Joined Var(flag_openacc_dims)
 ; Documented in C
 
+fopenacc-kernels-annotate-loops
+Fortran LTO Optimization
+; Documented in C
+
 fopenmp
 Fortran LTO
 ; Documented in C
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index d0e516c..a7c4331 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "parse.h"
 #include "diagnostic.h"
 #include "gomp-constants.h"
+#include "options.h"
 
 /* Match an end of OpenMP directive.  End of OpenMP directive is optional
    whitespace, followed by '\n' or comment '!'.  */
@@ -7039,3 +7040,417 @@ gfc_resolve_omp_udrs (gfc_symtree *st)
   for (omp_udr = st->n.omp_udr; omp_udr; omp_udr = omp_udr->next)
     gfc_resolve_omp_udr (omp_udr);
 }
+
+
+/* The following functions implement automatic recognition and annotation of
+   DO loops in OpenACC kernels regions.  Inside a kernels region, a nest of
+   DO loops that does not contain any annotated OpenACC loops, nor EXIT
+   or GOTO statements, gets an automatic "acc loop auto" annotation
+   on each loop.
+   This feature is controlled by flag_openacc_kernels_annotate_loops.  */
+
+
+/* State of annotation state traversal for DO loops in kernels regions.  */
+enum annotation_state {
+  as_outer,
+  as_in_kernels_region,
+  as_in_kernels_loop,
+  as_in_kernels_inner_loop
+};
+
+/* Return status of annotation traversal.  */
+enum annotation_result {
+  ar_ok,
+  ar_invalid_loop,
+  ar_invalid_nest
+};
+
+/* Code walk function for check_for_invalid_calls.  */
+
+static int
+check_code_for_invalid_calls (gfc_code **codep, int *walk_subtrees,
+			      void *data ATTRIBUTE_UNUSED)
+{
+  gfc_code *code = *codep;
+  switch (code->op)
+    {
+    case EXEC_CALL:
+      /* Calls to openacc routines are permitted.  */
+      if (code->resolved_sym
+	  && (code->resolved_sym->attr.oacc_routine_lop
+	      != OACC_ROUTINE_LOP_NONE))
+	return 0;
+      /* Else fall through.  */
+
+    case EXEC_CALL_PPC:
+    case EXEC_ASSIGN_CALL:
+      gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+		   "Subroutine call at %L prevents annotation of loop nest",
+		   &code->loc);
+      *walk_subtrees = 0;
+      return 1;
+
+    default:
+      return 0;
+    }
+}
+
+/* Expr walk function for check_for_invalid_calls.  */
+
+static int
+check_expr_for_invalid_calls (gfc_expr **exprp, int *walk_subtrees,
+			      void *data ATTRIBUTE_UNUSED)
+{
+  gfc_expr *expr = *exprp;
+  switch (expr->expr_type)
+    {
+    case EXPR_FUNCTION:
+      /* Permit calls to Fortran intrinsic functions and to routines
+	 with an explicitly declared parallelism level.  */
+      if (expr->value.function.isym
+	  || (expr->value.function.esym
+	      && (expr->value.function.esym->attr.oacc_routine_lop
+		  != OACC_ROUTINE_LOP_NONE)))
+	return 0;
+      /* Else fall through.  */
+
+    case EXPR_COMPCALL:
+      gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+		   "Function call at %L prevents annotation of loop nest",
+		   &expr->where);
+      *walk_subtrees = 0;
+      return 1;
+
+    default:
+      return 0;
+    }
+}
+
+/* Return TRUE if the DO loop CODE contains function or procedure
+   calls that ought to prohibit annotation.  This traversal is
+   separate from the main annotation tree walk because we need to walk
+   expressions as well as executable statements.  */
+
+static bool
+check_for_invalid_calls (gfc_code *code)
+{
+  gcc_assert (code->op == EXEC_DO);
+  return gfc_code_walker (&code, check_code_for_invalid_calls,
+			  check_expr_for_invalid_calls, NULL);
+}
+
+/* Annotate DO loop CODE with OpenACC "loop auto".  */
+
+static void
+annotate_do_loop (gfc_code *code, gfc_code *parent)
+{
+
+  /* A DO loop's body is another phony DO node whose next pointer starts
+     the actual body.  */
+  gcc_assert (code->op == EXEC_DO);
+  gcc_assert (code->block->op == EXEC_DO);
+
+  /* Build the "acc loop auto" annotation and add the loop as its
+     body.  */
+  gfc_omp_clauses *clauses = gfc_get_omp_clauses ();
+  clauses->par_auto = 1;
+  gfc_code *oacc_loop = gfc_get_code (EXEC_OACC_LOOP);
+  oacc_loop->block = gfc_get_code (EXEC_OACC_LOOP);
+  oacc_loop->block->next = code;
+  oacc_loop->ext.omp_clauses = clauses;
+  oacc_loop->loc = code->loc;
+  oacc_loop->block->loc = code->loc;
+
+  /* Splice the annotation into the place of the original loop.  */
+  if (parent->block == code)
+    parent->block = oacc_loop;
+  else
+    {
+      gfc_code *prev = parent->block;
+      while (prev != code && prev->next != code)
+	{
+	  prev = prev->next;
+	  gcc_assert (prev != NULL);
+	}
+      prev->next = oacc_loop;
+    }
+  oacc_loop->next = code->next;
+  code->next = NULL;
+}
+
+/* Recursively traverse CODE in block PARENT, finding OpenACC kernels
+   regions.  GOTO_TARGETS keeps track of statement labels that are
+   targets of gotos in the current function, while STATE keeps track
+   of the current context of the traversal.  If the traversal
+   encounters a DO loop inside a kernels region, annotate it with
+   OpenACC loop directives if appropriate.  Return the status of the
+   traversal.  */
+
+static enum annotation_result
+annotate_do_loops_in_kernels (gfc_code *code, gfc_code *parent,
+			      hash_set <gfc_st_label *> *goto_targets,
+			      annotation_state state)
+{
+  gfc_code *next_code = NULL;
+  enum annotation_result retval = ar_ok;
+
+  for ( ; code; code = next_code)
+    {
+      bool walk_block = true;
+      next_code = code->next;
+
+      if (state >= as_in_kernels_loop
+	  && code->here && goto_targets->contains (code->here))
+	/* This statement has a label that is the target of a GOTO or some
+	   other jump.  Do not try to sort out the details, just reject
+	   this loop nest.  */
+	{
+	  gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+		       "Possible control transfer to label at %L "
+		       "prevents annotation of loop nest",
+		       &code->loc);
+	  return ar_invalid_nest;
+	}
+
+      switch (code->op)
+	{
+	case EXEC_OACC_KERNELS:
+	  /* Enter kernels region.  */
+	  annotate_do_loops_in_kernels (code->block->next, code,
+					goto_targets,
+					as_in_kernels_region);
+	  walk_block = false;
+	  break;
+
+	case EXEC_OACC_PARALLEL_LOOP:
+	case EXEC_OACC_PARALLEL:
+	case EXEC_OACC_LOOP:
+	  /* Do not try to add automatic OpenACC annotations inside manually
+	     annotated loops.  Presumably, the user avoided doing it on
+	     purpose; for example, all available levels of parallelism may
+	     have been used up.  */
+	  if (state >= as_in_kernels_region)
+	    {
+	      gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+			   "Explicit loop annotation at %L "
+			   "prevents annotation of loop nest",
+			   &code->loc);
+	      return ar_invalid_nest;
+	    }
+	  walk_block = false;
+	  break;
+
+	case EXEC_DO:
+	  if (state >= as_in_kernels_region)
+	    {
+	      /* A DO loop's body is another phony DO node whose next
+		 pointer starts the actual body.  Skip the phony node.  */
+	      gcc_assert (code->block->op == EXEC_DO);
+	      enum annotation_result result
+		= annotate_do_loops_in_kernels (code->block->next, code,
+						goto_targets,
+						as_in_kernels_loop);
+	      /* Check for function/procedure calls in the body of the
+		 loop that would prevent parallelization.  Unlike in C/C++,
+		 we do not have to check that there is no modification of
+		 the loop variable or loop count since they are already
+		 handled by the semantics of DO loops in the FORTRAN
+		 language.  */
+	      if (result != ar_invalid_nest && check_for_invalid_calls (code))
+		result = ar_invalid_nest;
+	      if (result == ar_ok)
+		annotate_do_loop (code, parent);
+	      else if (result == ar_invalid_nest
+		       && state >= as_in_kernels_loop)
+		/* The outer loop is invalid, too, so stop traversal.  */
+		return result;
+	      walk_block = false;
+	    }
+	  break;
+
+	case EXEC_OACC_KERNELS_LOOP:
+	  /* This is a combined "acc kernels loop" directive.  We want to
+	     leave the outer loop alone but try to annotate any nested
+	     loops in the body.  The expected structure nesting here is
+	       EXEC_OACC_KERNELS_LOOP
+		 EXEC_OACC_KERNELS_LOOP
+		   EXEC_DO
+		     EXEC_DO
+		       ...body...  */
+	  if (code->block)
+	    /* Might be empty?  */
+	    {
+	      gcc_assert (code->block->op == EXEC_OACC_KERNELS_LOOP);
+	      gfc_omp_clauses *clauses = code->ext.omp_clauses;
+	      int collapse = clauses->collapse;
+	      gfc_expr_list *tile = clauses->tile_list;
+	      gfc_code *inner = code->block->next;
+
+	      gcc_assert (inner->op == EXEC_DO);
+	      gcc_assert (inner->block->op == EXEC_DO);
+
+	      /* We need to skip over nested loops covered by "collapse" or
+		 "tile" clauses.  "Tile" takes precedence
+		 (see gfc_trans_omp_do).  */
+	      if (tile)
+		{
+		  collapse = 0;
+		  for (gfc_expr_list *el = tile; el; el = el->next)
+		    collapse++;
+		}
+	      if (clauses->orderedc)
+		collapse = clauses->orderedc;
+	      if (collapse <= 0)
+		collapse = 1;
+	      for (int i = 1; i < collapse; i++)
+		{
+		  gcc_assert (inner->op == EXEC_DO);
+		  gcc_assert (inner->block->op == EXEC_DO);
+		  inner = inner->block->next;
+		}
+	      if (inner)
+		/* Loop might have empty body?  */
+		annotate_do_loops_in_kernels (inner->block->next,
+					      inner, goto_targets,
+					      as_in_kernels_region);
+	    }
+	  walk_block = false;
+	  break;
+
+	case EXEC_DO_WHILE:
+	case EXEC_DO_CONCURRENT:
+	  /* Traverse the body in a special state to allow EXIT statements
+	     from these loops.  */
+	  if (state >= as_in_kernels_loop)
+	    {
+	      enum annotation_result result
+		= annotate_do_loops_in_kernels (code->block, code,
+						goto_targets,
+						as_in_kernels_inner_loop);
+	      if (result == ar_invalid_nest)
+		return result;
+	      else if (result != ar_ok)
+		retval = result;
+	      walk_block = false;
+	    }
+	  break;
+
+	case EXEC_GOTO:
+	case EXEC_ARITHMETIC_IF:
+	case EXEC_STOP:
+	case EXEC_ERROR_STOP:
+	  /* A jump that may leave this loop.  */
+	  if (state >= as_in_kernels_loop)
+	    {
+	      gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+			   "Possible unstructured control flow at %L "
+			   "prevents annotation of loop nest",
+			   &code->loc);
+	      return ar_invalid_nest;
+	    }
+	  break;
+
+	case EXEC_RETURN:
+	  /* A return from a kernels region is diagnosed elsewhere as a
+	     hard error, so no warning is needed here.  */
+	  if (state >= as_in_kernels_loop)
+	    return ar_invalid_nest;
+	  break;
+
+	case EXEC_EXIT:
+	  if (state == as_in_kernels_loop)
+	    {
+	      gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+			   "Exit at %L prevents annotation of loop",
+			   &code->loc);
+	      retval = ar_invalid_loop;
+	    }
+	  break;
+
+	case EXEC_BACKSPACE:
+	case EXEC_CLOSE:
+	case EXEC_ENDFILE:
+	case EXEC_FLUSH:
+	case EXEC_INQUIRE:
+	case EXEC_OPEN:
+	case EXEC_READ:
+	case EXEC_REWIND:
+	case EXEC_WRITE:
+	  /* Executing side-effecting I/O statements in parallel doesn't
+	     make much sense.  If this is what users want, they can always
+	     add explicit annotations on the loop nest.  */
+	  if (state >= as_in_kernels_loop)
+	    {
+	      gfc_warning (OPT_Wopenacc_kernels_annotate_loops,
+			   "I/O statement at %L prevents annotation of loop",
+			   &code->loc);
+	      return ar_invalid_nest;
+	    }
+	  break;
+
+	default:
+	  break;
+	}
+
+      /* Visit nested statements, if any, returning early if we hit
+	 any problems.  */
+      if (walk_block)
+	{
+	  enum annotation_result result
+	    = annotate_do_loops_in_kernels (code->block, code,
+					    goto_targets, state);
+	  if (result == ar_invalid_nest)
+	    return result;
+	  else if (result != ar_ok)
+	    retval = result;
+	}
+    }
+  return retval;
+}
+
+/* Traverse CODE to find all the labels referenced by GOTO and similar
+   statements and store them in GOTO_TARGETS.  */
+
+static void
+compute_goto_targets (gfc_code *code, hash_set <gfc_st_label *> *goto_targets)
+{
+  for ( ; code; code = code->next)
+    {
+      switch (code->op)
+	{
+	case EXEC_GOTO:
+	case EXEC_LABEL_ASSIGN:
+	  goto_targets->add (code->label1);
+	  gcc_fallthrough ();
+
+	case EXEC_ARITHMETIC_IF:
+	  goto_targets->add (code->label2);
+	  goto_targets->add (code->label3);
+	  gcc_fallthrough ();
+
+	default:
+	  /* Visit nested statements, if any.  */
+	  if (code->block != NULL)
+	    compute_goto_targets (code->block, goto_targets);
+	}
+    }
+}
+
+/* Find DO loops in OpenACC kernels regions that do not have OpenACC
+   annotations but look like they might benefit from automatic
+   parallelization.  Add "acc loop auto" annotations for them.  Assumes
+   flag_openacc_kernels_annotate_loops is set.  */
+
+void
+gfc_oacc_annotate_loops_in_kernels_regions (gfc_namespace *ns)
+{
+  if (ns->proc_name)
+    {
+      hash_set <gfc_st_label *> goto_targets;
+      compute_goto_targets (ns->code, &goto_targets);
+      annotate_do_loops_in_kernels (ns->code, NULL, &goto_targets, as_outer);
+    }
+
+  for (ns = ns->contained; ns; ns = ns->sibling)
+    gfc_oacc_annotate_loops_in_kernels_regions (ns);
+}
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 6669621..a19286a 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -6612,6 +6612,15 @@ done:
   if (flag_c_prototypes || flag_c_prototypes_external)
     fprintf (stdout, "\n#ifdef __cplusplus\n}\n#endif\n");
 
+  /* Add annotations on loops in OpenACC kernels regions if requested.  This
+     is most easily done on this representation close to the source code.  */
+  if (flag_openacc && flag_openacc_kernels_annotate_loops)
+    {
+      gfc_current_ns = gfc_global_ns_list;
+      for (; gfc_current_ns; gfc_current_ns = gfc_current_ns->sibling)
+	gfc_oacc_annotate_loops_in_kernels_regions (gfc_current_ns);
+    }
+
   /* Do the translation.  */
   translate_all_program_units (gfc_global_ns_list);
 
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 7d3365f..8956530 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -4315,7 +4315,8 @@ typedef struct dovar_init_d {
 
 static tree
 gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
-		  gfc_omp_clauses *do_clauses, tree par_clauses)
+		  gfc_omp_clauses *do_clauses, tree par_clauses,
+		  bool combined)
 {
   gfc_se se;
   tree dovar, stmt, from, to, step, type, init, cond, incr, orig_decls;
@@ -4645,7 +4646,10 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
     case EXEC_OMP_DO: stmt = make_node (OMP_FOR); break;
     case EXEC_OMP_DISTRIBUTE: stmt = make_node (OMP_DISTRIBUTE); break;
     case EXEC_OMP_TASKLOOP: stmt = make_node (OMP_TASKLOOP); break;
-    case EXEC_OACC_LOOP: stmt = make_node (OACC_LOOP); break;
+    case EXEC_OACC_LOOP:
+      stmt = make_node (OACC_LOOP);
+      OACC_LOOP_COMBINED (stmt) = combined;
+      break;
     default: gcc_unreachable ();
     }
 
@@ -4739,7 +4743,8 @@ gfc_trans_oacc_combined_directive (gfc_code *code)
     pblock = &block;
   else
     pushlevel ();
-  stmt = gfc_trans_omp_do (code, EXEC_OACC_LOOP, pblock, &loop_clauses, NULL);
+  stmt = gfc_trans_omp_do (code, EXEC_OACC_LOOP, pblock, &loop_clauses, NULL,
+			   true);
   protected_set_expr_location (stmt, loc);
   if (TREE_CODE (stmt) != BIND_EXPR)
     stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
@@ -5180,7 +5185,7 @@ gfc_trans_omp_do_simd (gfc_code *code, stmtblock_t *pblock,
     omp_do_clauses
       = gfc_trans_omp_clauses (&block, &clausesa[GFC_OMP_SPLIT_DO], code->loc);
   body = gfc_trans_omp_do (code, EXEC_OMP_SIMD, pblock ? pblock : &block,
-			   &clausesa[GFC_OMP_SPLIT_SIMD], omp_clauses);
+			   &clausesa[GFC_OMP_SPLIT_SIMD], omp_clauses, false);
   if (pblock == NULL)
     {
       if (TREE_CODE (body) != BIND_EXPR)
@@ -5233,7 +5238,7 @@ gfc_trans_omp_parallel_do (gfc_code *code, stmtblock_t *pblock,
 	pushlevel ();
     }
   stmt = gfc_trans_omp_do (code, EXEC_OMP_DO, new_pblock,
-			   &clausesa[GFC_OMP_SPLIT_DO], omp_clauses);
+			   &clausesa[GFC_OMP_SPLIT_DO], omp_clauses, false);
   if (pblock == NULL)
     {
       if (TREE_CODE (stmt) != BIND_EXPR)
@@ -5476,7 +5481,8 @@ gfc_trans_omp_distribute (gfc_code *code, gfc_omp_clauses *clausesa)
     case EXEC_OMP_TARGET_TEAMS_DISTRIBUTE_SIMD:
     case EXEC_OMP_TEAMS_DISTRIBUTE_SIMD:
       stmt = gfc_trans_omp_do (code, EXEC_OMP_SIMD, &block,
-			       &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE);
+			       &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE,
+			       false);
       if (TREE_CODE (stmt) != BIND_EXPR)
 	stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
       else
@@ -5532,7 +5538,7 @@ gfc_trans_omp_teams (gfc_code *code, gfc_omp_clauses *clausesa,
     case EXEC_OMP_TEAMS_DISTRIBUTE:
       stmt = gfc_trans_omp_do (code, EXEC_OMP_DISTRIBUTE, NULL,
 			       &clausesa[GFC_OMP_SPLIT_DISTRIBUTE],
-			       NULL);
+			       NULL, false);
       break;
     default:
       stmt = gfc_trans_omp_distribute (code, clausesa);
@@ -5600,7 +5606,8 @@ gfc_trans_omp_target (gfc_code *code)
       break;
     case EXEC_OMP_TARGET_SIMD:
       stmt = gfc_trans_omp_do (code, EXEC_OMP_SIMD, &block,
-			       &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE);
+			       &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE,
+			       false);
       if (TREE_CODE (stmt) != BIND_EXPR)
 	stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
       else
@@ -5670,7 +5677,8 @@ gfc_trans_omp_taskloop (gfc_code *code)
       break;
     case EXEC_OMP_TASKLOOP_SIMD:
       stmt = gfc_trans_omp_do (code, EXEC_OMP_SIMD, &block,
-			       &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE);
+			       &clausesa[GFC_OMP_SPLIT_SIMD], NULL_TREE,
+			       false);
       if (TREE_CODE (stmt) != BIND_EXPR)
 	stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
       else
@@ -5948,7 +5956,7 @@ gfc_trans_oacc_directive (gfc_code *code)
       return gfc_trans_oacc_construct (code);
     case EXEC_OACC_LOOP:
       return gfc_trans_omp_do (code, code->op, NULL, code->ext.omp_clauses,
-			       NULL);
+			       NULL, false);
     case EXEC_OACC_UPDATE:
     case EXEC_OACC_CACHE:
     case EXEC_OACC_ENTER_DATA:
@@ -5985,7 +5993,7 @@ gfc_trans_omp_directive (gfc_code *code)
     case EXEC_OMP_SIMD:
     case EXEC_OMP_TASKLOOP:
       return gfc_trans_omp_do (code, code->op, NULL, code->ext.omp_clauses,
-			       NULL);
+			       NULL, false);
     case EXEC_OMP_DISTRIBUTE_PARALLEL_DO:
     case EXEC_OMP_DISTRIBUTE_PARALLEL_DO_SIMD:
     case EXEC_OMP_DISTRIBUTE_SIMD:
diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
index 0877242..fe8d09f 100644
--- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels-unparallelized.f95
@@ -2,6 +2,7 @@
 ! OpenACC kernels.
 
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fopt-info-optimized-omp" }
 ! { dg-additional-options "-fdump-tree-ompexp" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
diff --git a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95 b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
index f2c4736..1a0b8b8 100644
--- a/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/classify-kernels.f95
@@ -2,6 +2,7 @@
 ! kernels.
 
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fopt-info-optimized-omp" }
 ! { dg-additional-options "-fdump-tree-ompexp" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
diff --git a/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 b/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
index 9563492..562a4e4 100644
--- a/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/combined-directives.f90
@@ -139,10 +139,21 @@ end subroutine test
 
 ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. collapse.2." 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "acc loop private.i. gang" 2 "gimple" } }
-! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. worker" 2 "gimple" } }
-! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. vector" 2 "gimple" } }
-! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. seq" 2 "gimple" } }
-! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. auto" 2 "gimple" } }
+
+! These are the parallel loop variants.
+! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. worker" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. vector" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. seq" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. auto" 1 "gimple" } }
+
+! These are the kernels loop variants.  Here the inner loops are annotated
+! separately.
+! { dg-final { scan-tree-dump-times "acc loop private.i. worker" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. vector" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. seq" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop private.i. auto" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "acc loop auto private.j." 4 "gimple" } }
+
 ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. tile.2, 3" 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "acc loop private.i. independent" 2 "gimple" } }
 ! { dg-final { scan-tree-dump-times "private.z" 2 "gimple" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
index 5defe2e..d2816c3 100644
--- a/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
@@ -1,4 +1,5 @@
 ! { dg-options "-fopenacc -fdump-tree-omplower" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 
 module consts
   integer, parameter :: n = 100
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95
index ef53324..63774ff 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
new file mode 100644
index 0000000..42e751d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
@@ -0,0 +1,33 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that all loops in the nest are annotated. 
+
+subroutine f (a, b, c)
+  implicit none
+
+  real, intent (in), dimension(16,16) :: a
+  real, intent (in), dimension(16,16) :: b
+  real, intent (out), dimension(16,16) :: c
+  
+  integer :: i, j, k
+  real :: t
+
+!$acc kernels copyin(a(1:16,1:16), b(1:16,1:16)) copyout(c(1:16,1:16))
+
+  do i = 1, 16
+    do j = 1, 16
+      t = 0
+      do k = 1, 16
+        t = t + a(i,k) * b(k,j)
+      end do
+      c(i,j) = t;
+    end do
+  end do
+
+!$acc end kernels
+end subroutine f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 3 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95
new file mode 100644
index 0000000..f612c5be
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95
@@ -0,0 +1,32 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a random goto in the body can't be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0 .or. b(i) < 0) then
+      go to 10  ! { dg-warning "Possible unstructured control flow" }
+    end if
+    t = t + a(i) * b(i)
+  end do
+
+10  f = t
+
+!$acc end kernels
+
+end function f
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
new file mode 100644
index 0000000..6e2e2c4
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-additional-options "-std=legacy" }
+! { dg-do compile }
+
+! Test that a loop with a random label in the body cannot be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  goto 10
+
+  do i = 1, 16
+10  t = t + a(i) * b(i)  ! { dg-warning "Possible control transfer to label" }
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
new file mode 100644
index 0000000..03c4234
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
@@ -0,0 +1,39 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that in a situation with nested loops, a problem that prevents
+! annotation of the inner loop only still allows the outer loop to be
+! annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    do j = 1, 16
+      if (a(i) < 0 .or. b(j) < 0) then
+        exit  ! { dg-warning "Exit" }
+      else
+        t = t + a(i) * b(j)
+      end if
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
new file mode 100644
index 0000000..6aeb3f2
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
@@ -0,0 +1,38 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that in a situation with nested loops, a problem that prevents
+! annotation of the outer loop only still allows the inner loop to be
+! annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0) then
+      exit  ! { dg-warning "Exit" }
+    end if
+    do j = 1, 16
+      t = t + a(i) * b(j)
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
new file mode 100644
index 0000000..7d1cff6
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
@@ -0,0 +1,35 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that an explicit annotation on an outer loop suppresses annotation
+!  of inner loops, and produces a diagnostic.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+!$acc loop seq  ! { dg-warning "Explicit loop annotation" }
+  do i = 1, 16
+    do j = 1, 16
+      t = t + a(i) * b(j)
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
new file mode 100644
index 0000000..dab0d40
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
@@ -0,0 +1,35 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that an explicit annotation on an inner loop suppresses annotation
+! of the outer loop, and produces a diagnostic.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    !$acc loop seq  ! { dg-warning "Explicit loop annotation" }
+    do j = 1, 16
+      t = t + a(i) * b(j)
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
new file mode 100644
index 0000000..15ef670
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that loops containing I/O statements can't be annotated.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i, j
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    do j = 1, 16
+      print *, " i =", i, " j =", j  ! { dg-warning "I/O statement" }
+      t = t + a(i) * b(j)
+    end do
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95
new file mode 100644
index 0000000..e4e210a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95
@@ -0,0 +1,28 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that "acc kernels loop" directive causes annotation of the entire
+! loop nest.
+
+subroutine f (a, b)
+
+  implicit none
+  real, intent (in), dimension(20) :: a
+  real, intent (out), dimension(20) :: b
+  integer :: k, l, m
+
+!$acc kernels loop
+  do k = 1, 20
+    do l = 1, 20
+      do m = 1, 20
+	b(m) = a(m);
+      end do
+    end do
+  end do
+
+end subroutine f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 2 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95
new file mode 100644
index 0000000..5dd6e7f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95
@@ -0,0 +1,29 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that "acc kernels loop" directive causes annotation of the entire
+! loop nest in the presence of a collapse clause.
+
+subroutine f (a, b)
+
+  implicit none
+  real, intent (in), dimension(20) :: a
+  real, intent (out), dimension(20) :: b
+  integer :: k, l, m
+
+!$acc kernels loop collapse(2)
+  do k = 1, 20
+    do l = 1, 20
+      do m = 1, 20
+	b(m) = a(m);
+      end do
+    end do
+  end do
+
+end subroutine f
+
+! { dg-final { scan-tree-dump-times "acc loop .*collapse.2." 1 "original" } }
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
new file mode 100644
index 0000000..2baaa59
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
@@ -0,0 +1,32 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a variable bound can be annotated. 
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (:) :: a, b
+
+  integer :: i, n
+  real :: t
+
+  t = 0.0
+  n = size (a)
+
+!$acc kernels
+
+  do i = 1, n
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95
new file mode 100644
index 0000000..5169a0a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95
@@ -0,0 +1,26 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with calls to intrinsics in the body can be annotated.
+
+subroutine f (n, input, out1, out2)
+  implicit none
+  integer :: n
+  integer, intent (in), dimension (n) :: input
+  integer, intent (out), dimension (n) :: out1, out2
+
+  integer :: i
+
+!$acc kernels
+
+  do i = 1, n
+      out1(i) = min (i, input(i))
+      out2(i) = not (input(i))
+  end do
+!$acc end kernels
+
+end subroutine f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
new file mode 100644
index 0000000..e629891
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
@@ -0,0 +1,33 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a conditional in the body can be annotated. 
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) > 0 .and. b(i) > 0) then
+      t = t + a(i) * b(i)
+    end if
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
new file mode 100644
index 0000000..6c3300b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a case construct in the body can be annotated. 
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+!$acc kernels
+
+  do i = 1, 16
+    select case (i)
+      case (1)
+        t = a(i) * b(i)
+      case default
+        t = t + a(i) * b(i)
+    end select
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
new file mode 100644
index 0000000..52a9e7e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
@@ -0,0 +1,35 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a cycle statement in the body can be annotated. 
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0 .or. b(i) < 0) then
+      cycle
+    end if
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
new file mode 100644
index 0000000..60eb245
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a exit statement in the body cannot be annotated. 
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0 .or. b(i) < 0) then
+      exit	! { dg-warning "Exit" }
+    end if
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
new file mode 100644
index 0000000..438a13a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
@@ -0,0 +1,48 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a random function call in the body cannot
+! be annotated. 
+
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  interface
+    function g (x)
+      real :: g
+      real, intent (in) :: x
+    end function g
+
+    subroutine h (x)
+      real, intent (in) :: x
+    end subroutine h
+  end interface
+
+  t = 0.0
+
+!$acc kernels
+  do i = 1, 16
+    t = t + g (a(i) * b(i))  ! { dg-warning "Function call" }
+  end do
+
+  do i = 1, 16
+    call h (t) ! { dg-warning "Subroutine call" }
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
new file mode 100644
index 0000000..aa97e37
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
@@ -0,0 +1,50 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a call to a declared openacc function/subroutine
+! can be annotated. 
+
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  interface
+    function g (x)
+      !$acc routine worker
+      real :: g
+      real, intent (in) :: x
+    end function g
+
+    subroutine h (x)
+      !$acc routine worker
+      real, intent (in) :: x
+    end subroutine h
+  end interface
+
+  t = 0.0
+
+!$acc kernels
+  do i = 1, 16
+    t = t + g (a(i) * b(i))
+  end do
+
+  do i = 1, 16
+    call h (t)
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+!$acc end kernels
+
+end function f
+
+! { dg-final { scan-tree-dump-times "acc loop auto" 2 "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95
new file mode 100644
index 0000000..f5aa5a0
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95
@@ -0,0 +1,34 @@
+! { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-Wopenacc-kernels-annotate-loops" }
+! { dg-additional-options "-fdump-tree-original" }
+! { dg-do compile }
+
+! Test that a loop with a return statement in the body gives a hard
+! error.
+
+function f (a, b)
+  implicit none
+
+  real :: f
+  real, intent (in), dimension (16) :: a, b
+
+  integer :: i
+  real :: t
+
+  t = 0.0
+
+!$acc kernels
+
+  do i = 1, 16
+    if (a(i) < 0 .or. b(i) < 0) then
+      f = 0.0
+      return	! { dg-error "invalid branch" }
+    end if
+    t = t + a(i) * b(i)
+  end do
+
+  f = t
+
+!$acc end kernels
+
+end function f
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95
index 2f1dcd6..c1f6ef8 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95
index 447e85d6..313e3df 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95
index 4edb288..2667106 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95
index fc113e1..d79ed79 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95
index 94522f5..d8ef52a 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95
index b9c4aea..6b73341 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95
index 6dc7b2e..aadfcfc 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95
index 48c20b9..0d45c5c 100644
--- a/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95
@@ -1,4 +1,5 @@
 ! { dg-additional-options "-O2" }
+! { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
 ! { dg-additional-options "-fdump-tree-parloops1-all" }
 ! { dg-additional-options "-fdump-tree-optimized" }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95 b/gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95
index 5d563d2..0c47045 100644
--- a/gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/private-explicit-kernels-1.f95
@@ -73,8 +73,9 @@ program test
 
   !$acc kernels loop private(i2_1_c, j2_1_c) independent
   ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) private\\(j2_1_c\\) independent" 1 "original" } }
-  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) private\\(j2_1_c\\) independent" 1 "gimple" } }
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) independent" 1 "gimple" } }
   do i2_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(j2_1_c\\)" 1 "gimple" } }
      do j2_1_c = 1, 100
      end do
   end do
@@ -130,9 +131,11 @@ program test
 
   !$acc kernels loop private(i3_1_c, j3_1_c, k3_1_c) independent
   ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) private\\(j3_1_c\\) private\\(k3_1_c\\) independent" 1 "original" } }
-  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) private\\(j3_1_c\\) private\\(k3_1_c\\) independent" 1 "gimple" } }
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) independent" 1 "gimple" } }
   do i3_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(j3_1_c\\)" 1 "gimple" } }
      do j3_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(k3_1_c\\)" 1 "gimple" } }
         do k3_1_c = 1, 100
         end do
      end do
diff --git a/gcc/testsuite/gfortran.dg/goacc/private-predetermined-kernels-1.f95 b/gcc/testsuite/gfortran.dg/goacc/private-predetermined-kernels-1.f95
index 12a7854..3357a20 100644
--- a/gcc/testsuite/gfortran.dg/goacc/private-predetermined-kernels-1.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/private-predetermined-kernels-1.f95
@@ -73,8 +73,9 @@ program test
 
   !$acc kernels loop independent
   ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) private\\(j2_1_c\\) independent" 1 "original" } }
-  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) private\\(j2_1_c\\) independent" 1 "gimple" } }
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i2_1_c\\) independent" 1 "gimple" } }
   do i2_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(j2_1_c\\)" 1 "gimple" } }
      do j2_1_c = 1, 100
      end do
   end do
@@ -130,9 +131,11 @@ program test
 
   !$acc kernels loop independent
   ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) private\\(j3_1_c\\) private\\(k3_1_c\\) independent" 1 "original" } }
-  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) private\\(j3_1_c\\) private\\(k3_1_c\\) independent" 1 "gimple" } }
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop private\\(i3_1_c\\) independent" 1 "gimple" } }
   do i3_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(j3_1_c\\)" 1 "gimple" } }
      do j3_1_c = 1, 100
+  ! { dg-final { scan-tree-dump-times "#pragma acc loop auto private\\(k3_1_c\\)" 1 "gimple" } }
         do k3_1_c = 1, 100
         end do
      end do
-- 
2.8.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] [OpenACC] Kernels loop annotation
  2020-09-09 17:53 [PATCH 0/2] [OpenACC] Kernels loop annotation Sandra Loosemore
  2020-09-09 17:53 ` [PATCH 1/2] [OpenACC] Kernels loops annotation: C and C++ Sandra Loosemore
  2020-09-09 17:53 ` [PATCH 2/2] [OpenACC] Kernels loops annotation: Fortran Sandra Loosemore
@ 2020-09-10 10:20 ` Richard Biener
  2020-09-10 16:00   ` Sandra Loosemore
  2020-10-20 16:34 ` Sandra Loosemore
  3 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2020-09-10 10:20 UTC (permalink / raw)
  To: Sandra Loosemore; +Cc: GCC Patches

On Wed, Sep 9, 2020 at 7:55 PM Sandra Loosemore <sandra@codesourcery.com> wrote:
>
> This set of patches implements C/C++ and Fortran front end support for
> adding "acc loop auto" annotations to loop nests in OpenACC kernels
> regions.  For background on this, refer to Thomas Schwinge's talk from
> last year's cauldron, at
>
> https://gcc.gnu.org/wiki/cauldron2019talks?action=AttachFile&do=view&target=OpenACC+kernels-cauldron2019.pdf
>
> In particular, pages 20-24 describe this part of the work.  We're
> trying to identify loops that might be parallelizable and convert them
> to ACC_LOOP tree structures for further analysis, instead of lowering
> them to goto form early in compilation, as we do with ordinary
> for/while/do loops in C/C++ and DO loops in Fortran.

So the issue I ran into when trying a simplistic "transfer" of DO CONCURRENT
is that variables in DO CONCURRENT scope get moved to function scope
by simplification and nothing prevents optimizers from extending lifetime
of those which means we end up eventually creating additional cross-iteration
dependences and the result is a loop that is no longer satisfying 'DO
CONCURRENT'.

I realize OACC handling is hacked in place in a set of passes during early
optimization so these kind of transforms simply might not happen "yet"
(by luck - nothing made them "invalid" on GIMPLE).

I didn't look at the how you "annotate" and until when the annotation prevails
(the headers of the two patches don't say so either) so maybe you will
not have such issues by design?

Richard.

> The C/C++ patches depend on my earlier not-yet-reviewed patch series
> to unify the loop tree representations in the two front ends, which I
> most recently reposted here:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551927.html
>
> These patches have all been previously committed to the OG10 branch to
> unblock other work on subsequent processing of the loops with auto
> annotations, but without proper review.  Here I've mashed up the
> followup bug fixes and incremental improvements I committed to the
> branch together with the original patches to simplify review.  If
> anyone cares, the corresponding commits on the OG10 branch were:
>
> c96409c1f190e29fd9809890fb16d72556f3b7e6
> fdbad20a57e03e05b608f19db41a454bc0cd1c47
> 6d670e648e76fe44589a42ee458098ff84d24af1
> cb581bdb6689d74c1849b7e6bc139c6f122fdcc2
> 16d59cea8362c1ec731aa5b4db42a2817f036f23
> 7b436e90a4e03fdce5b0c6a8c452d3f23f1e136b
> c2789b61cf29397295f39a43f5d1605ab8a32d87
> 1c9af55d7ff76e2e6b633af33e6e6991a0ba4c48
> 20f37fd2f9c8c52fff380982d6fc5eb2d88b3dd9
> df5f2065bad30dc6aff9653237157c33fd4161cd
>
> Sandra Loosemore (2):
>   Kernels loops annotation: C and C++.
>   Kernels loops annotation: Fortran.
>
>  gcc/c-family/c-common.h                            |   1 +
>  gcc/c-family/c-omp.c                               | 916 +++++++++++++++++++--
>  gcc/c-family/c.opt                                 |   8 +
>  gcc/c/c-decl.c                                     |  28 +
>  gcc/c/c-parser.c                                   |   3 +
>  gcc/cp/decl.c                                      |  44 +
>  gcc/cp/parser.c                                    |   3 +
>  gcc/cp/semantics.c                                 |   9 +
>  gcc/doc/invoke.texi                                |  34 +-
>  gcc/fortran/gfortran.h                             |   1 +
>  gcc/fortran/lang.opt                               |   8 +
>  gcc/fortran/openmp.c                               | 415 ++++++++++
>  gcc/fortran/parse.c                                |   9 +
>  gcc/fortran/trans-openmp.c                         |  30 +-
>  .../goacc/classify-kernels-unparallelized.c        |   1 +
>  .../c-c++-common/goacc/classify-kernels.c          |   1 +
>  .../c-c++-common/goacc/combined-directives.c       |   2 +-
>  .../goacc/kernels-counter-var-redundant-load.c     |   1 +
>  .../goacc/kernels-counter-vars-function-scope.c    |   1 +
>  .../goacc/kernels-double-reduction-n.c             |   1 +
>  .../c-c++-common/goacc/kernels-double-reduction.c  |   1 +
>  gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c  |   1 +
>  gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c  |   1 +
>  .../c-c++-common/goacc/kernels-loop-annotation-1.c |  26 +
>  .../goacc/kernels-loop-annotation-10.c             |  32 +
>  .../goacc/kernels-loop-annotation-11.c             |  27 +
>  .../goacc/kernels-loop-annotation-12.c             |  28 +
>  .../goacc/kernels-loop-annotation-13.c             |  27 +
>  .../goacc/kernels-loop-annotation-14.c             |  22 +
>  .../goacc/kernels-loop-annotation-15.c             |  22 +
>  .../goacc/kernels-loop-annotation-16.c             |  26 +
>  .../goacc/kernels-loop-annotation-17.c             |  26 +
>  .../goacc/kernels-loop-annotation-18.c             |  18 +
>  .../goacc/kernels-loop-annotation-19.c             |  19 +
>  .../c-c++-common/goacc/kernels-loop-annotation-2.c |  21 +
>  .../goacc/kernels-loop-annotation-20.c             |  23 +
>  .../goacc/kernels-loop-annotation-21.c             |  42 +
>  .../goacc/kernels-loop-annotation-22.c             |  41 +
>  .../c-c++-common/goacc/kernels-loop-annotation-3.c |  24 +
>  .../c-c++-common/goacc/kernels-loop-annotation-4.c |  34 +
>  .../c-c++-common/goacc/kernels-loop-annotation-5.c |  27 +
>  .../c-c++-common/goacc/kernels-loop-annotation-6.c |  27 +
>  .../c-c++-common/goacc/kernels-loop-annotation-7.c |  26 +
>  .../c-c++-common/goacc/kernels-loop-annotation-8.c |  27 +
>  .../c-c++-common/goacc/kernels-loop-annotation-9.c |  26 +
>  .../c-c++-common/goacc/kernels-loop-data-2.c       |   1 +
>  .../goacc/kernels-loop-data-enter-exit-2.c         |   1 +
>  .../goacc/kernels-loop-data-enter-exit.c           |   1 +
>  .../c-c++-common/goacc/kernels-loop-data-update.c  |   1 +
>  .../c-c++-common/goacc/kernels-loop-data.c         |   1 +
>  gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c  |   1 +
>  .../c-c++-common/goacc/kernels-loop-mod-not-zero.c |   1 +
>  gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c  |   1 +
>  .../c-c++-common/goacc/kernels-loop-nest.c         |   1 +
>  gcc/testsuite/c-c++-common/goacc/kernels-loop.c    |   1 +
>  .../c-c++-common/goacc/kernels-one-counter-var.c   |   1 +
>  .../goacc/kernels-parallel-loop-data-enter-exit.c  |   1 +
>  .../c-c++-common/goacc/kernels-reduction.c         |   1 +
>  .../goacc/classify-kernels-unparallelized.f95      |   1 +
>  .../gfortran.dg/goacc/classify-kernels.f95         |   1 +
>  .../gfortran.dg/goacc/combined-directives.f90      |  19 +-
>  gcc/testsuite/gfortran.dg/goacc/common-block-3.f90 |   1 +
>  gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 |   1 +
>  .../goacc/kernels-loop-annotation-1.f95            |  33 +
>  .../goacc/kernels-loop-annotation-10.f95           |  32 +
>  .../goacc/kernels-loop-annotation-11.f95           |  34 +
>  .../goacc/kernels-loop-annotation-12.f95           |  39 +
>  .../goacc/kernels-loop-annotation-13.f95           |  38 +
>  .../goacc/kernels-loop-annotation-14.f95           |  35 +
>  .../goacc/kernels-loop-annotation-15.f95           |  35 +
>  .../goacc/kernels-loop-annotation-16.f95           |  34 +
>  .../goacc/kernels-loop-annotation-18.f95           |  28 +
>  .../goacc/kernels-loop-annotation-19.f95           |  29 +
>  .../goacc/kernels-loop-annotation-2.f95            |  32 +
>  .../goacc/kernels-loop-annotation-20.f95           |  26 +
>  .../goacc/kernels-loop-annotation-3.f95            |  33 +
>  .../goacc/kernels-loop-annotation-4.f95            |  34 +
>  .../goacc/kernels-loop-annotation-5.f95            |  35 +
>  .../goacc/kernels-loop-annotation-6.f95            |  34 +
>  .../goacc/kernels-loop-annotation-7.f95            |  48 ++
>  .../goacc/kernels-loop-annotation-8.f95            |  50 ++
>  .../goacc/kernels-loop-annotation-9.f95            |  34 +
>  .../gfortran.dg/goacc/kernels-loop-data-2.f95      |   1 +
>  .../goacc/kernels-loop-data-enter-exit-2.f95       |   1 +
>  .../goacc/kernels-loop-data-enter-exit.f95         |   1 +
>  .../gfortran.dg/goacc/kernels-loop-data-update.f95 |   1 +
>  .../gfortran.dg/goacc/kernels-loop-data.f95        |   1 +
>  gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 |   1 +
>  gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95   |   1 +
>  .../kernels-parallel-loop-data-enter-exit.f95      |   1 +
>  .../goacc/private-explicit-kernels-1.f95           |   7 +-
>  .../goacc/private-predetermined-kernels-1.f95      |   7 +-
>  gcc/tree.h                                         |   5 +
>  93 files changed, 2755 insertions(+), 81 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c
>  create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-1.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-10.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-11.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-12.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-13.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-14.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-15.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-16.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-18.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-19.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-2.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-20.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-3.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-4.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-5.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-6.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-7.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-8.f95
>  create mode 100644 gcc/testsuite/gfortran.dg/goacc/kernels-loop-annotation-9.f95
>
> --
> 2.8.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] [OpenACC] Kernels loop annotation
  2020-09-10 10:20 ` [PATCH 0/2] [OpenACC] Kernels loop annotation Richard Biener
@ 2020-09-10 16:00   ` Sandra Loosemore
  0 siblings, 0 replies; 7+ messages in thread
From: Sandra Loosemore @ 2020-09-10 16:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On 9/10/20 4:20 AM, Richard Biener wrote:
> On Wed, Sep 9, 2020 at 7:55 PM Sandra Loosemore <sandra@codesourcery.com> wrote:
>>
>> This set of patches implements C/C++ and Fortran front end support for
>> adding "acc loop auto" annotations to loop nests in OpenACC kernels
>> regions.  For background on this, refer to Thomas Schwinge's talk from
>> last year's cauldron, at
>>
>> https://gcc.gnu.org/wiki/cauldron2019talks?action=AttachFile&do=view&target=OpenACC+kernels-cauldron2019.pdf
>>
>> In particular, pages 20-24 describe this part of the work.  We're
>> trying to identify loops that might be parallelizable and convert them
>> to ACC_LOOP tree structures for further analysis, instead of lowering
>> them to goto form early in compilation, as we do with ordinary
>> for/while/do loops in C/C++ and DO loops in Fortran.
> 
> So the issue I ran into when trying a simplistic "transfer" of DO CONCURRENT
> is that variables in DO CONCURRENT scope get moved to function scope
> by simplification and nothing prevents optimizers from extending lifetime
> of those which means we end up eventually creating additional cross-iteration
> dependences and the result is a loop that is no longer satisfying 'DO
> CONCURRENT'.

I don't have any background on this issue, but I think it must be 
orthogonal?  My patch only examines EXEC_DO, not EXEC_DO_CONCURRENT.

> I realize OACC handling is hacked in place in a set of passes during early
> optimization so these kind of transforms simply might not happen "yet"
> (by luck - nothing made them "invalid" on GIMPLE).
> 
> I didn't look at the how you "annotate" and until when the annotation prevails
> (the headers of the two patches don't say so either) so maybe you will
> not have such issues by design?

The strategy is pretty simple:  it does a code walk to examine the 
parsed form of ordinary loop constructs (EXEC_DO in Fortran, FOR_STMT in 
the newly combined C/C++ representation) within a kernels region.  If 
any loop in a nest has an explicit "acc loop" annotation, the annotator 
ignores that entire nest on the theory that the user has already 
indicated what parallelism they want, except for combined "acc kernels 
loop" directives where the intent in actual code seems to be to try to 
optimize the entire nest.  It does some sanity checks about modification 
of the loop variable in the body of the loop, etc.  If it looks 
plausible, the annotator changes the representation to the equivalent of 
"acc loop auto", and it's up to later passes to figure out whether 
"auto" can be compiled as "parallel" or if it has to fall back to "seq". 
  I tried to add a lot of comments throughout the code explaining the 
rationale for the various heuristics and restrictions controlling the 
annotation.

-Sandra

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] [OpenACC] Kernels loop annotation
  2020-09-09 17:53 [PATCH 0/2] [OpenACC] Kernels loop annotation Sandra Loosemore
                   ` (2 preceding siblings ...)
  2020-09-10 10:20 ` [PATCH 0/2] [OpenACC] Kernels loop annotation Richard Biener
@ 2020-10-20 16:34 ` Sandra Loosemore
  3 siblings, 0 replies; 7+ messages in thread
From: Sandra Loosemore @ 2020-10-20 16:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: Joseph Myers, jason, Nathan Sidwell, fortran, Tobias Burnus

On 9/9/20 11:53 AM, Sandra Loosemore wrote:
> This set of patches implements C/C++ and Fortran front end support for
> adding "acc loop auto" annotations to loop nests in OpenACC kernels
> regions.  For background on this, refer to Thomas Schwinge's talk from
> last year's cauldron, at
> 
> https://gcc.gnu.org/wiki/cauldron2019talks?action=AttachFile&do=view&target=OpenACC+kernels-cauldron2019.pdf
> 
> In particular, pages 20-24 describe this part of the work.  We're
> trying to identify loops that might be parallelizable and convert them
> to ACC_LOOP tree structures for further analysis, instead of lowering
> them to goto form early in compilation, as we do with ordinary
> for/while/do loops in C/C++ and DO loops in Fortran.

Ping!

Overview: 
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553518.html

C/C++ part: 
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553519.html

Fortran part: 
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553520.html

-Sandra



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] [OpenACC] Kernels loops annotation: C and C++.
  2020-09-09 17:53 ` [PATCH 1/2] [OpenACC] Kernels loops annotation: C and C++ Sandra Loosemore
@ 2020-10-22 20:21   ` Jason Merrill
  0 siblings, 0 replies; 7+ messages in thread
From: Jason Merrill @ 2020-10-22 20:21 UTC (permalink / raw)
  To: Sandra Loosemore, gcc-patches, Jakub Jelinek

On 9/9/20 1:53 PM, Sandra Loosemore wrote:
> This patch detects loops in kernels regions that are candidates for
> parallelization, and adds "#pragma acc loop auto" annotations to them.
> This annotation is controlled by the -fopenacc-kernels-annotate-loops
> option, which is enabled by default.  -Wopenacc-kernels-annotate-loops
> can be used to produce diagnostics about loops that cannot be
> annotated.

For the main body of the patch, I hope Jakub will look at it as the 
OpenMP maintainer.  I just have a couple of minor comments below.

> 2020-09-08  Sandra Loosemore  <sandra@codesourcery.com>
> 
> 	gcc/c-family/
> 	* c-common.h (c_oacc_annotate_loops_in_kernels_regions): Declare.
> 	* c-omp.c: Include tree-iterator.h.
> 	(enum annotation_state): New.
> 	(struct annotation_info): New.
> 	(do_not_annotate_loop): New.
> 	(do_not_annotate_loop_nest): New.
> 	(annotation_error): New.
> 	(c_finish_omp_for_internal): New.
> 	(c_finish_omp_for): Use c_finish_omp_for_internal.
> 	(is_local_var): New.
> 	(end_test_ok_for_annotation_r): New.
> 	(end_test_ok_for_annotation): New.
> 	(lang_specific_unwrap_initializer): New.
> 	(annotate_for_loop): New.
> 	(annotate_and_check_for_loop): New.
> 	(annotate_loops_in_kernels_regions): New.
> 	(c_oacc_annotate_loops_in_kernels_regions): New.
> 	* c.opt (Wopenacc-kernels-annotate-loops): New.
> 	(fopenacc-kernels-annotate-loops): New.
> 
> 	gcc/c/
> 	* c-decl.c (c_unwrap_for_init): New.
> 	(finish_function): Call c_oacc_annotate_loops_in_kernels_regions.
> 	* c-parser.c (c_parser_oacc_loop): Set OACC_LOOP_COMBINED.
> 
> 	gcc/cp/
> 	* decl.c (cp_unwrap_for_init): New.
> 	(finish_function): Call c_oacc_annotate_loops_in_kernels_regions.
> 	* parser.c (cp_parser_oacc_loop): Set OACC_LOOP_COMBINED.
> 	* semantics.c (handle_omp_array_sections_1): Call STRIP_NOPS
> 	on length and bound.
> 	(handle_omp_array_sections): Likewise.
> 
> 	gcc/
> 	* doc/invoke.texi (Option Summary): Add entries for
> 	-Wopenacc-kernels-annotate-loops and
> 	-fno-openacc-kernels-annotate-loops.
> 	(Warning Options): Document -Wopenacc-kernels-annotate-loops.
> 	(Optimization Options): Document
> 	-fno-openacc-kernels-annotate-loops.
> 	* tree.h (OACC_LOOP_COMBINED): New.
> 
> 	gcc/testsuite/
> 	* c-c++-common/goacc/classify-kernels-unparallelized.c: Add
> 	-fno-openacc-kernels-annotate-loops option.
> 	* c-c++-common/goacc/classify-kernels.c: Likewise.
> 	* c-c++-common/goacc/combined-directives.c: Likewise.
> 	* c-c++-common/goacc/kernels-counter-var-redundant-load.c:
> 	Likewise.
> 	* c-c++-common/goacc/kernels-counter-vars-function-scope.c:
> 	Likewise.
> 	* c-c++-common/goacc/kernels-double-reduction-n.c: Likewise.
> 	* c-c++-common/goacc/kernels-double-reduction.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-2.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-3.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-data-2.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-data-update.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-data.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-g.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-n.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-nest.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop.c: Likewise.
> 	* c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
> 	* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c:
> 	Likewise.
> 	* c-c++-common/goacc/kernels-reduction.c: Likewise.
> 	* c-c++-common/goacc/kernels-loop-annotation-1.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-2.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-3.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-4.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-5.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-6.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-7.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-8.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-9.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-10.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-11.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-12.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-13.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-14.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-15.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-16.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-17.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-18.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-19.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-20.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-21.c: New.
> 	* c-c++-common/goacc/kernels-loop-annotation-22.c: New.
> ---
>   gcc/c-family/c-common.h                            |   1 +
>   gcc/c-family/c-omp.c                               | 916 +++++++++++++++++++--
>   gcc/c-family/c.opt                                 |   8 +
>   gcc/c/c-decl.c                                     |  28 +
>   gcc/c/c-parser.c                                   |   3 +
>   gcc/cp/decl.c                                      |  44 +
>   gcc/cp/parser.c                                    |   3 +
>   gcc/cp/semantics.c                                 |   9 +
>   gcc/doc/invoke.texi                                |  34 +-
>   .../goacc/classify-kernels-unparallelized.c        |   1 +
>   .../c-c++-common/goacc/classify-kernels.c          |   1 +
>   .../c-c++-common/goacc/combined-directives.c       |   2 +-
>   .../goacc/kernels-counter-var-redundant-load.c     |   1 +
>   .../goacc/kernels-counter-vars-function-scope.c    |   1 +
>   .../goacc/kernels-double-reduction-n.c             |   1 +
>   .../c-c++-common/goacc/kernels-double-reduction.c  |   1 +
>   gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c  |   1 +
>   gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c  |   1 +
>   .../c-c++-common/goacc/kernels-loop-annotation-1.c |  26 +
>   .../goacc/kernels-loop-annotation-10.c             |  32 +
>   .../goacc/kernels-loop-annotation-11.c             |  27 +
>   .../goacc/kernels-loop-annotation-12.c             |  28 +
>   .../goacc/kernels-loop-annotation-13.c             |  27 +
>   .../goacc/kernels-loop-annotation-14.c             |  22 +
>   .../goacc/kernels-loop-annotation-15.c             |  22 +
>   .../goacc/kernels-loop-annotation-16.c             |  26 +
>   .../goacc/kernels-loop-annotation-17.c             |  26 +
>   .../goacc/kernels-loop-annotation-18.c             |  18 +
>   .../goacc/kernels-loop-annotation-19.c             |  19 +
>   .../c-c++-common/goacc/kernels-loop-annotation-2.c |  21 +
>   .../goacc/kernels-loop-annotation-20.c             |  23 +
>   .../goacc/kernels-loop-annotation-21.c             |  42 +
>   .../goacc/kernels-loop-annotation-22.c             |  41 +
>   .../c-c++-common/goacc/kernels-loop-annotation-3.c |  24 +
>   .../c-c++-common/goacc/kernels-loop-annotation-4.c |  34 +
>   .../c-c++-common/goacc/kernels-loop-annotation-5.c |  27 +
>   .../c-c++-common/goacc/kernels-loop-annotation-6.c |  27 +
>   .../c-c++-common/goacc/kernels-loop-annotation-7.c |  26 +
>   .../c-c++-common/goacc/kernels-loop-annotation-8.c |  27 +
>   .../c-c++-common/goacc/kernels-loop-annotation-9.c |  26 +
>   .../c-c++-common/goacc/kernels-loop-data-2.c       |   1 +
>   .../goacc/kernels-loop-data-enter-exit-2.c         |   1 +
>   .../goacc/kernels-loop-data-enter-exit.c           |   1 +
>   .../c-c++-common/goacc/kernels-loop-data-update.c  |   1 +
>   .../c-c++-common/goacc/kernels-loop-data.c         |   1 +
>   gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c  |   1 +
>   .../c-c++-common/goacc/kernels-loop-mod-not-zero.c |   1 +
>   gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c  |   1 +
>   .../c-c++-common/goacc/kernels-loop-nest.c         |   1 +
>   gcc/testsuite/c-c++-common/goacc/kernels-loop.c    |   1 +
>   .../c-c++-common/goacc/kernels-one-counter-var.c   |   1 +
>   .../goacc/kernels-parallel-loop-data-enter-exit.c  |   1 +
>   .../c-c++-common/goacc/kernels-reduction.c         |   1 +
>   gcc/tree.h                                         |   5 +
>   54 files changed, 1603 insertions(+), 62 deletions(-)
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c
>   create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c
> 
> diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
> index 6abfe4b..d7938ba 100644
> --- a/gcc/c-family/c-common.h
> +++ b/gcc/c-family/c-common.h
> @@ -1220,6 +1220,7 @@ extern enum omp_clause_default_kind c_omp_predetermined_sharing (tree);
>   extern enum omp_clause_defaultmap_kind c_omp_predetermined_mapping (tree);
>   extern tree c_omp_check_context_selector (location_t, tree);
>   extern void c_omp_mark_declare_variant (location_t, tree, tree);
> +extern void c_oacc_annotate_loops_in_kernels_regions (tree, tree (*) (tree));
>   extern const char *c_omp_map_clause_name (tree, bool);
>   
>   /* Return next tree in the chain for chain_next walking of tree nodes.  */
> diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
> index d7cff0f..3c86f3f 100644
> --- a/gcc/c-family/c-omp.c
> +++ b/gcc/c-family/c-omp.c
> @@ -35,7 +35,7 @@ along with GCC; see the file COPYING3.  If not see
>   #include "attribs.h"
>   #include "gimplify.h"
>   #include "langhooks.h"
> -
> +#include "tree-iterator.h"
>   
>   /* Complete a #pragma oacc wait construct.  LOC is the location of
>      the #pragma.  */
> @@ -694,6 +694,110 @@ c_omp_for_incr_canonicalize_ptr (location_t loc, tree decl, tree incr)
>     return incr;
>   }
>   
> +/* State of annotation traversal for FOR loops in kernels regions,
> +   used to control processing and diagnostic messages that are deferred until
> +   the entire loop has been scanned.  */
> +enum annotation_state {
> +  as_outer,
> +  as_in_kernels_region,
> +  as_in_kernels_loop,
> +  /* The remaining state values represent conversion failures caught
> +     while in as_in_kernels_loop state.  To test whether the traversal is
> +     in the body of a kernels loop, use (state >= as_in_kernels_loop).  */
> +  as_invalid_variable_type,
> +  as_missing_initializer,
> +  as_invalid_initializer,
> +  as_missing_predicate,
> +  as_invalid_predicate,
> +  as_missing_increment,
> +  as_invalid_increment,
> +  as_explicit_annotation,
> +  as_invalid_control_flow,
> +  as_invalid_break,
> +  as_invalid_return,
> +  as_invalid_call,
> +  as_invalid_modification
> +};
> +
> +/* Structure used to hold state for automatic annotation of FOR loops
> +   in kernels regions.  LOOP is the nearest enclosing loop, or
> +   NULL_TREE if outside of a loop context.  VARS is a tree_list
> +   containing the variables controlling LOOP's termination (the
> +   induction variable and a possible limit variable).  STATE keeps
> +   track of whether loop satisfies all criteria making it legal to
> +   parallelize.  Otherwise, REASON is a statement that blocks
> +   automatic parallelization, such as an unstructured jump or an
> +   assignment to a variable in VARS, used for printing diagnostics.
> +
> +   These structures are chained through NEXT, which points to the
> +   next-closest enclosing loop's or the kernels region's annotation info, if
> +   any.  */
> +
> +struct annotation_info
> +{
> +  tree loop;
> +  tree vars;
> +  bool break_ok;
> +  enum annotation_state state;
> +  tree reason;
> +  struct annotation_info *next;
> +};
> +
> +/* Mark the current loop's INFO as not OK to annotate, recording STATE
> +   and REASON for producing diagnostics later.  */
> +
> +static void
> +do_not_annotate_loop (struct annotation_info *info,
> +		      enum annotation_state state, tree reason)
> +{
> +  if (info->state == as_in_kernels_loop)
> +    {
> +      info->state = state;
> +      info->reason = reason;
> +    }
> +}
> +
> +/* Mark the current loop identified by INFO and all of its ancestors (i.e.,
> +   enclosing loops) as not OK to annotate.  Arguments are the same as
> +   for do_not_annotate_loop.  */
> +
> +static void
> +do_not_annotate_loop_nest (struct annotation_info *info,
> +			   enum annotation_state state, tree reason)
> +{
> +  while (info != NULL)
> +    {
> +      do_not_annotate_loop (info, state, reason);
> +      info = info->next;
> +    }
> +}
> +
> +/* If INFO is non-null, call do_not_annotate_loop with STATE and REASON
> +   to record info for diagnosing an error later.  Otherwise emit an error now
> +   at ELOCUS with message MSG and the optional arguments.  */
> +
> +static void annotation_error (struct annotation_info *,
> +			      enum annotation_state, tree, location_t,
> +			      const char *, ...) ATTRIBUTE_GCC_DIAG(5,6);
> +static
> +void annotation_error (struct annotation_info *info,
> +			      enum annotation_state state,
> +			      tree reason,
> +			      location_t elocus,
> +			      const char *msg, ...)
> +{
> +  if (info)
> +    do_not_annotate_loop (info, state, reason);
> +  else
> +    {
> +      auto_diagnostic_group d;
> +      va_list ap;
> +      va_start (ap, msg);
> +      emit_diagnostic_valist (DK_ERROR, elocus, -1, msg, &ap);
> +      va_end (ap);
> +    }
> +}
> +
>   /* Validate and generate OMP_FOR.
>      DECLV is a vector of iteration variables, for each collapsed loop.
>   
> @@ -703,12 +807,19 @@ c_omp_for_incr_canonicalize_ptr (location_t loc, tree decl, tree incr)
>      INITV, CONDV and INCRV are vectors containing initialization
>      expressions, controlling predicates and increment expressions.
>      BODY is the body of the loop and PRE_BODY statements that go before
> -   the loop.  */
> +   the loop.  FINAL_P is true if not inside a C++ template.
>   
> -tree
> -c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
> -		  tree orig_declv, tree initv, tree condv, tree incrv,
> -		  tree body, tree pre_body, bool final_p)
> +   INFO is null if called to parse an explicitly-annotated OMP for
> +   loop, otherwise it holds state information for automatically
> +   annotating a regular FOR loop in a kernels region.  In the former case,
> +   malformed loops are hard errors; otherwise we just record the annotation
> +   failure in INFO.  */
> +
> +static tree
> +c_finish_omp_for_internal (location_t locus, enum tree_code code, tree declv,
> +			   tree orig_declv, tree initv, tree condv, tree incrv,
> +			   tree body, tree pre_body, bool final_p,
> +			   struct annotation_info *info)
>   {
>     location_t elocus;
>     bool fail = false;
> @@ -732,12 +843,14 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
>         if (!INTEGRAL_TYPE_P (TREE_TYPE (decl))
>   	  && TREE_CODE (TREE_TYPE (decl)) != POINTER_TYPE)
>   	{
> -	  error_at (elocus, "invalid type for iteration variable %qE", decl);
> +	  annotation_error (info, as_invalid_variable_type, decl, elocus,
> +			    "invalid type for iteration variable %qE", decl);
>   	  fail = true;
>   	}
>         else if (TYPE_ATOMIC (TREE_TYPE (decl)))
>   	{
> -	  error_at (elocus, "%<_Atomic%> iteration variable %qE", decl);
> +	  annotation_error (info, as_invalid_variable_type, decl, elocus,
> +			    "%<_Atomic%> iteration variable %qE", decl);
>   	  fail = true;
>   	  /* _Atomic iterator confuses stuff too much, so we risk ICE
>   	     trying to diagnose it further.  */
> @@ -753,7 +866,8 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
>   	  init = DECL_INITIAL (decl);
>   	  if (init == NULL)
>   	    {
> -	      error_at (elocus, "%qE is not initialized", decl);
> +	      annotation_error (info, as_missing_initializer, decl, elocus,
> +				"%qE is not initialized", decl);
>   	      init = integer_zero_node;
>   	      fail = true;
>   	    }
> @@ -774,7 +888,8 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
>   
>         if (cond == NULL_TREE)
>   	{
> -	  error_at (elocus, "missing controlling predicate");
> +	  annotation_error (info, as_missing_predicate, NULL_TREE, elocus,
> +			    "missing controlling predicate");
>   	  fail = true;
>   	}
>         else
> @@ -790,12 +905,14 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
>   	  if (EXPR_HAS_LOCATION (cond))
>   	    elocus = EXPR_LOCATION (cond);
>   
> -	  if (TREE_CODE (cond) == LT_EXPR
> -	      || TREE_CODE (cond) == LE_EXPR
> -	      || TREE_CODE (cond) == GT_EXPR
> -	      || TREE_CODE (cond) == GE_EXPR
> -	      || TREE_CODE (cond) == NE_EXPR
> -	      || TREE_CODE (cond) == EQ_EXPR)
> +	  enum tree_code condcode = TREE_CODE (cond);
> +
> +	  if (condcode == LT_EXPR
> +	      || condcode == LE_EXPR
> +	      || condcode == GT_EXPR
> +	      || condcode == GE_EXPR
> +	      || condcode == NE_EXPR
> +	      || condcode == EQ_EXPR)
>   	    {
>   	      tree op0 = TREE_OPERAND (cond, 0);
>   	      tree op1 = TREE_OPERAND (cond, 1);
> @@ -815,79 +932,88 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
>   	      if (TREE_CODE (op0) == NOP_EXPR
>   		  && decl == TREE_OPERAND (op0, 0))
>   		{
> -		  TREE_OPERAND (cond, 0) = TREE_OPERAND (op0, 0);
> -		  TREE_OPERAND (cond, 1)
> -		    = fold_build1_loc (elocus, NOP_EXPR, TREE_TYPE (decl),
> -				   TREE_OPERAND (cond, 1));
> +		  op0 = TREE_OPERAND (op0, 0);
> +		  op1 = fold_build1_loc (elocus, NOP_EXPR, TREE_TYPE (decl),
> +					 op1);
>   		}
>   	      else if (TREE_CODE (op1) == NOP_EXPR
>   		       && decl == TREE_OPERAND (op1, 0))
>   		{
> -		  TREE_OPERAND (cond, 1) = TREE_OPERAND (op1, 0);
> -		  TREE_OPERAND (cond, 0)
> -		    = fold_build1_loc (elocus, NOP_EXPR, TREE_TYPE (decl),
> -				   TREE_OPERAND (cond, 0));
> +		  op1 = TREE_OPERAND (op1, 0);
> +		  op0 = fold_build1_loc (elocus, NOP_EXPR, TREE_TYPE (decl),
> +					 op0);
>   		}
>   
> -	      if (decl == TREE_OPERAND (cond, 0))
> +	      if (decl == op0)
>   		cond_ok = true;
> -	      else if (decl == TREE_OPERAND (cond, 1))
> +	      else if (decl == op1)
>   		{
> -		  TREE_SET_CODE (cond,
> -				 swap_tree_comparison (TREE_CODE (cond)));
> -		  TREE_OPERAND (cond, 1) = TREE_OPERAND (cond, 0);
> -		  TREE_OPERAND (cond, 0) = decl;
> +		  condcode = swap_tree_comparison (condcode);
> +		  op1 = op0;
> +		  op0 = decl;
>   		  cond_ok = true;
>   		}
>   
> -	      if (TREE_CODE (cond) == NE_EXPR
> -		  || TREE_CODE (cond) == EQ_EXPR)
> +	      if (condcode == NE_EXPR || condcode == EQ_EXPR)
>   		{
>   		  if (!INTEGRAL_TYPE_P (TREE_TYPE (decl)))
>   		    {
> -		      if (code == OACC_LOOP || TREE_CODE (cond) == EQ_EXPR)
> +		      if (code == OACC_LOOP || condcode == EQ_EXPR)
>   			cond_ok = false;
>   		    }
> -		  else if (operand_equal_p (TREE_OPERAND (cond, 1),
> +		  else if (operand_equal_p (op1,
>   					    TYPE_MIN_VALUE (TREE_TYPE (decl)),
>   					    0))
> -		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
> -					 ? GT_EXPR : LE_EXPR);
> -		  else if (operand_equal_p (TREE_OPERAND (cond, 1),
> +		    condcode = (condcode == NE_EXPR ? GT_EXPR : LE_EXPR);
> +		  else if (operand_equal_p (op1,
>   					    TYPE_MAX_VALUE (TREE_TYPE (decl)),
>   					    0))
> -		    TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR
> -					 ? LT_EXPR : GE_EXPR);
> -		  else if (code == OACC_LOOP || TREE_CODE (cond) == EQ_EXPR)
> +		    condcode = (condcode == NE_EXPR ? LT_EXPR : GE_EXPR);
> +		  else if (code == OACC_LOOP || condcode == EQ_EXPR)
>   		    cond_ok = false;
>   		}
>   
> -	      if (cond_ok && TREE_VEC_ELT (condv, i) != cond)
> +	      if (cond_ok)
>   		{
> -		  tree ce = NULL_TREE, *pce = &ce;
> -		  tree type = TREE_TYPE (TREE_OPERAND (cond, 1));
> -		  for (tree c = TREE_VEC_ELT (condv, i); c != cond;
> -		       c = TREE_OPERAND (c, 1))
> +		  /* We postponed destructive changes to canonicalize
> +		     cond until we're sure it is OK.  In the !error_p
> +		     case where we are trying to transform a regular FOR_STMT
> +		     to OMP_FOR, we don't want to destroy the original
> +		     condition if we aren't going to be able to do the
> +		     transformation anyway.  */
> +		  TREE_SET_CODE (cond, condcode);
> +		  TREE_OPERAND (cond, 0) = op0;
> +		  TREE_OPERAND (cond, 1) = op1;
> +
> +		  if (TREE_VEC_ELT (condv, i) != cond)
>   		    {
> -		      *pce = build2 (COMPOUND_EXPR, type, TREE_OPERAND (c, 0),
> -				     TREE_OPERAND (cond, 1));
> -		      pce = &TREE_OPERAND (*pce, 1);
> +		      tree ce = NULL_TREE, *pce = &ce;
> +		      tree type = TREE_TYPE (op1);
> +		      for (tree c = TREE_VEC_ELT (condv, i); c != cond;
> +			   c = TREE_OPERAND (c, 1))
> +			{
> +			  *pce = build2 (COMPOUND_EXPR, type,
> +					 TREE_OPERAND (c, 0), op1);
> +			  pce = &TREE_OPERAND (*pce, 1);
> +			}
> +		      op1 = ce;
> +		      TREE_VEC_ELT (condv, i) = cond;
>   		    }
> -		  TREE_OPERAND (cond, 1) = ce;
> -		  TREE_VEC_ELT (condv, i) = cond;
>   		}
>   	    }
>   
>   	  if (!cond_ok)
>   	    {
> -	      error_at (elocus, "invalid controlling predicate");
> +	      annotation_error (info, as_invalid_predicate, cond, elocus,
> +				"invalid controlling predicate");
>   	      fail = true;
>   	    }
>   	}
>   
>         if (incr == NULL_TREE)
>   	{
> -	  error_at (elocus, "missing increment expression");
> +	  annotation_error (info, as_missing_increment, NULL_TREE, elocus,
> +			    "missing increment expression");
>   	  fail = true;
>   	}
>         else
> @@ -986,9 +1112,11 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
>   			      if (i == NULL_TREE
>   				  || !operand_equal_p (unit, i, 0))
>   				{
> -				  error_at (elocus,
> -					    "increment is not constant 1 or "
> -					    "-1 for %<!=%> condition");
> +				  annotation_error (info,
> +						    as_invalid_increment,
> +						    incr, elocus,
> +						    "increment is not constant 1 or "
> +						    "-1 for %<!=%> condition");
>   				  fail = true;
>   				}
>   			    }
> @@ -1004,9 +1132,10 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
>   		    {
>   		      if (!integer_onep (i) && !integer_minus_onep (i))
>   			{
> -			  error_at (elocus,
> -				    "increment is not constant 1 or -1 for"
> -				    " %<!=%> condition");
> +			  annotation_error (info, as_invalid_increment,
> +					    incr, elocus,
> +					    "increment is not constant 1 or -1 for"
> +					    " %<!=%> condition");
>   			  fail = true;
>   			}
>   		    }
> @@ -1018,7 +1147,8 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
>   	    }
>   	  if (!incr_ok)
>   	    {
> -	      error_at (elocus, "invalid increment expression");
> +	      annotation_error (info, as_invalid_increment, incr,
> +				elocus, "invalid increment expression");
>   	      fail = true;
>   	    }
>   	}
> @@ -1046,6 +1176,20 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
>       }
>   }
>   
> +/* External entry point to c_finish_omp_for_internal, called from the
> +   parsers.  See above for description of the arguments.  */
> +
> +tree
> +c_finish_omp_for (location_t locus, enum tree_code code, tree declv,
> +		  tree orig_declv, tree initv, tree condv, tree incrv,
> +		  tree body, tree pre_body, bool final_p)
> +{
> +  return c_finish_omp_for_internal (locus, code, declv,
> +				    orig_declv, initv, condv, incrv,
> +				    body, pre_body, final_p, NULL);
> +}
> +
> +
>   /* Type for passing data in between c_omp_check_loop_iv and
>      c_omp_check_loop_iv_r.  */
>   
> @@ -2579,3 +2723,657 @@ c_omp_map_clause_name (tree clause, bool oacc)
>       }
>     return omp_clause_code_name[OMP_CLAUSE_CODE (clause)];
>   }
> +
> +
> +/* The following functions implement automatic recognition and annotation of
> +   for loops in OpenACC kernels regions.  Inside a kernels region, a nest of
> +   for loops that does not contain any annotated OpenACC loops, nor break
> +   or goto statements or assignments to the variables controlling loop
> +   termination, is converted to an OMP_FOR node with an "acc loop auto"
> +   annotation on each loop.  This feature is controlled by
> +   flag_openacc_kernels_annotate_loops.  */
> +
> +/* Check whether DECL is the declaration of a local variable (or function
> +   parameter) of integral type that does not have its address taken.  */
> +
> +static bool
> +is_local_var (tree decl)
> +{
> +  return ((TREE_CODE (decl) == VAR_DECL || TREE_CODE (decl) == PARM_DECL)
> +	  && DECL_CONTEXT (decl) != NULL
> +	  && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL
> +	  && INTEGRAL_TYPE_P (TREE_TYPE (decl))
> +	  && !TREE_ADDRESSABLE (decl));
> +}
> +
> +/* EXP is a loop bound expression for a comparison against local
> +   variable DECL.  Check whether this is potentially valid in an OpenACC loop
> +   context, namely that it can be precomputed when entering the loop
> +   construct per the OpenACC specification.  Local variables referenced
> +   in both DECL and EXP that may not be modified in the body of the loop
> +   are added to the list in INFO to be checked later.
> +
> +   FIXME: Ideally we would like to make this test permissive rather than
> +   restrictive, and allow the later conversion of the "auto" attribute to
> +   either "seq" or "independent" to make the determination using dataflow,
> +   alias analysis, etc rather than a tree traversal.  But presently it does
> +   not do that and always just hoists the loop bound expression.  So the
> +   current implementation only considers expressions involving unmodified
> +   local variables and constants, using a tree walk.  */
> +
> +static tree
> +end_test_ok_for_annotation_r (tree *tp, int *walk_subtrees,
> +			      void *data)
> +{
> +  tree exp = *tp;
> +  struct annotation_info *info = (struct annotation_info *) data;
> +
> +  switch (TREE_CODE_CLASS (TREE_CODE (exp)))
> +    {
> +    case tcc_constant:
> +      /* Constants are trivially known to be invariant.  */
> +      return NULL_TREE;
> +
> +    case tcc_declaration:
> +      if (is_local_var (exp))
> +	{
> +	  tree t;
> +	  /* Add it to the list of variables that can't be modified in the
> +	     loop, only if not already present.  */
> +	  for (t = info->vars; t && TREE_VALUE (t) != exp;
> +	       t = TREE_CHAIN (t))
> +	    ;
> +	  if (!t)
> +	    info->vars = tree_cons (NULL_TREE, exp, info->vars);
> +	  return NULL_TREE;
> +	}
> +      else if (TREE_CODE (exp) == VAR_DECL && TREE_READONLY (exp))
> +	return NULL_TREE;
> +      else if (TREE_CODE (exp) == FUNCTION_DECL)
> +	return NULL_TREE;
> +      break;
> +
> +    case tcc_unary:
> +    case tcc_binary:
> +    case tcc_comparison:
> +      /* Allow arithmetic expressions and comparisons provided
> +	 that the operands are good.  */
> +      return NULL_TREE;
> +
> +    default:
> +      /* Handle some special cases.  */
> +      switch (TREE_CODE (exp))
> +	{
> +	case COND_EXPR:
> +	case TRUTH_ANDIF_EXPR:
> +	case TRUTH_ORIF_EXPR:
> +	case TRUTH_AND_EXPR:
> +	case TRUTH_OR_EXPR:
> +	case TRUTH_XOR_EXPR:
> +	case TRUTH_NOT_EXPR:
> +	  /* ?: and boolean operators are OK.  */
> +	  return NULL_TREE;
> +
> +	case CALL_EXPR:
> +	  /* Allow calls to constant functions with invariant operands.  */
> +	  {
> +	    tree fndecl = get_callee_fndecl (exp);
> +	    if (fndecl && TREE_READONLY (fndecl))
> +	      return NULL_TREE;
> +	  }
> +	  break;
> +
> +	case ADDR_EXPR:
> +	  /* We can expect addresses of things to be invariant.  */
> +	  return NULL_TREE;
> +
> +	default:
> +	  break;
> +	}
> +    }
> +
> +  /* Reject anything else.  */
> +  *walk_subtrees = 0;
> +  return exp;
> +}
> +
> +static bool
> +end_test_ok_for_annotation (tree decl, tree exp,
> +			    struct annotation_info *info)
> +{
> +  /* Traversal returns NULL_TREE if all is well.  */
> +  if (!walk_tree (&exp, end_test_ok_for_annotation_r, info, NULL))
> +    {
> +      /* So far, so good.  Check the decl against any variables collected
> +	 in the exp.  */
> +      tree t;
> +      for (t = info->vars; t; t = TREE_CHAIN (t))
> +	if (TREE_VALUE (t) == decl)
> +	  return false;
> +      info->vars = tree_cons (NULL_TREE, decl, info->vars);
> +      return true;
> +    }
> +  return false;
> +}
> +
> +/* The initializer for a FOR_STMT is sometimes wrapped in various other
> +   language-specific tree structures.  We need a hook to unwrap them.
> +   This function takes a tree argument and should return either a
> +   MODIFY_EXPR, VAR_DECL, or NULL_TREE.  */
> +
> +static tree (*lang_specific_unwrap_initializer) (tree);
> +
> +/* Try to annotate the given NODE, which must be a FOR_STMT, with a
> +   "#pragma acc loop auto" annotation.  In practice, this means
> +   building an OMP_FOR node for it.  DECL and INIT are the
> +   previously-verified iteration variable and initializer.  Annotating
> +   the loop may fail, in which case INFO is used to record the cause
> +   of the failure and the original loop remains unchanged.  This
> +   function returns the transformed loop if the transformation
> +   succeeded, the original node otherwise.  */
> +
> +static tree
> +annotate_for_loop (tree node, tree decl, tree init,
> +		   struct annotation_info *info)
> +{
> +  gcc_checking_assert (TREE_CODE (node) == FOR_STMT);
> +
> +  location_t loc = EXPR_LOCATION (node);
> +  tree cond = FOR_COND (node);
> +  tree incr = FOR_EXPR (node);
> +
> +  gcc_assert (decl);
> +  gcc_assert (cond);
> +  gcc_assert (decl && TREE_CODE (decl) == VAR_DECL);
> +
> +  /* The C++ frontend can wrap the increment two levels deep inside a
> +     cleanup expression, but c_finish_omp_for does not care about that.  */
> +  if (incr != NULL_TREE && TREE_CODE (incr) == CLEANUP_POINT_EXPR)
> +    incr = TREE_OPERAND (TREE_OPERAND (incr, 0), 0);
> +  tree body = FOR_BODY (node);
> +
> +  tree declv = make_tree_vec (1);
> +  tree initv = make_tree_vec (1);
> +  tree condv = make_tree_vec (1);
> +  tree incrv = make_tree_vec (1);
> +  TREE_VEC_ELT (declv, 0) = decl;
> +  TREE_VEC_ELT (initv, 0) = init;
> +  TREE_VEC_ELT (condv, 0) = cond;
> +  TREE_VEC_ELT (incrv, 0) = incr;
> +
> +  /* Do the actual transformation.  This can still fail because
> +     c_finish_omp_for has some stricter checks than we have performed up to
> +     this point.  */
> +  tree omp_for = c_finish_omp_for_internal (loc, OACC_LOOP, declv, NULL_TREE,
> +					    initv, condv, incrv, body,
> +					    NULL_TREE, false, info);
> +  if (omp_for != NULL_TREE)
> +    {
> +      /* Add an auto clause, then return the new loop.  */
> +      tree auto_clause = build_omp_clause (loc, OMP_CLAUSE_AUTO);
> +      OMP_CLAUSE_CHAIN (auto_clause) = OMP_FOR_CLAUSES (omp_for);
> +      OMP_FOR_CLAUSES (omp_for) = auto_clause;
> +      return omp_for;
> +    }
> +
> +  return node;
> +}
> +
> +/* Forward declaration.  */
> +static tree annotate_loops_in_kernels_regions (tree *, int *, void *);
> +
> +/* Given a FOR_STMT NODE that is a candidate for parallelization, check its
> +   body for validity, then try to annotate it with
> +   "#pragma oacc loop auto", possibly modifying the current node in place.
> +   The INFO argument contains the traversal state at the point the loop
> +   appears.  */
> +
> +static void
> +check_and_annotate_for_loop (tree *nodeptr, tree_stmt_iterator *prev_tsi,
> +			     struct annotation_info *info)
> +{
> +  tree node = *nodeptr;
> +  gcc_assert (TREE_CODE (node) == FOR_STMT);
> +  tree init = FOR_INIT_STMT (node);
> +  tree cond = FOR_COND (node);
> +  tree prev_stmt = NULL_TREE;
> +  tree decl = NULL_TREE;
> +  bool unlink_prev = false;
> +  bool fix_decl = false;
> +
> +  /* This structure describes the current loop statement.  */
> +  struct annotation_info loop_info
> +    = { node, NULL_TREE, false, as_in_kernels_loop, NULL_TREE, info };
> +
> +  /* If we are in the body of an explicitly-annotated loop, do not add
> +     annotations to this loop or any other nested loops.  */
> +  if (info->state == as_explicit_annotation)
> +    do_not_annotate_loop (&loop_info, as_explicit_annotation, info->reason);
> +
> +  /* We need to find the controlling variable for the loop in order
> +     to detect whether it is modified in the body of the loop.
> +     That is why we are doing some checks on the loop condition
> +     that duplicate what c_finish_omp_for is doing.  */
> +
> +  /* First we need to find the decl and initializer for the
> +     controlling variable.  Both the C and C++ front ends normally put
> +     the initializer in the statement list just before the FOR_STMT
> +     instead of in FOR_INIT_STMT.  If FOR_INIT_STMT happens to exist
> +     but isn't a MODIFY_EXPR, give up.
> +     handle it.  */
> +
> +  else if (init != NULL_TREE && TREE_CODE (init) != MODIFY_EXPR)
> +    do_not_annotate_loop (&loop_info, as_invalid_initializer, NULL_TREE);
> +
> +  /* Examine the statement before the loop to see if it is a
> +     valid initializer.  It must be either a MODIFY_EXPR or VAR_DECL,
> +     possibly wrapped in language-specific structure.  */
> +  else if (init == NULL_TREE && prev_tsi != NULL && tsi_stmt (*prev_tsi))
> +    {
> +      prev_stmt = tsi_stmt (*prev_tsi);
> +
> +      /* Call the language-specific hook to unwrap prev_stmt.  */
> +      prev_stmt = (*lang_specific_unwrap_initializer) (prev_stmt);
> +
> +      /* See if we have a valid MODIFY_EXPR.  */
> +      if (TREE_CODE (prev_stmt) == MODIFY_EXPR
> +	  && is_local_var (TREE_OPERAND (prev_stmt, 0))
> +	  && !TREE_SIDE_EFFECTS (TREE_OPERAND (prev_stmt, 1)))
> +	{
> +	  decl = TREE_OPERAND (prev_stmt, 0);
> +	  init = prev_stmt;
> +	  unlink_prev = true;
> +	}
> +      else if (is_local_var (prev_stmt)
> +	       && !TREE_SIDE_EFFECTS (DECL_INITIAL (prev_stmt)))
> +	{
> +	  /* If the preceding statement is the declaration of the loop
> +	     variable with its initialization, build an assignment
> +	     expression for the loop's initializer.  */
> +	  decl = prev_stmt;
> +	  init = build2 (MODIFY_EXPR, TREE_TYPE (decl), decl,
> +			 DECL_INITIAL (decl));
> +	  /* We need to remove the initializer from the decl if we
> +	     end up using the init we just built instead.  */
> +	  fix_decl = true;
> +	}
> +    }
> +
> +  if (init == NULL_TREE || decl == NULL_TREE)
> +    /* There is nothing we can do to find the correct init statement for
> +       this loop.  */
> +    do_not_annotate_loop (&loop_info, as_missing_initializer, NULL_TREE);
> +
> +  /* The condition must be a comparison of the decl we found in
> +     the initializer against an expression that can be hoisted
> +     outside the loop.  */
> +  if (loop_info.state > as_in_kernels_loop)
> +    /* Skip validating condition if we've already got an error.  */
> +    ;
> +  else if (cond == NULL_TREE)
> +    do_not_annotate_loop (&loop_info, as_missing_predicate, NULL_TREE);
> +  else if (TREE_CODE_CLASS (TREE_CODE (cond)) != tcc_comparison)
> +    do_not_annotate_loop (&loop_info, as_invalid_predicate, cond);
> +  else
> +    {
> +      tree limit_exp = NULL_TREE;
> +
> +      if (TREE_OPERAND (cond, 0) == decl)
> +	limit_exp = TREE_OPERAND (cond, 1);
> +      else if (TREE_OPERAND (cond, 1) == decl)
> +	limit_exp = TREE_OPERAND (cond, 0);
> +
> +      if (!limit_exp
> +	  || !end_test_ok_for_annotation (decl, limit_exp, &loop_info))
> +	do_not_annotate_loop (&loop_info, as_invalid_predicate, cond);
> +    }
> +
> +  /* Walk the body.  This will process any nested loops, so we have to do it
> +     even if we have already rejected this loop as a candidate for
> +     annotation.  */
> +  walk_tree (&FOR_BODY (node), annotate_loops_in_kernels_regions,
> +	     (void *) &loop_info, NULL);
> +
> +  if (loop_info.state == as_in_kernels_loop)
> +    {
> +      /* If the traversal of the loop and all nested loops didn't hit
> +	 any problems, attempt the actual transformation.  If it
> +	 succeeds, replace this node with the annotated loop.  */
> +      tree result = annotate_for_loop (node, decl, init, &loop_info);
> +      if (result != node)
> +	{
> +	  /* Success!  */
> +	  *nodeptr = result;
> +
> +	  if (unlink_prev)
> +	    /* We don't need the previous statement that we consumed
> +	       as an initializer in the new OMP_FOR any more.  */
> +	    tsi_delink (prev_tsi);
> +
> +	  if (fix_decl)
> +	    /* We no longer need the initializer expression on the
> +	       decl of the loop variable and don't want to duplicate
> +	       it.  The kernels conversion pass would interpret it as
> +	       a stray assignment in a gang-single region.  */
> +	    DECL_INITIAL (decl) = NULL_TREE;
> +
> +	  return;
> +	}
> +    }
> +
> +  /* If we got here, we have a FOR_STMT we could not convert to an
> +     OMP loop.  */
> +
> +  if (loop_info.state == as_invalid_return)
> +    /* This is diagnosed elsewhere as a hard error, so no warning is
> +       needed here.  */
> +    return;
> +
> +  /* Issue warnings about other problems.  */
> +  auto_diagnostic_group d;
> +  if (warning_at (EXPR_LOCATION (node),
> +		  OPT_Wopenacc_kernels_annotate_loops,
> +		  "loop cannot be annotated for OpenACC parallelization"))
> +    {
> +      location_t locus;
> +      if (loop_info.reason && EXPR_HAS_LOCATION (loop_info.reason))
> +	locus = EXPR_LOCATION (loop_info.reason);
> +      else
> +	locus = EXPR_LOCATION (node);
> +      switch (loop_info.state)
> +	{
> +	case as_invalid_variable_type:
> +	  inform (locus, "invalid type for iteration variable %qE",
> +		  loop_info.reason);
> +	  break;
> +	case as_missing_initializer:
> +	  inform (locus, "missing iteration variable initializer");
> +	  break;
> +	case as_invalid_initializer:
> +	  inform (locus, "unrecognized initializer");
> +	  break;
> +	case as_missing_predicate:
> +	  inform (locus, "missing controlling predicate");
> +	  break;
> +	case as_invalid_predicate:
> +	  inform (locus, "invalid controlling predicate");
> +	  break;
> +	case as_missing_increment:
> +	  inform (locus, "missing increment expression");
> +	  break;
> +	case as_invalid_increment:
> +	  inform (locus, "invalid increment expression");
> +	  break;
> +	case as_explicit_annotation:
> +	  inform (locus, "explicit OpenACC annotation in loop nest");
> +	  break;
> +	case as_invalid_control_flow:
> +	  inform (locus, "loop contains unstructured control flow");
> +	  break;
> +	case as_invalid_break:
> +	  inform (locus, "loop contains %<break%> statement");
> +	  break;
> +	case as_invalid_call:
> +	  inform (locus, "loop contains call to non-oacc function");
> +	  break;
> +	case as_invalid_modification:
> +	  inform (locus, "invalid modification of controlling variable");
> +	  break;
> +	default:
> +	  gcc_unreachable ();
> +	}
> +    }
> +}
> +
> +/* Traversal function for walk_tree.  Visit the tree, finding OpenACC
> +   kernels regions.  DATA is NULL if we are outside of a kernels region,
> +   otherwise it is a pointer to the enclosing kernels region's
> +   annotation_info struct.  If the traversal encounters a for loop inside a
> +   kernels region that is a candidate for parallelization, annotate it
> +   with OpenACC loop directives.  */
> +
> +static tree
> +annotate_loops_in_kernels_regions (tree *nodeptr, int *walk_subtrees,
> +				   void *data)
> +{
> +  tree node = *nodeptr;
> +  struct annotation_info *info = (struct annotation_info *) data;
> +  gcc_assert (info);
> +
> +  switch (TREE_CODE (node))
> +    {
> +    case OACC_KERNELS:
> +      /* Recursively process the body of the kernels region in a new info
> +	 scope.  */
> +      if (info->state == as_outer)
> +	{
> +	  struct annotation_info nested_info
> +	    = { NULL_TREE, NULL_TREE, true,
> +		as_in_kernels_region, NULL_TREE, info };
> +	  walk_tree (&OMP_BODY (node), annotate_loops_in_kernels_regions,
> +		     (void *) &nested_info, NULL);
> +	  *walk_subtrees = 0;
> +	}
> +      break;
> +
> +    case OACC_LOOP:
> +      /* Do not try to add automatic OpenACC annotations inside manually
> +	 annotated loops.  Presumably, the user avoided doing it on
> +	 purpose; for example, all available levels of parallelism may
> +	 have been used up.  However, assume that the combined construct
> +	 "#pragma acc kernels loop" means to try to process the whole
> +	 loop nest.
> +	 Note that a single OACC_LOOP construct represents an entire set
> +	 of collapsed loops so we do not have to deal explicitly with the
> +	 collapse clause here, as the Fortran front end does.  */
> +      if (info->state == as_in_kernels_region && OACC_LOOP_COMBINED (node))
> +	{
> +	  walk_tree (&OMP_BODY (node), annotate_loops_in_kernels_regions,
> +		     (void *) info, NULL);
> +	  *walk_subtrees = 0;
> +	}
> +      else
> +	{
> +	  struct annotation_info nested_info
> +	    = { NULL_TREE, NULL_TREE, false, as_explicit_annotation,
> +		node, info };
> +	  if (info->state >= as_in_kernels_region)
> +	    do_not_annotate_loop_nest (info, as_explicit_annotation,
> +				       node);
> +	  walk_tree (&OMP_BODY (node), annotate_loops_in_kernels_regions,
> +		     (void *) &nested_info, NULL);
> +	  *walk_subtrees = 0;
> +	}
> +      break;
> +
> +    case FOR_STMT:
> +      /* Try to annotate the loop if we are in a kernels region.
> +	 This will do a recursive traversal of the loop body in a new
> +	 info scope.  */
> +      if (info->state >= as_in_kernels_region)
> +	{
> +	  check_and_annotate_for_loop (nodeptr, NULL, info);
> +	  *walk_subtrees = 0;
> +	}
> +      break;
> +
> +    case LABEL_EXPR:
> +      /* Possibly unstructured control flow.  Unless we perform further
> +	 analyses, we must assume that such control flow may enter the
> +	 current loop.  In this case, we must not parallelize the loop.  */
> +      if (info->state >= as_in_kernels_loop
> +	  && TREE_USED (LABEL_EXPR_LABEL (node)))
> +	do_not_annotate_loop_nest (info, as_invalid_control_flow, node);
> +      break;
> +
> +    case GOTO_EXPR:
> +      /* Possibly unstructured control flow.  Unless we perform further
> +	 analyses, we must assume that such control flow may leave the
> +	 current loop.  In this case, we must not parallelize the loop.  */
> +      if (info->state >= as_in_kernels_loop)
> +	do_not_annotate_loop_nest (info, as_invalid_control_flow, node);
> +      break;
> +
> +    case BREAK_STMT:
> +      /* A break statement.  Whether or not this is valid depends on the
> +	 enclosing context.  */
> +      if (info->state >= as_in_kernels_loop && !info->break_ok)
> +	do_not_annotate_loop (info, as_invalid_break, node);
> +      break;
> +
> +    case RETURN_EXPR:
> +      /* A return leaves the entire loop nest.  */
> +      if (info->state >= as_in_kernels_loop)
> +	do_not_annotate_loop_nest (info, as_invalid_return, node);
> +      break;
> +
> +    case CALL_EXPR:
> +      /* Direct function calls to builtins and functions marked as
> +	 OpenACC routines are allowed.  Reject indirect calls or calls
> +	 to non-routines.  */
> +      if (info->state >= as_in_kernels_loop)
> +	{
> +	  tree fn = CALL_EXPR_FN (node), fn_decl = NULL_TREE;
> +	  if (fn != NULL_TREE && TREE_CODE (fn) == FUNCTION_DECL)
> +	    fn_decl = fn;
> +	  else if (fn != NULL_TREE && TREE_CODE (fn) == ADDR_EXPR)
> +	    {
> +	      tree fn_op = TREE_OPERAND (fn, 0);
> +	      if (fn_op != NULL_TREE && TREE_CODE (fn_op) == FUNCTION_DECL)
> +		fn_decl = fn_op;
> +	    }
> +	  if (fn_decl == NULL_TREE)
> +	    do_not_annotate_loop_nest (info, as_invalid_call, node);
> +	  else if (!fndecl_built_in_p (fn_decl, BUILT_IN_NORMAL)
> +		   && !lookup_attribute ("oacc function",
> +					 DECL_ATTRIBUTES (fn_decl)))
> +	    do_not_annotate_loop_nest (info, as_invalid_call, node);
> +	}
> +      break;
> +
> +    case MODIFY_EXPR:
> +      /* See if this assignment's LHS is one of the variables that must
> +	 not be modified in the loop body because they control termination
> +	 of the loop (or an enclosing loop in the nest).  */
> +      if (info->state >= as_in_kernels_loop)
> +	{
> +	  tree lhs = TREE_OPERAND (node, 0);
> +	  if (!is_local_var (lhs))
> +	    /* Early exit: This cannot be a variable we care about.  */
> +	    break;
> +	  /* Walk up the loop stack.  Invalidate the ones controlled by this
> +	     variable.  There may be several, if this variable is the common
> +	     iteration limit for several nested loops.  */
> +	  for (struct annotation_info *outer_loop = info; outer_loop != NULL;
> +	       outer_loop = outer_loop->next)
> +	    for (tree t = outer_loop->vars; t != NULL_TREE; t = TREE_CHAIN (t))
> +	      if (TREE_VALUE (t) == lhs)
> +		{
> +		  do_not_annotate_loop (outer_loop,
> +					as_invalid_modification,
> +					node);
> +		  break;
> +		}
> +	}
> +      break;
> +
> +    case SWITCH_STMT:
> +      /* Needs special handling to allow break in the body.  */
> +      if (info->state >= as_in_kernels_loop)
> +	{
> +	  bool save_break_ok = info->break_ok;
> +
> +	  walk_tree (&SWITCH_STMT_COND (node),
> +		     annotate_loops_in_kernels_regions,
> +		     (void *) info, NULL);
> +	  info->break_ok = true;
> +	  walk_tree (&SWITCH_STMT_BODY (node),
> +		     annotate_loops_in_kernels_regions,
> +		     (void *) info, NULL);
> +	  info->break_ok = save_break_ok;
> +	  *walk_subtrees = 0;
> +	}
> +      break;
> +
> +    case WHILE_STMT:
> +      /* Needs special handling to allow break in the body.  */
> +      if (info->state >= as_in_kernels_loop)
> +	{
> +	  bool save_break_ok = info->break_ok;
> +
> +	  walk_tree (&WHILE_COND (node), annotate_loops_in_kernels_regions,
> +		     (void *) info, NULL);
> +	  info->break_ok = true;
> +	  walk_tree (&WHILE_BODY (node), annotate_loops_in_kernels_regions,
> +		     (void *) info, NULL);
> +	  info->break_ok = save_break_ok;
> +	  *walk_subtrees = 0;
> +	}
> +      break;
> +
> +    case DO_STMT:
> +      /* Needs special handling to allow break in the body.  */
> +      if (info->state >= as_in_kernels_loop)
> +	{
> +	  bool save_break_ok = info->break_ok;
> +
> +	  walk_tree (&DO_COND (node), annotate_loops_in_kernels_regions,
> +		     (void *) info, NULL);
> +	  info->break_ok = true;
> +	  walk_tree (&DO_BODY (node), annotate_loops_in_kernels_regions,
> +		     (void *) info, NULL);
> +	  info->break_ok = save_break_ok;
> +	  *walk_subtrees = 0;
> +	}
> +      break;
> +
> +    case STATEMENT_LIST:
> +      /* We iterate over these explicitly so that we can track the previous
> +	 statement in the chain.  It may be the initializer for a following
> +	 FOR_STMT node.  */
> +      if (info->state >= as_in_kernels_region)
> +	{
> +	  tree_stmt_iterator i = tsi_start (node);
> +	  tree_stmt_iterator prev, *prev_tsi = NULL;
> +	  while (!tsi_end_p (i))
> +	    {
> +	      tree *stmtptr = tsi_stmt_ptr (i);
> +	      if (TREE_CODE (*stmtptr) == FOR_STMT)
> +		{
> +		  check_and_annotate_for_loop (stmtptr, prev_tsi, info);
> +		  *walk_subtrees = 0;
> +		}
> +	      else
> +		walk_tree (stmtptr, annotate_loops_in_kernels_regions,
> +			   (void *) info, NULL);
> +	      prev = i;
> +	      prev_tsi = &prev;
> +	      tsi_next (&i);
> +	    }
> +	  *walk_subtrees = 0;
> +	}
> +      break;
> +
> +    default:
> +      break;
> +    }
> +
> +  return NULL_TREE;
> +}
> +
> +/* Find for loops in OpenACC kernels regions that do not have OpenACC
> +   annotations but look like they might benefit from automatic
> +   parallelization.  Convert them from FOR_STMT to OMP_FOR nodes and
> +   add the equivalent of "#pragma acc loop auto" annotations for them.
> +   Assumes flag_openacc_kernels_annotate_loops is set.  */
> +
> +void
> +c_oacc_annotate_loops_in_kernels_regions (tree decl,
> +					  tree (*unwrap_fn) (tree))
> +{
> +  struct annotation_info info
> +    = { NULL_TREE, NULL_TREE, true, as_outer, NULL_TREE, NULL };
> +  lang_specific_unwrap_initializer = unwrap_fn;

This is a strange way to handle a hook: a static variable that is set to 
point to the same function every time this function is called.  I'd 
think either the pointer should be part of the annotation_info, or it 
should be a function declared in c-common.h with separate definitions in 
the C and C++ front ends.

> +  walk_tree (&DECL_SAVED_TREE (decl), annotate_loops_in_kernels_regions,
> +	     (void *) &info, NULL);
> +}
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index c1d8fd3..b146aad 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -975,6 +975,10 @@ Wold-style-definition
>   C ObjC Var(warn_old_style_definition) Init(-1) Warning
>   Warn if an old-style parameter definition is used.
>   
> +Wopenacc-kernels-annotate-loops
> +C ObjC C++ ObjC++ Warning Var(warn_openacc_kernels_annotate_loops) Init(0)
> +Warn about loops in OpenACC kernels regions that cannot be parallelized.
> +
>   Wopenmp-simd
>   C C++ Var(warn_openmp_simd) Warning LangEnabledBy(C C++,Wall)
>   Warn if a simd directive is overridden by the vectorizer cost model.
> @@ -1744,6 +1748,10 @@ fopenacc-dim=
>   C ObjC C++ ObjC++ LTO Joined Var(flag_openacc_dims)
>   Specify default OpenACC compute dimensions.
>   
> +fopenacc-kernels-annotate-loops
> +C ObjC C++ ObjC++ LTO Optimization Var(flag_openacc_kernels_annotate_loops) Init(1)
> +Automatically parallelize unannotated loops in OpenACC kernels regions.
> +
>   fopenmp
>   C ObjC C++ ObjC++ LTO Var(flag_openmp)
>   Enable OpenMP (implies -frecursive in Fortran).
> diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> index 8204db2..02b7f2f 100644
> --- a/gcc/c/c-decl.c
> +++ b/gcc/c/c-decl.c
> @@ -9885,6 +9885,29 @@ temp_pop_parm_decls (void)
>     pop_scope ();
>   }
>   \f
> +/* Function passed to c_oacc_annotate_loop_in_kernels_regions to do
> +   language-specific unwrapping of an initializer expression.  */
> +static tree
> +c_unwrap_for_init (tree x)
> +{
> +  if (!x)
> +    return NULL_TREE;
> +
> +  while (true)
> +    switch (TREE_CODE (x))
> +      {
> +      case MODIFY_EXPR:
> +      case VAR_DECL:
> +	return x;
> +
> +      case DECL_EXPR:
> +	x = TREE_OPERAND (x, 0);

For both C and C++, returning the DECL_EXPR seems better than returning 
the VAR_DECL, since the former clearly expresses initialization.

> +	break;
> +
> +      default:
> +	return NULL_TREE;
> +      }
> +}
>   
>   /* Finish up a function declaration and compile that function
>      all the way to assembler language output.  Then free the storage
> @@ -9987,6 +10010,11 @@ finish_function (location_t end_loc)
>     if (warn_unused_parameter)
>       do_warn_unused_parameter (fndecl);
>   
> +  /* If requested, automatically annotate suitable loops in OpenACC kernels
> +     regions with OpenACC loop annotations to allow auto-parallelization.  */
> +  if (flag_openacc && flag_openacc_kernels_annotate_loops)
> +    c_oacc_annotate_loops_in_kernels_regions (fndecl, c_unwrap_for_init);
> +
>     /* Store the end of the function, so that we get good line number
>        info for the epilogue.  */
>     cfun->function_end_locus = end_loc;
> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index 2e6775a..12fb4b7 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -16831,6 +16831,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
>   		    omp_clause_mask mask, tree *cclauses, bool *if_p)
>   {
>     bool is_parallel = ((mask >> PRAGMA_OACC_CLAUSE_REDUCTION) & 1) == 1;
> +  bool is_combined = (cclauses != NULL);
>   
>     strcat (p_name, " loop");
>     mask |= OACC_LOOP_CLAUSE_MASK;
> @@ -16849,6 +16850,8 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
>     tree block = c_begin_compound_stmt (true);
>     tree stmt = c_parser_omp_for_loop (loc, parser, OACC_LOOP, clauses, NULL,
>   				     if_p);
> +  if (stmt && stmt != error_mark_node)
> +    OACC_LOOP_COMBINED (stmt) = is_combined;
>     block = c_end_compound_stmt (loc, block, true);
>     add_stmt (block);
>   
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index 31d6874..2d83c41 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -16960,6 +16960,45 @@ emit_coro_helper (tree helper)
>     expand_or_defer_fn (helper);
>   }
>   
> +
> +/* Function passed to c_oacc_annotate_loop_in_kernels_regions to do
> +   language-specific unwrapping of an initializer expression.  */
> +static tree
> +cp_unwrap_for_init (tree x)
> +{
> +  if (!x)
> +    return NULL_TREE;
> +
> +  while (true)
> +    switch (TREE_CODE (x))
> +      {
> +      case MODIFY_EXPR:
> +      case VAR_DECL:
> +	return x;
> +
> +      case CLEANUP_POINT_EXPR:
> +	x = TREE_OPERAND (x, 0);
> +	break;
> +
> +      case EXPR_STMT:
> +	x = TREE_OPERAND (x, 0);
> +	break;
> +
> +      case DECL_EXPR:
> +	x = TREE_OPERAND (x, 0);
> +	break;
> +
> +      case CONVERT_EXPR:
> +	if (TREE_TYPE (x) != void_type_node)
> +	  return NULL_TREE;
> +	x = TREE_OPERAND (x, 0);
> +	break;
> +
> +      default:
> +	return NULL_TREE;
> +      }
> +}
> +
>   /* Finish up a function declaration and compile that function
>      all the way to assembler language output.  The free the storage
>      for the function definition. INLINE_P is TRUE if we just
> @@ -17264,6 +17303,11 @@ finish_function (bool inline_p)
>         && !DECL_CLONED_FUNCTION_P (fndecl))
>       do_warn_unused_parameter (fndecl);
>   
> +  /* If requested, automatically annotate suitable loops in OpenACC kernels
> +     regions with OpenACC loop annotations to allow auto-parallelization.  */
> +  if (flag_openacc && flag_openacc_kernels_annotate_loops)
> +    c_oacc_annotate_loops_in_kernels_regions (fndecl, cp_unwrap_for_init);
> +
>     /* Genericize before inlining.  */
>     if (!processing_template_decl
>         && !DECL_IMMEDIATE_FUNCTION_P (fndecl)
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 9849e59..1a11f2b 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -41281,6 +41281,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
>   		     omp_clause_mask mask, tree *cclauses, bool *if_p)
>   {
>     bool is_parallel = ((mask >> PRAGMA_OACC_CLAUSE_REDUCTION) & 1) == 1;
> +  bool is_combined = (cclauses != NULL);
>   
>     strcat (p_name, " loop");
>     mask |= OACC_LOOP_CLAUSE_MASK;
> @@ -41299,6 +41300,8 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
>     tree block = begin_omp_structured_block ();
>     int save = cp_parser_begin_omp_structured_block (parser);
>     tree stmt = cp_parser_omp_for_loop (parser, OACC_LOOP, clauses, NULL, if_p);
> +  if (stmt && stmt != error_mark_node)
> +    OACC_LOOP_COMBINED (stmt) = is_combined;
>     cp_parser_end_omp_structured_block (parser, save);
>     add_stmt (finish_omp_structured_block (block));
>   
> diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
> index 107d39d..efdb393 100644
> --- a/gcc/cp/semantics.c
> +++ b/gcc/cp/semantics.c
> @@ -4900,6 +4900,10 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
>       length = mark_rvalue_use (length);
>     /* We need to reduce to real constant-values for checks below.  */
>     if (length)
> +    STRIP_NOPS (length);
> +  if (low_bound)
> +    STRIP_NOPS (low_bound);
> +  if (length)
>       length = fold_simple (length);
>     if (low_bound)
>       low_bound = fold_simple (low_bound);
> @@ -5204,6 +5208,11 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
>   	  tree low_bound = TREE_PURPOSE (t);
>   	  tree length = TREE_VALUE (t);
>   
> +	  if (length)
> +	    STRIP_NOPS (length);
> +	  if (low_bound)
> +	    STRIP_NOPS (low_bound);
> +
>   	  i--;
>   	  if (low_bound
>   	      && TREE_CODE (low_bound) == INTEGER_CST
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index bca8c85..af83a5f 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -348,7 +348,8 @@ Objective-C and Objective-C++ Dialects}.
>   -Wmissing-include-dirs  -Wmissing-noreturn  -Wno-missing-profile @gol
>   -Wno-multichar  -Wmultistatement-macros  -Wnonnull  -Wnonnull-compare @gol
>   -Wnormalized=@r{[}none@r{|}id@r{|}nfc@r{|}nfkc@r{]} @gol
> --Wnull-dereference  -Wno-odr  -Wopenmp-simd  @gol
> +-Wnull-dereference  -Wno-odr @gol
> +-Wopenacc-kernels-annotate-loops  -Wopenmp-simd @gol
>   -Wno-overflow  -Woverlength-strings  -Wno-override-init-side-effects @gol
>   -Wpacked  -Wno-packed-bitfield-compat  -Wpacked-not-aligned  -Wpadded @gol
>   -Wparentheses  -Wno-pedantic-ms-format @gol
> @@ -500,7 +501,8 @@ Objective-C and Objective-C++ Dialects}.
>   -fmerge-constants  -fmodulo-sched  -fmodulo-sched-allow-regmoves @gol
>   -fmove-loop-invariants  -fno-branch-count-reg @gol
>   -fno-defer-pop  -fno-fp-int-builtin-inexact  -fno-function-cse @gol
> --fno-guess-branch-probability  -fno-inline  -fno-math-errno  -fno-peephole @gol
> +-fno-guess-branch-probability  -fno-inline  -fno-math-errno @gol
> +-fno-openacc-kernels-annotate-loops  -fno-peephole @gol
>   -fno-peephole2  -fno-printf-return-value  -fno-sched-interblock @gol
>   -fno-sched-spec  -fno-signed-zeros @gol
>   -fno-toplevel-reorder  -fno-trapping-math  -fno-zero-initialized-in-bss @gol
> @@ -8305,6 +8307,13 @@ Do not warn about compile-time overflow in constant expressions.
>   Warn about One Definition Rule violations during link-time optimization.
>   Enabled by default.
>   
> +@item -Wopenacc-kernels-annotate-loops
> +@opindex Wopenacc-kernels-annotate-loops
> +@opindex Wno-Wopenacc-kernels-annotate-loops
> +Warn about @code{for} (C/C++) or @code{DO} (Fortran) loops in OpenACC
> +kernels regions that cannot be automatically annotated for
> +parallelization with @option{-fopenacc-kernels-annotate-loops}.
> +
>   @item -Wopenmp-simd
>   @opindex Wopenmp-simd
>   @opindex Wno-openmp-simd
> @@ -13647,6 +13656,27 @@ approximation is enabled.  The default value is 2.
>   
>   @end table
>   
> +@item -fno-openacc-kernels-annotate-loops
> +@opindex fno-openacc-kernels-annotate-loops
> +@opindex fopenacc-kernels-annotate-loops
> +@cindex kernels regions, OpenACC
> +Disable automatic parallelization of unannotated loops in OpenACC
> +kernels regions.  The default is to attempt to add implicit
> +@code{acc loop auto} annotations to loops in kernels regions if
> +@option{-fopenacc} is enabled.
> +
> +Note that you can use @option{-Wopenacc-kernels-annotate-loops} to
> +diagnose @code{for} loops that cannot be automatically annotated
> +(@pxref{Warning Options}).  Reasons why automatic loop annotations
> +cannot be applied include premature exits, calls to functions without
> +an @code{openacc routine} annotation, or unstructured control flow in
> +the loop body.  In C and C++, the loop variable initialization, end
> +test, and increment expressions must additionally conform to
> +restrictions similar to those for explicitly-annotated loops, and the
> +loop variable must not be otherwise modified in the body of the loop.
> +An explicit @code{acc loop} annotation disables automatic annotations
> +on any nested or containing loops.
> +
>   @end table
>   
>   @node Instrumentation Options
> diff --git a/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c b/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
> index d4c4b2c..a6e2d0b 100644
> --- a/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
> +++ b/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c
> @@ -2,6 +2,7 @@
>      OpenACC kernels.  */
>   
>   /* { dg-additional-options "-O2" }
> +   { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
>      { dg-additional-options "-fopt-info-optimized-omp" }
>      { dg-additional-options "-fdump-tree-ompexp" }
>      { dg-additional-options "-fdump-tree-parloops1-all" }
> diff --git a/gcc/testsuite/c-c++-common/goacc/classify-kernels.c b/gcc/testsuite/c-c++-common/goacc/classify-kernels.c
> index 16e9b9e..bb21c9c 100644
> --- a/gcc/testsuite/c-c++-common/goacc/classify-kernels.c
> +++ b/gcc/testsuite/c-c++-common/goacc/classify-kernels.c
> @@ -2,6 +2,7 @@
>      kernels.  */
>   
>   /* { dg-additional-options "-O2" }
> +   { dg-additional-options "-fno-openacc-kernels-annotate-loops" }
>      { dg-additional-options "-fopt-info-optimized-omp" }
>      { dg-additional-options "-fdump-tree-ompexp" }
>      { dg-additional-options "-fdump-tree-parloops1-all" }
> diff --git a/gcc/testsuite/c-c++-common/goacc/combined-directives.c b/gcc/testsuite/c-c++-common/goacc/combined-directives.c
> index c2a3c57..2519f23 100644
> --- a/gcc/testsuite/c-c++-common/goacc/combined-directives.c
> +++ b/gcc/testsuite/c-c++-common/goacc/combined-directives.c
> @@ -110,7 +110,7 @@ test ()
>   // { dg-final { scan-tree-dump-times "acc loop worker" 2 "gimple" } }
>   // { dg-final { scan-tree-dump-times "acc loop vector" 2 "gimple" } }
>   // { dg-final { scan-tree-dump-times "acc loop seq" 2 "gimple" } }
> -// { dg-final { scan-tree-dump-times "acc loop auto" 2 "gimple" } }
> +// { dg-final { scan-tree-dump-times "acc loop auto" 6 "gimple" } }
>   // { dg-final { scan-tree-dump-times "acc loop tile.2, 3" 2 "gimple" } }
>   // { dg-final { scan-tree-dump-times "acc loop independent private.i" 2 "gimple" } }
>   // { dg-final { scan-tree-dump-times "private.z" 2 "gimple" } }
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c b/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c
> index 0304254..c37152c 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-counter-var-redundant-load.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-dom3" } */
>   
>   #include <stdlib.h>
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c b/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
> index c475333..b1f4302 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c b/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c
> index 8f7f415..e87aab3 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fopt-info-optimized-omp" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c b/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c
> index c11d36f..2323857 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fopt-info-optimized-omp" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c
> index acef6a1..adca30b 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c
> index 75e2bb7..5f16085 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c
> new file mode 100644
> index 0000000..c7b5ac8
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-1.c
> @@ -0,0 +1,26 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that all loops in the nest are annotated.  */
> +
> +void f (float a[16][16], float b[16][16], float c[16][16])
> +{
> +  int i, j, k;
> +
> +#pragma acc kernels copyin(a[0:16][0:16], b[0:16][0:16]) copyout(c[0:16][0:16])
> +  {
> +    for (i = 0; i < 16; i++) {
> +      for (j = 0; j < 16; j++) {
> +	float t = 0;
> +	for (k = 0; k < 16; k++)
> +	  t += a[i][k] * b[k][j];
> +	c[i][j] = t;
> +      }
> +    }
> +  }
> +
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 3 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c
> new file mode 100644
> index 0000000..58b41d2
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-10.c
> @@ -0,0 +1,32 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a random goto in the body can't be annotated.  */
> +
> +#define n 16
> +
> +float f (float *a, float *b)
> +{
> +  float t = 0;
> +  int i;
> +
> +#pragma acc kernels
> +  {
> +    for (i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
> +      {
> +	if (a[i] < 0)
> +	  {
> +	    t = 0;
> +	    goto bad;
> +	  }
> +	t += a[i] * b[i];
> +      }
> +  bad:
> +    ;
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c
> new file mode 100644
> index 0000000..e9d2ef4
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-11.c
> @@ -0,0 +1,27 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a random label in the body triggers a warning.  */
> +
> +#define n 16
> +
> +float f (float *a, float *b)
> +{
> +  float t = 0;
> +  int i = n - 1;
> +
> +#pragma acc kernels
> +  {
> +    goto spaghetti;
> +    for (i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
> +      {
> +      spaghetti:
> +	t += a[i] * b[i];
> +      }
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c
> new file mode 100644
> index 0000000..ba408bc
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-12.c
> @@ -0,0 +1,28 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that in a situation with nested loops, a problem that prevents
> +   annotation of the inner loop only still allows the outer loop to be
> +   annotated.  */
> +
> +float f (float *a, float *b, int n)
> +{
> +  float t = 0;
> +
> +#pragma acc kernels
> +  {
> +    for (int i = 0; i < n; i++)
> +      for (int j = 0; j <= i; j++)  /* { dg-warning "loop cannot be annotated" } */
> +       {
> +         if (a[i] < 0 || b[j] < 0)
> +           j = i;
> +         else
> +           t += a[i] * b[j];
> +       }
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c
> new file mode 100644
> index 0000000..64433e8
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-13.c
> @@ -0,0 +1,27 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that in a situation with nested loops, a problem that prevents
> +   annotation of the outer loop only still allows the inner loop to be
> +   annotated.  */
> +
> +float f (float *a, float *b, int n)
> +{
> +  float t = 0;
> +
> +#pragma acc kernels
> +  {
> +    for (int i = 0; i < n; i++)	  /* { dg-warning "loop cannot be annotated" } */
> +      {
> +	if (a[i] < 0)
> +	  n = i;
> +	for (int j = 0; j <= i; j++)
> +	  t += a[i] * b[j];
> +      }
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c
> new file mode 100644
> index 0000000..379e6ba
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-14.c
> @@ -0,0 +1,22 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that an explicit annotation on an outer loop suppresses annotation
> +   of inner loops, and produces a diagnostic.  */
> +
> +void f (float *a, float *b)
> +{
> +  float t = 0;
> +
> +#pragma acc kernels
> +  {
> +#pragma acc loop seq
> +    for (int l = 0; l < 20; l++)
> +      for (int m = 0; m < 20; m++)	/* { dg-warning "loop cannot be annotated" } */
> +        b[m] = a[m];
> +  }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c
> new file mode 100644
> index 0000000..9a2a7ca
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-15.c
> @@ -0,0 +1,22 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that an explicit annotation on an inner loop suppresses annotation
> +   of outer loops, and produces a diagnostic.  */
> +
> +void f (float *a, float *b)
> +{
> +  float t = 0;
> +
> +#pragma acc kernels
> +  {
> +    for (int l = 0; l < 20; l++)	/* { dg-warning "loop cannot be annotated" } */
> +#pragma acc loop seq
> +      for (int m = 0; m < 20; m++)
> +        b[m] = a[m];
> +  }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c
> new file mode 100644
> index 0000000..075f897
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-16.c
> @@ -0,0 +1,26 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a modification of the loop variable in the
> +   body cannot be annotated.  */
> +
> +float f (float *a, float *b, int n)
> +{
> +  float t = 0;
> +
> +#pragma acc kernels
> +  {
> +    for (int i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
> +      {
> +	if (a[i] < 0 || b[i] < 0)
> +	  i = n;
> +	else
> +	  t += a[i] * b[i];
> +      }
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c
> new file mode 100644
> index 0000000..5076789
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-17.c
> @@ -0,0 +1,26 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a modification of the loop iteration count
> +   variable in the body cannot be annotated.  */
> +
> +float f (float *a, float *b, int n)
> +{
> +  float t = 0;
> +
> +#pragma acc kernels
> +  {
> +    for (int i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
> +      {
> +	if (a[i] < 0 || b[i] < 0)
> +	  n = i;
> +	else
> +	  t += a[i] * b[i];
> +      }
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c
> new file mode 100644
> index 0000000..89ec644
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-18.c
> @@ -0,0 +1,18 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that "acc kernels loop" directive causes annotation of the entire
> +   loop nest.  */
> +
> +void f (float *a, float *b)
> +{
> +#pragma acc kernels loop
> +  for (int k = 0; k < 20; k++)
> +    for (int l = 0; l < 20; l++)
> +      for (int m = 0; m < 20; m++)
> +	b[m] = a[m];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 2 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c
> new file mode 100644
> index 0000000..77a3b7a
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-19.c
> @@ -0,0 +1,19 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that "acc kernels loop" directive causes annotation of the entire
> +   loop nest in the presence of a collapse clause.  */
> +
> +void f (float *a, float *b)
> +{
> +#pragma acc kernels loop collapse(2)
> +  for (int k = 0; k < 20; k++)
> +    for (int l = 0; l < 20; l++)
> +      for (int m = 0; m < 20; m++)
> +	b[m] = a[m];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop collapse.2." 1 "original" } } */
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c
> new file mode 100644
> index 0000000..9e0a946
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-2.c
> @@ -0,0 +1,21 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a variable bound can be annotated.  */
> +
> +float f (float *a, float *b, int n)
> +{
> +  float t = 0;
> +  int i;
> +
> +#pragma acc kernels
> +  {
> +    for (i = 0; i < n; i++)
> +      t += a[i] * b[i];
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
> new file mode 100644
> index 0000000..5e3f028
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
> @@ -0,0 +1,23 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that calls to built-in functions don't inhibit kernels loop
> +   annotation.  */
> +
> +void foo (int n, int *input, int *out1, int *out2)
> +{
> +#pragma acc kernels
> +  {
> +    int i;
> +
> +    for (i = 0; i < n; i++)
> +      {
> +	out1[i] = __builtin_clz (input[i]);
> +	out2[i] = __builtin_popcount (input[i]);
> +      }
> +  }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c
> new file mode 100644
> index 0000000..f87444e
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-21.c
> @@ -0,0 +1,42 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test for rejecting annotation on loops that have various subexpressions
> +   in the loop end test that are not loop-invariant.  */
> +
> +extern int g (int);
> +extern int x;
> +extern int gg (int, int) __attribute__ ((const));
> +
> +void f (float *a, float *b, int n)
> +{
> +
> +  int j;
> +#pragma acc kernels
> +  {
> +    /* Non-constant function call.  */
> +    for (int i = 0; i < g(n); i++)	/* { dg-warning "loop cannot be annotated" } */
> +      a[i] = b[i];
> +
> +    /* Global variable.  */
> +    for (int i = x; i < n + x; i++)	/* { dg-warning "loop cannot be annotated" } */
> +      a[i] = b[i];
> +
> +    /* Explicit reference to the loop variable.  */
> +    for (int i = 0; i < gg (i, n); i++)	/* { dg-warning "loop cannot be annotated" } */
> +      a[i] = b[i];
> +
> +    /* Reference to a variable that is modified in the body of the loop.  */
> +    j = 0;
> +    for (int i = 0; i < gg (j, n); i++)	/* { dg-warning "loop cannot be annotated" } */
> +      {
> +	a[i] = b[i];
> +	j = i;
> +      }
> +
> +  }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c
> new file mode 100644
> index 0000000..6a5099d
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-22.c
> @@ -0,0 +1,41 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test for accepting annotation on loops that have various forms of
> +   loop-invariant expressions in their end test.  */
> +
> +extern const int x;
> +extern int g (int) __attribute__ ((const));
> +
> +void f (float *a, float *b, int n)
> +{
> +
> +  int j;
> +#pragma acc kernels
> +  {
> +    /* Reversed form of comparison.  */
> +    for (int i = 0; n >= i; i++)
> +      a[i] = b[i];
> +
> +    /* Constant function call.  */
> +    for (int i = 0; i < g(n); i++)
> +      a[i] = b[i];
> +
> +    /* Constant global variable.  */
> +    for (int i = 0; i < x; i++)
> +      a[i] = b[i];
> +
> +    /* Complicated expression involving conditionals, etc. */
> +    for (int i = 0; i < ((x == 4) ? (n << 2) : (n << 3)); i++)
> +      a[i] = b[i];
> +
> +    /* Reference to a local variable not modified in the loop.  */
> +    j = ((x == 4) ? (n << 2) : (n << 3));
> +    for (int i = 0; i < j; i++)
> +      a[i] = b[i];
> +  }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 5 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c
> new file mode 100644
> index 0000000..f60070e
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-3.c
> @@ -0,0 +1,24 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a conditional in the body can be annotated.  */
> +
> +#define n 16
> +
> +float f (float *a, float *b)
> +{
> +  float t = 0;
> +  int i;
> +
> +#pragma acc kernels
> +  {
> +    for (i = 0; i < n; i++)
> +      if (a[i] > 0 && b[i] > 0)
> +	t += a[i] * b[i];
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c
> new file mode 100644
> index 0000000..949871c
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-4.c
> @@ -0,0 +1,34 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a switch and break in the body can be annotated.  */
> +
> +#define n 16
> +
> +float f (float *a, float *b, int state)
> +{
> +  float t = 0;
> +  int i;
> +
> +#pragma acc kernels
> +  {
> +    for (i = 0; i < n; i++)
> +      switch (state)
> +	{
> +	case 0:
> +	default:
> +	  t += a[i] * b[i];
> +	  break;
> +
> +	case 1:
> +	  if (a[i] > 0 && b[i] > 0)
> +	    t += a[i] * b[i];
> +	  break;
> +	}
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c
> new file mode 100644
> index 0000000..03dfe8f
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-5.c
> @@ -0,0 +1,27 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a continue statement in the body can be annotated.  */
> +
> +#define n 16
> +
> +float f (float *a, float *b)
> +{
> +  float t = 0;
> +  int i;
> +
> +#pragma acc kernels
> +  {
> +    for (i = 0; i < n; i++)
> +      {
> +	if (a[i] < 0 || b[i] < 0)
> +	  continue;
> +	t += a[i] * b[i];
> +      }
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c
> new file mode 100644
> index 0000000..ede6b3c
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-6.c
> @@ -0,0 +1,27 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a break statement in the body cannot be annotated.  */
> +
> +#define n 16
> +
> +float f (float *a, float *b)
> +{
> +  float t = 0;
> +  int i;
> +
> +#pragma acc kernels
> +  {
> +    for (i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
> +      {
> +	if (a[i] < 0 || b[i] < 0)
> +	  break;
> +	t += a[i] * b[i];
> +      }
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c
> new file mode 100644
> index 0000000..20ee299
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-7.c
> @@ -0,0 +1,26 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with a random function call in the body cannot be
> +   annotated.  */
> +
> +extern float g (float);
> +
> +#define n 16
> +
> +float f (float *a, float *b)
> +{
> +  float t = 0;
> +  int i;
> +
> +#pragma acc kernels
> +  {
> +    for (i = 0; i < n; i++)	/* { dg-warning "loop cannot be annotated" } */
> +      t += g (a[i] * b[i]);
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 0 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c
> new file mode 100644
> index 0000000..796f048
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-8.c
> @@ -0,0 +1,27 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a loop with an openacc function call in the body can be
> +   annotated.  */
> +
> +#pragma acc routine worker
> +extern float g (float);
> +
> +#define n 16
> +
> +float f (float *a, float *b)
> +{
> +  float t = 0;
> +  int i;
> +
> +#pragma acc kernels
> +  {
> +    for (i = 0; i < n; i++)
> +      t += g (a[i] * b[i]);
> +  }
> +  return t;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "acc loop auto" 1 "original" } } */
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c
> new file mode 100644
> index 0000000..048f1b0
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-9.c
> @@ -0,0 +1,26 @@
> +/* { dg-additional-options "-fopenacc -fopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-Wopenacc-kernels-annotate-loops" } */
> +/* { dg-additional-options "-fdump-tree-original" } */
> +/* { dg-do compile } */
> +
> +/* Test that a kernels loop with a return in the body triggers a hard
> +   error.  */
> +
> +#define n 16
> +
> +float f (float *a, float *b)
> +{
> +  float t = 0;
> +  int i;
> +
> +#pragma acc kernels
> +  {
> +    for (i = 0; i < n; i++)
> +      {
> +	if (a[i] < 0 || b[i] < 0)
> +	  return 0.0;	/* { dg-error "invalid branch" } */
> +	t += a[i] * b[i];
> +      }
> +  }
> +  return t;
> +}
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c
> index 7180021..9a97de6 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-2.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c
> index 0c9f833..31e8378 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit-2.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c
> index 0bd21b6..ad59155 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-enter-exit.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c
> index dd5a841..4acffef 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data-update.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c
> index a658182..327aa05 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-data.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c
> index 73b469d..26c65fe 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c
> @@ -1,5 +1,6 @@
>   /* { dg-additional-options "-O2" } */
>   /* { dg-additional-options "-g" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c
> index 5592623..8955cf2 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c
> index e86be1b..d88a61d 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c
> index 2b0e186..5943d56 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop.c
> index 9619d53..ad525cd 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-loop.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c b/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c
> index 69539b2..f799baf 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c b/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c
> index 81b0fee..b8093b5 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-reduction.c b/gcc/testsuite/c-c++-common/goacc/kernels-reduction.c
> index 5921b88..105cbcf 100644
> --- a/gcc/testsuite/c-c++-common/goacc/kernels-reduction.c
> +++ b/gcc/testsuite/c-c++-common/goacc/kernels-reduction.c
> @@ -1,4 +1,5 @@
>   /* { dg-additional-options "-O2" } */
> +/* { dg-additional-options "-fno-openacc-kernels-annotate-loops" } */
>   /* { dg-additional-options "-fdump-tree-parloops1-all" } */
>   /* { dg-additional-options "-fdump-tree-optimized" } */
>   
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 9ec24a3..9ee9124 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -1492,6 +1492,11 @@ class auto_suppress_location_wrappers
>   #define OMP_TARGET_COMBINED(NODE) \
>     (OMP_TARGET_CHECK (NODE)->base.private_flag)
>   
> +/* True on an OACC_LOOP statement if it is part of a combined construct,
> +   for example "#pragma acc kernels loop".  */
> +#define OACC_LOOP_COMBINED(NODE) \
> +  (OACC_LOOP_CHECK (NODE)->base.private_flag)
> +
>   /* Memory order for OMP_ATOMIC*.  */
>   #define OMP_ATOMIC_MEMORY_ORDER(NODE) \
>     (TREE_RANGE_CHECK (NODE, OMP_ATOMIC, \
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-22 20:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-09 17:53 [PATCH 0/2] [OpenACC] Kernels loop annotation Sandra Loosemore
2020-09-09 17:53 ` [PATCH 1/2] [OpenACC] Kernels loops annotation: C and C++ Sandra Loosemore
2020-10-22 20:21   ` Jason Merrill
2020-09-09 17:53 ` [PATCH 2/2] [OpenACC] Kernels loops annotation: Fortran Sandra Loosemore
2020-09-10 10:20 ` [PATCH 0/2] [OpenACC] Kernels loop annotation Richard Biener
2020-09-10 16:00   ` Sandra Loosemore
2020-10-20 16:34 ` Sandra Loosemore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).