public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [gomp4 7/8] libgomp: work around missing pthread_attr_t on nvptx
  2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
                   ` (3 preceding siblings ...)
  2015-09-23 17:22 ` [gomp4 6/8] libgomp: provide stub bar.h on nvptx Alexander Monakov
@ 2015-09-23 17:22 ` Alexander Monakov
  2015-09-24  8:15   ` Jakub Jelinek
  2015-09-23 17:22 ` [gomp4 1/8] nvptx: remove assumption of OpenACC attrs presence Alexander Monakov
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 17:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan, Alexander Monakov

Although newlib headers define most pthreads types, pthread_attr_t is not
available.  Macro-replace it by 'void' to keep the prototype of
gomp_init_thread_affinity unchanged, and do not declare gomp_thread_attr.

	* libgomp.h: Define pthread_attr_t to void on NVPTX.
---
 libgomp/libgomp.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index d51b08b..f4255b4 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -510,8 +510,13 @@ static inline struct gomp_task_icv *gomp_icv (bool write)
     return &gomp_global_icv;
 }
 
+#ifdef __nvptx__
+/* pthread_attr_t is not provided by newlib on NVPTX.  */
+#define pthread_attr_t void
+#else
 /* The attributes to be used during thread creation.  */
 extern pthread_attr_t gomp_thread_attr;
+#endif
 
 /* Function prototypes.  */
 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [gomp4 4/8] libgomp: minimal OpenMP support in plugin-nvptx.c
  2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
  2015-09-23 17:22 ` [gomp4 8/8] libgomp: provide ICVs via env.c on nvptx Alexander Monakov
@ 2015-09-23 17:22 ` Alexander Monakov
  2015-09-24  7:34   ` Jakub Jelinek
  2015-09-23 17:22 ` [gomp4 2/8] nvptx mkoffload: do not restrict to OpenACC Alexander Monakov
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 17:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan, Alexander Monakov

This is a minimal patch for NVPTX OpenMP offloading, using Jakub's initial
implementation.  It allows to successfully run '#pragma omp target', without
any parallel execution: 1 team of 1 thread is spawned on the device, and
target regions with '#pragma omp parallel' will fail with a link error.

	* plugin/plugin-nvptx.c (nvptx_host2dev): Allow NULL 'nvthd'.
        (nvptx_dev2host): Ditto.
        (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400.
        (GOMP_OFFLOAD_run): New.
---
 libgomp/plugin/plugin-nvptx.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 52c49c7..a3eaafa 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1052,7 +1052,7 @@ nvptx_host2dev (void *d, const void *h, size_t s)
     GOMP_PLUGIN_fatal ("invalid size");
 
 #ifndef DISABLE_ASYNC
-  if (nvthd->current_stream != nvthd->ptx_dev->null_stream)
+  if (nvthd && nvthd->current_stream != nvthd->ptx_dev->null_stream)
     {
       CUevent *e;
 
@@ -1117,7 +1117,7 @@ nvptx_dev2host (void *h, const void *d, size_t s)
     GOMP_PLUGIN_fatal ("invalid size");
 
 #ifndef DISABLE_ASYNC
-  if (nvthd->current_stream != nvthd->ptx_dev->null_stream)
+  if (nvthd && nvthd->current_stream != nvthd->ptx_dev->null_stream)
     {
       CUevent *e;
 
@@ -1451,7 +1451,7 @@ GOMP_OFFLOAD_get_name (void)
 unsigned int
 GOMP_OFFLOAD_get_caps (void)
 {
-  return GOMP_OFFLOAD_CAP_OPENACC_200;
+  return GOMP_OFFLOAD_CAP_OPENACC_200 | GOMP_OFFLOAD_CAP_OPENMP_400;
 }
 
 int
@@ -1788,3 +1788,27 @@ GOMP_OFFLOAD_openacc_set_cuda_stream (int async, void *stream)
 {
   return nvptx_set_cuda_stream (async, stream);
 }
+
+void
+GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars)
+{
+  CUfunction function = ((struct targ_fn_descriptor *) tgt_fn)->fn;
+  CUresult r;
+  struct ptx_device *ptx_dev = ptx_devices[ord];
+  const char *maybe_abort_msg = "(perhaps abort was called)";
+  void *args = &tgt_vars;
+
+  r = cuLaunchKernel (function,
+		      1, 1, 1,
+		      1, 1, 1,
+		      0, ptx_dev->null_stream->stream, &args, 0);
+  if (r != CUDA_SUCCESS)
+    GOMP_PLUGIN_fatal ("cuLaunchKernel error: %s", cuda_error (r));
+
+  r = cuCtxSynchronize ();
+  if (r == CUDA_ERROR_LAUNCH_FAILED)
+    GOMP_PLUGIN_fatal ("cuCtxSynchronize error: %s %s\n", cuda_error (r),
+		       maybe_abort_msg);
+  else if (r != CUDA_SUCCESS)
+    GOMP_PLUGIN_fatal ("cuCtxSynchronize error: %s", cuda_error (r));
+}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [gomp4 6/8] libgomp: provide stub bar.h on nvptx
  2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
                   ` (2 preceding siblings ...)
  2015-09-23 17:22 ` [gomp4 2/8] nvptx mkoffload: do not restrict to OpenACC Alexander Monakov
@ 2015-09-23 17:22 ` Alexander Monakov
  2015-09-24  8:09   ` Jakub Jelinek
  2015-09-23 17:22 ` [gomp4 7/8] libgomp: work around missing pthread_attr_t " Alexander Monakov
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 17:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan, Alexander Monakov

This stub header only provides empty struct gomp_barrier_t.  For now I've
punted on providing a minimally-correct implementation.

	* config/nvptx/bar.h: New file.
---
 libgomp/config/nvptx/bar.h | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)
 create mode 100644 libgomp/config/nvptx/bar.h

diff --git a/libgomp/config/nvptx/bar.h b/libgomp/config/nvptx/bar.h
new file mode 100644
index 0000000..009d85f
--- /dev/null
+++ b/libgomp/config/nvptx/bar.h
@@ -0,0 +1,38 @@
+/* Copyright (C) 2015 Free Software Foundation, Inc.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This is an NVPTX specific implementation of a barrier synchronization
+   mechanism for libgomp.  This type is private to the library.  This
+   implementation is a stub, for now.  */
+
+#ifndef GOMP_BARRIER_H
+#define GOMP_BARRIER_H 1
+
+typedef struct
+{
+} gomp_barrier_t;
+
+typedef unsigned int gomp_barrier_state_t;
+
+#endif /* GOMP_BARRIER_H */

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [gomp4 1/8] nvptx: remove assumption of OpenACC attrs presence
  2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
                   ` (4 preceding siblings ...)
  2015-09-23 17:22 ` [gomp4 7/8] libgomp: work around missing pthread_attr_t " Alexander Monakov
@ 2015-09-23 17:22 ` Alexander Monakov
  2015-09-24  7:26   ` Jakub Jelinek
  2015-09-23 17:40 ` [gomp4 3/8] libgomp: provide target-to-host fallback diagnostic Alexander Monakov
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 17:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan, Alexander Monakov

This patch makes one OpenACC-specific path in nvptx_record_offload_symbol
optional.

	* config/nvptx/nvptx.c (nvptx_record_offload_symbol): Allow missing
        OpenACC attributes.
---
 gcc/config/nvptx/nvptx.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 53850a1..21c59ef 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4026,19 +4026,22 @@ nvptx_record_offload_symbol (tree decl)
 
     case FUNCTION_DECL:
       {
-	tree attr = get_oacc_fn_attrib (decl);
-	tree dims = TREE_VALUE (attr);
-	unsigned ix;
-	
 	fprintf (asm_out_file, "//:FUNC_MAP \"%s\"",
 		 IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)));
 
-	for (ix = 0; ix != GOMP_DIM_MAX; ix++, dims = TREE_CHAIN (dims))
+	tree attr = get_oacc_fn_attrib (decl);
+	if (attr)
 	  {
-	    int size = TREE_INT_CST_LOW (TREE_VALUE (dims));
+	    tree dims = TREE_VALUE (attr);
+	    unsigned ix;
 
-	    gcc_assert (!TREE_PURPOSE (dims));
-	    fprintf (asm_out_file, ", %#x", size);
+	    for (ix = 0; ix != GOMP_DIM_MAX; ix++, dims = TREE_CHAIN (dims))
+	    {
+	      int size = TREE_INT_CST_LOW (TREE_VALUE (dims));
+
+	      gcc_assert (!TREE_PURPOSE (dims));
+	      fprintf (asm_out_file, ", %#x", size);
+	    }
 	  }
 
 	fprintf (asm_out_file, "\n");

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [gomp4 2/8] nvptx mkoffload: do not restrict to OpenACC
  2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
  2015-09-23 17:22 ` [gomp4 8/8] libgomp: provide ICVs via env.c on nvptx Alexander Monakov
  2015-09-23 17:22 ` [gomp4 4/8] libgomp: minimal OpenMP support in plugin-nvptx.c Alexander Monakov
@ 2015-09-23 17:22 ` Alexander Monakov
  2015-09-24  7:29   ` Jakub Jelinek
  2015-09-23 17:22 ` [gomp4 6/8] libgomp: provide stub bar.h on nvptx Alexander Monakov
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 17:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan, Alexander Monakov

This patch allows to meaningfully invoke mkoffload with -fopenmp.  The check
for -fopenacc flag is specific to gomp4 branch: trunk does not have it.

	* config/nvptx/mkoffload.c (main): Do not check for -fopenacc.
---
 gcc/config/nvptx/mkoffload.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/config/nvptx/mkoffload.c b/gcc/config/nvptx/mkoffload.c
index 0114394..8c15686 100644
--- a/gcc/config/nvptx/mkoffload.c
+++ b/gcc/config/nvptx/mkoffload.c
@@ -468,15 +468,12 @@ main (int argc, char **argv)
       obstack_ptr_grow (&argv_obstack, str);
     }
 
-  bool fopenacc = false;
   for (int ix = 1; ix != argc; ix++)
     {
       if (!strcmp (argv[ix], "-v"))
 	verbose = true;
       else if (!strcmp (argv[ix], "-save-temps"))
 	save_temps = true;
-      else if (!strcmp (argv[ix], "-fopenacc"))
-	fopenacc = true;
 
       if (!strcmp (argv[ix], "-o") && ix + 1 != argc)
 	outname = argv[++ix];
@@ -491,8 +488,8 @@ main (int argc, char **argv)
     fatal_error (input_location, "cannot open '%s'", ptx_cfile_name);
 
   /* PR libgomp/65099: Currently, we only support offloading in 64-bit
-     configurations, and only for OpenACC offloading.  */
-  if (!target_ilp32 && fopenacc)
+     configurations.  */
+  if (!target_ilp32)
     {
       ptx_name = make_temp_file (".mkoffload");
       obstack_ptr_grow (&argv_obstack, "-o");

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [gomp4 8/8] libgomp: provide ICVs via env.c on nvptx
  2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
@ 2015-09-23 17:22 ` Alexander Monakov
  2015-09-24  8:15   ` Jakub Jelinek
  2015-09-23 17:22 ` [gomp4 4/8] libgomp: minimal OpenMP support in plugin-nvptx.c Alexander Monakov
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 17:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan, Alexander Monakov

This patch ports env.c to NVPTX.  It drops all environment parsing routines
since there's no "environment" on the device.  For now, the useful effect of
the patch is providing 'omp_is_initial_device' to distinguish host execution
from target execution in user code.

Several functions use gomp_icv, which is not adjusted for NVPTX and thus will
try to use EMUTLS.  The intended way forward is to provide a custom
implementation of gomp_icv on NVPTX, likely via pre-allocating storage prior
to spawning a team.

	* config/nvptx/env.c: New file.
---
 libgomp/config/nvptx/env.c | 219 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 219 insertions(+)

diff --git a/libgomp/config/nvptx/env.c b/libgomp/config/nvptx/env.c
index e69de29..f964b29 100644
--- a/libgomp/config/nvptx/env.c
+++ b/libgomp/config/nvptx/env.c
@@ -0,0 +1,219 @@
+/* Copyright (C) 2015 Free Software Foundation, Inc.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file defines the OpenMP internal control variables.  There is
+   no environment on the accelerator, so the variables can be changed
+   only via OpenMP API in target regions.  */
+
+#include "libgomp.h"
+#include "libgomp_f.h"
+
+#include <limits.h>
+
+struct gomp_task_icv gomp_global_icv = {
+  .nthreads_var = 1,
+  .thread_limit_var = UINT_MAX,
+  .run_sched_var = GFS_DYNAMIC,
+  .run_sched_modifier = 1,
+  .default_device_var = 0,
+  .dyn_var = false,
+  .nest_var = false,
+  .bind_var = omp_proc_bind_false,
+  .target_data = NULL
+};
+
+unsigned long gomp_max_active_levels_var = INT_MAX;
+unsigned long gomp_available_cpus = 1, gomp_managed_threads = 1;
+unsigned long long gomp_spin_count_var, gomp_throttled_spin_count_var;
+unsigned long *gomp_nthreads_var_list, gomp_nthreads_var_list_len;
+char *gomp_bind_var_list;
+unsigned long gomp_bind_var_list_len;
+void **gomp_places_list;
+unsigned long gomp_places_list_len;
+int gomp_debug_var;
+
+void
+omp_set_num_threads (int n)
+{
+  struct gomp_task_icv *icv = gomp_icv (true);
+  icv->nthreads_var = (n > 0 ? n : 1);
+}
+
+void
+omp_set_dynamic (int val)
+{
+  struct gomp_task_icv *icv = gomp_icv (true);
+  icv->dyn_var = val;
+}
+
+int
+omp_get_dynamic (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->dyn_var;
+}
+
+void
+omp_set_nested (int val)
+{
+  struct gomp_task_icv *icv = gomp_icv (true);
+  icv->nest_var = val;
+}
+
+int
+omp_get_nested (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->nest_var;
+}
+
+void
+omp_set_schedule (omp_sched_t kind, int modifier)
+{
+  struct gomp_task_icv *icv = gomp_icv (true);
+  switch (kind)
+    {
+    case omp_sched_static:
+      if (modifier < 1)
+	modifier = 0;
+      icv->run_sched_modifier = modifier;
+      break;
+    case omp_sched_dynamic:
+    case omp_sched_guided:
+      if (modifier < 1)
+	modifier = 1;
+      icv->run_sched_modifier = modifier;
+      break;
+    case omp_sched_auto:
+      break;
+    default:
+      return;
+    }
+  icv->run_sched_var = kind;
+}
+
+void
+omp_get_schedule (omp_sched_t *kind, int *modifier)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  *kind = icv->run_sched_var;
+  *modifier = icv->run_sched_modifier;
+}
+
+int
+omp_get_max_threads (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->nthreads_var;
+}
+
+int
+omp_get_thread_limit (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->thread_limit_var > INT_MAX ? INT_MAX : icv->thread_limit_var;
+}
+
+void
+omp_set_max_active_levels (int max_levels)
+{
+  if (max_levels >= 0)
+    gomp_max_active_levels_var = max_levels;
+}
+
+int
+omp_get_max_active_levels (void)
+{
+  return gomp_max_active_levels_var;
+}
+
+int
+omp_get_cancellation (void)
+{
+  return 0;
+}
+
+omp_proc_bind_t
+omp_get_proc_bind (void)
+{
+  return omp_proc_bind_false;
+}
+
+void
+omp_set_default_device (int device_num __attribute__((unused)))
+{
+}
+
+int
+omp_get_default_device (void)
+{
+  return 0;
+}
+
+int
+omp_get_num_devices (void)
+{
+  return 0;
+}
+
+int
+omp_get_num_teams (void)
+{
+  /* FORNOW.  */
+  return 1;
+}
+
+int
+omp_get_team_num (void)
+{
+  /* FORNOW.  */
+  return 0;
+}
+
+int
+omp_is_initial_device (void)
+{
+  /* PTX is an accelerator-only target.  */
+  return 0;
+}
+
+ialias (omp_set_dynamic)
+ialias (omp_set_nested)
+ialias (omp_set_num_threads)
+ialias (omp_get_dynamic)
+ialias (omp_get_nested)
+ialias (omp_set_schedule)
+ialias (omp_get_schedule)
+ialias (omp_get_max_threads)
+ialias (omp_get_thread_limit)
+ialias (omp_set_max_active_levels)
+ialias (omp_get_max_active_levels)
+ialias (omp_get_cancellation)
+ialias (omp_get_proc_bind)
+ialias (omp_set_default_device)
+ialias (omp_get_default_device)
+ialias (omp_get_num_devices)
+ialias (omp_get_num_teams)
+ialias (omp_get_team_num)
+ialias (omp_is_initial_device)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [gomp4 3/8] libgomp: provide target-to-host fallback diagnostic
  2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
                   ` (5 preceding siblings ...)
  2015-09-23 17:22 ` [gomp4 1/8] nvptx: remove assumption of OpenACC attrs presence Alexander Monakov
@ 2015-09-23 17:40 ` Alexander Monakov
  2015-09-24  7:33   ` Jakub Jelinek
  2015-09-23 17:43 ` [gomp4 5/8] libgomp: provide sem.h, mutex.h, ptrlock.h on nvptx Alexander Monakov
  2015-09-23 18:44 ` [gomp4 0/8] NVPTX: initial OpenMP offloading Bernd Schmidt
  8 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 17:40 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan, Alexander Monakov

This patch allows to see when target regions are executed on host with
GOMP_DEBUG=1 in the environment.

	* target.c (GOMP_target): Use gomp_debug on fallback path.
---
 libgomp/target.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libgomp/target.c b/libgomp/target.c
index 6ca80ad..1cc2098 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1008,6 +1008,7 @@ GOMP_target (int device, void (*fn) (void *), const void *unused,
       || !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
     {
       /* Host fallback.  */
+      gomp_debug (0, "%s: target region executing on host\n", __FUNCTION__);
       struct gomp_thread old_thr, *thr = gomp_thread ();
       old_thr = *thr;
       memset (thr, '\0', sizeof (*thr));

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [gomp4 0/8] NVPTX: initial OpenMP offloading
@ 2015-09-23 17:43 Alexander Monakov
  2015-09-23 17:22 ` [gomp4 8/8] libgomp: provide ICVs via env.c on nvptx Alexander Monakov
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 17:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan, Alexander Monakov

Hello,

This patch series implements some minimally required changes to have OpenMP
offloading working for NVPTX target on the gomp4 branch.  '#pragma omp target'
and data updates should work, but all parallel execution functionality remains
stubbed out (uses of '#pragma omp parallel' in target regions yield a link
error).

I'd like to get feedback on the patches, and approval for the gomp-4_0-branch
where possible.

Patches 1-2 unbreak compilation with offloading, patch 4 allows to invoke a
target region on the accelerator, patches 5-8 unbreak libgomp.h and allow
env.c to be compiled for the accelerator.

  nvptx: remove assumption of OpenACC attrs presence
  nvptx mkoffload: do not restrict to OpenACC
  libgomp: provide target-to-host fallback diagnostic
  libgomp: minimal OpenMP support in plugin-nvptx.c
  libgomp: provide sem.h, mutex.h, ptrlock.h on nvptx
  libgomp: provide stub bar.h on nvptx
  libgomp: work around missing pthread_attr_t on nvptx
  libgomp: provide ICVs via env.c on nvptx

 gcc/config/nvptx/mkoffload.c   |   7 +-
 gcc/config/nvptx/nvptx.c       |  19 ++--
 libgomp/config/nvptx/bar.h     |  38 +++++++
 libgomp/config/nvptx/env.c     | 219 +++++++++++++++++++++++++++++++++++++++++
 libgomp/config/nvptx/mutex.h   |  67 +++++++++++++
 libgomp/config/nvptx/ptrlock.h |  73 ++++++++++++++
 libgomp/config/nvptx/sem.h     |  65 ++++++++++++
 libgomp/libgomp.h              |   5 +
 libgomp/plugin/plugin-nvptx.c  |  30 +++++-
 libgomp/target.c               |   1 +
 10 files changed, 508 insertions(+), 16 deletions(-)
 create mode 100644 libgomp/config/nvptx/bar.h
 create mode 100644 libgomp/config/nvptx/mutex.h
 create mode 100644 libgomp/config/nvptx/ptrlock.h
 create mode 100644 libgomp/config/nvptx/sem.h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [gomp4 5/8] libgomp: provide sem.h, mutex.h, ptrlock.h on nvptx
  2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
                   ` (6 preceding siblings ...)
  2015-09-23 17:40 ` [gomp4 3/8] libgomp: provide target-to-host fallback diagnostic Alexander Monakov
@ 2015-09-23 17:43 ` Alexander Monakov
  2015-09-24  7:43   ` Jakub Jelinek
  2015-09-23 18:44 ` [gomp4 0/8] NVPTX: initial OpenMP offloading Bernd Schmidt
  8 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 17:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan, Alexander Monakov

This patch provides minimal non-stub implementations for libgomp
mutex/ptrlock/semaphore, using atomic ops and busy waiting.  The goal here is
to at least provide stub struct declarations necessary to unbreak libgomp.h.

Atomics with busy waiting seems to be the only way to provide such primitives
for inter-team synchronizations, but for intra-team ops a more efficient
implementation may be possible.

(all functionality is unused since consumers are stubbed out in config/nvptx)

	* config/nvptx/mutex.h: New file.
	* config/nvptx/ptrlock.h: New file.
	* config/nvptx/sem.h: New file.
---
 libgomp/config/nvptx/mutex.h   | 67 ++++++++++++++++++++++++++++++++++++++
 libgomp/config/nvptx/ptrlock.h | 73 ++++++++++++++++++++++++++++++++++++++++++
 libgomp/config/nvptx/sem.h     | 65 +++++++++++++++++++++++++++++++++++++
 3 files changed, 205 insertions(+)
 create mode 100644 libgomp/config/nvptx/mutex.h
 create mode 100644 libgomp/config/nvptx/ptrlock.h
 create mode 100644 libgomp/config/nvptx/sem.h

diff --git a/libgomp/config/nvptx/mutex.h b/libgomp/config/nvptx/mutex.h
new file mode 100644
index 0000000..a98d5a9
--- /dev/null
+++ b/libgomp/config/nvptx/mutex.h
@@ -0,0 +1,67 @@
+/* Copyright (C) 2015 Free Software Foundation, Inc.
+   Contributed by Alexander Monakov <amonakov@ispras.ru>
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This is an NVPTX specific implementation of a mutex synchronization
+   mechanism for libgomp.  This type is private to the library.  This
+   implementation uses atomic instructions and busy waiting.  */
+
+#ifndef GOMP_MUTEX_H
+#define GOMP_MUTEX_H 1
+
+typedef int gomp_mutex_t;
+
+#define GOMP_MUTEX_INIT_0 1
+
+static inline void
+gomp_mutex_init (gomp_mutex_t *mutex)
+{
+  *mutex = 0;
+}
+
+static inline void
+gomp_mutex_destroy (gomp_mutex_t *mutex)
+{
+}
+
+static inline void
+gomp_mutex_lock (gomp_mutex_t *mutex)
+{
+  int value = __atomic_load_n (mutex, MEMMODEL_ACQUIRE);
+  for (;;)
+    {
+      while (value == 0)
+	value = __atomic_load_n (mutex, MEMMODEL_ACQUIRE);
+      if (__atomic_compare_exchange_n (mutex, &value, 1, false,
+				       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED))
+	return;
+    }
+}
+
+static inline void
+gomp_mutex_unlock (gomp_mutex_t *mutex)
+{
+  __atomic_store_n (mutex, 0, MEMMODEL_RELEASE);
+}
+#endif /* GOMP_MUTEX_H */
diff --git a/libgomp/config/nvptx/ptrlock.h b/libgomp/config/nvptx/ptrlock.h
new file mode 100644
index 0000000..c4ff033
--- /dev/null
+++ b/libgomp/config/nvptx/ptrlock.h
@@ -0,0 +1,73 @@
+/* Copyright (C) 2015 Free Software Foundation, Inc.
+   Contributed by Alexander Monakov <amonakov@ispras.ru>
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This is an NVPTX specific implementation of a mutex synchronization
+   mechanism for libgomp.  This type is private to the library.  This
+   implementation uses atomic instructions and busy waiting.
+
+   A ptrlock has four states:
+   0/NULL Initial
+   1      Owned by me, I get to write a pointer to ptrlock.
+   2      Some thread is waiting on the ptrlock.
+   >2     Ptrlock contains a valid pointer.
+   It is not valid to gain the ptrlock and then write a NULL to it.  */
+
+#ifndef GOMP_PTRLOCK_H
+#define GOMP_PTRLOCK_H 1
+
+typedef void *gomp_ptrlock_t;
+
+static inline void gomp_ptrlock_init (gomp_ptrlock_t *ptrlock, void *ptr)
+{
+  *ptrlock = ptr;
+}
+
+static inline void *gomp_ptrlock_get (gomp_ptrlock_t *ptrlock)
+{
+  uintptr_t v = (uintptr_t) __atomic_load_n (ptrlock, MEMMODEL_ACQUIRE);
+  if (v > 2)
+    return (void *) v;
+
+  if (v == 0
+      && __atomic_compare_exchange_n (ptrlock, &v, 1, false,
+				      MEMMODEL_ACQUIRE, MEMMODEL_ACQUIRE))
+    return NULL;
+
+  while (v == 1)
+    v = (uintptr_t) __atomic_load_n (ptrlock, MEMMODEL_ACQUIRE);
+
+  return (void *) v;
+}
+
+static inline void gomp_ptrlock_set (gomp_ptrlock_t *ptrlock, void *ptr)
+{
+  __atomic_store_n (ptrlock, ptr, MEMMODEL_RELEASE);
+}
+
+static inline void gomp_ptrlock_destroy (gomp_ptrlock_t *ptrlock)
+{
+}
+
+#endif /* GOMP_PTRLOCK_H */
diff --git a/libgomp/config/nvptx/sem.h b/libgomp/config/nvptx/sem.h
new file mode 100644
index 0000000..c29ebac
--- /dev/null
+++ b/libgomp/config/nvptx/sem.h
@@ -0,0 +1,65 @@
+/* Copyright (C) 2015 Free Software Foundation, Inc.
+   Contributed by Alexander Monakov <amonakov@ispras.ru>
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This is an NVPTX specific implementation of a semaphore synchronization
+   mechanism for libgomp.  This type is private to the library.  This
+   semaphore implementation uses atomic instructions and busy waiting.  */
+
+#ifndef GOMP_SEM_H
+#define GOMP_SEM_H 1
+
+typedef int gomp_sem_t;
+
+static inline void
+gomp_sem_init (gomp_sem_t *sem, int value)
+{
+  *sem = value;
+}
+
+static inline void
+gomp_sem_destroy (gomp_sem_t *sem)
+{
+}
+
+static inline void
+gomp_sem_wait (gomp_sem_t *sem)
+{
+  int count = __atomic_load_n (sem, MEMMODEL_ACQUIRE);
+  for (;;)
+    {
+      while (count == 0)
+	count = __atomic_load_n (sem, MEMMODEL_ACQUIRE);
+      if (__atomic_compare_exchange_n (sem, &count, count - 1, false,
+				       MEMMODEL_ACQUIRE, MEMMODEL_RELAXED))
+	return;
+    }
+}
+
+static inline void
+gomp_sem_post (gomp_sem_t *sem)
+{
+  (void) __atomic_add_fetch (sem, 1, MEMMODEL_RELEASE);
+}
+#endif /* GOMP_SEM_H */

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 0/8] NVPTX: initial OpenMP offloading
  2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
                   ` (7 preceding siblings ...)
  2015-09-23 17:43 ` [gomp4 5/8] libgomp: provide sem.h, mutex.h, ptrlock.h on nvptx Alexander Monakov
@ 2015-09-23 18:44 ` Bernd Schmidt
  2015-09-23 21:44   ` Alexander Monakov
  8 siblings, 1 reply; 27+ messages in thread
From: Bernd Schmidt @ 2015-09-23 18:44 UTC (permalink / raw)
  To: Alexander Monakov, gcc-patches; +Cc: Jakub Jelinek, Arutyun Avetisyan

On 09/23/2015 07:22 PM, Alexander Monakov wrote:
> This patch series implements some minimally required changes to have OpenMP
> offloading working for NVPTX target on the gomp4 branch.  '#pragma omp target'
> and data updates should work, but all parallel execution functionality remains
> stubbed out (uses of '#pragma omp parallel' in target regions yield a link
> error).
>
> I'd like to get feedback on the patches, and approval for the gomp-4_0-branch
> where possible.

I have two major concerns here. Can I ask you how much experience you 
have with GPU programming and ptx? These patches provide stub 
functionality, which is easy enough, but I can't tell whether there's a 
credible plan to provide a full implementation. GPUs really need a 
different programming model than normal CPUs, which is something I 
learned the hard way, and I'm not terribly optimistic about porting 
libgomp to ptx. (I may be wrong.)

In one patch you mention newlib pthread type definitions - are you aware 
that there is no real pthreads implementation in the ptx newlib? The ptx 
newlib is really only provided for a minimal subset of libc functionality.

My other concern would be not to approve changes to the gomp-4_0-branch 
that could derail or slow down the effort to implement OpenACC, which 
has a much better chance of being in gcc-6 than this effort. You might 
want to make a private branch for your work.


Bernd

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 0/8] NVPTX: initial OpenMP offloading
  2015-09-23 18:44 ` [gomp4 0/8] NVPTX: initial OpenMP offloading Bernd Schmidt
@ 2015-09-23 21:44   ` Alexander Monakov
  2015-09-24  7:26     ` Jakub Jelinek
  0 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-23 21:44 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches, Jakub Jelinek, Arutyun Avetisyan

On Wed, 23 Sep 2015, Bernd Schmidt wrote:
> I have two major concerns here. Can I ask you how much experience you have
> with GPU programming and ptx?

I'd say I have a good understanding of the programming model and nvidia
hardware architecture, having used CUDA tools and payed attention to
r&d news for a few years.  I've discussed with OpenACC and HSA folks at the
GNU Cauldron my plans to take on this work, and I hope they can acknowledge
that I at least seemed to have a clue :)

> These patches provide stub functionality, which
> is easy enough, but I can't tell whether there's a credible plan to provide a
> full implementation. GPUs really need a different programming model than
> normal CPUs, which is something I learned the hard way, and I'm not terribly
> optimistic about porting libgomp to ptx. (I may be wrong.)

Right, libgomp running on ptx would have to do many things differently from
how it does now (and some drop entirely, like affinity).  Thankfully it can be
implemented piecemeal in config/nvptx, without #ifdef butchery in the primary
source files.  The plan towards providing a full implementation is thus to
work my way incrementally over GOMP_nn api, allowing '#pragma omp parallel' to
link successfully, then 'for', 'teams' and so on.  For 'parallel' the
intention is to either have prestarted idle threads in teams if possible, or
start another kernel via dynamic parallelism.  Exact details are to be worked
out -- I'd like to avoid introducing a hard dependency on dynamic parallelism.

> In one patch you mention newlib pthread type definitions - are you aware that
> there is no real pthreads implementation in the ptx newlib? The ptx newlib is
> really only provided for a minimal subset of libc functionality.

Sure, I'm aware.  The point was to make libgomp.h valid to be included into
the rest of to-be-ported source files, keeping modifications to it to a
minimum.  If the idea is that relying on #include <pthread.h> available on
nvptx in the first place is too much of a hack, we can discuss alternatives :)

> My other concern would be not to approve changes to the gomp-4_0-branch that
> could derail or slow down the effort to implement OpenACC, which has a much
> better chance of being in gcc-6 than this effort. You might want to make a
> private branch for your work.

I'm unclear how this work might hurt the OpenACC efforts, and in any case I
intend to be careful.  I don't imagine there will be conflicting requirements
to source code changes along the way.  In defense of the idea of working on
gomp4 branch, I expect that interleaving OpenACC and OpenMP work on a common
branch will cause less pain in case of inadvertent breakage than a merge
afterward.  Jakub, since you suggested submitting for gomp-4_0-branch, what's
your recommendation here?

Thanks
Alexander

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 1/8] nvptx: remove assumption of OpenACC attrs presence
  2015-09-23 17:22 ` [gomp4 1/8] nvptx: remove assumption of OpenACC attrs presence Alexander Monakov
@ 2015-09-24  7:26   ` Jakub Jelinek
  2015-09-24  7:27     ` Jakub Jelinek
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  7:26 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan, bernds, Nathan Sidwell

On Wed, Sep 23, 2015 at 08:22:15PM +0300, Alexander Monakov wrote:
> This patch makes one OpenACC-specific path in nvptx_record_offload_symbol
> optional.
> 
> 	* config/nvptx/nvptx.c (nvptx_record_offload_symbol): Allow missing
>         OpenACC attributes.

LGTM, but as it is a nvptx backend change, please check with the nvptx
maintainers (Bernd/Nathan), and for the whole patch series, please wait for
Thomas/Nathan if they are ok with having the stuff on their branch.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 0/8] NVPTX: initial OpenMP offloading
  2015-09-23 21:44   ` Alexander Monakov
@ 2015-09-24  7:26     ` Jakub Jelinek
  2015-09-24 14:31       ` Nathan Sidwell
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  7:26 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: Bernd Schmidt, gcc-patches, Arutyun Avetisyan

On Wed, Sep 23, 2015 at 11:24:16PM +0300, Alexander Monakov wrote:
> > These patches provide stub functionality, which
> > is easy enough, but I can't tell whether there's a credible plan to provide a
> > full implementation. GPUs really need a different programming model than
> > normal CPUs, which is something I learned the hard way, and I'm not terribly
> > optimistic about porting libgomp to ptx. (I may be wrong.)
> 
> Right, libgomp running on ptx would have to do many things differently from
> how it does now (and some drop entirely, like affinity).  Thankfully it can be

Sure, affinity doesn't have to be supported.  And, eventually some
simpler constructs can be e.g. inlined by the compiler if it is desirable.
Some constructs like tasking though are just too complex to handle them
without sharing code in the library.  Static scheduling loops are already
expanded inline by the compiler except for ordered loops (which are again
hard to handle without library side), other scheduling kinds IMHO just can
be shared with the CPU implementation, etc.

> implemented piecemeal in config/nvptx, without #ifdef butchery in the primary
> source files.  The plan towards providing a full implementation is thus to

We really don't need to avoid all #ifdef stuff, just keep it to a reasonable
maintanable level.

> > In one patch you mention newlib pthread type definitions - are you aware that
> > there is no real pthreads implementation in the ptx newlib? The ptx newlib is
> > really only provided for a minimal subset of libc functionality.
> 
> Sure, I'm aware.  The point was to make libgomp.h valid to be included into
> the rest of to-be-ported source files, keeping modifications to it to a
> minimum.  If the idea is that relying on #include <pthread.h> available on
> nvptx in the first place is too much of a hack, we can discuss alternatives :)

I'd say for e.g. libgomp.h it is acceptable to use what I've posted in
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01418.html, so HAVE_PTHREAD_H
and LIBGOMP_USE_PTHREAD guards.  It is likely some other offloading target
in the future (somebody has been talking about e.g. ARM offloading to
Epiphany (Parallella board)) will have the same need (i.e. no pthreads, and
either a dummy pthread.h around, or not at all).
Plus of course we need NVPTX version of gomp_thread (), that can be guarded
with __nvptx__ ifdef (if the implementation is small, but I'd hope it is,
some CTA local pointer and pointer arithmetics - indexed by %tid.x / WRAP_SZ
or something similar.

> > My other concern would be not to approve changes to the gomp-4_0-branch that
> > could derail or slow down the effort to implement OpenACC, which has a much
> > better chance of being in gcc-6 than this effort. You might want to make a
> > private branch for your work.
> 
> I'm unclear how this work might hurt the OpenACC efforts, and in any case I
> intend to be careful.  I don't imagine there will be conflicting requirements
> to source code changes along the way.  In defense of the idea of working on
> gomp4 branch, I expect that interleaving OpenACC and OpenMP work on a common
> branch will cause less pain in case of inadvertent breakage than a merge
> afterward.  Jakub, since you suggested submitting for gomp-4_0-branch, what's
> your recommendation here?

My suggestion for this to be added to gomp-4_0-branch rather than e.g.
gomp-4_1-branch or trunk directly is that even at the beginning it has some
dependencies on the stuff that has not been merged into trunk yet, in
particular the nvptx changes to libgomp that are on the branch and the code
to link libgcc and/or libgomp statically into the nvptx offloaded chunks.

Once those pieces are merged into trunk, obviously it could be developed on
some other branch, but I'd hope none of the changes actually can be
problematic to the OpenACC effort, OpenACC uses from the libgomp only a
minimum files and that I bet is not going to change too much with the
patches.

As for merging plans, the OpenMP 4.1 standard is approaching its final form
quickly, so I expect to merge gomp-4_1-branch to trunk around October 15th.
It would be nice if the gomp-4_0-branch stuff (at least the parts
Thomas/Nathan want to see in GCC 6) were in the process of being merged
shortly after that (I know I'm behind with patch review and am very sorry
for that, will try to find more time for that in the second half of October
and early November).  As for this NVPTX OpenMP 4.1 port, I'd say it really
depends on how invasive it is to other parts of the compiler.  Parts of it
that can't destabilize OpenMP 4.1 host or XeonPhi/XeonPhi-emul nor OpenACC
support can go even during stage3 (of course on a case by case basis).

So I'd like to ask Thomas/Nathan if they are ok with this stuff being on
the gomp-4_0-branch for now, once all the prerequisities it needs are on the
trunk, it can go into its own branch.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 1/8] nvptx: remove assumption of OpenACC attrs presence
  2015-09-24  7:26   ` Jakub Jelinek
@ 2015-09-24  7:27     ` Jakub Jelinek
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  7:27 UTC (permalink / raw)
  To: Alexander Monakov
  Cc: gcc-patches, Arutyun Avetisyan, Bernd Schmidt, Nathan Sidwell

On Wed, Sep 23, 2015 at 08:22:15PM +0300, Alexander Monakov wrote:
> This patch makes one OpenACC-specific path in nvptx_record_offload_symbol
> optional.
> 
> 	* config/nvptx/nvptx.c (nvptx_record_offload_symbol): Allow missing
>         OpenACC attributes.

LGTM, but as it is a nvptx backend change, please check with the nvptx
maintainers (Bernd/Nathan), and for the whole patch series, please wait for
Thomas/Nathan if they are ok with having the stuff on their branch.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 2/8] nvptx mkoffload: do not restrict to OpenACC
  2015-09-23 17:22 ` [gomp4 2/8] nvptx mkoffload: do not restrict to OpenACC Alexander Monakov
@ 2015-09-24  7:29   ` Jakub Jelinek
  2015-10-02 19:29     ` [PR target/67822] OpenMP offloading to nvptx fails (was: [gomp4 2/8] nvptx mkoffload: do not restrict to OpenACC) Thomas Schwinge
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  7:29 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan

On Wed, Sep 23, 2015 at 08:22:16PM +0300, Alexander Monakov wrote:
> This patch allows to meaningfully invoke mkoffload with -fopenmp.  The check
> for -fopenacc flag is specific to gomp4 branch: trunk does not have it.
> 
> 	* config/nvptx/mkoffload.c (main): Do not check for -fopenacc.
> ---
>  gcc/config/nvptx/mkoffload.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)

LGTM.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 3/8] libgomp: provide target-to-host fallback diagnostic
  2015-09-23 17:40 ` [gomp4 3/8] libgomp: provide target-to-host fallback diagnostic Alexander Monakov
@ 2015-09-24  7:33   ` Jakub Jelinek
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  7:33 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan

On Wed, Sep 23, 2015 at 08:22:17PM +0300, Alexander Monakov wrote:
> This patch allows to see when target regions are executed on host with
> GOMP_DEBUG=1 in the environment.
> 
> 	* target.c (GOMP_target): Use gomp_debug on fallback path.
> ---
>  libgomp/target.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/libgomp/target.c b/libgomp/target.c
> index 6ca80ad..1cc2098 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -1008,6 +1008,7 @@ GOMP_target (int device, void (*fn) (void *), const void *unused,
>        || !(devicep->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400))
>      {
>        /* Host fallback.  */
> +      gomp_debug (0, "%s: target region executing on host\n", __FUNCTION__);
>        struct gomp_thread old_thr, *thr = gomp_thread ();
>        old_thr = *thr;
>        memset (thr, '\0', sizeof (*thr));

Ok.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 4/8] libgomp: minimal OpenMP support in plugin-nvptx.c
  2015-09-23 17:22 ` [gomp4 4/8] libgomp: minimal OpenMP support in plugin-nvptx.c Alexander Monakov
@ 2015-09-24  7:34   ` Jakub Jelinek
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  7:34 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan

On Wed, Sep 23, 2015 at 08:22:18PM +0300, Alexander Monakov wrote:
> This is a minimal patch for NVPTX OpenMP offloading, using Jakub's initial
> implementation.  It allows to successfully run '#pragma omp target', without
> any parallel execution: 1 team of 1 thread is spawned on the device, and
> target regions with '#pragma omp parallel' will fail with a link error.
> 
> 	* plugin/plugin-nvptx.c (nvptx_host2dev): Allow NULL 'nvthd'.
>         (nvptx_dev2host): Ditto.
>         (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400.
>         (GOMP_OFFLOAD_run): New.

Ok.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 5/8] libgomp: provide sem.h, mutex.h, ptrlock.h on nvptx
  2015-09-23 17:43 ` [gomp4 5/8] libgomp: provide sem.h, mutex.h, ptrlock.h on nvptx Alexander Monakov
@ 2015-09-24  7:43   ` Jakub Jelinek
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  7:43 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan

On Wed, Sep 23, 2015 at 08:22:19PM +0300, Alexander Monakov wrote:
> This patch provides minimal non-stub implementations for libgomp
> mutex/ptrlock/semaphore, using atomic ops and busy waiting.  The goal here is
> to at least provide stub struct declarations necessary to unbreak libgomp.h.
> 
> Atomics with busy waiting seems to be the only way to provide such primitives
> for inter-team synchronizations, but for intra-team ops a more efficient
> implementation may be possible.

I expect almost all the synchronization primitives can be just intra-team,
the only possible exception (though not required by the standard) would be
the locks used in atomic.c I'd say.  But I guess this is ok for now as the
first step.
> 
> (all functionality is unused since consumers are stubbed out in config/nvptx)
> 
> 	* config/nvptx/mutex.h: New file.
> 	* config/nvptx/ptrlock.h: New file.
> 	* config/nvptx/sem.h: New file.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 6/8] libgomp: provide stub bar.h on nvptx
  2015-09-23 17:22 ` [gomp4 6/8] libgomp: provide stub bar.h on nvptx Alexander Monakov
@ 2015-09-24  8:09   ` Jakub Jelinek
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  8:09 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan

On Wed, Sep 23, 2015 at 08:22:20PM +0300, Alexander Monakov wrote:
> This stub header only provides empty struct gomp_barrier_t.  For now I've
> punted on providing a minimally-correct implementation.
> 
> 	* config/nvptx/bar.h: New file.
> ---
>  libgomp/config/nvptx/bar.h | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
>  create mode 100644 libgomp/config/nvptx/bar.h

Ok (barrier is complicated by the need to handle explicit tasks and
cancellation), so it will not be just bar.sync insn alone (bet bar.arrive
followed by task handling/cancellation checking and finally bar.sync or
so?).

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 8/8] libgomp: provide ICVs via env.c on nvptx
  2015-09-23 17:22 ` [gomp4 8/8] libgomp: provide ICVs via env.c on nvptx Alexander Monakov
@ 2015-09-24  8:15   ` Jakub Jelinek
  2015-09-24 13:25     ` Alexander Monakov
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  8:15 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan

On Wed, Sep 23, 2015 at 08:22:22PM +0300, Alexander Monakov wrote:
> This patch ports env.c to NVPTX.  It drops all environment parsing routines
> since there's no "environment" on the device.  For now, the useful effect of
> the patch is providing 'omp_is_initial_device' to distinguish host execution
> from target execution in user code.
> 
> Several functions use gomp_icv, which is not adjusted for NVPTX and thus will
> try to use EMUTLS.  The intended way forward is to provide a custom
> implementation of gomp_icv on NVPTX, likely via pre-allocating storage prior
> to spawning a team.
> 
> 	* config/nvptx/env.c: New file.

I don't like this, there is just too much code duplication in this case and
it is going to be a maintainance nightmare going forward (e.g.
gomp-4_1-branch adds further functions, etc.).
I'd suggest split the toplevel env.c into two files, icv.c which would
contain the global variables and most of the small API functions, and env.c
which would contain the global constructor, env var parsing, printing and
perhaps omp_is_initial_device ().  Then nvptx.c would use the toplevel icv.c
and provide its own env.c with just omp_is_initial_device () (which of
course eventually can be inlined by the compiler on NVPTX target or perhaps
any ACCEL_COMPILER, but we need to provide a library version anyway, you can
take address of the function etc.).

Are you ok with that?

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 7/8] libgomp: work around missing pthread_attr_t on nvptx
  2015-09-23 17:22 ` [gomp4 7/8] libgomp: work around missing pthread_attr_t " Alexander Monakov
@ 2015-09-24  8:15   ` Jakub Jelinek
  2015-09-24 15:33     ` Alexander Monakov
  0 siblings, 1 reply; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24  8:15 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan

On Wed, Sep 23, 2015 at 08:22:21PM +0300, Alexander Monakov wrote:
> Although newlib headers define most pthreads types, pthread_attr_t is not
> available.  Macro-replace it by 'void' to keep the prototype of
> gomp_init_thread_affinity unchanged, and do not declare gomp_thread_attr.
> 
> 	* libgomp.h: Define pthread_attr_t to void on NVPTX.
> ---
>  libgomp/libgomp.h | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
> index d51b08b..f4255b4 100644
> --- a/libgomp/libgomp.h
> +++ b/libgomp/libgomp.h
> @@ -510,8 +510,13 @@ static inline struct gomp_task_icv *gomp_icv (bool write)
>      return &gomp_global_icv;
>  }
>  
> +#ifdef __nvptx__
> +/* pthread_attr_t is not provided by newlib on NVPTX.  */
> +#define pthread_attr_t void
> +#else
>  /* The attributes to be used during thread creation.  */
>  extern pthread_attr_t gomp_thread_attr;
> +#endif
>  
>  /* Function prototypes.  */
>  

I'd prefer here the https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01418.html
changes to libgomp.h and associated configury changes.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 8/8] libgomp: provide ICVs via env.c on nvptx
  2015-09-24  8:15   ` Jakub Jelinek
@ 2015-09-24 13:25     ` Alexander Monakov
  2015-09-24 13:45       ` Jakub Jelinek
  0 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-24 13:25 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Arutyun Avetisyan

On Thu, 24 Sep 2015, Jakub Jelinek wrote:

> On Wed, Sep 23, 2015 at 08:22:22PM +0300, Alexander Monakov wrote:
> > This patch ports env.c to NVPTX.  It drops all environment parsing routines
> > since there's no "environment" on the device.  For now, the useful effect of
> > the patch is providing 'omp_is_initial_device' to distinguish host execution
> > from target execution in user code.
> > 
> > Several functions use gomp_icv, which is not adjusted for NVPTX and thus will
> > try to use EMUTLS.  The intended way forward is to provide a custom
> > implementation of gomp_icv on NVPTX, likely via pre-allocating storage prior
> > to spawning a team.
> > 
> > 	* config/nvptx/env.c: New file.
> 
> I don't like this, there is just too much code duplication in this case and
> it is going to be a maintainance nightmare going forward (e.g.
> gomp-4_1-branch adds further functions, etc.).
> I'd suggest split the toplevel env.c into two files, icv.c which would
> contain the global variables and most of the small API functions, and env.c
> which would contain the global constructor, env var parsing, printing and
> perhaps omp_is_initial_device ().  Then nvptx.c would use the toplevel icv.c
> and provide its own env.c with just omp_is_initial_device () (which of
> course eventually can be inlined by the compiler on NVPTX target or perhaps
> any ACCEL_COMPILER, but we need to provide a library version anyway, you can
> take address of the function etc.).
> 
> Are you ok with that?

Definitely, thanks for the suggestion!  While implementing that, I considered
that it should be more natural to keep only env processing in env.c, and split
device-related functionality in another file, icv-device.c.  That way, nvptx
can keep a zero-sized env.c, use generic icv.c, and provide its overrides in
icv-device.c.  If that's too fancy I can revert to your suggested approach.
How does the following patch look?


[gomp4] libgomp: split ICV functionality out of env.c

Split env.c, leaving only processing of environment variables in the original
file.  Move most of ICV definitions and associated API entrypoints into icv.c,
except target-related API entrypoints, which are moved into icv-device.c.  The
intention is to allow offload-only architectures to use the generic icv.c.

	* Makefile.am (libgomp_la_SOURCES): Add icv.c and icv-device.c.
        * Makefile.in: Regenerate.
        * env.c: Split out ICV definitions into...
        * icv.c: ...here (new file) and...
        * icv-device.c: ...here. New file.
---
 libgomp/Makefile.am  |  16 ++--
 libgomp/Makefile.in  |  34 +++++----
 libgomp/env.c        | 204 +--------------------------------------------------
 libgomp/icv-device.c |  78 ++++++++++++++++++++
 libgomp/icv.c        | 181 +++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 291 insertions(+), 222 deletions(-)
 create mode 100644 libgomp/icv-device.c
 create mode 100644 libgomp/icv.c

diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index 5411278..b3a09b0 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -58,12 +58,12 @@ libgomp_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) \
 libgomp_la_DEPENDENCIES = $(libgomp_version_dep)
 libgomp_la_LINK = $(LINK) $(libgomp_la_LDFLAGS)
 
-libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
-	iter_ull.c loop.c loop_ull.c ordered.c parallel.c sections.c single.c \
-	task.c team.c work.c lock.c mutex.c proc.c sem.c bar.c ptrlock.c \
-	time.c fortran.c affinity.c target.c splay-tree.c libgomp-plugin.c \
-	oacc-parallel.c oacc-host.c oacc-init.c oacc-mem.c oacc-async.c \
-	oacc-plugin.c oacc-cuda.c
+libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c icv.c \
+	icv-device.c iter.c iter_ull.c loop.c loop_ull.c ordered.c parallel.c \
+	sections.c single.c task.c team.c work.c lock.c mutex.c proc.c sem.c \
+	bar.c ptrlock.c time.c fortran.c affinity.c target.c splay-tree.c \
+	libgomp-plugin.c oacc-parallel.c oacc-host.c oacc-init.c oacc-mem.c \
+	oacc-async.c oacc-plugin.c oacc-cuda.c
 
 include $(top_srcdir)/plugin/Makefrag.am
 
@@ -95,6 +95,10 @@ fortran.lo: libgomp_f.h
 fortran.o: libgomp_f.h
 env.lo: libgomp_f.h
 env.o: libgomp_f.h
+icv.lo: libgomp_f.h
+icv.o: libgomp_f.h
+icv-device.lo: libgomp_f.h
+icv-device.o: libgomp_f.h
 
 
 # Automake Documentation:
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index 79745ce..e2e0e42 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -162,13 +162,13 @@ libgomp_plugin_nvptx_la_LINK = $(LIBTOOL) --tag=CC \
 libgomp_la_LIBADD =
 @USE_FORTRAN_TRUE@am__objects_1 = openacc.lo
 am_libgomp_la_OBJECTS = alloc.lo barrier.lo critical.lo env.lo \
-	error.lo iter.lo iter_ull.lo loop.lo loop_ull.lo ordered.lo \
-	parallel.lo sections.lo single.lo task.lo team.lo work.lo \
-	lock.lo mutex.lo proc.lo sem.lo bar.lo ptrlock.lo time.lo \
-	fortran.lo affinity.lo target.lo splay-tree.lo \
-	libgomp-plugin.lo oacc-parallel.lo oacc-host.lo oacc-init.lo \
-	oacc-mem.lo oacc-async.lo oacc-plugin.lo oacc-cuda.lo \
-	$(am__objects_1)
+	error.lo icv.lo icv-device.lo iter.lo iter_ull.lo loop.lo \
+	loop_ull.lo ordered.lo parallel.lo sections.lo single.lo \
+	task.lo team.lo work.lo lock.lo mutex.lo proc.lo sem.lo bar.lo \
+	ptrlock.lo time.lo fortran.lo affinity.lo target.lo \
+	splay-tree.lo libgomp-plugin.lo oacc-parallel.lo oacc-host.lo \
+	oacc-init.lo oacc-mem.lo oacc-async.lo oacc-plugin.lo \
+	oacc-cuda.lo $(am__objects_1)
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
 DEFAULT_INCLUDES = -I.@am__isrc@
 depcomp = $(SHELL) $(top_srcdir)/../depcomp
@@ -409,13 +409,13 @@ libgomp_la_LDFLAGS = $(libgomp_version_info) $(libgomp_version_script) \
 
 libgomp_la_DEPENDENCIES = $(libgomp_version_dep)
 libgomp_la_LINK = $(LINK) $(libgomp_la_LDFLAGS)
-libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c iter.c \
-	iter_ull.c loop.c loop_ull.c ordered.c parallel.c sections.c \
-	single.c task.c team.c work.c lock.c mutex.c proc.c sem.c \
-	bar.c ptrlock.c time.c fortran.c affinity.c target.c \
-	splay-tree.c libgomp-plugin.c oacc-parallel.c oacc-host.c \
-	oacc-init.c oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c \
-	$(am__append_2)
+libgomp_la_SOURCES = alloc.c barrier.c critical.c env.c error.c icv.c \
+	icv-device.c iter.c iter_ull.c loop.c loop_ull.c ordered.c \
+	parallel.c sections.c single.c task.c team.c work.c lock.c \
+	mutex.c proc.c sem.c bar.c ptrlock.c time.c fortran.c \
+	affinity.c target.c splay-tree.c libgomp-plugin.c \
+	oacc-parallel.c oacc-host.c oacc-init.c oacc-mem.c \
+	oacc-async.c oacc-plugin.c oacc-cuda.c $(am__append_2)
 
 # Nvidia PTX OpenACC plugin.
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info $(libtool_VERSION)
@@ -572,6 +572,8 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/env.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/error.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fortran.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/icv-device.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/icv.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter_ull.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp-plugin.Plo@am__quote@
@@ -1253,6 +1255,10 @@ fortran.lo: libgomp_f.h
 fortran.o: libgomp_f.h
 env.lo: libgomp_f.h
 env.o: libgomp_f.h
+icv.lo: libgomp_f.h
+icv.o: libgomp_f.h
+icv-device.lo: libgomp_f.h
+icv-device.o: libgomp_f.h
 
 all-local: $(STAMP_GENINSRC)
 
diff --git a/libgomp/env.c b/libgomp/env.c
index 6b5e963..2095a18 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -23,8 +23,8 @@
    see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-/* This file defines the OpenMP internal control variables, and arranges
-   for them to be initialized from environment variables at startup.  */
+/* This file arranges for OpenMP internal control variables to be initialized
+   from environment variables at startup.  */
 
 #include "libgomp.h"
 #include "libgomp_f.h"
@@ -54,34 +54,6 @@
 # define strtoull(ptr, eptr, base) strtoul (ptr, eptr, base)
 #endif
 
-struct gomp_task_icv gomp_global_icv = {
-  .nthreads_var = 1,
-  .thread_limit_var = UINT_MAX,
-  .run_sched_var = GFS_DYNAMIC,
-  .run_sched_modifier = 1,
-  .default_device_var = 0,
-  .dyn_var = false,
-  .nest_var = false,
-  .bind_var = omp_proc_bind_false,
-  .target_data = NULL
-};
-
-unsigned long gomp_max_active_levels_var = INT_MAX;
-bool gomp_cancel_var = false;
-#ifndef HAVE_SYNC_BUILTINS
-gomp_mutex_t gomp_managed_threads_lock;
-#endif
-unsigned long gomp_available_cpus = 1, gomp_managed_threads = 1;
-unsigned long long gomp_spin_count_var, gomp_throttled_spin_count_var;
-unsigned long *gomp_nthreads_var_list, gomp_nthreads_var_list_len;
-char *gomp_bind_var_list;
-unsigned long gomp_bind_var_list_len;
-void **gomp_places_list;
-unsigned long gomp_places_list_len;
-int gomp_debug_var;
-char *goacc_device_type;
-int goacc_device_num;
-
 /* Parse the OMP_SCHEDULE environment variable.  */
 
 static void
@@ -1297,175 +1269,3 @@ initialize_env (void)
 
   goacc_runtime_initialize ();
 }
-
-\f
-/* The public OpenMP API routines that access these variables.  */
-
-void
-omp_set_num_threads (int n)
-{
-  struct gomp_task_icv *icv = gomp_icv (true);
-  icv->nthreads_var = (n > 0 ? n : 1);
-}
-
-void
-omp_set_dynamic (int val)
-{
-  struct gomp_task_icv *icv = gomp_icv (true);
-  icv->dyn_var = val;
-}
-
-int
-omp_get_dynamic (void)
-{
-  struct gomp_task_icv *icv = gomp_icv (false);
-  return icv->dyn_var;
-}
-
-void
-omp_set_nested (int val)
-{
-  struct gomp_task_icv *icv = gomp_icv (true);
-  icv->nest_var = val;
-}
-
-int
-omp_get_nested (void)
-{
-  struct gomp_task_icv *icv = gomp_icv (false);
-  return icv->nest_var;
-}
-
-void
-omp_set_schedule (omp_sched_t kind, int modifier)
-{
-  struct gomp_task_icv *icv = gomp_icv (true);
-  switch (kind)
-    {
-    case omp_sched_static:
-      if (modifier < 1)
-	modifier = 0;
-      icv->run_sched_modifier = modifier;
-      break;
-    case omp_sched_dynamic:
-    case omp_sched_guided:
-      if (modifier < 1)
-	modifier = 1;
-      icv->run_sched_modifier = modifier;
-      break;
-    case omp_sched_auto:
-      break;
-    default:
-      return;
-    }
-  icv->run_sched_var = kind;
-}
-
-void
-omp_get_schedule (omp_sched_t *kind, int *modifier)
-{
-  struct gomp_task_icv *icv = gomp_icv (false);
-  *kind = icv->run_sched_var;
-  *modifier = icv->run_sched_modifier;
-}
-
-int
-omp_get_max_threads (void)
-{
-  struct gomp_task_icv *icv = gomp_icv (false);
-  return icv->nthreads_var;
-}
-
-int
-omp_get_thread_limit (void)
-{
-  struct gomp_task_icv *icv = gomp_icv (false);
-  return icv->thread_limit_var > INT_MAX ? INT_MAX : icv->thread_limit_var;
-}
-
-void
-omp_set_max_active_levels (int max_levels)
-{
-  if (max_levels >= 0)
-    gomp_max_active_levels_var = max_levels;
-}
-
-int
-omp_get_max_active_levels (void)
-{
-  return gomp_max_active_levels_var;
-}
-
-int
-omp_get_cancellation (void)
-{
-  return gomp_cancel_var;
-}
-
-omp_proc_bind_t
-omp_get_proc_bind (void)
-{
-  struct gomp_task_icv *icv = gomp_icv (false);
-  return icv->bind_var;
-}
-
-void
-omp_set_default_device (int device_num)
-{
-  struct gomp_task_icv *icv = gomp_icv (true);
-  icv->default_device_var = device_num >= 0 ? device_num : 0;
-}
-
-int
-omp_get_default_device (void)
-{
-  struct gomp_task_icv *icv = gomp_icv (false);
-  return icv->default_device_var;
-}
-
-int
-omp_get_num_devices (void)
-{
-  return gomp_get_num_devices ();
-}
-
-int
-omp_get_num_teams (void)
-{
-  /* Hardcoded to 1 on host, MIC, HSAIL?  Maybe variable on PTX.  */
-  return 1;
-}
-
-int
-omp_get_team_num (void)
-{
-  /* Hardcoded to 0 on host, MIC, HSAIL?  Maybe variable on PTX.  */
-  return 0;
-}
-
-int
-omp_is_initial_device (void)
-{
-  /* Hardcoded to 1 on host, should be 0 on MIC, HSAIL, PTX.  */
-  return 1;
-}
-
-ialias (omp_set_dynamic)
-ialias (omp_set_nested)
-ialias (omp_set_num_threads)
-ialias (omp_get_dynamic)
-ialias (omp_get_nested)
-ialias (omp_set_schedule)
-ialias (omp_get_schedule)
-ialias (omp_get_max_threads)
-ialias (omp_get_thread_limit)
-ialias (omp_set_max_active_levels)
-ialias (omp_get_max_active_levels)
-ialias (omp_get_cancellation)
-ialias (omp_get_proc_bind)
-ialias (omp_set_default_device)
-ialias (omp_get_default_device)
-ialias (omp_get_num_devices)
-ialias (omp_get_num_teams)
-ialias (omp_get_team_num)
-ialias (omp_is_initial_device)
diff --git a/libgomp/icv-device.c b/libgomp/icv-device.c
new file mode 100644
index 0000000..d598478
--- /dev/null
+++ b/libgomp/icv-device.c
@@ -0,0 +1,78 @@
+/* Copyright (C) 2015 Free Software Foundation, Inc.
+   Contributed by Richard Henderson <rth@redhat.com>.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file defines OpenMP API entry points that accelerator targets are
+   expected to replace.  */
+
+#include "libgomp.h"
+#include "libgomp_f.h"
+
+void
+omp_set_default_device (int device_num)
+{
+  struct gomp_task_icv *icv = gomp_icv (true);
+  icv->default_device_var = device_num >= 0 ? device_num : 0;
+}
+
+int
+omp_get_default_device (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->default_device_var;
+}
+
+int
+omp_get_num_devices (void)
+{
+  return gomp_get_num_devices ();
+}
+
+int
+omp_get_num_teams (void)
+{
+  /* Hardcoded to 1 on host, MIC, HSAIL?  Maybe variable on PTX.  */
+  return 1;
+}
+
+int
+omp_get_team_num (void)
+{
+  /* Hardcoded to 0 on host, MIC, HSAIL?  Maybe variable on PTX.  */
+  return 0;
+}
+
+int
+omp_is_initial_device (void)
+{
+  /* Hardcoded to 1 on host, should be 0 on MIC, HSAIL, PTX.  */
+  return 1;
+}
+
+ialias (omp_set_default_device)
+ialias (omp_get_default_device)
+ialias (omp_get_num_devices)
+ialias (omp_get_num_teams)
+ialias (omp_get_team_num)
+ialias (omp_is_initial_device)
diff --git a/libgomp/icv.c b/libgomp/icv.c
new file mode 100644
index 0000000..6229cd2
--- /dev/null
+++ b/libgomp/icv.c
@@ -0,0 +1,181 @@
+/* Copyright (C) 2015 Free Software Foundation, Inc.
+   Contributed by Richard Henderson <rth@redhat.com>.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file defines the OpenMP internal control variables and associated
+   OpenMP API entry points.  */
+
+#include "libgomp.h"
+#include "libgomp_f.h"
+#include <limits.h>
+
+struct gomp_task_icv gomp_global_icv = {
+  .nthreads_var = 1,
+  .thread_limit_var = UINT_MAX,
+  .run_sched_var = GFS_DYNAMIC,
+  .run_sched_modifier = 1,
+  .default_device_var = 0,
+  .dyn_var = false,
+  .nest_var = false,
+  .bind_var = omp_proc_bind_false,
+  .target_data = NULL
+};
+
+unsigned long gomp_max_active_levels_var = INT_MAX;
+bool gomp_cancel_var = false;
+#ifndef HAVE_SYNC_BUILTINS
+gomp_mutex_t gomp_managed_threads_lock;
+#endif
+unsigned long gomp_available_cpus = 1, gomp_managed_threads = 1;
+unsigned long long gomp_spin_count_var, gomp_throttled_spin_count_var;
+unsigned long *gomp_nthreads_var_list, gomp_nthreads_var_list_len;
+char *gomp_bind_var_list;
+unsigned long gomp_bind_var_list_len;
+void **gomp_places_list;
+unsigned long gomp_places_list_len;
+int gomp_debug_var;
+char *goacc_device_type;
+int goacc_device_num;
+
+void
+omp_set_num_threads (int n)
+{
+  struct gomp_task_icv *icv = gomp_icv (true);
+  icv->nthreads_var = (n > 0 ? n : 1);
+}
+
+void
+omp_set_dynamic (int val)
+{
+  struct gomp_task_icv *icv = gomp_icv (true);
+  icv->dyn_var = val;
+}
+
+int
+omp_get_dynamic (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->dyn_var;
+}
+
+void
+omp_set_nested (int val)
+{
+  struct gomp_task_icv *icv = gomp_icv (true);
+  icv->nest_var = val;
+}
+
+int
+omp_get_nested (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->nest_var;
+}
+
+void
+omp_set_schedule (omp_sched_t kind, int modifier)
+{
+  struct gomp_task_icv *icv = gomp_icv (true);
+  switch (kind)
+    {
+    case omp_sched_static:
+      if (modifier < 1)
+	modifier = 0;
+      icv->run_sched_modifier = modifier;
+      break;
+    case omp_sched_dynamic:
+    case omp_sched_guided:
+      if (modifier < 1)
+	modifier = 1;
+      icv->run_sched_modifier = modifier;
+      break;
+    case omp_sched_auto:
+      break;
+    default:
+      return;
+    }
+  icv->run_sched_var = kind;
+}
+
+void
+omp_get_schedule (omp_sched_t *kind, int *modifier)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  *kind = icv->run_sched_var;
+  *modifier = icv->run_sched_modifier;
+}
+
+int
+omp_get_max_threads (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->nthreads_var;
+}
+
+int
+omp_get_thread_limit (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->thread_limit_var > INT_MAX ? INT_MAX : icv->thread_limit_var;
+}
+
+void
+omp_set_max_active_levels (int max_levels)
+{
+  if (max_levels >= 0)
+    gomp_max_active_levels_var = max_levels;
+}
+
+int
+omp_get_max_active_levels (void)
+{
+  return gomp_max_active_levels_var;
+}
+
+int
+omp_get_cancellation (void)
+{
+  return gomp_cancel_var;
+}
+
+omp_proc_bind_t
+omp_get_proc_bind (void)
+{
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->bind_var;
+}
+
+ialias (omp_set_dynamic)
+ialias (omp_set_nested)
+ialias (omp_set_num_threads)
+ialias (omp_get_dynamic)
+ialias (omp_get_nested)
+ialias (omp_set_schedule)
+ialias (omp_get_schedule)
+ialias (omp_get_max_threads)
+ialias (omp_get_thread_limit)
+ialias (omp_set_max_active_levels)
+ialias (omp_get_max_active_levels)
+ialias (omp_get_cancellation)
+ialias (omp_get_proc_bind)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 8/8] libgomp: provide ICVs via env.c on nvptx
  2015-09-24 13:25     ` Alexander Monakov
@ 2015-09-24 13:45       ` Jakub Jelinek
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24 13:45 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan

On Thu, Sep 24, 2015 at 04:15:28PM +0300, Alexander Monakov wrote:
> Definitely, thanks for the suggestion!  While implementing that, I considered
> that it should be more natural to keep only env processing in env.c, and split
> device-related functionality in another file, icv-device.c.  That way, nvptx
> can keep a zero-sized env.c, use generic icv.c, and provide its overrides in
> icv-device.c.  If that's too fancy I can revert to your suggested approach.
> How does the following patch look?

icv-device.c looks reasonable.  Note, the wording is that it is UB if (some
of those) functions are called from target regions.  That means the routines
still should be defined somewhere, but can be just stubbed.

> [gomp4] libgomp: split ICV functionality out of env.c
> 
> Split env.c, leaving only processing of environment variables in the original
> file.  Move most of ICV definitions and associated API entrypoints into icv.c,
> except target-related API entrypoints, which are moved into icv-device.c.  The
> intention is to allow offload-only architectures to use the generic icv.c.
> 
> 	* Makefile.am (libgomp_la_SOURCES): Add icv.c and icv-device.c.
>         * Makefile.in: Regenerate.
>         * env.c: Split out ICV definitions into...
>         * icv.c: ...here (new file) and...
>         * icv-device.c: ...here. New file.

LGTM, except:

> @@ -95,6 +95,10 @@ fortran.lo: libgomp_f.h
>  fortran.o: libgomp_f.h
>  env.lo: libgomp_f.h
>  env.o: libgomp_f.h
> +icv.lo: libgomp_f.h
> +icv.o: libgomp_f.h
> +icv-device.lo: libgomp_f.h
> +icv-device.o: libgomp_f.h

You don't really want this, it is enough to include it in env.c only.

> +/* This file defines OpenMP API entry points that accelerator targets are
> +   expected to replace.  */
> +
> +#include "libgomp.h"
> +#include "libgomp_f.h"

And please leave out the libgomp_f.h include here.

> --- /dev/null
> +++ b/libgomp/icv.c
> @@ -0,0 +1,181 @@
> +/* Copyright (C) 2015 Free Software Foundation, Inc.

I'd say as the file is a copy of the source of env.c that originates back to
2005, it should be 2005-2015 (both files).

> +/* This file defines the OpenMP internal control variables and associated
> +   OpenMP API entry points.  */
> +
> +#include "libgomp.h"
> +#include "libgomp_f.h"

Like above.

Just to make sure, ChangeLog entries on the gomp-4*-branch branches go into
ChangeLog.gomp.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 0/8] NVPTX: initial OpenMP offloading
  2015-09-24  7:26     ` Jakub Jelinek
@ 2015-09-24 14:31       ` Nathan Sidwell
  0 siblings, 0 replies; 27+ messages in thread
From: Nathan Sidwell @ 2015-09-24 14:31 UTC (permalink / raw)
  To: Jakub Jelinek, Alexander Monakov
  Cc: Bernd Schmidt, gcc-patches, Arutyun Avetisyan

On 09/24/15 03:21, Jakub Jelinek wrote:

> So I'd like to ask Thomas/Nathan if they are ok with this stuff being on
> the gomp-4_0-branch for now, once all the prerequisities it needs are on the
> trunk, it can go into its own branch.

Let Thomas & I think about it.  Now that the new launch API is approved (working 
on the changes requested).  I expect to merge another chunk of ptx-specific bits 
regarding mkoffloads and friends next week[*].  That might solve the branch 
dependency problem Jakub discusses.


nathan

[*] I think they'll be patches I can self approve, given what they'll be affecting.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 7/8] libgomp: work around missing pthread_attr_t on nvptx
  2015-09-24  8:15   ` Jakub Jelinek
@ 2015-09-24 15:33     ` Alexander Monakov
  2015-09-24 16:13       ` Jakub Jelinek
  0 siblings, 1 reply; 27+ messages in thread
From: Alexander Monakov @ 2015-09-24 15:33 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Arutyun Avetisyan

> I'd prefer here the https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01418.html
> changes to libgomp.h and associated configury changes.

OK, like the following?

[gomp4] libgomp: guard pthreads usage by LIBGOMP_USE_PTHREADS

This allows to avoid referencing pthread types and functions on nvptx.

	* configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it...
        (LIBGOMP_USE_PTHREADS): ...here; new define.
        * configure: Regenerate.
        * config.h.in: Likewise.
        * libgomp.h: Guard pthread.h inclusion.
        (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS.
        (gomp_init_thread_affinity): Ditto.
---
 libgomp/config.h.in  | 3 +++
 libgomp/configure    | 7 +++++++
 libgomp/configure.ac | 6 ++++++
 libgomp/libgomp.h    | 6 ++++++
 4 files changed, 22 insertions(+)

diff --git a/libgomp/config.h.in b/libgomp/config.h.in
index 7685bfb..ba64fd7 100644
--- a/libgomp/config.h.in
+++ b/libgomp/config.h.in
@@ -91,6 +91,9 @@
 /* Define to 1 if GNU symbol versioning is used for libgomp. */
 #undef LIBGOMP_GNU_SYMBOL_VERSIONING
 
+/* Define to 1 if libgomp should use POSIX threads. */
+#undef LIBGOMP_USE_PTHREADS
+
 /* Define to the sub-directory in which libtool stores uninstalled libraries.
    */
 #undef LT_OBJDIR
diff --git a/libgomp/configure b/libgomp/configure
index 7407b4c..de87d4a 100755
--- a/libgomp/configure
+++ b/libgomp/configure
@@ -15043,6 +15043,7 @@ case "$host" in
     ;;
   nvptx*-*-*)
     # NVPTX does not support Pthreads, has its own code replacement.
+    libgomp_use_pthreads=no
     ;;
   *)
     # Check to see if -pthread or -lpthread is needed.  Prefer the former.
@@ -15088,6 +15089,12 @@ rm -f core conftest.err conftest.$ac_objext \
     conftest$ac_exeext conftest.$ac_ext
 esac
 
+if test x$libgomp_use_pthreads != xno; then
+
+$as_echo "#define LIBGOMP_USE_PTHREADS 1" >>confdefs.h
+
+fi
+
 # Plugins for offload execution, configure.ac fragment.  -*- mode: autoconf -*-
 #
 # Copyright (C) 2014-2015 Free Software Foundation, Inc.
diff --git a/libgomp/configure.ac b/libgomp/configure.ac
index b1696d0..3bce745 100644
--- a/libgomp/configure.ac
+++ b/libgomp/configure.ac
@@ -181,6 +181,7 @@ case "$host" in
     ;;
   nvptx*-*-*)
     # NVPTX does not support Pthreads, has its own code replacement.
+    libgomp_use_pthreads=no
     ;;
   *)
     # Check to see if -pthread or -lpthread is needed.  Prefer the former.
@@ -202,6 +203,11 @@ case "$host" in
        [AC_MSG_ERROR([Pthreads are required to build libgomp])])])
 esac
 
+if test x$libgomp_use_pthreads != xno; then
+  AC_DEFINE(LIBGOMP_USE_PTHREADS, 1,
+            [Define to 1 if libgomp should use POSIX threads.])
+fi
+
 m4_include([plugin/configfrag.ac])
 
 # Check for functions needed.
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index d51b08b..1454adf 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -40,7 +40,9 @@
 #include "gstdint.h"
 #include "libgomp-plugin.h"
 
+#ifdef HAVE_PTHREAD_H
 #include <pthread.h>
+#endif
 #include <stdbool.h>
 #include <stdlib.h>
 #include <stdarg.h>
@@ -510,15 +512,19 @@ static inline struct gomp_task_icv *gomp_icv (bool write)
     return &gomp_global_icv;
 }
 
+#ifdef LIBGOMP_USE_PTHREADS
 /* The attributes to be used during thread creation.  */
 extern pthread_attr_t gomp_thread_attr;
+#endif
 
 /* Function prototypes.  */
 
 /* affinity.c */
 
 extern void gomp_init_affinity (void);
+#ifdef LIBGOMP_USE_PTHREADS
 extern void gomp_init_thread_affinity (pthread_attr_t *, unsigned int);
+#endif
 extern void **gomp_affinity_alloc (unsigned long, bool);
 extern void gomp_affinity_init_place (void *);
 extern bool gomp_affinity_add_cpus (void *, unsigned long, unsigned long,
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [gomp4 7/8] libgomp: work around missing pthread_attr_t on nvptx
  2015-09-24 15:33     ` Alexander Monakov
@ 2015-09-24 16:13       ` Jakub Jelinek
  0 siblings, 0 replies; 27+ messages in thread
From: Jakub Jelinek @ 2015-09-24 16:13 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: gcc-patches, Arutyun Avetisyan

On Thu, Sep 24, 2015 at 06:18:10PM +0300, Alexander Monakov wrote:
> > I'd prefer here the https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01418.html
> > changes to libgomp.h and associated configury changes.
> 
> OK, like the following?
> 
> [gomp4] libgomp: guard pthreads usage by LIBGOMP_USE_PTHREADS
> 
> This allows to avoid referencing pthread types and functions on nvptx.
> 
> 	* configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it...
>         (LIBGOMP_USE_PTHREADS): ...here; new define.
>         * configure: Regenerate.
>         * config.h.in: Likewise.
>         * libgomp.h: Guard pthread.h inclusion.
>         (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS.
>         (gomp_init_thread_affinity): Ditto.

Yeah, thanks.

	Jakub

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PR target/67822] OpenMP offloading to nvptx fails (was: [gomp4 2/8] nvptx mkoffload: do not restrict to OpenACC)
  2015-09-24  7:29   ` Jakub Jelinek
@ 2015-10-02 19:29     ` Thomas Schwinge
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Schwinge @ 2015-10-02 19:29 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Alexander Monakov
  Cc: Arutyun Avetisyan, Bernd Schmidt, nathan

[-- Attachment #1: Type: text/plain, Size: 2967 bytes --]

Hi!

On Thu, 24 Sep 2015 09:25:54 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Sep 23, 2015 at 08:22:16PM +0300, Alexander Monakov wrote:
> > This patch allows to meaningfully invoke mkoffload with -fopenmp.  The check
> > for -fopenacc flag is specific to gomp4 branch: trunk does not have it.
> > 
> > 	* config/nvptx/mkoffload.c (main): Do not check for -fopenacc.
> 
> LGTM.

No; see <https://gcc.gnu.org/PR67822>.  I installed the following on
trunk in r228414 (as obvious), "inverting" the -fopenacc check that we
currently have on gomp-4_0-branch to instead skip generating an
offloading image if -fopenmp is specified.  You're welcome to work on the
PR, of course.

commit cba229716b3a369e96c71189d98a46ca3ada2717
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Fri Oct 2 19:27:30 2015 +0000

    [PR target/67822] OpenMP offloading to nvptx fails
    
    	gcc/
    	PR target/67822
    	* config/nvptx/mkoffload.c (main): Scan the argument vector for
    	-fopenmp, and skip generating an offloading image if specified.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@228414 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog                | 6 ++++++
 gcc/config/nvptx/mkoffload.c | 7 +++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git gcc/ChangeLog gcc/ChangeLog
index 3f7561a..1317a2d 100644
--- gcc/ChangeLog
+++ gcc/ChangeLog
@@ -1,3 +1,9 @@
+2015-10-02  Thomas Schwinge  <thomas@codesourcery.com>
+
+	PR target/67822
+	* config/nvptx/mkoffload.c (main): Scan the argument vector for
+	-fopenmp, and skip generating an offloading image if specified.
+
 2015-10-02  Uros Bizjak  <ubizjak@gmail.com>
 
 	* system.h (ROUND_UP): New macro definition.
diff --git gcc/config/nvptx/mkoffload.c gcc/config/nvptx/mkoffload.c
index 926c82b..69eb4ea 100644
--- gcc/config/nvptx/mkoffload.c
+++ gcc/config/nvptx/mkoffload.c
@@ -1030,6 +1030,7 @@ main (int argc, char **argv)
   expandargv (&argc, &argv);
 
   /* Scan the argument vector.  */
+  bool fopenmp = false;
   for (int i = 1; i < argc; i++)
     {
 #define STR "-foffload-abi="
@@ -1044,6 +1045,8 @@ main (int argc, char **argv)
 			 "unrecognizable argument of option " STR);
 	}
 #undef STR
+      else if (strcmp (argv[i], "-fopenmp") == 0)
+	fopenmp = true;
       else if (strcmp (argv[i], "-v") == 0)
 	verbose = true;
     }
@@ -1082,8 +1085,8 @@ main (int argc, char **argv)
     fatal_error (input_location, "cannot open '%s'", ptx_cfile_name);
 
   /* PR libgomp/65099: Currently, we only support offloading in 64-bit
-     configurations.  */
-  if (offload_abi == OFFLOAD_ABI_LP64)
+     configurations.  PR target/67822: OpenMP offloading to nvptx fails.  */
+  if (offload_abi == OFFLOAD_ABI_LP64 && !fopenmp)
     {
       ptx_name = make_temp_file (".mkoffload");
       obstack_ptr_grow (&argv_obstack, "-o");


Grüße,
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2015-10-02 19:29 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-23 17:43 [gomp4 0/8] NVPTX: initial OpenMP offloading Alexander Monakov
2015-09-23 17:22 ` [gomp4 8/8] libgomp: provide ICVs via env.c on nvptx Alexander Monakov
2015-09-24  8:15   ` Jakub Jelinek
2015-09-24 13:25     ` Alexander Monakov
2015-09-24 13:45       ` Jakub Jelinek
2015-09-23 17:22 ` [gomp4 4/8] libgomp: minimal OpenMP support in plugin-nvptx.c Alexander Monakov
2015-09-24  7:34   ` Jakub Jelinek
2015-09-23 17:22 ` [gomp4 2/8] nvptx mkoffload: do not restrict to OpenACC Alexander Monakov
2015-09-24  7:29   ` Jakub Jelinek
2015-10-02 19:29     ` [PR target/67822] OpenMP offloading to nvptx fails (was: [gomp4 2/8] nvptx mkoffload: do not restrict to OpenACC) Thomas Schwinge
2015-09-23 17:22 ` [gomp4 6/8] libgomp: provide stub bar.h on nvptx Alexander Monakov
2015-09-24  8:09   ` Jakub Jelinek
2015-09-23 17:22 ` [gomp4 7/8] libgomp: work around missing pthread_attr_t " Alexander Monakov
2015-09-24  8:15   ` Jakub Jelinek
2015-09-24 15:33     ` Alexander Monakov
2015-09-24 16:13       ` Jakub Jelinek
2015-09-23 17:22 ` [gomp4 1/8] nvptx: remove assumption of OpenACC attrs presence Alexander Monakov
2015-09-24  7:26   ` Jakub Jelinek
2015-09-24  7:27     ` Jakub Jelinek
2015-09-23 17:40 ` [gomp4 3/8] libgomp: provide target-to-host fallback diagnostic Alexander Monakov
2015-09-24  7:33   ` Jakub Jelinek
2015-09-23 17:43 ` [gomp4 5/8] libgomp: provide sem.h, mutex.h, ptrlock.h on nvptx Alexander Monakov
2015-09-24  7:43   ` Jakub Jelinek
2015-09-23 18:44 ` [gomp4 0/8] NVPTX: initial OpenMP offloading Bernd Schmidt
2015-09-23 21:44   ` Alexander Monakov
2015-09-24  7:26     ` Jakub Jelinek
2015-09-24 14:31       ` Nathan Sidwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).