public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [gomp4] acc_on_device
@ 2015-10-29 20:28 Nathan Sidwell
  2015-10-30  8:40 ` Thomas Schwinge
  2016-01-04 18:15 ` [gomp4] Fix acc_on_device for C++ Nathan Sidwell
  0 siblings, 2 replies; 13+ messages in thread
From: Nathan Sidwell @ 2015-10-29 20:28 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1158 bytes --]

I've  committed this to gomp4 branch.  It resolves a problem with 
builtin_acc_on_device and C++.  The builtin's arg is  an int, but the 
acc_on_device fn shoud take enum acc_device_t.  In C++ a prototype
  int Foo (enum X);
fails to match up with a builtin of type
  int Foo (int);

We'd worked around this on gomp4 by making the fn decl in openacc.h take an int. 
  This patch resolves things in a different manner, preserving the expected type 
in the header file.

For C, we simply declare it as having enum type, and C matches it up with the 
builtin.   For C++ we declare it as taking an int,  which then matches the 
builtin too.  We also provide an inline forwarding function, taking the enum type.

Because I;m paranoid, I added entries to the enum, to ensure it's layout 
compatible with int.

The test cases in the gcc testsuite were hiding the problem by providing part of 
openacc.h in the test directory, and this  had diverged from the openacc.h we 
actually have.  I deleted those tests and inserted one in the libgomp testsuite, 
which correctly picks up the openacc.h of the tool under test, (rather than one 
in system includes).


nathan

[-- Attachment #2: gomp4-accondev.patch --]
[-- Type: text/x-patch, Size: 7750 bytes --]

2015-10-29  Nathan Sidwell  <nathan@codesourcery.com>

	libgomp/
	* openacc.h (enum acc_device_t): Ensure layout compatibility.
	(acc_on_device): Declare compatible with builtin and provide C++
	wrapper.
	* oacc-init.c (acc_on_device): Change arg type.
	* config/nvptx/oacc-init.c (acc_on_device): Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: New.

	gcc/testsuite/
	* c-c++-common/goacc/acc_on_device-2-off.c: Delete.
	* c-c++-common/goacc/acc_on_device-2.c: Delete.
	* c-c++-common/goacc/kernels-acc-on-device-2.c: Delete.
	* c-c++-common/goacc/kernels-acc-on-device.c: Delete.
	* c-c++-common/goacc/openacc.h: Delete.

Index: libgomp/oacc-init.c
===================================================================
--- libgomp/oacc-init.c	(revision 229502)
+++ libgomp/oacc-init.c	(working copy)
@@ -646,7 +646,7 @@ ialias (acc_set_device_num)
    this, rather than generating infinitely recursive code.  */
 
 int __attribute__ ((__optimize__ ("O2")))
-acc_on_device (int dev)
+acc_on_device (acc_device_t dev)
 {
   return __builtin_acc_on_device (dev);
 }
Index: libgomp/openacc.h
===================================================================
--- libgomp/openacc.h	(revision 229502)
+++ libgomp/openacc.h	(working copy)
@@ -56,7 +56,10 @@ typedef enum acc_device_t
     /* acc_device_host_nonshm = 3 removed.  */
     acc_device_not_host = 4,
     acc_device_nvidia = 5,
-    _ACC_device_hwm
+    _ACC_device_hwm,
+    /* Ensure enumeration is layout compatible with int.  */
+    _ACC_highest = __INT_MAX__,
+    _ACC_neg = -1
   } acc_device_t;
 
 typedef enum acc_async_t
@@ -79,11 +82,11 @@ void acc_wait_all (void) __GOACC_NOTHROW
 void acc_wait_all_async (int) __GOACC_NOTHROW;
 void acc_init (acc_device_t) __GOACC_NOTHROW;
 void acc_shutdown (acc_device_t) __GOACC_NOTHROW;
-/* Library function declaration.  Although it should take an
-   acc_device_t argument, that causes problems with matching the
-   builtin, which takes an int (to avoid declaring the enumeration
-   inside the compiler).  */
-int acc_on_device (int) __GOACC_NOTHROW;
+#ifdef __cplusplus
+int acc_on_device (int __arg) __GOACC_NOTHROW;
+#else
+int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW;
+#endif
 void *acc_malloc (size_t) __GOACC_NOTHROW;
 void acc_free (void *) __GOACC_NOTHROW;
 /* Some of these would be more correct with const qualifiers, but
@@ -117,6 +120,10 @@ int acc_set_cuda_stream (int, void *) __
 
 #ifdef __cplusplus
 }
+inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
+{
+  return acc_on_device ((int) __arg);
+}
 #endif
 
 #endif /* _OPENACC_H */
Index: libgomp/config/nvptx/oacc-init.c
===================================================================
--- libgomp/config/nvptx/oacc-init.c	(revision 229502)
+++ libgomp/config/nvptx/oacc-init.c	(working copy)
@@ -36,7 +36,7 @@
    this, rather than generating infinitely recursive code.  */
 
 int __attribute__ ((__optimize__ ("O2")))
-acc_on_device (int dev)
+acc_on_device (acc_device_t dev)
 {
   return __builtin_acc_on_device (dev);
 }
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+
+#include <openacc.h>
+
+int Foo (acc_device_t x)
+{
+  return acc_on_device (x);
+}
+
+/* { dg-final { scan-assembler-not "acc_on_device" } } */
Index: gcc/testsuite/c-c++-common/goacc/acc_on_device-2-off.c
===================================================================
--- gcc/testsuite/c-c++-common/goacc/acc_on_device-2-off.c	(revision 229535)
+++ gcc/testsuite/c-c++-common/goacc/acc_on_device-2-off.c	(working copy)
@@ -1,17 +0,0 @@
-/* Have to enable optimizations, as otherwise builtins won't be expanded.  */
-/* { dg-additional-options "-O -fdump-rtl-expand -fno-openacc" } */
-
-/* Duplicate parts of libgomp/openacc.h, because we can't include it here.  */
-
-#include "openacc.h"
-
-int
-f (void)
-{
-  const acc_device_t dev = acc_device_X;
-  return acc_on_device (dev);
-}
-
-/* Without -fopenacc, we're expecting one call.
-   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 1 "expand" } } */
-
Index: gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c
===================================================================
--- gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c	(revision 229535)
+++ gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c	(working copy)
@@ -1,19 +0,0 @@
-/* Have to enable optimizations, as otherwise builtins won't be expanded.  */
-/* { dg-additional-options "-O -fdump-rtl-expand" } */
-
-#include "openacc.h"
-
-int
-f (void)
-{
-  const acc_device_t dev = acc_device_X;
-  return acc_on_device (dev);
-}
-
-/* With -fopenacc, we're expecting the builtin to be expanded, so no calls.
-   TODO: in C++, the use of enum acc_device_t for acc_on_device's parameter
-   perturbs expansion as a builtin, which expects an int parameter.  It's fine
-   when changing acc_device_t to plain int, but that's not necessarily what a
-   user will be doing.
-
-   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 0 "expand" { xfail c++ } } } */
Index: gcc/testsuite/c-c++-common/goacc/kernels-acc-on-device-2.c
===================================================================
--- gcc/testsuite/c-c++-common/goacc/kernels-acc-on-device-2.c	(revision 229535)
+++ gcc/testsuite/c-c++-common/goacc/kernels-acc-on-device-2.c	(working copy)
@@ -1,37 +0,0 @@
-/* { dg-additional-options "-O2" } */
-
-#include "openacc.h"
-
-#define N 32
-
-void
-foo (float *a, float *b)
-{
-#pragma acc kernels copyin(a[0:N]) copyout(b[0:N])
-  {
-    int ii;
-    int on_host = acc_on_device (acc_device_X);
-
-    for (ii = 0; ii < N; ii++)
-      {
-	if (on_host)
-	  b[ii] = a[ii] + 1;
-	else
-	  b[ii] = a[ii];
-      }
-  }
-
-#pragma acc kernels copyin(a[0:N]) copyout(b[0:N])
-  {
-    int ii;
-    int on_host = acc_on_device (acc_device_X);
-
-    for (ii = 0; ii < N; ii++)
-      {
-	if (on_host)
-	  b[ii] = a[ii] + 2;
-	else
-	  b[ii] = a[ii];
-      }
-  }
-}
Index: gcc/testsuite/c-c++-common/goacc/kernels-acc-on-device.c
===================================================================
--- gcc/testsuite/c-c++-common/goacc/kernels-acc-on-device.c	(revision 229535)
+++ gcc/testsuite/c-c++-common/goacc/kernels-acc-on-device.c	(working copy)
@@ -1,35 +0,0 @@
-/* { dg-additional-options "-O2" } */
-
-#include "openacc.h"
-
-#define N 32
-
-void
-foo (float *a, float *b)
-{
-#pragma acc kernels copyin(a[0:N]) copyout(b[0:N])
-  {
-    int ii;
-
-    for (ii = 0; ii < N; ii++)
-      {
-	if (acc_on_device (acc_device_X))
-	  b[ii] = a[ii] + 1;
-	else
-	  b[ii] = a[ii];
-      }
-  }
-
-#pragma acc kernels copyin(a[0:N]) copyout(b[0:N])
-  {
-    int ii;
-
-    for (ii = 0; ii < N; ii++)
-      {
-	if (acc_on_device (acc_device_X))
-	  b[ii] = a[ii] + 2;
-	else
-	  b[ii] = a[ii];
-      }
-  }
-}
Index: gcc/testsuite/c-c++-common/goacc/openacc.h
===================================================================
--- gcc/testsuite/c-c++-common/goacc/openacc.h	(revision 229535)
+++ gcc/testsuite/c-c++-common/goacc/openacc.h	(working copy)
@@ -1,18 +0,0 @@
-#if __cplusplus
-extern "C" {
-#endif
-
-#if __cplusplus >= 201103
-# define __GOACC_NOTHROW noexcept
-#elif __cplusplus
-# define __GOACC_NOTHROW throw ()
-#else /* Not C++ */
-# define __GOACC_NOTHROW __attribute__ ((__nothrow__))
-#endif
-
-typedef enum acc_device_t { acc_device_X = 123 } acc_device_t;
-int acc_on_device (int) __GOACC_NOTHROW;
-
-#if __cplusplus
-}
-#endif

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [openacc] on_device fix
@ 2015-10-30  0:27     ` Nathan Sidwell
  2017-05-23 15:34       ` Make the OpenACC C++ acc_on_device wrapper "always inline" (was: [openacc] on_device fix) Thomas Schwinge
  0 siblings, 1 reply; 13+ messages in thread
From: Nathan Sidwell @ 2015-10-30  0:27 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1927 bytes --]

acc_on_device and it's builtin had a conflict.  The function formally takes an 
enum argument, but the builtin takes an int -- primarily to avoid the compiler 
having to generate the enum  type internally.

This works fine for C,  where the external declaration of the function (in 
openacc.h) matches up with the builtin, and we optimize the builtin as expected.

It fails for C++ where the builtin doesn't match the declaration in the header. 
  We end up with emitting a call to acc_on_device,  which is resolved by 
libgomp.  Unfortunately that means we fail to optimize.

We could resolve this in a number of ways

1) make the header file's declaration have an int argument.

2) make the header file have an inline definition  fowarding to a differently 
named function with an int argument, that matched a renamed builtin

3) Do what this patch does.

Option 1 would be visible in the type system, if someone took the address of the 
function (I'm not sure why they'd do that).  We used this variant on the gomp4 
branch for a long time.

Option  2 requires changing the symbols exported from libgomp.   Instead of 
exporting acc_on_device we'd need to export __acc_on_device or something.  And 
we'd need to provide a backwards compatible entry point named acc_on_device anyway.

Option 3 leaves things unchanged for C --  declare a function with an enum arg. 
  But for C++ we the extern "C" declaration takes an int -- and therefore 
matches the builtin.  We insert an inline wrapper that takes an enum argument. 
Because of C++'s overload resolution both the wrapper and the int-taking 
declaration can have the same source name.

We require the enum to be layout compatible  with int -- this was an artifact of 
the earlier implementation anyway.  I added enumeration values to acc_device_t 
to enforce that, just in case someone tries to compile their openacc code with 
-fshort-enums.

Committed to trunk.

nathan

[-- Attachment #2: trunk-ondev.patch --]
[-- Type: text/x-patch, Size: 5146 bytes --]

2015-10-29  Nathan Sidwell  <nathan@codesourcery.com>

	gcc/
	* openacc.h (enum acc_device_t): Reformat. Ensure layout
	compatibility.
	(enum acc_async_t): Reformat.
	(acc_on_device): Declare compatible with builtin and provide C++
	wrapper.
	* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: New.

	gcc/testsuite/
	* c-c++-common/goacc/acc_on_device-2-off.c: Delete.
	* c-c++-common/goacc/acc_on_device-2.c: Delete.

Index: libgomp/openacc.h
===================================================================
--- libgomp/openacc.h	(revision 229535)
+++ libgomp/openacc.h	(working copy)
@@ -47,24 +47,25 @@ extern "C" {
 #endif
 
 /* Types */
-typedef enum acc_device_t
-  {
-    /* Keep in sync with include/gomp-constants.h.  */
-    acc_device_none = 0,
-    acc_device_default = 1,
-    acc_device_host = 2,
-    /* acc_device_host_nonshm = 3 removed.  */
-    acc_device_not_host = 4,
-    acc_device_nvidia = 5,
-    _ACC_device_hwm
-  } acc_device_t;
-
-typedef enum acc_async_t
-  {
-    /* Keep in sync with include/gomp-constants.h.  */
-    acc_async_noval = -1,
-    acc_async_sync  = -2
-  } acc_async_t;
+typedef enum acc_device_t {
+  /* Keep in sync with include/gomp-constants.h.  */
+  acc_device_none = 0,
+  acc_device_default = 1,
+  acc_device_host = 2,
+  /* acc_device_host_nonshm = 3 removed.  */
+  acc_device_not_host = 4,
+  acc_device_nvidia = 5,
+  _ACC_device_hwm,
+  /* Ensure enumeration is layout compatible with int.  */
+  _ACC_highest = __INT_MAX__,
+  _ACC_neg = -1
+} acc_device_t;
+
+typedef enum acc_async_t {
+  /* Keep in sync with include/gomp-constants.h.  */
+  acc_async_noval = -1,
+  acc_async_sync  = -2
+} acc_async_t;
 
 int acc_get_num_devices (acc_device_t) __GOACC_NOTHROW;
 void acc_set_device_type (acc_device_t) __GOACC_NOTHROW;
@@ -79,7 +80,11 @@ void acc_wait_all (void) __GOACC_NOTHROW
 void acc_wait_all_async (int) __GOACC_NOTHROW;
 void acc_init (acc_device_t) __GOACC_NOTHROW;
 void acc_shutdown (acc_device_t) __GOACC_NOTHROW;
-int acc_on_device (acc_device_t) __GOACC_NOTHROW;
+#ifdef __cplusplus
+int acc_on_device (int __arg) __GOACC_NOTHROW;
+#else
+int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW;
+#endif
 void *acc_malloc (size_t) __GOACC_NOTHROW;
 void acc_free (void *) __GOACC_NOTHROW;
 /* Some of these would be more correct with const qualifiers, but
@@ -113,6 +118,13 @@ int acc_set_cuda_stream (int, void *) __
 
 #ifdef __cplusplus
 }
+
+/* Forwarding function with correctly typed arg.  */
+
+inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
+{
+  return acc_on_device ((int) __arg);
+}
 #endif
 
 #endif /* _OPENACC_H */
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2" } */
+
+#include <openacc.h>
+
+int Foo (acc_device_t x)
+{
+  return acc_on_device (x);
+}
+
+/* { dg-final { scan-assembler-not "acc_on_device" } } */
Index: gcc/testsuite/c-c++-common/goacc/acc_on_device-2-off.c
===================================================================
--- gcc/testsuite/c-c++-common/goacc/acc_on_device-2-off.c	(revision 229535)
+++ gcc/testsuite/c-c++-common/goacc/acc_on_device-2-off.c	(working copy)
@@ -1,24 +0,0 @@
-/* Have to enable optimizations, as otherwise builtins won't be expanded.  */
-/* { dg-additional-options "-O -fdump-rtl-expand -fno-openacc" } */
-
-#if __cplusplus
-extern "C" {
-#endif
-
-typedef enum acc_device_t { acc_device_X = 123 } acc_device_t;
-extern int acc_on_device (acc_device_t);
-
-#if __cplusplus
-}
-#endif
-
-int
-f (void)
-{
-  const acc_device_t dev = acc_device_X;
-  return acc_on_device (dev);
-}
-
-/* Without -fopenacc, we're expecting one call.
-   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 1 "expand" } } */
-
Index: gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c
===================================================================
--- gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c	(revision 229535)
+++ gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c	(working copy)
@@ -1,28 +0,0 @@
-/* Have to enable optimizations, as otherwise builtins won't be expanded.  */
-/* { dg-additional-options "-O -fdump-rtl-expand" } */
-
-#if __cplusplus
-extern "C" {
-#endif
-
-typedef enum acc_device_t { acc_device_X = 123 } acc_device_t;
-extern int acc_on_device (acc_device_t);
-
-#if __cplusplus
-}
-#endif
-
-int
-f (void)
-{
-  const acc_device_t dev = acc_device_X;
-  return acc_on_device (dev);
-}
-
-/* With -fopenacc, we're expecting the builtin to be expanded, so no calls.
-   TODO: in C++, even under extern "C", the use of enum for acc_device_t
-   perturbs expansion as a builtin, which expects an int parameter.  It's fine
-   when changing acc_device_t to plain int, but that's not what we're doing in
-   <openacc.h>.
-
-   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 0 "expand" { xfail c++ } } } */

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [gomp4] acc_on_device
  2015-10-29 20:28 [gomp4] acc_on_device Nathan Sidwell
@ 2015-10-30  8:40 ` Thomas Schwinge
  2015-10-30  8:54   ` Thomas Schwinge
  2016-01-04 18:15 ` [gomp4] Fix acc_on_device for C++ Nathan Sidwell
  1 sibling, 1 reply; 13+ messages in thread
From: Thomas Schwinge @ 2015-10-30  8:40 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2649 bytes --]

Hi!

On Thu, 29 Oct 2015 13:15:13 -0700, Nathan Sidwell <nathan@acm.org> wrote:
> I've  committed this to gomp4 branch.  It resolves a problem with 
> builtin_acc_on_device and C++.

> The test cases in the gcc testsuite were hiding the problem by providing part of 
> openacc.h in the test directory, and this  had diverged from the openacc.h we 
> actually have.  I deleted those tests and inserted one in the libgomp testsuite, 
> which correctly picks up the openacc.h of the tool under test, (rather than one 
> in system includes).

The idea had been to test the compiler handling of the acc_on_device
builtin in the compiler testsuite, but yes, having to duplicate parts of
openacc.h was ugly.


> --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(revision 0)
> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(working copy)

In r229568 committed to gomp-4_0-branch as obvious:

commit e2c1427d60ffcc9183fbd5a0996dfe98c7219dc5
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Fri Oct 30 08:29:54 2015 +0000

    De-duplicate testsuite file
    
    	libgomp/
    	* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c:
    	De-duplicate file.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@229568 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp                                      |  5 +++++
 libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c | 12 ------------
 2 files changed, 5 insertions(+), 12 deletions(-)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index ddbcdee..89f57ef 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2015-10-30  Thomas Schwinge  <thomas@codesourcery.com>
+
+	* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c:
+	De-duplicate file.
+
 2015-10-29  Nathan Sidwell  <nathan@codesourcery.com>
 
 	* openacc.h (enum acc_device_t): Ensure layout compatibility.
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
index 0f73aeb..c1eed0e 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
@@ -1,15 +1,3 @@
-/* { dg-do compile } */
-/* { dg-additional-options "-O2" } */
-
-#include <openacc.h>
-
-int Foo (acc_device_t x)
-{
-  return acc_on_device (x);
-}
-
-/* { dg-final { scan-assembler-not "acc_on_device" } } */
-/* { dg-do compile } */
 /* { dg-additional-options "-O2" } */
 
 #include <openacc.h>


Grüße
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [gomp4] acc_on_device
  2015-10-30  8:40 ` Thomas Schwinge
@ 2015-10-30  8:54   ` Thomas Schwinge
  0 siblings, 0 replies; 13+ messages in thread
From: Thomas Schwinge @ 2015-10-30  8:54 UTC (permalink / raw)
  To: GCC Patches; +Cc: Nathan Sidwell

[-- Attachment #1: Type: text/plain, Size: 2018 bytes --]

Hi!

On Fri, 30 Oct 2015 09:31:52 +0100, I wrote:
> > --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(revision 0)
> > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(working copy)
> 
> In r229568 committed to gomp-4_0-branch as obvious:
> 
> commit e2c1427d60ffcc9183fbd5a0996dfe98c7219dc5
> Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
> Date:   Fri Oct 30 08:29:54 2015 +0000
> 
>     De-duplicate testsuite file

Chopped too much; in r229570 committed to gomp-4_0-branch as obvious:

commit 07e6f70f45dc4bbe343a972bce05aee8e0897e2e
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Fri Oct 30 08:39:02 2015 +0000

    De-duplicate testsuite file: restore dg-do compile directive
    
    	* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Restore
    	dg-do compile directive.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@229570 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp                                      | 3 +++
 libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c | 1 +
 2 files changed, 4 insertions(+)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 89f57ef..ba33e02 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,5 +1,8 @@
 2015-10-30  Thomas Schwinge  <thomas@codesourcery.com>
 
+	* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Restore
+	dg-do compile directive.
+
 	* testsuite/libgomp.oacc-c-c++-common/acc-on-device.c:
 	De-duplicate file.
 
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
index c1eed0e..88c000e 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
@@ -1,3 +1,4 @@
+/* { dg-do compile } */
 /* { dg-additional-options "-O2" } */
 
 #include <openacc.h>


Grüße
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [gomp4] Fix acc_on_device for C++
@ 2016-01-04 18:15 ` Nathan Sidwell
  2016-01-06 15:48   ` [openacc] fix unoptimized acc_on_device Nathan Sidwell
  0 siblings, 1 reply; 13+ messages in thread
From: Nathan Sidwell @ 2016-01-04 18:15 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 187 bytes --]

This patch fixes acc_on_device's C++ wrapper when compiling at -O0.  The wrapper 
isn't inlined, and we need to mark the function as needing emission by the 
device compiler too.

nathan

[-- Attachment #2: gomp4-ondev.patch --]
[-- Type: text/x-patch, Size: 1287 bytes --]

2016-01-04  Nathan Sidwell  <nathan@codesourcery.com>

	* openacc.c (acc_on_device): Add routine pragma for C++ wrapper.
	* testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: New.

Index: libgomp/openacc.h
===================================================================
--- libgomp/openacc.h	(revision 232058)
+++ libgomp/openacc.h	(working copy)
@@ -121,6 +121,7 @@ int acc_set_cuda_stream (int, void *) __
 
 /* Forwarding function with correctly typed arg.  */
 
+#pragma acc routine seq
 inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
 {
   return acc_on_device ((int) __arg);
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c	(working copy)
@@ -0,0 +1,23 @@
+/* { dg-additional-options "-O0" } */
+
+#include <openacc.h>
+
+/* acc_on_device might not be folded at -O0, but it should work. */
+
+int main ()
+{
+  int dev;
+  
+#pragma acc parallel copyout (dev)
+  {
+    dev = acc_on_device (acc_device_not_host);
+  }
+
+  int expect = 1;
+  
+#if  ACC_DEVICE_TYPE_host
+  expect = 0;
+#endif
+  
+  return dev != expect;
+}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [openacc] fix unoptimized acc_on_device
@ 2016-01-06 15:48   ` Nathan Sidwell
  2015-10-30  0:27     ` [openacc] on_device fix Nathan Sidwell
  2016-01-06 19:04     ` [openacc] fix unoptimized acc_on_device Jakub Jelinek
  0 siblings, 2 replies; 13+ messages in thread
From: Nathan Sidwell @ 2016-01-06 15:48 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 255 bytes --]

I've committed this to trunk.  C++ needs a wrapper function to deal with the 
enumerated type argument.   Usually that's inlined, but when not optimizing we 
need to emit a definition of the wrapper.  That means marking it as an openacc 
routine.

nathan

[-- Attachment #2: trunk-on-dev.patch --]
[-- Type: text/x-patch, Size: 1230 bytes --]

2016-01-06  Nathan Sidwell  <nathan@acm.org>

	* openacc.c (acc_on_device): Add routine pragma for C++ wrapper.
	* testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: New.

Index: openacc.h
===================================================================
--- openacc.h	(revision 232103)
+++ openacc.h	(working copy)
@@ -121,6 +121,7 @@ int acc_set_cuda_stream (int, void *) __
 
 /* Forwarding function with correctly typed arg.  */
 
+#pragma acc routine seq
 inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
 {
   return acc_on_device ((int) __arg);
Index: testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c
===================================================================
--- testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c	(revision 0)
+++ testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c	(working copy)
@@ -0,0 +1,23 @@
+/* { dg-additional-options "-O0" } */
+
+#include <openacc.h>
+
+/* acc_on_device might not be folded at -O0, but it should work. */
+
+int main ()
+{
+  int dev;
+  
+#pragma acc parallel copyout (dev)
+  {
+    dev = acc_on_device (acc_device_not_host);
+  }
+
+  int expect = 1;
+  
+#if  ACC_DEVICE_TYPE_host
+  expect = 0;
+#endif
+  
+  return dev != expect;
+}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [openacc] fix unoptimized acc_on_device
  2016-01-06 15:48   ` [openacc] fix unoptimized acc_on_device Nathan Sidwell
  2015-10-30  0:27     ` [openacc] on_device fix Nathan Sidwell
@ 2016-01-06 19:04     ` Jakub Jelinek
  1 sibling, 0 replies; 13+ messages in thread
From: Jakub Jelinek @ 2016-01-06 19:04 UTC (permalink / raw)
  To: Nathan Sidwell; +Cc: GCC Patches

On Wed, Jan 06, 2016 at 10:48:52AM -0500, Nathan Sidwell wrote:
> I've committed this to trunk.  C++ needs a wrapper function to deal with the
> enumerated type argument.   Usually that's inlined, but when not optimizing
> we need to emit a definition of the wrapper.  That means marking it as an
> openacc routine.
> 
> nathan

> 2016-01-06  Nathan Sidwell  <nathan@acm.org>
> 
> 	* openacc.c (acc_on_device): Add routine pragma for C++ wrapper.

s/\.c/\.h/

> 	* testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: New.

Otherwise LGTM.

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Make the OpenACC C++ acc_on_device wrapper "always inline" (was: [openacc] on_device fix)
  2015-10-30  0:27     ` [openacc] on_device fix Nathan Sidwell
@ 2017-05-23 15:34       ` Thomas Schwinge
  2017-05-30 12:36         ` [PING] Make the OpenACC C++ acc_on_device wrapper "always inline" Thomas Schwinge
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Schwinge @ 2017-05-23 15:34 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches, Jakub Jelinek

Hi!

On Thu, 29 Oct 2015 17:22:46 -0700, Nathan Sidwell <nathan@acm.org> wrote:
> acc_on_device and it's builtin had a conflict.  The function formally takes an 
> enum argument, but the builtin takes an int -- primarily to avoid the compiler 
> having to generate the enum  type internally.
> 
> This works fine for C,  where the external declaration of the function (in 
> openacc.h) matches up with the builtin, and we optimize the builtin as expected.
> 
> It fails for C++ where the builtin doesn't match the declaration in the header. 
>   We end up with emitting a call to acc_on_device,  which is resolved by 
> libgomp.  Unfortunately that means we fail to optimize.  [...]

> [Nathan's trunk r229562] leaves things unchanged for C --  declare a function with an enum arg. 
>   But for C++ we the extern "C" declaration takes an int -- and therefore 
> matches the builtin.  We insert an inline wrapper that takes an enum argument. 
> Because of C++'s overload resolution both the wrapper and the int-taking 
> declaration can have the same source name.

> --- libgomp/openacc.h	(revision 229535)
> +++ libgomp/openacc.h	(working copy)

> -int acc_on_device (acc_device_t) __GOACC_NOTHROW;
> +#ifdef __cplusplus
> +int acc_on_device (int __arg) __GOACC_NOTHROW;
> +#else
> +int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW;
> +#endif

>  #ifdef __cplusplus
>  }
> +
> +/* Forwarding function with correctly typed arg.  */
> +
> +inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
> +{
> +  return acc_on_device ((int) __arg);
> +}
>  #endif

> --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(revision 0)
> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(working copy)
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include <openacc.h>
> +
> +int Foo (acc_device_t x)
> +{
> +  return acc_on_device (x);
> +}
> +
> +/* { dg-final { scan-assembler-not "acc_on_device" } } */

As a user, I'd expect that when compiling such code with "-O0" instead of
"-O2", but adding "__attribute__ ((optimize ("O2")))" to "Foo", that I'd
then get "acc_on_device" expanded as a builtin, and no calls to the
"acc_on_device library function.  In C++ that is currently not working,
because the "Forwarding function with correctly typed arg" (cited above)
doesn't "inherit" that "optimize" attribute.  Making that one "always
inline" resolves the problem.  Also I cleaned up and extended testing
some more.  OK for trunk?

commit 9cc3a384c17e9f692f7864c604d2e2f9fbf0bac9
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Tue May 23 13:21:14 2017 +0200

    Make the OpenACC C++ acc_on_device wrapper "always inline"
    
            libgomp/
            * openacc.h [__cplusplus] (acc_on_device): Mark as "always
            inline".
            * testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: Remove
            file; test cases already present...
            * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: ... in
            this file.  Update.
            * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Remove
            file; test cases now present...
            * testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c: ... in
            this new file.
            * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Update.
---
 libgomp/openacc.h                                  |  3 +-
 .../libgomp.oacc-c-c++-common/acc-on-device-2.c    | 22 -------------
 .../libgomp.oacc-c-c++-common/acc-on-device.c      | 12 -------
 .../libgomp.oacc-c-c++-common/acc_on_device-1.c    | 38 +++++++++++++---------
 .../libgomp.oacc-c-c++-common/acc_on_device-2.c    | 21 ++++++++++++
 .../libgomp.oacc-c-c++-common/parallel-dims.c      | 14 ++++----
 6 files changed, 52 insertions(+), 58 deletions(-)

diff --git libgomp/openacc.h libgomp/openacc.h
index 137e2c1..266f559 100644
--- libgomp/openacc.h
+++ libgomp/openacc.h
@@ -121,7 +121,8 @@ int acc_set_cuda_stream (int, void *) __GOACC_NOTHROW;
 /* Forwarding function with correctly typed arg.  */
 
 #pragma acc routine seq
-inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
+inline __attribute__ ((__always_inline__)) int
+acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
 {
   return acc_on_device ((int) __arg);
 }
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c
deleted file mode 100644
index bfcb67d..0000000
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c
+++ /dev/null
@@ -1,22 +0,0 @@
-/* Test the acc_on_device library function. */
-/* { dg-additional-options "-fno-builtin-acc_on_device" } */
-
-#include <openacc.h>
-
-int main ()
-{
-  int dev;
-  
-#pragma acc parallel copyout (dev)
-  {
-    dev = acc_on_device (acc_device_not_host);
-  }
-
-  int expect = 1;
-  
-#if  ACC_DEVICE_TYPE_host
-  expect = 0;
-#endif
-  
-  return dev != expect;
-}
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
deleted file mode 100644
index e0d8710..0000000
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
+++ /dev/null
@@ -1,12 +0,0 @@
-/* { dg-do compile } */
-/* We don't expect this to work with optimizations disabled.
-   { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
-
-#include <openacc.h>
-
-int Foo (acc_device_t x)
-{
-  return acc_on_device (x);
-}
-
-/* { dg-final { scan-assembler-not "acc_on_device" } } */
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c
index 8112745..eb962e4 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c
@@ -1,6 +1,9 @@
 /* Disable the acc_on_device builtin; we want to test the libgomp library
    function.  */
+/* { dg-additional-options "-DACC_ON_DEVICE=acc_on_device" } */
 /* { dg-additional-options "-fno-builtin-acc_on_device" } */
+/* { dg-additional-options "-fdump-rtl-expand" }
+   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 12 "expand" } } */
 
 #include <stdlib.h>
 #include <openacc.h>
@@ -11,13 +14,13 @@ main (int argc, char *argv[])
   /* Host.  */
 
   {
-    if (!acc_on_device (acc_device_none))
+    if (!ACC_ON_DEVICE (acc_device_none))
       abort ();
-    if (!acc_on_device (acc_device_host))
+    if (!ACC_ON_DEVICE (acc_device_host))
       abort ();
-    if (acc_on_device (acc_device_not_host))
+    if (ACC_ON_DEVICE (acc_device_not_host))
       abort ();
-    if (acc_on_device (acc_device_nvidia))
+    if (ACC_ON_DEVICE (acc_device_nvidia))
       abort ();
   }
 
@@ -26,39 +29,44 @@ main (int argc, char *argv[])
 
 #pragma acc parallel if(0)
   {
-    if (!acc_on_device (acc_device_none))
+    if (!ACC_ON_DEVICE (acc_device_none))
       abort ();
-    if (!acc_on_device (acc_device_host))
+    if (!ACC_ON_DEVICE (acc_device_host))
       abort ();
-    if (acc_on_device (acc_device_not_host))
+    if (ACC_ON_DEVICE (acc_device_not_host))
       abort ();
-    if (acc_on_device (acc_device_nvidia))
+    if (ACC_ON_DEVICE (acc_device_nvidia))
       abort ();
   }
 
 
-#if !ACC_DEVICE_TYPE_host
+  int on_host_p;
+#if ACC_DEVICE_TYPE_host
+  on_host_p = 1;
+#else
+  on_host_p = 0;
+#endif
 
   /* Offloaded.  */
 
 #pragma acc parallel
   {
-    if (acc_on_device (acc_device_none))
+    if (on_host_p != ACC_ON_DEVICE (acc_device_none))
       abort ();
-    if (acc_on_device (acc_device_host))
+    if (on_host_p != ACC_ON_DEVICE (acc_device_host))
       abort ();
-    if (!acc_on_device (acc_device_not_host))
+    if (on_host_p == ACC_ON_DEVICE (acc_device_not_host))
       abort ();
+
 #if ACC_DEVICE_TYPE_nvidia
-    if (!acc_on_device (acc_device_nvidia))
+    if (!ACC_ON_DEVICE (acc_device_nvidia))
       abort ();
 #else
-    if (acc_on_device (acc_device_nvidia))
+    if (ACC_ON_DEVICE (acc_device_nvidia))
       abort ();
 #endif
   }
 
-#endif
 
   return 0;
 }
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c
new file mode 100644
index 0000000..c3b3378
--- /dev/null
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c
@@ -0,0 +1,21 @@
+/* With the acc_on_device builtin enabled, we don't expect any calls to the
+   libgomp library function.  */
+/* { dg-additional-options "-fdump-rtl-expand" }
+   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 0 "expand" } } */
+
+#include <openacc.h>
+
+#ifdef __OPTIMIZE__
+# define ACC_ON_DEVICE acc_on_device
+#else
+/* Without optimizations enabled, we're not expecting the acc_on_device builtin
+   to be used, so use here a "-O2" wrapper.  */
+#pragma acc routine seq
+static int __attribute__ ((optimize ("O2")))
+ACC_ON_DEVICE (acc_device_t arg)
+{
+  return acc_on_device (arg);
+}
+#endif
+
+#include "acc_on_device-1.c"
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
index 8308f7c..1c48ab3 100644
--- libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
@@ -4,14 +4,12 @@
 #include <limits.h>
 #include <openacc.h>
 
-/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper
-   not behaving as expected for -O0.  */
 #pragma acc routine seq
 static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
 {
-  if (acc_on_device ((int) acc_device_host))
+  if (acc_on_device (acc_device_host))
     return 0;
-  else if (acc_on_device ((int) acc_device_nvidia))
+  else if (acc_on_device (acc_device_nvidia))
     {
       unsigned int r;
       asm volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (r));
@@ -24,9 +22,9 @@ static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
 #pragma acc routine seq
 static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
 {
-  if (acc_on_device ((int) acc_device_host))
+  if (acc_on_device (acc_device_host))
     return 0;
-  else if (acc_on_device ((int) acc_device_nvidia))
+  else if (acc_on_device (acc_device_nvidia))
     {
       unsigned int r;
       asm volatile ("mov.u32 %0,%%tid.y;" : "=r" (r));
@@ -39,9 +37,9 @@ static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
 #pragma acc routine seq
 static unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
 {
-  if (acc_on_device ((int) acc_device_host))
+  if (acc_on_device (acc_device_host))
     return 0;
-  else if (acc_on_device ((int) acc_device_nvidia))
+  else if (acc_on_device (acc_device_nvidia))
     {
       unsigned int r;
       asm volatile ("mov.u32 %0,%%tid.x;" : "=r" (r));


Grüße
 Thomas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PING] Make the OpenACC C++ acc_on_device wrapper "always inline"
  2017-05-23 15:34       ` Make the OpenACC C++ acc_on_device wrapper "always inline" (was: [openacc] on_device fix) Thomas Schwinge
@ 2017-05-30 12:36         ` Thomas Schwinge
  2017-06-06  6:35           ` [PING] " Thomas Schwinge
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Schwinge @ 2017-05-30 12:36 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches, Jakub Jelinek

Hi!

Ping.

On Tue, 23 May 2017 17:31:11 +0200, I wrote:
> On Thu, 29 Oct 2015 17:22:46 -0700, Nathan Sidwell <nathan@acm.org> wrote:
> > acc_on_device and it's builtin had a conflict.  The function formally takes an 
> > enum argument, but the builtin takes an int -- primarily to avoid the compiler 
> > having to generate the enum  type internally.
> > 
> > This works fine for C,  where the external declaration of the function (in 
> > openacc.h) matches up with the builtin, and we optimize the builtin as expected.
> > 
> > It fails for C++ where the builtin doesn't match the declaration in the header. 
> >   We end up with emitting a call to acc_on_device,  which is resolved by 
> > libgomp.  Unfortunately that means we fail to optimize.  [...]
> 
> > [Nathan's trunk r229562] leaves things unchanged for C --  declare a function with an enum arg. 
> >   But for C++ we the extern "C" declaration takes an int -- and therefore 
> > matches the builtin.  We insert an inline wrapper that takes an enum argument. 
> > Because of C++'s overload resolution both the wrapper and the int-taking 
> > declaration can have the same source name.
> 
> > --- libgomp/openacc.h	(revision 229535)
> > +++ libgomp/openacc.h	(working copy)
> 
> > -int acc_on_device (acc_device_t) __GOACC_NOTHROW;
> > +#ifdef __cplusplus
> > +int acc_on_device (int __arg) __GOACC_NOTHROW;
> > +#else
> > +int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW;
> > +#endif
> 
> >  #ifdef __cplusplus
> >  }
> > +
> > +/* Forwarding function with correctly typed arg.  */
> > +
> > +inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
> > +{
> > +  return acc_on_device ((int) __arg);
> > +}
> >  #endif
> 
> > --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(revision 0)
> > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(working copy)
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-O2" } */
> > +
> > +#include <openacc.h>
> > +
> > +int Foo (acc_device_t x)
> > +{
> > +  return acc_on_device (x);
> > +}
> > +
> > +/* { dg-final { scan-assembler-not "acc_on_device" } } */
> 
> As a user, I'd expect that when compiling such code with "-O0" instead of
> "-O2", but adding "__attribute__ ((optimize ("O2")))" to "Foo", that I'd
> then get "acc_on_device" expanded as a builtin, and no calls to the
> "acc_on_device library function.  In C++ that is currently not working,
> because the "Forwarding function with correctly typed arg" (cited above)
> doesn't "inherit" that "optimize" attribute.  Making that one "always
> inline" resolves the problem.  Also I cleaned up and extended testing
> some more.  OK for trunk?
> 
> commit 9cc3a384c17e9f692f7864c604d2e2f9fbf0bac9
> Author: Thomas Schwinge <thomas@codesourcery.com>
> Date:   Tue May 23 13:21:14 2017 +0200
> 
>     Make the OpenACC C++ acc_on_device wrapper "always inline"
>     
>             libgomp/
>             * openacc.h [__cplusplus] (acc_on_device): Mark as "always
>             inline".
>             * testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: Remove
>             file; test cases already present...
>             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: ... in
>             this file.  Update.
>             * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Remove
>             file; test cases now present...
>             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c: ... in
>             this new file.
>             * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Update.
> ---
>  libgomp/openacc.h                                  |  3 +-
>  .../libgomp.oacc-c-c++-common/acc-on-device-2.c    | 22 -------------
>  .../libgomp.oacc-c-c++-common/acc-on-device.c      | 12 -------
>  .../libgomp.oacc-c-c++-common/acc_on_device-1.c    | 38 +++++++++++++---------
>  .../libgomp.oacc-c-c++-common/acc_on_device-2.c    | 21 ++++++++++++
>  .../libgomp.oacc-c-c++-common/parallel-dims.c      | 14 ++++----
>  6 files changed, 52 insertions(+), 58 deletions(-)
> 
> diff --git libgomp/openacc.h libgomp/openacc.h
> index 137e2c1..266f559 100644
> --- libgomp/openacc.h
> +++ libgomp/openacc.h
> @@ -121,7 +121,8 @@ int acc_set_cuda_stream (int, void *) __GOACC_NOTHROW;
>  /* Forwarding function with correctly typed arg.  */
>  
>  #pragma acc routine seq
> -inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
> +inline __attribute__ ((__always_inline__)) int
> +acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
>  {
>    return acc_on_device ((int) __arg);
>  }
> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c
> deleted file mode 100644
> index bfcb67d..0000000
> --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c
> +++ /dev/null
> @@ -1,22 +0,0 @@
> -/* Test the acc_on_device library function. */
> -/* { dg-additional-options "-fno-builtin-acc_on_device" } */
> -
> -#include <openacc.h>
> -
> -int main ()
> -{
> -  int dev;
> -  
> -#pragma acc parallel copyout (dev)
> -  {
> -    dev = acc_on_device (acc_device_not_host);
> -  }
> -
> -  int expect = 1;
> -  
> -#if  ACC_DEVICE_TYPE_host
> -  expect = 0;
> -#endif
> -  
> -  return dev != expect;
> -}
> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
> deleted file mode 100644
> index e0d8710..0000000
> --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
> +++ /dev/null
> @@ -1,12 +0,0 @@
> -/* { dg-do compile } */
> -/* We don't expect this to work with optimizations disabled.
> -   { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
> -
> -#include <openacc.h>
> -
> -int Foo (acc_device_t x)
> -{
> -  return acc_on_device (x);
> -}
> -
> -/* { dg-final { scan-assembler-not "acc_on_device" } } */
> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c
> index 8112745..eb962e4 100644
> --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c
> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c
> @@ -1,6 +1,9 @@
>  /* Disable the acc_on_device builtin; we want to test the libgomp library
>     function.  */
> +/* { dg-additional-options "-DACC_ON_DEVICE=acc_on_device" } */
>  /* { dg-additional-options "-fno-builtin-acc_on_device" } */
> +/* { dg-additional-options "-fdump-rtl-expand" }
> +   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 12 "expand" } } */
>  
>  #include <stdlib.h>
>  #include <openacc.h>
> @@ -11,13 +14,13 @@ main (int argc, char *argv[])
>    /* Host.  */
>  
>    {
> -    if (!acc_on_device (acc_device_none))
> +    if (!ACC_ON_DEVICE (acc_device_none))
>        abort ();
> -    if (!acc_on_device (acc_device_host))
> +    if (!ACC_ON_DEVICE (acc_device_host))
>        abort ();
> -    if (acc_on_device (acc_device_not_host))
> +    if (ACC_ON_DEVICE (acc_device_not_host))
>        abort ();
> -    if (acc_on_device (acc_device_nvidia))
> +    if (ACC_ON_DEVICE (acc_device_nvidia))
>        abort ();
>    }
>  
> @@ -26,39 +29,44 @@ main (int argc, char *argv[])
>  
>  #pragma acc parallel if(0)
>    {
> -    if (!acc_on_device (acc_device_none))
> +    if (!ACC_ON_DEVICE (acc_device_none))
>        abort ();
> -    if (!acc_on_device (acc_device_host))
> +    if (!ACC_ON_DEVICE (acc_device_host))
>        abort ();
> -    if (acc_on_device (acc_device_not_host))
> +    if (ACC_ON_DEVICE (acc_device_not_host))
>        abort ();
> -    if (acc_on_device (acc_device_nvidia))
> +    if (ACC_ON_DEVICE (acc_device_nvidia))
>        abort ();
>    }
>  
>  
> -#if !ACC_DEVICE_TYPE_host
> +  int on_host_p;
> +#if ACC_DEVICE_TYPE_host
> +  on_host_p = 1;
> +#else
> +  on_host_p = 0;
> +#endif
>  
>    /* Offloaded.  */
>  
>  #pragma acc parallel
>    {
> -    if (acc_on_device (acc_device_none))
> +    if (on_host_p != ACC_ON_DEVICE (acc_device_none))
>        abort ();
> -    if (acc_on_device (acc_device_host))
> +    if (on_host_p != ACC_ON_DEVICE (acc_device_host))
>        abort ();
> -    if (!acc_on_device (acc_device_not_host))
> +    if (on_host_p == ACC_ON_DEVICE (acc_device_not_host))
>        abort ();
> +
>  #if ACC_DEVICE_TYPE_nvidia
> -    if (!acc_on_device (acc_device_nvidia))
> +    if (!ACC_ON_DEVICE (acc_device_nvidia))
>        abort ();
>  #else
> -    if (acc_on_device (acc_device_nvidia))
> +    if (ACC_ON_DEVICE (acc_device_nvidia))
>        abort ();
>  #endif
>    }
>  
> -#endif
>  
>    return 0;
>  }
> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c
> new file mode 100644
> index 0000000..c3b3378
> --- /dev/null
> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c
> @@ -0,0 +1,21 @@
> +/* With the acc_on_device builtin enabled, we don't expect any calls to the
> +   libgomp library function.  */
> +/* { dg-additional-options "-fdump-rtl-expand" }
> +   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 0 "expand" } } */
> +
> +#include <openacc.h>
> +
> +#ifdef __OPTIMIZE__
> +# define ACC_ON_DEVICE acc_on_device
> +#else
> +/* Without optimizations enabled, we're not expecting the acc_on_device builtin
> +   to be used, so use here a "-O2" wrapper.  */
> +#pragma acc routine seq
> +static int __attribute__ ((optimize ("O2")))
> +ACC_ON_DEVICE (acc_device_t arg)
> +{
> +  return acc_on_device (arg);
> +}
> +#endif
> +
> +#include "acc_on_device-1.c"
> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
> index 8308f7c..1c48ab3 100644
> --- libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
> @@ -4,14 +4,12 @@
>  #include <limits.h>
>  #include <openacc.h>
>  
> -/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper
> -   not behaving as expected for -O0.  */
>  #pragma acc routine seq
>  static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
>  {
> -  if (acc_on_device ((int) acc_device_host))
> +  if (acc_on_device (acc_device_host))
>      return 0;
> -  else if (acc_on_device ((int) acc_device_nvidia))
> +  else if (acc_on_device (acc_device_nvidia))
>      {
>        unsigned int r;
>        asm volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (r));
> @@ -24,9 +22,9 @@ static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
>  #pragma acc routine seq
>  static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
>  {
> -  if (acc_on_device ((int) acc_device_host))
> +  if (acc_on_device (acc_device_host))
>      return 0;
> -  else if (acc_on_device ((int) acc_device_nvidia))
> +  else if (acc_on_device (acc_device_nvidia))
>      {
>        unsigned int r;
>        asm volatile ("mov.u32 %0,%%tid.y;" : "=r" (r));
> @@ -39,9 +37,9 @@ static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
>  #pragma acc routine seq
>  static unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
>  {
> -  if (acc_on_device ((int) acc_device_host))
> +  if (acc_on_device (acc_device_host))
>      return 0;
> -  else if (acc_on_device ((int) acc_device_nvidia))
> +  else if (acc_on_device (acc_device_nvidia))
>      {
>        unsigned int r;
>        asm volatile ("mov.u32 %0,%%tid.x;" : "=r" (r));


Grüße
 Thomas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PING] [PING] Make the OpenACC C++ acc_on_device wrapper "always inline"
  2017-05-30 12:36         ` [PING] Make the OpenACC C++ acc_on_device wrapper "always inline" Thomas Schwinge
@ 2017-06-06  6:35           ` Thomas Schwinge
  2017-06-06  6:58             ` Jakub Jelinek
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Schwinge @ 2017-06-06  6:35 UTC (permalink / raw)
  To: Nathan Sidwell, GCC Patches, Jakub Jelinek

Hi!

Ping.

On Tue, 30 May 2017 14:35:29 +0200, I wrote:
> Ping.
> 
> On Tue, 23 May 2017 17:31:11 +0200, I wrote:
> > On Thu, 29 Oct 2015 17:22:46 -0700, Nathan Sidwell <nathan@acm.org> wrote:
> > > acc_on_device and it's builtin had a conflict.  The function formally takes an 
> > > enum argument, but the builtin takes an int -- primarily to avoid the compiler 
> > > having to generate the enum  type internally.
> > > 
> > > This works fine for C,  where the external declaration of the function (in 
> > > openacc.h) matches up with the builtin, and we optimize the builtin as expected.
> > > 
> > > It fails for C++ where the builtin doesn't match the declaration in the header. 
> > >   We end up with emitting a call to acc_on_device,  which is resolved by 
> > > libgomp.  Unfortunately that means we fail to optimize.  [...]
> > 
> > > [Nathan's trunk r229562] leaves things unchanged for C --  declare a function with an enum arg. 
> > >   But for C++ we the extern "C" declaration takes an int -- and therefore 
> > > matches the builtin.  We insert an inline wrapper that takes an enum argument. 
> > > Because of C++'s overload resolution both the wrapper and the int-taking 
> > > declaration can have the same source name.
> > 
> > > --- libgomp/openacc.h	(revision 229535)
> > > +++ libgomp/openacc.h	(working copy)
> > 
> > > -int acc_on_device (acc_device_t) __GOACC_NOTHROW;
> > > +#ifdef __cplusplus
> > > +int acc_on_device (int __arg) __GOACC_NOTHROW;
> > > +#else
> > > +int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW;
> > > +#endif
> > 
> > >  #ifdef __cplusplus
> > >  }
> > > +
> > > +/* Forwarding function with correctly typed arg.  */
> > > +
> > > +inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
> > > +{
> > > +  return acc_on_device ((int) __arg);
> > > +}
> > >  #endif
> > 
> > > --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(revision 0)
> > > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c	(working copy)
> > > @@ -0,0 +1,12 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-additional-options "-O2" } */
> > > +
> > > +#include <openacc.h>
> > > +
> > > +int Foo (acc_device_t x)
> > > +{
> > > +  return acc_on_device (x);
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler-not "acc_on_device" } } */
> > 
> > As a user, I'd expect that when compiling such code with "-O0" instead of
> > "-O2", but adding "__attribute__ ((optimize ("O2")))" to "Foo", that I'd
> > then get "acc_on_device" expanded as a builtin, and no calls to the
> > "acc_on_device library function.  In C++ that is currently not working,
> > because the "Forwarding function with correctly typed arg" (cited above)
> > doesn't "inherit" that "optimize" attribute.  Making that one "always
> > inline" resolves the problem.  Also I cleaned up and extended testing
> > some more.  OK for trunk?
> > 
> > commit 9cc3a384c17e9f692f7864c604d2e2f9fbf0bac9
> > Author: Thomas Schwinge <thomas@codesourcery.com>
> > Date:   Tue May 23 13:21:14 2017 +0200
> > 
> >     Make the OpenACC C++ acc_on_device wrapper "always inline"
> >     
> >             libgomp/
> >             * openacc.h [__cplusplus] (acc_on_device): Mark as "always
> >             inline".
> >             * testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: Remove
> >             file; test cases already present...
> >             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: ... in
> >             this file.  Update.
> >             * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Remove
> >             file; test cases now present...
> >             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c: ... in
> >             this new file.
> >             * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Update.
> > ---
> >  libgomp/openacc.h                                  |  3 +-
> >  .../libgomp.oacc-c-c++-common/acc-on-device-2.c    | 22 -------------
> >  .../libgomp.oacc-c-c++-common/acc-on-device.c      | 12 -------
> >  .../libgomp.oacc-c-c++-common/acc_on_device-1.c    | 38 +++++++++++++---------
> >  .../libgomp.oacc-c-c++-common/acc_on_device-2.c    | 21 ++++++++++++
> >  .../libgomp.oacc-c-c++-common/parallel-dims.c      | 14 ++++----
> >  6 files changed, 52 insertions(+), 58 deletions(-)
> > 
> > diff --git libgomp/openacc.h libgomp/openacc.h
> > index 137e2c1..266f559 100644
> > --- libgomp/openacc.h
> > +++ libgomp/openacc.h
> > @@ -121,7 +121,8 @@ int acc_set_cuda_stream (int, void *) __GOACC_NOTHROW;
> >  /* Forwarding function with correctly typed arg.  */
> >  
> >  #pragma acc routine seq
> > -inline int acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
> > +inline __attribute__ ((__always_inline__)) int
> > +acc_on_device (acc_device_t __arg) __GOACC_NOTHROW
> >  {
> >    return acc_on_device ((int) __arg);
> >  }
> > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c
> > deleted file mode 100644
> > index bfcb67d..0000000
> > --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c
> > +++ /dev/null
> > @@ -1,22 +0,0 @@
> > -/* Test the acc_on_device library function. */
> > -/* { dg-additional-options "-fno-builtin-acc_on_device" } */
> > -
> > -#include <openacc.h>
> > -
> > -int main ()
> > -{
> > -  int dev;
> > -  
> > -#pragma acc parallel copyout (dev)
> > -  {
> > -    dev = acc_on_device (acc_device_not_host);
> > -  }
> > -
> > -  int expect = 1;
> > -  
> > -#if  ACC_DEVICE_TYPE_host
> > -  expect = 0;
> > -#endif
> > -  
> > -  return dev != expect;
> > -}
> > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
> > deleted file mode 100644
> > index e0d8710..0000000
> > --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc-on-device.c
> > +++ /dev/null
> > @@ -1,12 +0,0 @@
> > -/* { dg-do compile } */
> > -/* We don't expect this to work with optimizations disabled.
> > -   { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
> > -
> > -#include <openacc.h>
> > -
> > -int Foo (acc_device_t x)
> > -{
> > -  return acc_on_device (x);
> > -}
> > -
> > -/* { dg-final { scan-assembler-not "acc_on_device" } } */
> > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c
> > index 8112745..eb962e4 100644
> > --- libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c
> > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c
> > @@ -1,6 +1,9 @@
> >  /* Disable the acc_on_device builtin; we want to test the libgomp library
> >     function.  */
> > +/* { dg-additional-options "-DACC_ON_DEVICE=acc_on_device" } */
> >  /* { dg-additional-options "-fno-builtin-acc_on_device" } */
> > +/* { dg-additional-options "-fdump-rtl-expand" }
> > +   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 12 "expand" } } */
> >  
> >  #include <stdlib.h>
> >  #include <openacc.h>
> > @@ -11,13 +14,13 @@ main (int argc, char *argv[])
> >    /* Host.  */
> >  
> >    {
> > -    if (!acc_on_device (acc_device_none))
> > +    if (!ACC_ON_DEVICE (acc_device_none))
> >        abort ();
> > -    if (!acc_on_device (acc_device_host))
> > +    if (!ACC_ON_DEVICE (acc_device_host))
> >        abort ();
> > -    if (acc_on_device (acc_device_not_host))
> > +    if (ACC_ON_DEVICE (acc_device_not_host))
> >        abort ();
> > -    if (acc_on_device (acc_device_nvidia))
> > +    if (ACC_ON_DEVICE (acc_device_nvidia))
> >        abort ();
> >    }
> >  
> > @@ -26,39 +29,44 @@ main (int argc, char *argv[])
> >  
> >  #pragma acc parallel if(0)
> >    {
> > -    if (!acc_on_device (acc_device_none))
> > +    if (!ACC_ON_DEVICE (acc_device_none))
> >        abort ();
> > -    if (!acc_on_device (acc_device_host))
> > +    if (!ACC_ON_DEVICE (acc_device_host))
> >        abort ();
> > -    if (acc_on_device (acc_device_not_host))
> > +    if (ACC_ON_DEVICE (acc_device_not_host))
> >        abort ();
> > -    if (acc_on_device (acc_device_nvidia))
> > +    if (ACC_ON_DEVICE (acc_device_nvidia))
> >        abort ();
> >    }
> >  
> >  
> > -#if !ACC_DEVICE_TYPE_host
> > +  int on_host_p;
> > +#if ACC_DEVICE_TYPE_host
> > +  on_host_p = 1;
> > +#else
> > +  on_host_p = 0;
> > +#endif
> >  
> >    /* Offloaded.  */
> >  
> >  #pragma acc parallel
> >    {
> > -    if (acc_on_device (acc_device_none))
> > +    if (on_host_p != ACC_ON_DEVICE (acc_device_none))
> >        abort ();
> > -    if (acc_on_device (acc_device_host))
> > +    if (on_host_p != ACC_ON_DEVICE (acc_device_host))
> >        abort ();
> > -    if (!acc_on_device (acc_device_not_host))
> > +    if (on_host_p == ACC_ON_DEVICE (acc_device_not_host))
> >        abort ();
> > +
> >  #if ACC_DEVICE_TYPE_nvidia
> > -    if (!acc_on_device (acc_device_nvidia))
> > +    if (!ACC_ON_DEVICE (acc_device_nvidia))
> >        abort ();
> >  #else
> > -    if (acc_on_device (acc_device_nvidia))
> > +    if (ACC_ON_DEVICE (acc_device_nvidia))
> >        abort ();
> >  #endif
> >    }
> >  
> > -#endif
> >  
> >    return 0;
> >  }
> > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c
> > new file mode 100644
> > index 0000000..c3b3378
> > --- /dev/null
> > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c
> > @@ -0,0 +1,21 @@
> > +/* With the acc_on_device builtin enabled, we don't expect any calls to the
> > +   libgomp library function.  */
> > +/* { dg-additional-options "-fdump-rtl-expand" }
> > +   { dg-final { scan-rtl-dump-times "\\\(call \[^\\n\]* acc_on_device" 0 "expand" } } */
> > +
> > +#include <openacc.h>
> > +
> > +#ifdef __OPTIMIZE__
> > +# define ACC_ON_DEVICE acc_on_device
> > +#else
> > +/* Without optimizations enabled, we're not expecting the acc_on_device builtin
> > +   to be used, so use here a "-O2" wrapper.  */
> > +#pragma acc routine seq
> > +static int __attribute__ ((optimize ("O2")))
> > +ACC_ON_DEVICE (acc_device_t arg)
> > +{
> > +  return acc_on_device (arg);
> > +}
> > +#endif
> > +
> > +#include "acc_on_device-1.c"
> > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
> > index 8308f7c..1c48ab3 100644
> > --- libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
> > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
> > @@ -4,14 +4,12 @@
> >  #include <limits.h>
> >  #include <openacc.h>
> >  
> > -/* TODO: "(int) acc_device_*" casts because of the C++ acc_on_device wrapper
> > -   not behaving as expected for -O0.  */
> >  #pragma acc routine seq
> >  static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
> >  {
> > -  if (acc_on_device ((int) acc_device_host))
> > +  if (acc_on_device (acc_device_host))
> >      return 0;
> > -  else if (acc_on_device ((int) acc_device_nvidia))
> > +  else if (acc_on_device (acc_device_nvidia))
> >      {
> >        unsigned int r;
> >        asm volatile ("mov.u32 %0,%%ctaid.x;" : "=r" (r));
> > @@ -24,9 +22,9 @@ static unsigned int __attribute__ ((optimize ("O2"))) acc_gang ()
> >  #pragma acc routine seq
> >  static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
> >  {
> > -  if (acc_on_device ((int) acc_device_host))
> > +  if (acc_on_device (acc_device_host))
> >      return 0;
> > -  else if (acc_on_device ((int) acc_device_nvidia))
> > +  else if (acc_on_device (acc_device_nvidia))
> >      {
> >        unsigned int r;
> >        asm volatile ("mov.u32 %0,%%tid.y;" : "=r" (r));
> > @@ -39,9 +37,9 @@ static unsigned int __attribute__ ((optimize ("O2"))) acc_worker ()
> >  #pragma acc routine seq
> >  static unsigned int __attribute__ ((optimize ("O2"))) acc_vector ()
> >  {
> > -  if (acc_on_device ((int) acc_device_host))
> > +  if (acc_on_device (acc_device_host))
> >      return 0;
> > -  else if (acc_on_device ((int) acc_device_nvidia))
> > +  else if (acc_on_device (acc_device_nvidia))
> >      {
> >        unsigned int r;
> >        asm volatile ("mov.u32 %0,%%tid.x;" : "=r" (r));


Grüße
 Thomas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PING] [PING] Make the OpenACC C++ acc_on_device wrapper "always inline"
  2017-06-06  6:35           ` [PING] " Thomas Schwinge
@ 2017-06-06  6:58             ` Jakub Jelinek
  2017-06-06 11:16               ` Thomas Schwinge
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Jelinek @ 2017-06-06  6:58 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Nathan Sidwell, GCC Patches

On Tue, Jun 06, 2017 at 08:35:40AM +0200, Thomas Schwinge wrote:
> > > commit 9cc3a384c17e9f692f7864c604d2e2f9fbf0bac9
> > > Author: Thomas Schwinge <thomas@codesourcery.com>
> > > Date:   Tue May 23 13:21:14 2017 +0200
> > > 
> > >     Make the OpenACC C++ acc_on_device wrapper "always inline"
> > >     
> > >             libgomp/
> > >             * openacc.h [__cplusplus] (acc_on_device): Mark as "always
> > >             inline".
> > >             * testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: Remove
> > >             file; test cases already present...
> > >             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: ... in
> > >             this file.  Update.
> > >             * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Remove
> > >             file; test cases now present...
> > >             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c: ... in
> > >             this new file.
> > >             * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Update.

I don't like this very much.
Can't you instead just turn the builtin into BT_FN_INT_VAR and diagnose
during folding if it has no or 2+ arguments or if the argument is not type
compatible with int?

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PING] [PING] Make the OpenACC C++ acc_on_device wrapper "always inline"
  2017-06-06  6:58             ` Jakub Jelinek
@ 2017-06-06 11:16               ` Thomas Schwinge
  2017-06-06 11:20                 ` Jakub Jelinek
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Schwinge @ 2017-06-06 11:16 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Nathan Sidwell, GCC Patches

Hi Jakub!

On Tue, 6 Jun 2017 08:58:21 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Jun 06, 2017 at 08:35:40AM +0200, Thomas Schwinge wrote:
> > > > commit 9cc3a384c17e9f692f7864c604d2e2f9fbf0bac9
> > > > Author: Thomas Schwinge <thomas@codesourcery.com>
> > > > Date:   Tue May 23 13:21:14 2017 +0200
> > > > 
> > > >     Make the OpenACC C++ acc_on_device wrapper "always inline"
> > > >     
> > > >             libgomp/
> > > >             * openacc.h [__cplusplus] (acc_on_device): Mark as "always
> > > >             inline".
> > > >             * testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: Remove
> > > >             file; test cases already present...
> > > >             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: ... in
> > > >             this file.  Update.
> > > >             * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Remove
> > > >             file; test cases now present...
> > > >             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c: ... in
> > > >             this new file.
> > > >             * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Update.
> 
> I don't like this very much.

Thanks for having a look.  Would you please clarify whether "this"
applies to my "always inline" changes and testing additions that you
quoted, or rather to the C++ "acc_on_device" wrapper function as it is
currently present?

> Can't you instead just turn the builtin into BT_FN_INT_VAR and diagnose
> during folding if it has no or 2+ arguments or if the argument is not type
> compatible with int?

Thanks for the suggestion, I'll look into that!

In terms of incremental progress, do you oppose that I commit my existing
patch now, and then rework the builtin in a later patch?


Grüße
 Thomas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PING] [PING] Make the OpenACC C++ acc_on_device wrapper "always inline"
  2017-06-06 11:16               ` Thomas Schwinge
@ 2017-06-06 11:20                 ` Jakub Jelinek
  0 siblings, 0 replies; 13+ messages in thread
From: Jakub Jelinek @ 2017-06-06 11:20 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Nathan Sidwell, GCC Patches

On Tue, Jun 06, 2017 at 01:16:03PM +0200, Thomas Schwinge wrote:
> On Tue, 6 Jun 2017 08:58:21 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Tue, Jun 06, 2017 at 08:35:40AM +0200, Thomas Schwinge wrote:
> > > > > commit 9cc3a384c17e9f692f7864c604d2e2f9fbf0bac9
> > > > > Author: Thomas Schwinge <thomas@codesourcery.com>
> > > > > Date:   Tue May 23 13:21:14 2017 +0200
> > > > > 
> > > > >     Make the OpenACC C++ acc_on_device wrapper "always inline"
> > > > >     
> > > > >             libgomp/
> > > > >             * openacc.h [__cplusplus] (acc_on_device): Mark as "always
> > > > >             inline".
> > > > >             * testsuite/libgomp.oacc-c-c++-common/acc-on-device-2.c: Remove
> > > > >             file; test cases already present...
> > > > >             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: ... in
> > > > >             this file.  Update.
> > > > >             * testsuite/libgomp.oacc-c-c++-common/acc-on-device.c: Remove
> > > > >             file; test cases now present...
> > > > >             * testsuite/libgomp.oacc-c-c++-common/acc_on_device-2.c: ... in
> > > > >             this new file.
> > > > >             * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Update.
> > 
> > I don't like this very much.
> 
> Thanks for having a look.  Would you please clarify whether "this"
> applies to my "always inline" changes and testing additions that you
> quoted, or rather to the C++ "acc_on_device" wrapper function as it is
> currently present?

The C++ acc_on_device wrapper altogether, though of course always inline on
it doesn't sound right either (what if you want to take acc_on_device
address?).

> > Can't you instead just turn the builtin into BT_FN_INT_VAR and diagnose
> > during folding if it has no or 2+ arguments or if the argument is not type
> > compatible with int?
> 
> Thanks for the suggestion, I'll look into that!
> 
> In terms of incremental progress, do you oppose that I commit my existing
> patch now, and then rework the builtin in a later patch?

We are in stage1 and this doesn't seem to be a blocker, I think it is better
to do it right, no need to do it incrementally.

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-06-06 11:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-29 20:28 [gomp4] acc_on_device Nathan Sidwell
2015-10-30  8:40 ` Thomas Schwinge
2015-10-30  8:54   ` Thomas Schwinge
2016-01-04 18:15 ` [gomp4] Fix acc_on_device for C++ Nathan Sidwell
2016-01-06 15:48   ` [openacc] fix unoptimized acc_on_device Nathan Sidwell
2015-10-30  0:27     ` [openacc] on_device fix Nathan Sidwell
2017-05-23 15:34       ` Make the OpenACC C++ acc_on_device wrapper "always inline" (was: [openacc] on_device fix) Thomas Schwinge
2017-05-30 12:36         ` [PING] Make the OpenACC C++ acc_on_device wrapper "always inline" Thomas Schwinge
2017-06-06  6:35           ` [PING] " Thomas Schwinge
2017-06-06  6:58             ` Jakub Jelinek
2017-06-06 11:16               ` Thomas Schwinge
2017-06-06 11:20                 ` Jakub Jelinek
2016-01-06 19:04     ` [openacc] fix unoptimized acc_on_device Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).