public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Cesar Philippidis <cesar@codesourcery.com>
To: Fortran List <fortran@gcc.gnu.org>,
	"gcc-patches@gcc.gnu.org"	<gcc-patches@gcc.gnu.org>
Subject: [gomp4] Extend libgomp's fortran test coverage of host_data
Date: Thu, 11 Aug 2016 22:39:00 -0000	[thread overview]
Message-ID: <5113e7e2-25c8-e31a-73f3-9c9b2c6edc79@codesourcery.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1169 bytes --]

This patch ports libgomp.oacc-c-c++-common/host_data-1.c to fortran.
Fortunately, the existing fortran host_data infrastructure was already
in place, so I had to do was port over the calls to Nvidia's CUDA BLAS
library.

There are a couple of details that one needs to consider when using CUDA
BLAS in gfortran. First, if you want to use Nvidia's wrapper functions
written in C to set up the appropriate cuda device contexts, then use
the thunking variants of the functions described here
<http://docs.nvidia.com/cuda/cublas/#appendix-b-cublas-fortran-bindings>.
Otherwise, it's much easier to let gfortran's OpenACC runtime manage the
data mappings and use the host_data clause to pass those data pointers
to the CUDA BLAS library calls.

In terms of calling the actual CUDA BLAS functions, there's already good
documentation for that here
<https://gcc.gnu.org/onlinedocs/gfortran/Interoperability-with-C.html>.
Basically, those library calls need a function interface with a special
C binding. The function I tested in host_data-2.f90 is cublasSaxpy.
Other function interfaces will need to be created as necessary.

I've applied this patch to gomp-4_0-branch.

Cesar

[-- Attachment #2: gomp4-fortran-host_data.diff --]
[-- Type: text/x-patch, Size: 2781 bytes --]

2016-08-11  Cesar Philippidis  <cesar@codesourcery.com>

	libgomp/
	* testsuite/libgomp.oacc-fortran/host_data-1.f90: Remove stale xfail.
	* testsuite/libgomp.oacc-fortran/host_data-2.f90: New test.


diff --git a/libgomp/testsuite/libgomp.oacc-fortran/host_data-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/host_data-1.f90
index 497b0f7..69a491d 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/host_data-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/host_data-1.f90
@@ -1,9 +1,6 @@
 ! { dg-do run }
 ! { dg-additional-options "-cpp" }
 
-! { dg-xfail-if "TODO" { *-*-* } }
-! { dg-excess-errors "TODO" }
-
 program test
   implicit none
 
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/host_data-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/host_data-2.f90
new file mode 100644
index 0000000..68e14e3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/host_data-2.f90
@@ -0,0 +1,85 @@
+! Test host_data interoperability with CUDA blas.
+
+! { dg-do run { target openacc_nvidia_accel_selected } }
+! { dg-additional-options "-lcublas" }
+
+program test
+  implicit none
+
+  integer, parameter :: N = 10
+  integer :: i
+  real*4 :: x_ref(N), y_ref(N), x(N), y(N), a
+  
+  interface
+     subroutine cublassaxpy(N, alpha, x, incx, y, incy) bind(c, name="cublasSaxpy")
+       use iso_c_binding
+       integer(kind=c_int), value :: N
+       real*4, value :: alpha
+       type(*), dimension(*) :: x
+       integer(kind=c_int), value :: incx
+       type(*), dimension(*) :: y
+       integer(kind=c_int), value :: incy
+     end subroutine cublassaxpy
+  end interface
+
+  a = 2.0
+
+  do i = 1, N
+     x(i) = 4.0 * i
+     y(i) = 3.0
+     x_ref(i) = x(i)
+     y_ref(i) = y(i)
+  end do
+
+  call saxpy (N, a, x_ref, y_ref)
+  
+  !$acc data copyin (x) copy (y)
+  !$acc host_data use_device (x, y)
+  call cublassaxpy(N, a, x, 1, y, 1)
+  !$acc end host_data
+  !$acc end data
+  
+  do i = 1, N
+     if (y(i) .ne. y_ref(i)) call abort
+  end do
+
+  !$acc data create (x) copyout (y)
+  !$acc parallel loop
+  do i = 1, N
+     y(i) = 3.0
+  end do
+  !$acc end parallel loop
+
+  !$acc host_data use_device (x, y)
+  call cublassaxpy(N, a, x, 1, y, 1)
+  !$acc end host_data
+  !$acc end data
+
+  do i = 1, N
+     if (y(i) .ne. y_ref(i)) call abort
+  end do
+
+  y(:) = 3.0
+  
+  !$acc data copyin (x) copyin (a) copy (y)
+  !$acc parallel present (x) pcopy (y) present (a)
+  call saxpy (N, a, x, y)
+  !$acc end parallel
+  !$acc end data
+
+  do i = 1, N
+     if (y(i) .ne. y_ref(i)) call abort
+  end do
+end program test
+
+subroutine saxpy (nn, aa, xx, yy)
+  integer :: nn
+  real*4 :: aa, xx(nn), yy(nn)
+  integer i
+  real*4 :: t
+  !$acc routine
+
+  do i = 1, nn
+    yy(i) = yy(i) + aa * xx(i)
+  end do
+end subroutine saxpy

             reply	other threads:[~2016-08-11 22:39 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-11 22:39 Cesar Philippidis [this message]
2016-08-12 19:39 ` Cesar Philippidis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5113e7e2-25c8-e31a-73f3-9c9b2c6edc79@codesourcery.com \
    --to=cesar@codesourcery.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).