From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 90738 invoked by alias); 22 Jul 2015 16:04:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 90723 invoked by uid 89); 22 Jul 2015 16:04:49 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 22 Jul 2015 16:04:46 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1ZHwVc-0004fg-QI from Thomas_Schwinge@mentor.com ; Wed, 22 Jul 2015 09:04:41 -0700 Received: from feldtkeller.schwinge.homeip.net (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Wed, 22 Jul 2015 17:04:39 +0100 From: Thomas Schwinge To: Jakub Jelinek CC: , Julian Brown , Bernd Schmidt , Tobias Burnus , Ilya Verbin Subject: Re: [WIP] OpenMP 4 NVPTX support In-Reply-To: <20150421155839.GZ1725@tucnak.redhat.com> References: <20150421155839.GZ1725@tucnak.redhat.com> User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/24.3.1 (x86_64-pc-linux-gnu) Date: Wed, 22 Jul 2015 16:13:00 -0000 Message-ID: <87twswyy9n.fsf@schwinge.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" X-SW-Source: 2015-07/txt/msg01866.txt.bz2 --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-length: 16755 Hi! On Tue, 21 Apr 2015 17:58:39 +0200, Jakub Jelinek wrote: > Attached is a minimal patch to get at least a trivial OpenMP 4.0 testcase > offloading to NVPTX (the first patch). The second patch is WIP, just fir= st > few needed changes to make libgomp to build for NVPTX (several weeks of w= ork > at least). We're not in particular working on making nvptx offloading work for OpenMP, but also for OpenACC offloading a tiny bit of code is required to be shipped in an offloading device's runtime library -- code that conceptually belongs into libgomp. (On gomp-4_0-branch, it currently lives in libgcc because that was easier to do.) Actually, as I should find out, building a "dummy" (empty) libgomp for nvptx is not actually difficult. Additionally to your second patch (U2; quoted at the end of this email), we'll need the following: commit ea5213c1eb6e525f64aa103312e8e0ac88048122 Author: Thomas Schwinge Date: Wed Jul 22 12:12:41 2015 +0200 Empty libgomp for nvptx =20=20=20=20 $ mkdir libgomp/config/nvptx $ cp libgomp/config/{linux,nvptx}/omp-lock.h $ for f in libgomp{,/config/linux,/config/posix}/*.c; do touch libg= omp/config/nvptx/"$(basename "$f")"; done --- libgomp/config/nvptx/affinity.c | 0 libgomp/config/nvptx/alloc.c | 0 libgomp/config/nvptx/bar.c | 0 libgomp/config/nvptx/barrier.c | 0 libgomp/config/nvptx/critical.c | 0 libgomp/config/nvptx/env.c | 0 libgomp/config/nvptx/error.c | 0 libgomp/config/nvptx/fortran.c | 0 libgomp/config/nvptx/iter.c | 0 libgomp/config/nvptx/iter_ull.c | 0 libgomp/config/nvptx/libgomp-plugin.c | 0 libgomp/config/nvptx/lock.c | 0 libgomp/config/nvptx/loop.c | 0 libgomp/config/nvptx/loop_ull.c | 0 libgomp/config/nvptx/mutex.c | 0 libgomp/config/nvptx/oacc-async.c | 0 libgomp/config/nvptx/oacc-cuda.c | 0 libgomp/config/nvptx/oacc-host.c | 0 libgomp/config/nvptx/oacc-init.c | 0 libgomp/config/nvptx/oacc-mem.c | 0 libgomp/config/nvptx/oacc-parallel.c | 0 libgomp/config/nvptx/oacc-plugin.c | 0 libgomp/config/nvptx/omp-lock.h | 12 ++++++++++++ libgomp/config/nvptx/ordered.c | 0 libgomp/config/nvptx/parallel.c | 0 libgomp/config/nvptx/proc.c | 0 libgomp/config/nvptx/ptrlock.c | 0 libgomp/config/nvptx/sections.c | 0 libgomp/config/nvptx/sem.c | 0 libgomp/config/nvptx/single.c | 0 libgomp/config/nvptx/splay-tree.c | 0 libgomp/config/nvptx/target.c | 0 libgomp/config/nvptx/task.c | 0 libgomp/config/nvptx/team.c | 0 libgomp/config/nvptx/time.c | 0 libgomp/config/nvptx/work.c | 0 36 files changed, 12 insertions(+) diff --git libgomp/config/nvptx/affinity.c libgomp/config/nvptx/affinity.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/alloc.c libgomp/config/nvptx/alloc.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/bar.c libgomp/config/nvptx/bar.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/barrier.c libgomp/config/nvptx/barrier.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/critical.c libgomp/config/nvptx/critical.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/env.c libgomp/config/nvptx/env.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/error.c libgomp/config/nvptx/error.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/fortran.c libgomp/config/nvptx/fortran.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/iter.c libgomp/config/nvptx/iter.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/iter_ull.c libgomp/config/nvptx/iter_ull.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/libgomp-plugin.c libgomp/config/nvptx/libgo= mp-plugin.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/lock.c libgomp/config/nvptx/lock.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/loop.c libgomp/config/nvptx/loop.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/loop_ull.c libgomp/config/nvptx/loop_ull.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/mutex.c libgomp/config/nvptx/mutex.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/oacc-async.c libgomp/config/nvptx/oacc-asyn= c.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/oacc-cuda.c libgomp/config/nvptx/oacc-cuda.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/oacc-host.c libgomp/config/nvptx/oacc-host.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/oacc-init.c libgomp/config/nvptx/oacc-init.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/oacc-mem.c libgomp/config/nvptx/oacc-mem.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/oacc-parallel.c libgomp/config/nvptx/oacc-p= arallel.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/oacc-plugin.c libgomp/config/nvptx/oacc-plu= gin.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/omp-lock.h libgomp/config/nvptx/omp-lock.h new file mode 100644 index 0000000..2ca7c5e --- /dev/null +++ libgomp/config/nvptx/omp-lock.h @@ -0,0 +1,12 @@ +/* This header is used during the build process to find the size and=20 + alignment of the public OpenMP locks, so that we can export data + structures without polluting the namespace. + + When using the Linux futex primitive, non-recursive locks require + one int. Recursive locks require we identify the owning task + and so require in addition one int and a pointer. */ + +typedef int omp_lock_t; +typedef struct { int lock, count; void *owner; } omp_nest_lock_t; +typedef int omp_lock_25_t; +typedef struct { int owner, count; } omp_nest_lock_25_t; diff --git libgomp/config/nvptx/ordered.c libgomp/config/nvptx/ordered.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/parallel.c libgomp/config/nvptx/parallel.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/proc.c libgomp/config/nvptx/proc.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/ptrlock.c libgomp/config/nvptx/ptrlock.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/sections.c libgomp/config/nvptx/sections.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/sem.c libgomp/config/nvptx/sem.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/single.c libgomp/config/nvptx/single.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/splay-tree.c libgomp/config/nvptx/splay-tre= e.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/target.c libgomp/config/nvptx/target.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/task.c libgomp/config/nvptx/task.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/team.c libgomp/config/nvptx/team.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/time.c libgomp/config/nvptx/time.c new file mode 100644 index 0000000..e69de29 diff --git libgomp/config/nvptx/work.c libgomp/config/nvptx/work.c new file mode 100644 index 0000000..e69de29 Next, we can then (on gomp-4_0-branch) move the libgcc code into libgomp: commit d8d75d17630d7633be4f1733fd195a104cb2ccc4 Author: Thomas Schwinge Date: Wed Jul 22 13:05:16 2015 +0200 [nvptx] Move GOMP stuff from libgcc to libgomp --- libgcc/config.host | 6 +--- libgcc/config/nvptx/gomp-acc_on_device.c | 9 ----- libgcc/config/nvptx/gomp-atomic.asm | 37 --------------------- libgcc/config/nvptx/t-nvptx | 11 ------ libgomp/config/nvptx/critical.c | 57 ++++++++++++++++++++++++++++= ++++ libgomp/config/nvptx/oacc-init.c | 35 ++++++++++++++++++++ 6 files changed, 93 insertions(+), 62 deletions(-) diff --git libgcc/config.host libgcc/config.host index ee7ce03..3a2c75d 100644 --- libgcc/config.host +++ libgcc/config.host @@ -1304,11 +1304,7 @@ mep*-*-*) ;; nvptx-*) tmake_file=3D"$tmake_file nvptx/t-nvptx" - if test "x${enable_as_accelerator_for}" !=3D x; then - extra_parts=3D"crt0.o libgomp.a libgomp.spec" - else - extra_parts=3D"crt0.o" - fi + extra_parts=3D"crt0.o" ;; *) echo "*** Configuration ${host} not supported" 1>&2 diff --git libgcc/config/nvptx/gomp-acc_on_device.c libgcc/config/nvptx/gom= p-acc_on_device.c deleted file mode 100644 index e4278f9..0000000 --- libgcc/config/nvptx/gomp-acc_on_device.c +++ /dev/null @@ -1,9 +0,0 @@ -int acc_on_device(int d) -{ - return __builtin_acc_on_device(d); -} - -int acc_on_device_h_(int *d) -{ - return acc_on_device(*d); -} diff --git libgcc/config/nvptx/gomp-atomic.asm libgcc/config/nvptx/gomp-ato= mic.asm deleted file mode 100644 index ae9d925..0000000 --- libgcc/config/nvptx/gomp-atomic.asm +++ /dev/null @@ -1,37 +0,0 @@ - -// BEGIN PREAMBLE - .version 3.1 - .target sm_30 - .address_size 64 - .extern .shared .u8 sdata[]; -// END PREAMBLE - -// BEGIN VAR DEF: libgomp_ptx_lock -.global .align 4 .u32 libgomp_ptx_lock; - -// BEGIN GLOBAL FUNCTION DECL: GOMP_atomic_start -.visible .func GOMP_atomic_start; -// BEGIN GLOBAL FUNCTION DEF: GOMP_atomic_start -.visible .func GOMP_atomic_start -{ - .reg .pred %p<2>; - .reg .s32 %r<2>; - .reg .s64 %rd<2>; -BB5_1: - mov.u64 %rd1, libgomp_ptx_lock; - atom.global.cas.b32 %r1, [%rd1], 0, 1; - setp.ne.s32 %p1, %r1, 0; - @%p1 bra BB5_1; - ret; - } -// BEGIN GLOBAL FUNCTION DECL: GOMP_atomic_end -.visible .func GOMP_atomic_end; -// BEGIN GLOBAL FUNCTION DEF: GOMP_atomic_end -.visible .func GOMP_atomic_end -{ - .reg .s32 %r<2>; - .reg .s64 %rd<2>; - mov.u64 %rd1, libgomp_ptx_lock; - atom.global.exch.b32 %r1, [%rd1], 0; - ret; - } diff --git libgcc/config/nvptx/t-nvptx libgcc/config/nvptx/t-nvptx index c8741c4..0c2cea0 100644 --- libgcc/config/nvptx/t-nvptx +++ libgcc/config/nvptx/t-nvptx @@ -13,14 +13,3 @@ crt0.o: $(srcdir)/config/nvptx/crt0.s # support it, and it may cause the build to fail, because of alloca usage,= for # example. INHIBIT_LIBC_CFLAGS =3D -Dinhibit_libc - -gomp-acc_on_device.o: $(srcdir)/config/nvptx/gomp-acc_on_device.c - $(gcc_compile) -c -fno-builtin-acc_on_device $< -gomp-atomic.o: $(srcdir)/config/nvptx/gomp-atomic.asm - cp $< $@ - -OBJS_libgomp=3D gomp-acc_on_device.o gomp-atomic.o -libgomp.a: $(OBJS_libgomp) - $(AR_CREATE_FOR_TARGET) $@ $(OBJS_libgomp) -libgomp.spec: - echo "*link_gomp: -lgomp" >$@ diff --git libgomp/config/nvptx/critical.c libgomp/config/nvptx/critical.c index e69de29..1f55aad 100644 --- libgomp/config/nvptx/critical.c +++ libgomp/config/nvptx/critical.c @@ -0,0 +1,57 @@ +/* GOMP atomic routines + + Copyright (C) 2014-2015 Free Software Foundation, Inc. + + Contributed by Mentor Embedded. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT = ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNE= SS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +__asm__ ("// BEGIN VAR DEF: libgomp_ptx_lock\n" + ".global .align 4 .u32 libgomp_ptx_lock;\n" + "\n" + "// BEGIN GLOBAL FUNCTION DECL: GOMP_atomic_start\n" + ".visible .func GOMP_atomic_start;\n" + "// BEGIN GLOBAL FUNCTION DEF: GOMP_atomic_start\n" + ".visible .func GOMP_atomic_start\n" + "{\n" + " .reg .pred %p<2>;\n" + " .reg .s32 %r<2>;\n" + " .reg .s64 %rd<2>;\n" + "BB5_1:\n" + " mov.u64 %rd1, libgomp_ptx_lock;\n" + " atom.global.cas.b32 %r1, [%rd1], 0, 1;\n" + " setp.ne.s32 %p1, %r1, 0;\n" + " @%p1 bra BB5_1;\n" + " ret;\n" + " }\n" + "// BEGIN GLOBAL FUNCTION DECL: GOMP_atomic_end\n" + ".visible .func GOMP_atomic_end;\n" + "// BEGIN GLOBAL FUNCTION DEF: GOMP_atomic_end\n" + ".visible .func GOMP_atomic_end\n" + "{\n" + " .reg .s32 %r<2>;\n" + " .reg .s64 %rd<2>;\n" + " mov.u64 %rd1, libgomp_ptx_lock;\n" + " atom.global.exch.b32 %r1, [%rd1], 0;\n" + " ret;\n" + " }"); diff --git libgomp/config/nvptx/oacc-init.c libgomp/config/nvptx/oacc-init.c index e69de29..e2c54c9 100644 --- libgomp/config/nvptx/oacc-init.c +++ libgomp/config/nvptx/oacc-init.c @@ -0,0 +1,35 @@ +/* OpenACC Runtime initialization routines + + Copyright (C) 2014-2015 Free Software Foundation, Inc. + + Contributed by Mentor Embedded. + + This file is part of the GNU Offloading and Multi Processing Library + (libgomp). + + Libgomp is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + Libgomp is distributed in the hope that it will be useful, but WITHOUT = ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNE= SS + FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#include "openacc.h" + +int +acc_on_device (acc_device_t d) +{ + return __builtin_acc_on_device (d); +} This, obviously, is still very bare-bones, but it works, and can be extended later. > we build libgcc.a, libc.a, libm.a, libgfortran.a (and in the future hopef= ully libgomp.a), > nothing attempts to link those in :(. Together with the changes highlighted above, I'd then work on merging into trunk the nvptx linking code present on gomp-4_0-branch, OK? For reference, your second patch (U2): > --- libgomp/configure.tgt.jj 2015-04-21 08:38:00.000000000 +0200 > +++ libgomp/configure.tgt 2015-04-21 10:59:30.857197475 +0200 > @@ -151,6 +151,10 @@ case "${target}" in > XLDFLAGS=3D"${XLDFLAGS} -lpthread" > ;; >=20=20 > + nvptx*-*-*) > + config_path=3D"nvptx" > + ;; > + > *) > ;; >=20=20 > --- libgomp/libgomp.h.jj 2015-04-21 08:38:00.000000000 +0200 > +++ libgomp/libgomp.h 2015-04-21 11:15:35.952217394 +0200 > @@ -40,7 +40,9 @@ > #include "gstdint.h" > #include "libgomp-plugin.h" >=20=20 > +#ifdef HAVE_PTHREAD_H > #include > +#endif > #include > #include > #include > --- libgomp/configure.ac.jj 2015-04-21 08:38:00.000000000 +0200 > +++ libgomp/configure.ac 2015-04-21 11:06:38.418117846 +0200 > @@ -179,6 +179,9 @@ case "$host" in > *-*-rtems*) > # RTEMS supports Pthreads, but the library is not available at GCC b= uild time. > ;; > + nvptx*-*-*) > + # NVPTX does not support Pthreads, has its own code replacement. > + ;; > *) > # Check to see if -pthread or -lpthread is needed. Prefer the forme= r. > # In case the pthread.h system header is not found, this test will f= ail. > --- configure.ac.jj 2015-04-21 08:38:09.000000000 +0200 > +++ configure.ac 2015-04-21 09:14:50.107827544 +0200 > @@ -539,6 +539,9 @@ if test x$enable_libgomp =3D x ; then > ;; > *-*-darwin* | *-*-aix*) > ;; > + # And on NVPTX as an offloading target. > + nvptx*-*-*) > + ;; > *) > noconfigdirs=3D"$noconfigdirs target-libgomp" > ;; Gr=C3=BC=C3=9Fe, Thomas --=-=-= Content-Type: application/pgp-signature Content-length: 472 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVr78EAAoJEPoxNhtoi6COJIAIAIRp8XlIu/Ho0a4msAX6mpcN WgFJxbOSGg4/b4B1bY0T0t0qDBl87OFH+JhqNlSDEbKNN4HXJsvIuRfjqNYjQvUh a6KBUZOiGayp0+RFZkvrfcEh8n+/mE8K+GV5IXdlBzYAcb+Ifa3cJh9O9PMpRC+B l37HyKT90lBY5ugnzzrB1djcPH6A4lCqXpexYSEfTxh3SmiyLiT+to9IOmSLC4af U44mmT1a8C1HZQ9B1qFDZAV1GuN0b/uZxB/WXhh8mqt+yJi2nMqDlNpdv11+C/Mc Z2cFJE24LXp896b81H35wOJdJQ4HJd3Al+r49cjqqI4+vUD2MRhNJwmBtNp3130= =W9tB -----END PGP SIGNATURE----- --=-=-=--