From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18925 invoked by alias); 20 Dec 2019 16:47:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 18880 invoked by uid 89); 20 Dec 2019 16:47:25 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-20.3 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_SHORT autolearn=ham version=3.3.1 spammy=10024, arise, prop, Wire X-HELO: esa4.mentor.iphmx.com Received: from esa4.mentor.iphmx.com (HELO esa4.mentor.iphmx.com) (68.232.137.252) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 20 Dec 2019 16:47:14 +0000 IronPort-SDR: /J4P8HRjFfyo5GJ3lBP2+tcMJ2+n4dNa/1WXoCZGMFzEpN2PwW/ZFqz9JrX1YZA+CNfh+XXFVY +7jIu7KbUqLWJSbKsgsNtGCdWMKD/MOlQfjYFlH7PwMAuPDryDZsV24YuP/gFCaCYIJUQkeTEI nykmlsUkgFdMG34NSEc+iMXVi1fwPXPshXHIhZc1lIClm86xzlR6E7oyNhVn5yODr+uDxDZn+j ykcI4hQ9SluyxPxaMO1LSg4mgQjRGthLr1gvgzV6PmqgMK866f11ecg3qk8nRoLuvC663z3EZx SLQ= Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 20 Dec 2019 08:47:12 -0800 IronPort-SDR: y+O5g+G1iybGQLo046nCb1oXq/aNBkeTV0bAdwrUiZuecDVnNZOh1j/74tuHRKEhp9hr++zjUw 4gnz27KvH3MtGUsWi23O/Q8sTDEdUij0IC+aJr2s064PUUae67lAYVgK8UASAdxMUae3cixAsL pdM3/cVEn5TDOrFtk11EFTPHA6t33uAXpYcRWriW9d7WmVFe+87Q5DYrGPvPz84+Haz8hMAuiy TlAJDJDf3gyFYjBWW7j6UuTvxaMqk6V9VcmZZ+Zgx54Iq3gZx5IS5R/IGkqfZjolbGFKd1EAHF 278= From: "Harwath, Frederik" Subject: Re: [PATCH] Add OpenACC 2.6 `acc_get_property' support CC: , , Andrew Stubbs , Julian Brown , Tobias Burnus , Jakub Jelinek References: <20191113153215.17750-1-frederik@codesourcery.com> <87imp01jr3.fsf@euler.schwinge.homeip.net> <20191114153531.7493-1-frederik@codesourcery.com> <87v9qfyiyz.fsf@euler.schwinge.homeip.net> To: , GCC Patches , Thomas Schwinge X-Pep-Version: 2.0 Message-ID: <1c2e2d57-31ce-8e4d-c8b9-c2fbc7091e86@codesourcery.com> Date: Fri, 20 Dec 2019 17:11:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.3.1 MIME-Version: 1.0 In-Reply-To: <87v9qfyiyz.fsf@euler.schwinge.homeip.net> Content-Type: multipart/mixed; boundary="------------71E4B669165A735C64136B22" Return-Path: frederik@codesourcery.com X-IsSubscribed: yes X-SW-Source: 2019-12/txt/msg01459.txt.bz2 --------------71E4B669165A735C64136B22 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-length: 14724 Hi Thomas, thanks for the review! I have attached a revised patch. > > There is no AMD GCN support yet. This will be added later on. > > ACK, just to note that there now is a 'libgomp/plugin/plugin-gcn.c' that > at least needs to get a stub implementation (can mostly copy from > 'libgomp/plugin/plugin-hsa.c'?) as otherwise the build will fail. Yes, I have added a stub. A full implementation will follow soon. The implementation in the OG9 branch that Andrew mentioned will need a bit of polishing. > Tobias has generally reviewed the Fortran bits, correct? Yes, he has done that internally. > | Before Frederik starts working on integrating this into GCC trunk, do y= ou > | (Jakub) agree with the libgomp plugin interface changes as implemented = by > | Maciej? For example, top-level 'GOMP_OFFLOAD_get_property' function in > | 'struct gomp_device_descr' instead of stuffing this into its > | 'acc_dispatch_t openacc'. (I never understood why the OpenACC functions > | need to be segregated like they are.) > > Jakub didn't answer, but I now myself decided that we should group this > with the other OpenACC libgomp-plugin functions, as this interface is > defined in terms of OpenACC-specific stuff such as 'acc_device_t'. > Frederik, please work on that, also try to move function definitions etc. > into appropriate places in case they aren't; ask if you need help. > That needs to be updated. Is it ok to do this in a separate follow-up patch? > > .../acc-get-property-2.c | 68 +++++++++ > > .../acc-get-property-3.c | 19 +++ > > .../acc-get-property-aux.c | 60 ++++++++ > > .../acc-get-property.c | 75 ++++++++++ > > .../libgomp.oacc-fortran/acc-get-property.f90 | 80 ++++++++++ > > Please name all these 'acc_get_property*', which is the name of the > interface tested. Ok. > > --- a/include/gomp-constants.h > > +++ b/include/gomp-constants.h > > @@ -178,6 +178,20 @@ enum gomp_map_kind > >=3D20=3D20 > > #define GOMP_DEVICE_ICV -1 > > #define GOMP_DEVICE_HOST_FALLBACK -2 > > +#define GOMP_DEVICE_CURRENT -3 > [...] > > Not should if this should be grouped with 'GOMP_DEVICE_ICV', > 'GOMP_DEVICE_HOST_FALLBACK', for it is not related to there. > > [...] > > Should this actually get value '-1' instead of '-3'? Or, is the OpenACC > 'acc_device_t' code already paying special attention to negative values > '-1', '-2'? (I don't think so.) > | Also, 'acc_device_current' is a libgomp-internal thing (doesn't interfa= ce > | with the compiler proper), so strictly speaking 'GOMP_DEVICE_CURRENT' > | isn't needed in 'include/gomp-constants.h'. But probably still a good > | idea to list it there, in this canonical place, to keep the several lis= ts > | of device types coherent. > still wonder about that... ;-) I have removed GOMP_DEVICE_CURRENT from gomp-constants.h. Changing the value of GOMP_DEVICE_ICV violates the following static asserts= in oacc-parallel.c: /* In the ABI, the GOACC_FLAGs are encoded as an inverted bitmask, so that= we continue to support the following two legacy values. */ _Static_assert (GOACC_FLAGS_UNMARSHAL (GOMP_DEVICE_ICV) =3D=3D 0, "legacy GOMP_DEVICE_ICV broken"); _Static_assert (GOACC_FLAGS_UNMARSHAL (GOMP_DEVICE_HOST_FALLBACK) =3D=3D GOACC_FLAG_HOST_FALLBACK, "legacy GOMP_DEVICE_HOST_FALLBACK broken"); > > +/* Device property codes. Keep in sync with > > + libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_device_property_t > > | Same thing, libgomp-internal, not sure whether to list these here? > > > + as well as libgomp/libgomp-plugin.h. */ > > (Not sure why 'libgomp/libgomp-plugin.h' is relevant here?) It does not seem to be relevant. Right now, openacc_lib.h is also not relev= ant. I have removed both file names from the comment. > > +#define GOMP_DEVICE_PROPERTY_MEMORY 1 > > +#define GOMP_DEVICE_PROPERTY_FREE_MEMORY 2 > > +#define GOMP_DEVICE_PROPERTY_NAME 0x10001 > > +#define GOMP_DEVICE_PROPERTY_VENDOR 0x10002 > > +#define GOMP_DEVICE_PROPERTY_DRIVER 0x10003 > > + > > +/* Internal property mask to tell numeric and string values apart. */ > > +#define GOMP_DEVICE_PROPERTY_STRING_MASK 0x10000 > > (Maybe should use an 'enum'?) I have changed this to an enum. However, this does not improve the code muc= h, since we cannot use the enum for the function arguments in the plugins because gomp-constants.h is not included from there. > Maybe this stuff should move from 'include/gomp-constants.h' to > 'libgomp/oacc-int.h'. I'll think about that again, when I'm awake again > tomorrow. ;-) Have you made up your mind yet? :-) > > --- a/libgomp/libgomp-plugin.h > > +++ b/libgomp/libgomp-plugin.h > > @@ -54,6 +54,13 @@ enum offload_target_type > > OFFLOAD_TARGET_TYPE_GCN =3D3D 8 > > }; > >=3D20=3D20 > > +/* Container type for passing device properties. */ > > +union gomp_device_property_value > > +{ > > + void *ptr; > > + uintmax_t val; > > +}; > > Why wouldn't that be 'size_t', 'const char *', as the actual data types > used? (Maybe I'm missing something.) I do not see a reason for this either. Changed. > > --- a/libgomp/libgomp.map > > +++ b/libgomp/libgomp.map > > @@ -502,6 +502,14 @@ GOACC_2.0.1 { > > GOACC_parallel_keyed; > > } GOACC_2.0; > >=3D20=3D20 > > +OACC_2.6 { > > + global: > > + acc_get_property; > > + acc_get_property_h_; > > + acc_get_property_string; > > + acc_get_property_string_h_; > > +} OACC_2.5; > > + > > GOMP_PLUGIN_1.0 { > > global: > > GOMP_PLUGIN_malloc; > > That's not correct: 'OACC_2.6' should come after 'OACC_2.5.1', and > also inherit from that one. Fixed. > > --- a/libgomp/oacc-init.c > > +++ b/libgomp/oacc-init.c > > > +static union gomp_device_property_value > > +get_property_any (int ord, acc_device_t d, acc_device_property_t prop) > > +{ > > + if (!acc_known_device_type (d)) > > + unknown_device_type_error(d); > > Checking isn't needed here for this is an internal interface? Right. > > + > > + union gomp_device_property_value propval; > > + struct gomp_device_descr *dev; > > + struct goacc_thread *thr; > > Generally, in new code, we try to place these next to their first use. Very reasonable. Adapted. > > + > > + if (d =3D3D=3D3D acc_device_none) > > + return (union gomp_device_property_value) { .val =3D3D 0 }; > > + > > + goacc_lazy_initialize (); > > + thr =3D3D goacc_thread (); > > + > > + if (d =3D3D=3D3D acc_device_current && (!thr || !thr->dev)) > > + return (union gomp_device_property_value) { .val =3D3D 0 }; > > Should we use a 'nullval' here, as used elsewhere? We could, but we can also remove those checks completely. > Also, this checking seems a bit convoluted; shouldn't this be integrated > into the following? It's certainly not necessary to special-case > 'acc_device_none' before 'goacc_lazy_initialize' etc.? Yes, I suppose that the original implementer wanted to handle those boundary cases as efficiently as possible. But it is not strictly necessary to do this and I agree that the code becomes more readable without those checks. > > + > > + if (d =3D3D=3D3D acc_device_current) > > + { > > + dev =3D3D thr->dev; > > + } > > + else > > + { > > + int num_devices; > > + > > + gomp_mutex_lock (&acc_device_lock); > > + > > + dev =3D3D resolve_device (d, false); > > Why call this without 'fail_is_error' flag here? Good question. I have set it to true to ensure that we get the device type checking that resolve_device performs. > > --- a/libgomp/openacc.f90 > > +++ b/libgomp/openacc.f90 > > @@ -28,7 +28,7 @@ > > ! . > >=3D20=3D20 > > module openacc_kinds > > - use iso_fortran_env, only: int32 > > + use iso_fortran_env, only: int32, int64 > > implicit none > >=3D20=3D20 > > private :: int32 > > @@ -47,6 +47,21 @@ module openacc_kinds > > integer (acc_device_kind), parameter :: acc_device_not_host =3D3D 4 > > integer (acc_device_kind), parameter :: acc_device_nvidia =3D3D 5 > > integer (acc_device_kind), parameter :: acc_device_gcn =3D3D 8 > > + integer (acc_device_kind), parameter :: acc_device_current =3D3D -3 > > + > > + public :: acc_device_property > > + > > + integer, parameter :: acc_device_property =3D3D int64 > > Why 'int64'? I changed this to 'int32', but please tell if there's a > reason for 'int64'. int32 is too narrow as - conforming to the OpenACC spec - acc_device_proper= ty is also used for the return type of acc_get_property (a bit strang, isn't i= t?). int64 also did not seem quite right. I have changed the type of acc_device_= property to c_size_t to match the type that is used internally and as the return typ= e of the corresponding C function. > Is it a conscious decision that we're not supporting the new > 'acc_get_property' interface via 'openacc_lib.h', which is (or, used to > be) an alternative to the Fortran 'openacc' module? It was not my decision to leave it out. I have to admit that I did not noti= ce the omission. > As of OpenACC 2.5, 'openacc_lib.h' has been deprecated ("no longer > supported"), but so far, we continued to support it, and it's (maybe?) > strange when that one now works for everything but the 'acc_get_property' > interface? Or, is that a statement that users really should move to the > Fortran 'openacc' module? Should they? You are probably best qualified to answer this :-). > > +typedef enum acc_device_property_t { > > + /* Keep in sync with include/gomp-constants.h. */ > > + /* Start from 1 to catch uninitialized use. */ > > + acc_property_memory =3D3D 1, > > + acc_property_free_memory =3D3D 2, > > + acc_property_name =3D3D 0x10001, > > + acc_property_vendor =3D3D 0x10002, > > + acc_property_driver =3D3D 0x10003 > > +} acc_device_property_t; > > Do we also need the magic here so that "Ensure enumeration is layout > compatible with int"? But I see that is not done for the 'typedef enum > acc_async_t' either. I don't remember the history behind that. I understand the "Ensure enumeration is layout compatible with int" comment= for acc_device_t, but I fail to see how those values achieve this. If you do not see a good reason to keep those magic values, I would change = them to 3, 4, 5 and if we need the "layout compatibility", I would rather do thi= s as it is done for acc_device_t. > > --- a/libgomp/plugin/plugin-hsa.c > > +++ b/libgomp/plugin/plugin-hsa.c > > @@ -699,6 +699,32 @@ GOMP_OFFLOAD_get_num_devices (void) > > return hsa_context.agent_count; > > } > >=3D20=3D20 > > +/* Part of the libgomp plugin interface. Return the value of property > > + PROP of agent number N. */ > > + > > +union gomp_device_property_value > > +GOMP_OFFLOAD_get_property (int n, int prop) > > +{ > > + union gomp_device_property_value nullval =3D3D { .val =3D3D 0 }; > > + > > + if (!init_hsa_context ()) > > + return nullval; > > I'm not familiar with that code, but similar to other plugins, > 'init_hsa_context' already is called via 'GOMP_OFFLOAD_get_num_devices' > (and 'GOMP_OFFLOAD_init_device', hmm...), so probably don't need to call > it here? Since Martin Jambor wrote that the call does no harm, I would just keep it. > > + > > + switch (prop) > > + { > > + case GOMP_DEVICE_PROPERTY_VENDOR: > > + return (union gomp_device_property_value) { .ptr =3D3D "AMD" }; > > + default: > > + return nullval; > > + } > > +} > > Not sure if "AMD" is actually correct here -- isn't HSA a > vendor-independent standard? I have changed "AMD" to "HSA". > > +union gomp_device_property_value > > +GOMP_OFFLOAD_get_property (int n, int prop) > > +{ > > + union gomp_device_property_value propval =3D3D { .val =3D3D 0 }; > > + > > + pthread_mutex_lock (&ptx_dev_lock); > > Everything (?) else seems to be accessing 'ptx_devices' without locking? > (I don't quickly understand the locking protocol used there... Will look > again tomorrow.) GOMP_OFFLOAD_init_device, GOMP_OFFLOAD_fini_device take the lock before accessing ptx_devices. I kept it. > > + CUDA_CALL_ERET (propval, cuCtxGetDevice, &ctxdev); > > + if (ptx_dev->dev =3D3D=3D3D ctxdev) > > + CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem); > > + else if (ptx_dev->ctx) > > + { > > + CUcontext old_ctx; > > + > > + CUDA_CALL_ERET (propval, cuCtxPushCurrent, ptx_dev->ctx); > > + CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem); > > + CUDA_CALL_ASSERT (cuCtxPopCurrent, &old_ctx); > > + } > > + else > > + { > > + CUcontext new_ctx; > > + > > + CUDA_CALL_ERET (propval, cuCtxCreate, &new_ctx, CU_CTX_SCHED_AUTO, > > + ptx_dev->dev); > > + CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem); > > + CUDA_CALL_ASSERT (cuCtxDestroy, new_ctx); > > + } > > > (Have not yet reviewed that CUDA magic. Do you understand that?) Yes, sort of. If the current thread's CUDA context is the a context for the= device, perform the query (cuMemGetInfo) right away. Otherwise, we have to set the context first to ensure that we query the rig= ht device. Note that this is not necessary for the functions that are used to retrieve the values of the other propert= ies because they use CUDA functions that take a device argument. If the ptx_dev already contains a context for the device, = set it temporarily for the duration of the query. Otherwise, do the same with a newly created context for the device and disp= ose of the new context afterwards. The last case might arise only if the device has not been initialized, as G= OMP_OFFLOAD_init_device calls nvptx_open_device which creates the context ptx_dev->ctx for ptx_dev-= >dev. > > + case GOMP_DEVICE_PROPERTY_DRIVER: > > + propval.ptr =3D3D cuda_driver_version; > > + break; > > + default: > > + GOMP_PLUGIN_error("Unknown OpenACC device-property"); > > + } > > [...] > > I see 'libgomp/oacc-host.c:host_get_property', > 'libgomp/plugin/plugin-hsa.c:GOMP_OFFLOAD_get_property', > 'liboffloadmic/plugin/libgomp-plugin-intelmic.cpp:GOMP_OFFLOAD_get_proper= ty' > do have a 'default: return nullval'; that's probably what we need to do > here, too? Yes, I have changed this and added test cases that check the return value f= or invalid properties. > > --- a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp > [...] > > + case GOMP_DEVICE_PROPERTY_VENDOR: > > + /* TODO: "error: invalid conversion from 'const void*' to 'void*= ' =3D > [-fpermissive]" */ > > + return (union gomp_device_property_value) { .ptr =3D3D (char *)= "In=3D > tel" }; > > Type cast maybe unnecessary per my 'libgomp/libgomp-plugin.h' comment abo= ve? Correct. Is it ok to commit the patch to trunk? Best regards, Frederik --------------71E4B669165A735C64136B22 Content-Type: text/x-patch; charset="UTF-8"; name="0001-Add-OpenACC-2.6-acc_get_property-support.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0001-Add-OpenACC-2.6-acc_get_property-support.patch" Content-length: 41911 =46rom c0849361fb847d5ffac47e48203e4377488305de Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Fri, 20 Dec 2019 17:24:36 +0100 Subject: [PATCH] Add OpenACC 2.6 `acc_get_property' support Add generic support for the OpenACC 2.6 `acc_get_property' and `acc_get_property_string' routines, as well as full handlers for the host and the NVPTX offload targets and minimal handlers for the HSA, Intel MIC, and AMD GCN offload targets. Included are C/C++ and Fortran tests that, in particular, print the property values for acc_property_vendor, acc_property_memory, acc_property_free_memory, acc_property_name, and acc_property_driver. The output looks as follows: Vendor: GNU Name: GOMP Total memory: 0 Free memory: 0 Driver: 1.0 with the host driver (where the memory related properties are not supported for the host device and yield 0, conforming to the standard) and output like: Vendor: Nvidia Total memory: 12651462656 Free memory: 12202737664 Name: TITAN V Driver: CUDA Driver 9.1 with the NVPTX driver. 2019-12-20 Maciej W. Rozycki Frederik Harwath Thomas Schwinge include/ * gomp-constants.h (gomp_device_property): New enum. libgomp/ * libgomp.h (gomp_device_descr): Add `get_property_func' member. * libgomp-plugin.h (gomp_device_property_value): New union. (gomp_device_property_value): New prototype. * openacc.h (acc_device_t): Add `acc_device_current' enumeration constant. (acc_device_property_t): New enum. (acc_get_property, acc_get_property_string): New prototypes. * oacc-init.c (acc_get_device_type): Also assert that result is not `acc_device_current'. (get_property_any, acc_get_property, acc_get_property_string): New functions. * openacc.f90 (openacc_kinds): Add `acc_device_current' and `acc_property_memory', `acc_property_free_memory', `acc_property_name', `acc_property_vendor' and `acc_property_driver' constants. Add `acc_device_property' data type. (openacc_internal): Add `acc_get_property' and `acc_get_property_string' interfaces. Add `acc_get_property_h', `acc_get_property_string_h', `acc_get_property_l' and `acc_get_property_string_l'. * oacc-host.c (host_get_property): New function. (host_dispatch): Wire it. * target.c (gomp_load_plugin_for_device): Handle `get_property'. * libgomp.map (OACC_2.6): Add `acc_get_property', `acc_get_property_h_', `acc_get_property_string' and `acc_get_property_string_h_' symbols. * libgomp.texi (OpenACC Runtime Library Routines): Add `acc_get_property'. (acc_get_property): New node. * plugin/plugin-gcn.c (GOMP_OFFLOAD_get_property): New function (stub). * plugin/plugin-hsa.c (GOMP_OFFLOAD_get_property): New function. * plugin/plugin-nvptx.c (CUDA_CALLS): Add `cuDeviceGetName', `cuDeviceTotalMem', `cuDriverGetVersion' and `cuMemGetInfo' calls. (GOMP_OFFLOAD_get_property): New function. (struct ptx_device): Add new field "name". (cuda_driver_version_s): Add new static variable ... (nvptx_init): ... and init from here. * testsuite/libgomp.oacc-c-c++-common/acc_get_property.c: New test. * testsuite/libgomp.oacc-c-c++-common/acc_get_property-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/acc_get_property-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c: New file with test helper functions. * testsuite/libgomp.oacc-fortran/acc_get_property.f90: New test. liboffloadmic/ * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_property): New function. Reviewed-by: Thomas Schwinge --- include/gomp-constants.h | 15 ++ libgomp/libgomp-plugin.h | 8 ++ libgomp/libgomp.h | 1 + libgomp/libgomp.map | 4 + libgomp/libgomp.texi | 39 ++++++ libgomp/oacc-host.c | 22 +++ libgomp/oacc-init.c | 63 ++++++++- libgomp/openacc.f90 | 129 +++++++++++++++++- libgomp/openacc.h | 15 ++ libgomp/plugin/cuda-lib.def | 4 + libgomp/plugin/plugin-gcn.c | 11 ++ libgomp/plugin/plugin-hsa.c | 26 ++++ libgomp/plugin/plugin-nvptx.c | 87 +++++++++++- libgomp/target.c | 1 + .../acc_get_property-2.c | 68 +++++++++ .../acc_get_property-3.c | 19 +++ .../acc_get_property-aux.c | 80 +++++++++++ .../acc_get_property.c | 75 ++++++++++ .../libgomp.oacc-fortran/acc_get_property.f90 | 92 +++++++++++++ .../plugin/libgomp-plugin-intelmic.cpp | 21 +++ 20 files changed, 774 insertions(+), 6 deletions(-) create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_pro= perty-2.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_pro= perty-3.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_pro= perty-aux.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_pro= perty.c create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/acc_get_property= .f90 diff --git a/include/gomp-constants.h b/include/gomp-constants.h index fdae6ccc870..d14e8b0394a 100644 --- a/include/gomp-constants.h +++ b/include/gomp-constants.h @@ -195,6 +195,21 @@ enum gomp_map_kind #define GOMP_DEVICE_ICV -1 #define GOMP_DEVICE_HOST_FALLBACK -2 =20 +/* Device property codes. Keep in sync with + libgomp/{openacc.h,openacc.f90}:acc_device_property_t */ +/* Start from 1 to catch uninitialized use. */ +enum gomp_device_property + { + GOMP_DEVICE_PROPERTY_MEMORY =3D 1, + GOMP_DEVICE_PROPERTY_FREE_MEMORY =3D 2, + GOMP_DEVICE_PROPERTY_NAME =3D 0x10001, + GOMP_DEVICE_PROPERTY_VENDOR =3D 0x10002, + GOMP_DEVICE_PROPERTY_DRIVER =3D 0x10003 + }; + +/* Internal property mask to tell numeric and string values apart. */ +#define GOMP_DEVICE_PROPERTY_STRING_MASK 0x10000 + /* GOMP_task/GOMP_taskloop* flags argument. */ #define GOMP_TASK_FLAG_UNTIED (1 << 0) #define GOMP_TASK_FLAG_FINAL (1 << 1) diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h index 037558c43f5..d3c6dc36276 100644 --- a/libgomp/libgomp-plugin.h +++ b/libgomp/libgomp-plugin.h @@ -54,6 +54,13 @@ enum offload_target_type OFFLOAD_TARGET_TYPE_GCN =3D 8 }; =20 +/* Container type for passing device properties. */ +union gomp_device_property_value +{ + const char *ptr; + size_t val; +}; + /* Opaque type to represent plugin-dependent implementation of an OpenACC asynchronous queue. */ struct goacc_asyncqueue; @@ -94,6 +101,7 @@ extern const char *GOMP_OFFLOAD_get_name (void); extern unsigned int GOMP_OFFLOAD_get_caps (void); extern int GOMP_OFFLOAD_get_type (void); extern int GOMP_OFFLOAD_get_num_devices (void); +extern union gomp_device_property_value GOMP_OFFLOAD_get_property (int, in= t); extern bool GOMP_OFFLOAD_init_device (int); extern bool GOMP_OFFLOAD_fini_device (int); extern unsigned GOMP_OFFLOAD_version (void); diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index c9653575208..24c76698c4e 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -1113,6 +1113,7 @@ struct gomp_device_descr __typeof (GOMP_OFFLOAD_get_caps) *get_caps_func; __typeof (GOMP_OFFLOAD_get_type) *get_type_func; __typeof (GOMP_OFFLOAD_get_num_devices) *get_num_devices_func; + __typeof (GOMP_OFFLOAD_get_property) *get_property_func; __typeof (GOMP_OFFLOAD_init_device) *init_device_func; __typeof (GOMP_OFFLOAD_fini_device) *fini_device_func; __typeof (GOMP_OFFLOAD_version) *version_func; diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index 63276f7d29b..c7268bfc8e7 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -492,6 +492,10 @@ OACC_2.6 { acc_detach_async; acc_detach_finalize; acc_detach_finalize_async; + acc_get_property; + acc_get_property_h_; + acc_get_property_string; + acc_get_property_string_h_; } OACC_2.5.1; =20 GOACC_2.0 { diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index ac9d38e01d7..5f8f1beedaf 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -1849,6 +1849,7 @@ acceleration device. * acc_get_device_type:: Get type of device accelerator to be used. * acc_set_device_num:: Set device number to use. * acc_get_device_num:: Get device number to be used. +* acc_get_property:: Get device property. * acc_async_test:: Tests for completion of a specific asynchr= onous operation. * acc_async_test_all:: Tests for completion of all asychronous @@ -2038,6 +2039,44 @@ region. =20 =20 =20 +@node acc_get_property +@section @code{acc_get_property} -- Get device property. +@cindex acc_get_property +@cindex acc_get_property_string +@table @asis +@item @emph{Description} +These routines return the value of the specified @var{property} for the +device being queried according to @var{devicenum} and @var{devicetype}. +Integer-valued and string-valued properties are returned by +@code{acc_get_property} and @code{acc_get_property_string} respectively. +The Fortran @code{acc_get_property_string} subroutine returns the string +retrieved in its fourth argument while the remaining entry points are +functions, which pass the return value as their result. + +@item @emph{C/C++}: +@multitable @columnfractions .20 .80 +@item @emph{Prototype}: @tab @code{size_t acc_get_property(int devicenum, = acc_device_t devicetype, acc_device_property_t property);} +@item @emph{Prototype}: @tab @code{const char *acc_get_property_string(int= devicenum, acc_device_t devicetype, acc_device_property_t property);} +@end multitable + +@item @emph{Fortran}: +@multitable @columnfractions .20 .80 +@item @emph{Interface}: @tab @code{function acc_get_property(devicenum, de= vicetype, property)} +@item @emph{Interface}: @tab @code{subroutine acc_get_property_string(devi= cenum, devicetype, property, string)} +@item @tab @code{integer devicenum} +@item @tab @code{integer(kind=3Dacc_device_kind) devicet= ype} +@item @tab @code{integer(kind=3Dacc_device_property) pro= perty} +@item @tab @code{integer(kind=3Dacc_device_property) acc= _get_property} +@item @tab @code{character(*) string} +@end multitable + +@item @emph{Reference}: +@uref{https://www.openacc.org, OpenACC specification v2.6}, section +3.2.6. +@end table + + + @node acc_async_test @section @code{acc_async_test} -- Test for completion of a specific asynch= ronous operation. @table @asis diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c index 845140f04f5..ec9e3247a1a 100644 --- a/libgomp/oacc-host.c +++ b/libgomp/oacc-host.c @@ -59,6 +59,27 @@ host_get_num_devices (void) return 1; } =20 +static union gomp_device_property_value +host_get_property (int n, int prop) +{ + union gomp_device_property_value nullval =3D { .val =3D 0 }; + + if (n >=3D host_get_num_devices ()) + return nullval; + + switch (prop) + { + case GOMP_DEVICE_PROPERTY_NAME: + return (union gomp_device_property_value) { .ptr =3D "GOMP" }; + case GOMP_DEVICE_PROPERTY_VENDOR: + return (union gomp_device_property_value) { .ptr =3D "GNU" }; + case GOMP_DEVICE_PROPERTY_DRIVER: + return (union gomp_device_property_value) { .ptr =3D VERSION }; + default: + return nullval; + } +} + static bool host_init_device (int n __attribute__ ((unused))) { @@ -248,6 +269,7 @@ static struct gomp_device_descr host_dispatch =3D .get_caps_func =3D host_get_caps, .get_type_func =3D host_get_type, .get_num_devices_func =3D host_get_num_devices, + .get_property_func =3D host_get_property, .init_device_func =3D host_init_device, .fini_device_func =3D host_fini_device, .version_func =3D host_version, diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c index dd88b58a379..487a2cca61f 100644 --- a/libgomp/oacc-init.c +++ b/libgomp/oacc-init.c @@ -670,7 +670,8 @@ acc_get_device_type (void) } =20 assert (res !=3D acc_device_default - && res !=3D acc_device_not_host); + && res !=3D acc_device_not_host + && res !=3D acc_device_current); =20 return res; } @@ -759,6 +760,66 @@ acc_set_device_num (int ord, acc_device_t d) =20 ialias (acc_set_device_num) =20 +static union gomp_device_property_value +get_property_any (int ord, acc_device_t d, acc_device_property_t prop) +{ + goacc_lazy_initialize (); + struct goacc_thread *thr =3D goacc_thread (); + + if (d =3D=3D acc_device_current && thr && thr->dev) + return thr->dev->get_property_func (thr->dev->target_id, prop); + + gomp_mutex_lock (&acc_device_lock); + + struct gomp_device_descr *dev =3D resolve_device (d, true); + + int num_devices =3D dev->get_num_devices_func (); + + if (num_devices <=3D 0 || ord >=3D num_devices) + acc_dev_num_out_of_range (d, ord, num_devices); + + dev +=3D ord; + + gomp_mutex_lock (&dev->lock); + if (dev->state =3D=3D GOMP_DEVICE_UNINITIALIZED) + gomp_init_device (dev); + gomp_mutex_unlock (&dev->lock); + + gomp_mutex_unlock (&acc_device_lock); + + assert (dev); + + return dev->get_property_func (dev->target_id, prop); +} + +size_t +acc_get_property (int ord, acc_device_t d, acc_device_property_t prop) +{ + if (!known_device_type_p (d)) + unknown_device_type_error(d); + + if (prop & GOMP_DEVICE_PROPERTY_STRING_MASK) + return 0; + else + return get_property_any (ord, d, prop).val; +} + +ialias (acc_get_property) + +const char * +acc_get_property_string (int ord, acc_device_t d, acc_device_property_t pr= op) +{ + if (!known_device_type_p (d)) + unknown_device_type_error(d); + + if (prop & GOMP_DEVICE_PROPERTY_STRING_MASK) + return get_property_any (ord, d, prop).ptr; + else + return NULL; +} + +ialias (acc_get_property_string) + /* For -O and higher, the compiler always attempts to expand acc_on_device= , but if the user disables the builtin, or calls it via a pointer, we'll need= this version. diff --git a/libgomp/openacc.f90 b/libgomp/openacc.f90 index fb7fc6e6d77..e5b4b40c3cc 100644 --- a/libgomp/openacc.f90 +++ b/libgomp/openacc.f90 @@ -31,16 +31,18 @@ =20 module openacc_kinds use iso_fortran_env, only: int32 + use iso_c_binding, only: c_size_t implicit none =20 public - private :: int32 + private :: int32, c_size_t =20 ! When adding items, also update 'public' setting in 'module openacc' be= low. =20 integer, parameter :: acc_device_kind =3D int32 =20 ! Keep in sync with include/gomp-constants.h. + integer (acc_device_kind), parameter :: acc_device_current =3D -3 integer (acc_device_kind), parameter :: acc_device_none =3D 0 integer (acc_device_kind), parameter :: acc_device_default =3D 1 integer (acc_device_kind), parameter :: acc_device_host =3D 2 @@ -49,6 +51,15 @@ module openacc_kinds integer (acc_device_kind), parameter :: acc_device_nvidia =3D 5 integer (acc_device_kind), parameter :: acc_device_gcn =3D 8 =20 + integer, parameter :: acc_device_property =3D c_size_t + + ! Keep in sync with include/gomp-constants.h. + integer (acc_device_property), parameter :: acc_property_memory =3D 1 + integer (acc_device_property), parameter :: acc_property_free_memory =3D= 2 + integer (acc_device_property), parameter :: acc_property_name =3D int(Z'= 10001') + integer (acc_device_property), parameter :: acc_property_vendor =3D int(= Z'10002') + integer (acc_device_property), parameter :: acc_property_driver =3D int(= Z'10003') + integer, parameter :: acc_handle_kind =3D int32 =20 ! Keep in sync with include/gomp-constants.h. @@ -89,6 +100,24 @@ module openacc_internal integer (acc_device_kind) d end function =20 + function acc_get_property_h (n, d, p) + import + implicit none (type, external) + integer (acc_device_property) :: acc_get_property_h + integer, value :: n + integer (acc_device_kind), value :: d + integer (acc_device_property), value :: p + end function + + subroutine acc_get_property_string_h (n, d, p, s) + import + implicit none (type, external) + integer, value :: n + integer (acc_device_kind), value :: d + integer (acc_device_property), value :: p + character (*) :: s + end subroutine + function acc_async_test_h (a) logical acc_async_test_h integer a @@ -508,6 +537,26 @@ module openacc_internal integer (c_int), value :: d end function =20 + function acc_get_property_l (n, d, p) & + bind (C, name =3D "acc_get_property") + use iso_c_binding, only: c_int, c_size_t + implicit none (type, external) + integer (c_size_t) :: acc_get_property_l + integer (c_int), value :: n + integer (c_int), value :: d + integer (c_int), value :: p + end function + + function acc_get_property_string_l (n, d, p) & + bind (C, name =3D "acc_get_property_string") + use iso_c_binding, only: c_int, c_ptr + implicit none (type, external) + type (c_ptr) :: acc_get_property_string_l + integer (c_int), value :: n + integer (c_int), value :: d + integer (c_int), value :: p + end function + function acc_async_test_l (a) & bind (C, name =3D "acc_async_test") use iso_c_binding, only: c_int @@ -716,16 +765,23 @@ module openacc private =20 ! From openacc_kinds - public :: acc_device_kind, acc_handle_kind + public :: acc_device_kind public :: acc_device_none, acc_device_default, acc_device_host public :: acc_device_not_host, acc_device_nvidia, acc_device_gcn + + public :: acc_device_property + public :: acc_property_memory, acc_property_free_memory + public :: acc_property_name, acc_property_vendor, acc_property_driver + + public :: acc_handle_kind public :: acc_async_noval, acc_async_sync =20 public :: openacc_version =20 public :: acc_get_num_devices, acc_set_device_type, acc_get_device_type - public :: acc_set_device_num, acc_get_device_num, acc_async_test - public :: acc_async_test_all + public :: acc_set_device_num, acc_get_device_num + public :: acc_get_property, acc_get_property_string + public :: acc_async_test, acc_async_test_all public :: acc_wait, acc_async_wait, acc_wait_async public :: acc_wait_all, acc_async_wait_all, acc_wait_all_async public :: acc_init, acc_shutdown, acc_on_device @@ -758,6 +814,14 @@ module openacc procedure :: acc_get_device_num_h end interface =20 + interface acc_get_property + procedure :: acc_get_property_h + end interface + + interface acc_get_property_string + procedure :: acc_get_property_string_h + end interface + interface acc_async_test procedure :: acc_async_test_h end interface @@ -976,6 +1040,63 @@ function acc_get_device_num_h (d) acc_get_device_num_h =3D acc_get_device_num_l (d) end function =20 +function acc_get_property_h (n, d, p) + use iso_c_binding, only: c_int, c_size_t + use openacc_internal, only: acc_get_property_l + use openacc_kinds + implicit none (type, external) + integer (acc_device_property) :: acc_get_property_h + integer, value :: n + integer (acc_device_kind), value :: d + integer (acc_device_property), value :: p + + integer (c_int) :: pint + + pint =3D int (p, c_int) + acc_get_property_h =3D acc_get_property_l (n, d, pint) +end function + +subroutine acc_get_property_string_h (n, d, p, s) + use iso_c_binding, only: c_char, c_int, c_ptr, c_f_pointer, c_associated + use openacc_internal, only: acc_get_property_string_l + use openacc_kinds + implicit none (type, external) + integer, value :: n + integer (acc_device_kind), value :: d + integer (acc_device_property), value :: p + character (*) :: s + + integer (c_int) :: pint + type (c_ptr) :: cptr + integer :: clen + character (kind=3Dc_char, len=3D1), pointer, contiguous :: sptr (:) + integer :: slen + integer :: i + + interface + function strlen (s) bind (C, name =3D "strlen") + use iso_c_binding, only: c_ptr, c_size_t + type (c_ptr), intent(in), value :: s + integer (c_size_t) :: strlen + end function strlen + end interface + + pint =3D int (p, c_int) + cptr =3D acc_get_property_string_l (n, d, pint) + s =3D "" + if (.not. c_associated (cptr)) then + return + end if + + clen =3D int (strlen (cptr)) + call c_f_pointer (cptr, sptr, [clen]) + + slen =3D min (clen, len (s)) + do i =3D 1, slen + s (i:i) =3D sptr (i) + end do +end subroutine + function acc_async_test_h (a) use openacc_internal, only: acc_async_test_l logical acc_async_test_h diff --git a/libgomp/openacc.h b/libgomp/openacc.h index d2e5c101f7f..9b143064b7d 100644 --- a/libgomp/openacc.h +++ b/libgomp/openacc.h @@ -49,6 +49,7 @@ extern "C" { /* Types */ typedef enum acc_device_t { /* Keep in sync with include/gomp-constants.h. */ + acc_device_current =3D -3, acc_device_none =3D 0, acc_device_default =3D 1, acc_device_host =3D 2, @@ -62,6 +63,16 @@ typedef enum acc_device_t { _ACC_neg =3D -1 } acc_device_t; =20 +typedef enum acc_device_property_t { + /* Keep in sync with include/gomp-constants.h. */ + /* Start from 1 to catch uninitialized use. */ + acc_property_memory =3D 1, + acc_property_free_memory =3D 2, + acc_property_name =3D 0x10001, + acc_property_vendor =3D 0x10002, + acc_property_driver =3D 0x10003 +} acc_device_property_t; + typedef enum acc_async_t { /* Keep in sync with include/gomp-constants.h. */ acc_async_noval =3D -1, @@ -73,6 +84,10 @@ void acc_set_device_type (acc_device_t) __GOACC_NOTHROW; acc_device_t acc_get_device_type (void) __GOACC_NOTHROW; void acc_set_device_num (int, acc_device_t) __GOACC_NOTHROW; int acc_get_device_num (acc_device_t) __GOACC_NOTHROW; +size_t acc_get_property + (int, acc_device_t, acc_device_property_t) __GOACC_NOTHROW; +const char *acc_get_property_string + (int, acc_device_t, acc_device_property_t) __GOACC_NOTHROW; int acc_async_test (int) __GOACC_NOTHROW; int acc_async_test_all (void) __GOACC_NOTHROW; void acc_wait (int) __GOACC_NOTHROW; diff --git a/libgomp/plugin/cuda-lib.def b/libgomp/plugin/cuda-lib.def index a16badcfa9d..cd91b39b1d2 100644 --- a/libgomp/plugin/cuda-lib.def +++ b/libgomp/plugin/cuda-lib.def @@ -8,6 +8,9 @@ CUDA_ONE_CALL (cuCtxSynchronize) CUDA_ONE_CALL (cuDeviceGet) CUDA_ONE_CALL (cuDeviceGetAttribute) CUDA_ONE_CALL (cuDeviceGetCount) +CUDA_ONE_CALL (cuDeviceGetName) +CUDA_ONE_CALL (cuDeviceTotalMem) +CUDA_ONE_CALL (cuDriverGetVersion) CUDA_ONE_CALL (cuEventCreate) CUDA_ONE_CALL (cuEventDestroy) CUDA_ONE_CALL (cuEventElapsedTime) @@ -35,6 +38,7 @@ CUDA_ONE_CALL (cuMemcpyHtoDAsync) CUDA_ONE_CALL (cuMemFree) CUDA_ONE_CALL (cuMemFreeHost) CUDA_ONE_CALL (cuMemGetAddressRange) +CUDA_ONE_CALL (cuMemGetInfo) CUDA_ONE_CALL (cuMemHostGetDevicePointer) CUDA_ONE_CALL (cuModuleGetFunction) CUDA_ONE_CALL (cuModuleGetGlobal) diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c index 04fe472a70d..32239c71c61 100644 --- a/libgomp/plugin/plugin-gcn.c +++ b/libgomp/plugin/plugin-gcn.c @@ -3236,6 +3236,17 @@ GOMP_OFFLOAD_get_num_devices (void) return hsa_context.agent_count; } =20 +union gomp_device_property_value +GOMP_OFFLOAD_get_property (int device, int prop) +{ + /* Stub. Check device and return default value for unsupported propertie= s. */ + /* TODO: Implement this function. */ + get_agent_info (device); + + union gomp_device_property_value nullval =3D { .val =3D 0 }; + return nullval; +} + /* Initialize device (agent) number N so that it can be used for computati= on. Return TRUE on success. */ =20 diff --git a/libgomp/plugin/plugin-hsa.c b/libgomp/plugin/plugin-hsa.c index 409e138aaca..259f704b2e9 100644 --- a/libgomp/plugin/plugin-hsa.c +++ b/libgomp/plugin/plugin-hsa.c @@ -699,6 +699,32 @@ GOMP_OFFLOAD_get_num_devices (void) return hsa_context.agent_count; } =20 +/* Part of the libgomp plugin interface. Return the value of property + PROP of agent number N. */ + +union gomp_device_property_value +GOMP_OFFLOAD_get_property (int n, int prop) +{ + union gomp_device_property_value nullval =3D { .val =3D 0 }; + + if (!init_hsa_context ()) + return nullval; + if (n >=3D hsa_context.agent_count) + { + GOMP_PLUGIN_error + ("Request for a property of a non-existing HSA device %i", n); + return nullval; + } + + switch (prop) + { + case GOMP_DEVICE_PROPERTY_VENDOR: + return (union gomp_device_property_value) { .ptr =3D "HSA" }; + default: + return nullval; + } +} + /* Part of the libgomp plugin interface. Initialize agent number N so tha= t it can be used for computation. Return TRUE on success. */ =20 diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index 911d0f66a6e..80e547541e3 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -189,6 +189,10 @@ cuda_error (CUresult r) return fallback; } =20 +/* Version of the CUDA Toolkit in the same MAJOR.MINOR format that is used= by + Nvidia, such as in the 'deviceQuery' program (Nvidia's CUDA samples). */ +static char cuda_driver_version_s[30]; + static unsigned int instantiated_devices =3D 0; static pthread_mutex_t ptx_dev_lock =3D PTHREAD_MUTEX_INITIALIZER; =20 @@ -284,7 +288,7 @@ struct ptx_device bool map; bool concur; bool mkern; - int mode; + int mode; int clock_khz; int num_sms; int regs_per_block; @@ -294,6 +298,9 @@ struct ptx_device int max_threads_per_multiprocessor; int default_dims[GOMP_DIM_MAX]; =20 + /* Length as used by the CUDA Runtime API ('struct cudaDeviceProp'). */ + char name[256]; + struct ptx_image_data *images; /* Images loaded on device. */ pthread_mutex_t image_lock; /* Lock for above list. */ =20 @@ -327,9 +334,16 @@ nvptx_init (void) =20 CUDA_CALL (cuInit, 0); =20 + int cuda_driver_version; + CUDA_CALL_ERET (NULL, cuDriverGetVersion, &cuda_driver_version); + snprintf (cuda_driver_version_s, sizeof cuda_driver_version_s, + "CUDA Driver %u.%u", + cuda_driver_version / 1000, cuda_driver_version % 1000 / 10); + CUDA_CALL (cuDeviceGetCount, &ndevs); ptx_devices =3D GOMP_PLUGIN_malloc_cleared (sizeof (struct ptx_device *) * ndevs); + return true; } =20 @@ -491,6 +505,9 @@ nvptx_open_device (int n) for (int i =3D 0; i !=3D GOMP_DIM_MAX; i++) ptx_dev->default_dims[i] =3D 0; =20 + CUDA_CALL_ERET (NULL, cuDeviceGetName, ptx_dev->name, sizeof ptx_dev->na= me, + dev); + ptx_dev->images =3D NULL; pthread_mutex_init (&ptx_dev->image_lock, NULL); =20 @@ -1104,6 +1121,74 @@ GOMP_OFFLOAD_get_num_devices (void) return nvptx_get_num_devices (); } =20 +union gomp_device_property_value +GOMP_OFFLOAD_get_property (int n, int prop) +{ + union gomp_device_property_value propval =3D { .val =3D 0 }; + + pthread_mutex_lock (&ptx_dev_lock); + + if (n >=3D nvptx_get_num_devices () || n < 0 || ptx_devices[n] =3D=3D NU= LL) + { + pthread_mutex_unlock (&ptx_dev_lock); + return propval; + } + + struct ptx_device *ptx_dev =3D ptx_devices[n]; + switch (prop) + { + case GOMP_DEVICE_PROPERTY_MEMORY: + { + size_t total_mem; + + CUDA_CALL_ERET (propval, cuDeviceTotalMem, &total_mem, ptx_dev->dev); + propval.val =3D total_mem; + } + break; + case GOMP_DEVICE_PROPERTY_FREE_MEMORY: + { + size_t total_mem; + size_t free_mem; + CUdevice ctxdev; + + CUDA_CALL_ERET (propval, cuCtxGetDevice, &ctxdev); + if (ptx_dev->dev =3D=3D ctxdev) + CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem); + else if (ptx_dev->ctx) + { + CUcontext old_ctx; + + CUDA_CALL_ERET (propval, cuCtxPushCurrent, ptx_dev->ctx); + CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem); + CUDA_CALL_ASSERT (cuCtxPopCurrent, &old_ctx); + } + else + { + CUcontext new_ctx; + + CUDA_CALL_ERET (propval, cuCtxCreate, &new_ctx, CU_CTX_SCHED_AUTO, + ptx_dev->dev); + CUDA_CALL_ERET (propval, cuMemGetInfo, &free_mem, &total_mem); + CUDA_CALL_ASSERT (cuCtxDestroy, new_ctx); + } + propval.val =3D free_mem; + } + break; + case GOMP_DEVICE_PROPERTY_NAME: + propval.ptr =3D ptx_dev->name; + break; + case GOMP_DEVICE_PROPERTY_VENDOR: + propval.ptr =3D "Nvidia"; + break; + case GOMP_DEVICE_PROPERTY_DRIVER: + propval.ptr =3D cuda_driver_version_s; + break; + } + + pthread_mutex_unlock (&ptx_dev_lock); + return propval; +} + bool GOMP_OFFLOAD_init_device (int n) { diff --git a/libgomp/target.c b/libgomp/target.c index 50a9c2b1df3..a1f80169f49 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -3002,6 +3002,7 @@ gomp_load_plugin_for_device (struct gomp_device_descr= *device, DLSYM (get_caps); DLSYM (get_type); DLSYM (get_num_devices); + DLSYM (get_property); DLSYM (init_device); DLSYM (fini_device); DLSYM (load_image); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-2= .c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-2.c new file mode 100644 index 00000000000..4dd13c401d3 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-2.c @@ -0,0 +1,68 @@ +/* Test the `acc_get_property' and '`acc_get_property_string' library + functions on Nvidia devices by comparing property values with + those obtained through the CUDA API. */ +/* { dg-additional-sources acc_get_property-aux.c } */ +/* { dg-additional-options "-lcuda -lcudart" } */ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ + +#include +#include +#include +#include +#include + +void expect_device_properties +(acc_device_t dev_type, int dev_num, + int expected_total_mem, int expected_free_mem, + const char* expected_vendor, const char* expected_name, + const char* expected_driver); + +int main () +{ + int dev_count; + cudaGetDeviceCount (&dev_count); + + for (int dev_num =3D 0; dev_num < dev_count; ++dev_num) + { + if (cudaSetDevice (dev_num) !=3D cudaSuccess) + { + fprintf (stderr, "cudaSetDevice failed.\n"); + abort (); + } + + printf("Checking device %d\n", dev_num); + + const char *vendor =3D "Nvidia"; + size_t free_mem; + size_t total_mem; + if (cudaMemGetInfo(&free_mem, &total_mem) !=3D cudaSuccess) + { + fprintf (stderr, "cudaMemGetInfo failed.\n"); + abort (); + } + + struct cudaDeviceProp p; + if (cudaGetDeviceProperties(&p, dev_num) !=3D cudaSuccess) + { + fprintf (stderr, "cudaGetDeviceProperties failed.\n"); + abort (); + } + + int driver_version; + if (cudaDriverGetVersion(&driver_version) !=3D cudaSuccess) + { + fprintf (stderr, "cudaDriverGetVersion failed.\n"); + abort (); + } + /* The version string should contain the version of the CUDA Toolkit + in the same MAJOR.MINOR format that is used by Nvidia. + The format string below is the same that is used by the deviceQuery + program, which belongs to Nvidia's CUDA samples, to print the version. */ + char driver[30]; + snprintf (driver, sizeof driver, "CUDA Driver %u.%u", + driver_version / 1000, driver_version % 1000 / 10); + + expect_device_properties(acc_device_nvidia, dev_num, + total_mem, free_mem, vendor, p.name, driver); + } +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-3= .c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-3.c new file mode 100644 index 00000000000..92565000e49 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-3.c @@ -0,0 +1,19 @@ +/* Test the `acc_get_property' and '`acc_get_property_string' library + functions for the host device. */ +/* { dg-additional-sources acc_get_property-aux.c } */ +/* { dg-do run } */ + +#include +#include + +void expect_device_properties +(acc_device_t dev_type, int dev_num, + int expected_total_mem, int expected_free_mem, + const char* expected_vendor, const char* expected_name, + const char* expected_driver); + +int main() +{ + printf ("Checking acc_device_host device properties\n"); + expect_device_properties (acc_device_host, 0, 0, 0, "GNU", "GOMP", "1.0"= ); +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-a= ux.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c new file mode 100644 index 00000000000..952bdbf6aea --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c @@ -0,0 +1,80 @@ +/* Auxiliary functions for acc_get_property tests */ +/* { dg-do compile { target skip-all-targets } } */ + +#include +#include +#include +#include + +void expect_device_properties +(acc_device_t dev_type, int dev_num, + int expected_total_mem, int expected_free_mem, + const char* expected_vendor, const char* expected_name, + const char* expected_driver) +{ + const char *vendor =3D acc_get_property_string (dev_num, dev_type, + acc_property_vendor); + if (strcmp (vendor, expected_vendor)) + { + fprintf (stderr, "Expected acc_property_vendor to equal \"%s\", " + "but was \"%s\".\n", expected_vendor, vendor); + abort (); + } + + int total_mem =3D acc_get_property (dev_num, dev_type, + acc_property_memory); + if (total_mem !=3D expected_total_mem) + { + fprintf (stderr, "Expected acc_property_memory to equal %d, " + "but was %d.\n", expected_total_mem, total_mem); + abort (); + + } + + int free_mem =3D acc_get_property (dev_num, dev_type, + acc_property_free_memory); + if (free_mem !=3D expected_free_mem) + { + fprintf (stderr, "Expected acc_property_free_memory to equal %d, " + "but was %d.\n", expected_free_mem, free_mem); + abort (); + } + + const char *name =3D acc_get_property_string (dev_num, dev_type, + acc_property_name); + if (strcmp (name, expected_name)) + { + fprintf(stderr, "Expected acc_property_name to equal \"%s\", " + "but was \"%s\".\n", expected_name, name); + abort (); + } + + const char *driver =3D acc_get_property_string (dev_num, dev_type, + acc_property_driver); + if (strcmp (expected_driver, driver)) + { + fprintf (stderr, "Expected acc_property_driver to equal %s, " + "but was %s.\n", expected_driver, driver); + abort (); + } + + int unknown_property =3D 16058; + int v =3D acc_get_property (dev_num, dev_type, (acc_device_property_t)un= known_property); + if (v !=3D 0) + { + fprintf (stderr, "Expected value of unknown numeric property to equa= l 0, " + "but was %d.\n", v); + abort (); + } + + int unknown_property2 =3D -16058; + const char *s =3D acc_get_property_string (dev_num, dev_type, (acc_devic= e_property_t)unknown_property2); + if (s !=3D NULL) + { + fprintf (stderr, "Expected value of unknown string property to be NU= LL, " + "but was %d.\n", s); + abort (); + } + + +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property.c= b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property.c new file mode 100644 index 00000000000..ac523898c60 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property.c @@ -0,0 +1,75 @@ +/* Test the `acc_get_property' and '`acc_get_property_string' library + functions by printing the results of those functions for all devices + of all device types mentioned in the OpenACC standard. + + See also acc_get_property.f90. */ +/* { dg-do run } */ + +#include +#include +#include +#include + +/* Print the values of the properties of all devices of the given type + and do basic device independent validation. */ + +void +print_device_properties(acc_device_t type) +{ + const char *s; + size_t v; + + int dev_count =3D acc_get_num_devices(type); + + for (int i =3D 0; i < dev_count; ++i) + { + printf(" Device %d:\n", i+1); + + s =3D acc_get_property_string (i, type, acc_property_vendor); + printf (" Vendor: %s\n", s); + if (s =3D=3D NULL || *s =3D=3D 0) + { + fprintf (stderr, "acc_property_vendor should not be null or empty.\n"); + abort (); + } + + v =3D acc_get_property (i, type, acc_property_memory); + printf (" Total memory: %zd\n", v); + + v =3D acc_get_property (i, type, acc_property_free_memory); + printf (" Free memory: %zd\n", v); + + s =3D acc_get_property_string (i, type, acc_property_name); + printf (" Name: %s\n", s); + if (s =3D=3D NULL || *s =3D=3D 0) + { + fprintf (stderr, "acc_property_name should not be null or empty.\n"); + abort (); + } + + s =3D acc_get_property_string (i, type, acc_property_driver); + printf (" Driver: %s\n", s); + if (s =3D=3D NULL || *s =3D=3D 0) + { + fprintf (stderr, "acc_property_string should not be null or empty.\n"); + abort (); + } + } +} + +int main () +{ + printf("acc_device_none:\n"); + /* For completness; not expected to print anything since there + should be no devices of this type. */ + print_device_properties(acc_device_none); + + printf("acc_device_default:\n"); + print_device_properties(acc_device_default); + + printf("acc_device_host:\n"); + print_device_properties(acc_device_host); + + printf("acc_device_not_host:\n"); + print_device_properties(acc_device_not_host); +} diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_get_property.f90 b/= libgomp/testsuite/libgomp.oacc-fortran/acc_get_property.f90 new file mode 100644 index 00000000000..cb68d386c8d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_get_property.f90 @@ -0,0 +1,92 @@ +! Test the `acc_get_property' and '`acc_get_property_string' library +! functions by printing the results of those functions for all devices +! of all device types mentioned in the OpenACC standard. +! +! See also acc_get_property.c +! { dg-do run } + +program test + use openacc + implicit none + + print *, "acc_device_none:" + ! For completeness; not expected to print anything + call print_device_properties (acc_device_none) + + print *, "acc_device_default:" + call print_device_properties (acc_device_default) + + print *, "acc_device_host:" + call print_device_properties (acc_device_host) + + print *, "acc_device_not_host:" + call print_device_properties (acc_device_not_host) +end program test + +! Print the values of the properties of all devices of the given type +! and do basic device independent validation. +subroutine print_device_properties (device_type) + use openacc + implicit none + + integer, intent(in) :: device_type + + integer :: device_count + integer :: device + integer(acc_device_property) :: v + character*256 :: s + + device_count =3D acc_get_num_devices(device_type) + + do device =3D 0, device_count - 1 + print "(a, i0)", " Device ", device + + call acc_get_property_string (device, device_type, acc_property_vendo= r, s) + print "(a, a)", " Vendor: ", trim (s) + if (s =3D=3D "") then + print *, "acc_property_vendor should not be empty." + stop 1 + end if + + v =3D acc_get_property (device, device_type, acc_property_memory) + print "(a, i0)", " Total memory: ", v + if (v < 0) then + print *, "acc_property_memory should not be negative." + stop 1 + end if + + v =3D acc_get_property (device, device_type, acc_property_free_memory) + print "(a, i0)", " Free memory: ", v + if (v < 0) then + print *, "acc_property_free_memory should not to be negative." + stop 1 + end if + + v =3D acc_get_property (device, device_type, int(2360, kind =3D acc_d= evice_property)) + if (v /=3D 0) then + print *, "Value of unknown numeric property should be 0." + stop 1 + end if + + call acc_get_property_string (device, device_type, acc_property_name,= s) + print "(a, a)", " Name: ", trim (s) + if (s =3D=3D "") then + print *, "acc_property_name should not be empty." + stop 1 + end if + + call acc_get_property_string (device, device_type, acc_property_drive= r, s) + print "(a, a)", " Driver: ", trim (s) + if (s =3D=3D "") then + print *, "acc_property_driver should not be empty." + stop 1 + end if + + call acc_get_property_string (device, device_type, int(4060, kind =3D= acc_device_property), s) + if (s /=3D "") then + print *, "Value of unknown string property should be empty string." + stop 1 + end if + + end do +end subroutine print_device_properties diff --git a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp b/liboffloadm= ic/plugin/libgomp-plugin-intelmic.cpp index d1678d0514e..40d97702b87 100644 --- a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp +++ b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp @@ -174,6 +174,27 @@ GOMP_OFFLOAD_get_num_devices (void) return num_devices; } =20 +extern "C" union gomp_device_property_value +GOMP_OFFLOAD_get_property (int n, int prop) +{ + union gomp_device_property_value nullval =3D { .val =3D 0 }; + + if (n >=3D num_devices) + { + GOMP_PLUGIN_error + ("Request for a property of a non-existing Intel MIC device %i", n); + return nullval; + } + + switch (prop) + { + case GOMP_DEVICE_PROPERTY_VENDOR: + return (union gomp_device_property_value) { .ptr =3D "Intel" }; + default: + return nullval; + } +} + static bool offload (const char *file, uint64_t line, int device, const char *name, int num_vars, VarDesc *vars, const void **async_data) --=20 2.17.1 --------------71E4B669165A735C64136B22--