From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 93124 invoked by alias); 20 Oct 2015 10:02:54 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 93101 invoked by uid 89); 20 Oct 2015 10:02:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL,BAYES_50,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 20 Oct 2015 10:02:51 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (Postfix) with ESMTPS id 43AD88E6E1; Tue, 20 Oct 2015 10:02:50 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-53.ams2.redhat.com [10.36.116.53]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t9KA2mea019299 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 20 Oct 2015 06:02:49 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id t9KA2kBZ023134; Tue, 20 Oct 2015 12:02:47 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id t9KA2jU2023133; Tue, 20 Oct 2015 12:02:45 +0200 Date: Tue, 20 Oct 2015 10:03:00 -0000 From: Jakub Jelinek To: Thomas Schwinge Cc: gcc-patches@gcc.gnu.org, Bernd Schmidt , Nathan Sidwell , Joseph Myers Subject: Re: Forwarding -foffload=[...] from the driver (compile-time) to libgomp (run-time) Message-ID: <20151020100245.GW478@tucnak.redhat.com> Reply-To: Jakub Jelinek References: <55F2E935.7010300@redhat.com> <87egjopgh0.fsf@kepler.schwinge.homeip.net> <20150911154349.GH1847@tucnak.redhat.com> <87a8s6vq69.fsf@kepler.schwinge.homeip.net> <20150929081814.GD1847@tucnak.redhat.com> <5624F236.9020308@mentor.com> <87lhbnsy1s.fsf@kepler.schwinge.homeip.net> <87mvve95af.fsf@schwinge.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87mvve95af.fsf@schwinge.name> User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes X-SW-Source: 2015-10/txt/msg01843.txt.bz2 On Mon, Oct 19, 2015 at 06:44:40PM +0200, Thomas Schwinge wrote: > > How's the following (complete patch instead of incremental patch; the > > driver changes are still the same as before)? The changes are: > > > > * libgomp/target.c:gomp_target_init again loads all the plugins. > > * libgomp/target.c:resolve_device and > > libgomp/oacc-init.c:resolve_device verify that a default device > > (OpenMP device-var ICV, and acc_device_default, respectively) is > > actually enabled, or resort to host fallback if not. > > * GOMP_set_offload_targets renamed to GOMP_enable_offload_targets; used > > to enable devices specified by -foffload. Can be called multiple > > times (executable, any shared libraries); the set of enabled devices > > is the union of all those ever requested. > > * GOMP_offload_register (but not the new GOMP_offload_register_ver) > > changed to enable all devices. This is to maintain compatibility > > with old executables and shared libraries built without the -foffload > > constructor support. Any reason not to pass the bitmask of the enabled targets to GOMP_offload_register_ver instead, to decrease the amount of ctors and the times you lock the various locks during initialization, or just enable automatically the devices you load data for during GOMP_offload_register_ver? I mean, GOMP_offload_register would enable for compatibility all devices, GOMP_offload_register_ver would enable the device it is registered for. For -foffload=disable on all shared libraries/binaries, naturally you would not register anything, thus would not enable any devices (only host fallback would work). Or are you worried about the case where one shared library is compiled with say -foffload=intelmic,ptx but doesn't actually contain any #pragma omp target/#pragma omp declare target (or OpenACC similar #directives), but only contains #pragma omp target data and/or the device query/copying routines, then dlopens some other shared library that actually has the offloading device code? That could be solved by adding the call you are talking about, but if we really should care about that unlikely case, it would be better to only arrange for it if really needed by the shared library (i.e. if it calls one of the OpenMP or OpenACC library routines that talk to the devices, or has #pragma omp target data or similar constructs; I'd strongly prefer not to have constructors in code that just got compiled with -fopenmp, even in configuration where some offloading is configured by default, when nothing in the code really cares about offloading. > --- a/gcc/gcc.c > +++ b/gcc/gcc.c > @@ -401,6 +401,8 @@ static const char *compare_debug_auxbase_opt_spec_function (int, const char **); > static const char *pass_through_libs_spec_func (int, const char **); > static const char *replace_extension_spec_func (int, const char **); > static const char *greater_than_spec_func (int, const char **); > +static const char *add_omp_infile_spec_func (int, const char **); > + > static char *convert_white_space (char *); > > /* The Specs Language I'd like to defer review of the driver bits, can Joseph or Bernd please have a look at those? > diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h > index 24fbb94..5da4fa7 100644 > --- a/libgomp/libgomp-plugin.h > +++ b/libgomp/libgomp-plugin.h > @@ -48,7 +48,8 @@ enum offload_target_type > OFFLOAD_TARGET_TYPE_HOST = 2, > /* OFFLOAD_TARGET_TYPE_HOST_NONSHM = 3 removed. */ > OFFLOAD_TARGET_TYPE_NVIDIA_PTX = 5, > - OFFLOAD_TARGET_TYPE_INTEL_MIC = 6 > + OFFLOAD_TARGET_TYPE_INTEL_MIC = 6, > + OFFLOAD_TARGET_TYPE_HWM What is HWM? Is that OFFLOAD_TARGET_TYPE_LAST what you mean? > diff --git a/libgomp/target.c b/libgomp/target.c > index b767410..df51bfb 100644 > --- a/libgomp/target.c > +++ b/libgomp/target.c > @@ -72,6 +72,9 @@ static int num_offload_images; > /* Array of descriptors for all available devices. */ > static struct gomp_device_descr *devices; > > +/* Set of enabled devices. */ > +static bool devices_enabled[OFFLOAD_TARGET_TYPE_HWM]; I must say I don't like the locking for this. If all you ever change on this is that you change it from 0 to 1, then supposedly just storing it with __atomic_store, perhaps with rel semantics, and reading it as __atomic_load, with acquire semantics, would be good enough? And perhaps change it into int array, so that it is actually atomic even on the old Alphas (if there are any around). Jakub