From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 76279 invoked by alias); 5 Aug 2015 15:09:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 76266 invoked by uid 89); 5 Aug 2015 15:09:25 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qk0-f169.google.com Received: from mail-qk0-f169.google.com (HELO mail-qk0-f169.google.com) (209.85.220.169) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 05 Aug 2015 15:09:23 +0000 Received: by qkfc129 with SMTP id c129so15919510qkf.1 for ; Wed, 05 Aug 2015 08:09:21 -0700 (PDT) X-Received: by 10.55.15.89 with SMTP id z86mr17413356qkg.75.1438787361457; Wed, 05 Aug 2015 08:09:21 -0700 (PDT) Received: from msticlxl57.ims.intel.com (fmdmzpr04-ext.fm.intel.com. [192.55.55.39]) by smtp.gmail.com with ESMTPSA id r22sm1535535qkr.2.2015.08.05.08.09.14 (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 05 Aug 2015 08:09:20 -0700 (PDT) Date: Wed, 05 Aug 2015 15:09:00 -0000 From: Ilya Verbin To: Richard Biener , Thomas Schwinge Cc: Jakub Jelinek , Richard Biener , Jan Hubicka , GCC Patches , Kirill Yukhin Subject: Re: [PATCH 2/n] OpenMP 4.0 offloading infrastructure: LTO streaming Message-ID: <20150805150904.GA3211@msticlxl57.ims.intel.com> References: <20141020111935.GA9362@msticlxl57.ims.intel.com> <20141024141601.GA62562@msticlxl57.ims.intel.com> <20141024142028.GD10376@tucnak.redhat.com> <20141028193047.GA17865@msticlxl57.ims.intel.com> <20141103092447.GO5026@tucnak.redhat.com> <20141105124655.GA42356@msticlxl57.ims.intel.com> <87egjopgh0.fsf@kepler.schwinge.homeip.net> <20150731142007.GA64740@msticlxl57.ims.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes X-SW-Source: 2015-08/txt/msg00292.txt.bz2 On Wed, Aug 05, 2015 at 10:40:44 +0200, Richard Biener wrote: > On Fri, Jul 31, 2015 at 4:20 PM, Ilya Verbin wrote: > > On Fri, Jul 31, 2015 at 16:08:27 +0200, Thomas Schwinge wrote: > >> We had established the use of a boolean flag have_offload in gcc::context > >> to indicate whether during compilation, we've actually seen any code to > >> be offloaded (see cited below the relevant parts of the patch by Ilya et > >> al.). This means that currently, the whole offload machinery will not be > >> run unless we actually have any offloaded data. This means that the > >> configured mkoffload programs (-foffload=[...], defaulting to > >> configure-time --enable-offload-targets=[...]) will not be invoked unless > >> we actually have any offloaded data. This means that we will not > >> actually generate constructor code to call libgomp's > >> GOMP_offload_register unless we actually have any offloaded data. > > > > Yes, that was the plan. > > > >> runtime, in libgomp, we then cannot reliably tell which -foffload=[...] > >> targets have been specified during compilation. > >> > >> But: at runtime, I'd like to know which -foffload=[...] targets have been > >> specified during compilation, so that we can, for example, reliably > >> resort to host fallback execution for -foffload=disable instead of > >> getting error message that an offloaded function is missing. > > > > It's easy to fix: > > > > diff --git a/libgomp/target.c b/libgomp/target.c > > index a5fb164..f81d570 100644 > > --- a/libgomp/target.c > > +++ b/libgomp/target.c > > @@ -1066,9 +1066,6 @@ gomp_get_target_fn_addr (struct gomp_device_descr *devicep, > > k.host_end = k.host_start + 1; > > splay_tree_key tgt_fn = splay_tree_lookup (&devicep->mem_map, &k); > > gomp_mutex_unlock (&devicep->lock); > > - if (tgt_fn == NULL) > > - gomp_fatal ("Target function wasn't mapped"); > > - > > return (void *) tgt_fn->tgt_offset; > > } > > } > > @@ -1095,6 +1092,8 @@ GOMP_target (int device, void (*fn) (void *), const void *unused, > > return gomp_target_fallback (fn, hostaddrs); > > > > void *fn_addr = gomp_get_target_fn_addr (devicep, fn); > > + if (fn_addr == NULL) > > + return gomp_target_fallback (fn, hostaddrs); > > > > struct target_mem_desc *tgt_vars > > = gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, false, > > @@ -1155,6 +1154,8 @@ GOMP_target_41 (int device, void (*fn) (void *), size_t mapnum, > > } > > > > void *fn_addr = gomp_get_target_fn_addr (devicep, fn); > > + if (fn_addr == NULL) > > + return gomp_target_fallback (fn, hostaddrs); > > > > struct target_mem_desc *tgt_vars > > = gomp_map_vars (devicep, mapnum, hostaddrs, NULL, sizes, kinds, true, > > > > > >> other hand, for example, for -foffload=nvptx-none, even if user program > >> code doesn't contain any offloaded data (and thus the offload machinery > >> has not been run), the user program might still contain any executable > >> directives or OpenACC runtime library calls, so we'd still like to use > >> the libgomp nvptx plugin. However, we currently cannot detect this > >> situation. > >> > >> I see two ways to resolve this: a) embed the compile-time -foffload=[...] > >> configuration in the executable (as a string, for example) for libgomp to > >> look that up, or b) make it a requirement that (if configured via > >> -foffload=[...]), the offload machinery is run even if there is not > >> actually any data to be offloaded, so we then reliably get the respective > >> constructor call to libgomp's GOMP_offload_register. I once began to > >> implement a), but this to get a big ugly, so then looked into b) instead. > >> Compared to the status quo, always running the whole offloading machinery > >> for the configured -foffload=[...] targets whenever -fopenacc/-fopenmp > >> are active, certainly does introduce some overhead when there isn't > >> actually any code to be offloaded, so I'm not sure whether that is > >> acceptable? > > > > I vote for (a). > > What happens for conflicting -fofffload=[...] options in different TUs? If you're asking about what happens now, only the list of offload targets from link-time -foffload=tgt1,tgt2 option matters. I don't like plan (b) because it calls ipa_write_summaries unconditionally for all OpenMP programs, which creates IR sections, which increases filesize and may cause other problems, e.g. . Also compile-time is increased because of LTO machinery, mkoffloads, etc. If OpenACC requires some registration in libgomp even without offload, maybe you can run this machinery only under flag_openacc? -- Ilya