From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3783 invoked by alias); 30 Mar 2015 16:42:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 3763 invoked by uid 89); 30 Mar 2015 16:42:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Mon, 30 Mar 2015 16:42:17 +0000 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t2UGgExw008931 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 30 Mar 2015 12:42:14 -0400 Received: from tucnak.zalov.cz (ovpn-116-58.ams2.redhat.com [10.36.116.58]) by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t2UGgBHT029490 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 30 Mar 2015 12:42:13 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.14.9/8.14.9) with ESMTP id t2UGg4Mt017257; Mon, 30 Mar 2015 18:42:05 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.14.9/8.14.9/Submit) id t2UGg28Y016227; Mon, 30 Mar 2015 18:42:02 +0200 Date: Mon, 30 Mar 2015 16:42:00 -0000 From: Jakub Jelinek To: Ilya Verbin Cc: Thomas Schwinge , Julian Brown , gcc-patches@gcc.gnu.org, Kirill Yukhin Subject: Re: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks (was: Merge current set of OpenACC changes from gomp-4_0-branch) Message-ID: <20150330164202.GH1746@tucnak.redhat.com> Reply-To: Jakub Jelinek References: <20150203200104.GB54416@msticlxl57.ims.intel.com> <20150204150545.2628ad35@octopus> <20150224112951.363b71fd@octopus> <87a902ib93.fsf@schwinge.name> <20150226172511.GA49293@msticlxl57.ims.intel.com> <20150306140113.GB26588@msticlxl57.ims.intel.com> <20150309144555.3a078f48@octopus> <20150323194439.GA12972@msticlxl57.ims.intel.com> <20150326120919.GZ1746@tucnak.redhat.com> <20150326204130.GA65474@msticlxl57.ims.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150326204130.GA65474@msticlxl57.ims.intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes X-SW-Source: 2015-03/txt/msg01562.txt.bz2 On Thu, Mar 26, 2015 at 11:41:30PM +0300, Ilya Verbin wrote: > Here is the latest patch for libgomp and mic plugin. > make check-target-libgomp using intelmic emul passed. > Also I used a testcase from the attachment. This applies cleanly. > Latest ptx part is here, I guess: > https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01407.html But the one Julian posted doesn't apply on top of your patch. If there is any interdiff needed on top of your patch, can it be posted against trunk + your patch? > +/* Insert mapping of host -> target address pairs to splay tree. */ > + > +static void > +gomp_splay_tree_insert_mapping (struct gomp_device_descr *devicep, > + struct addr_pair *host_addr, > + struct addr_pair *tgt_addr) > +{ > + struct target_mem_desc *tgt = gomp_malloc (sizeof (*tgt)); > + tgt->refcount = 1; > + tgt->array = gomp_malloc (sizeof (*tgt->array)); > + tgt->tgt_start = tgt_addr->start; > + tgt->tgt_end = tgt_addr->end; > + tgt->to_free = NULL; > + tgt->list_count = 0; > + tgt->device_descr = devicep; > + splay_tree_node node = tgt->array; > + splay_tree_key k = &node->key; > + k->host_start = host_addr->start; > + k->host_end = host_addr->end; > + k->tgt_offset = 0; > + k->refcount = 1; > + k->copy_from = false; > + k->tgt = tgt; > + node->left = NULL; > + node->right = NULL; > + splay_tree_insert (&devicep->mem_map, node); > +} What is the reason to register and allocate these one at a time, rather than using one struct target_mem_desc with one tgt->array for all splay tree nodes registered from one image? Perhaps you would just use tgt_start of 0 and tgt_end of 0 too (to make it clear it is special) and just use tgt_offset relative to that (i.e. absolute), but having to malloc each node individually and having to malloc a target_mem_desc for each one sounds expensive. Everything is freed just once anyway, isn't it? > @@ -654,6 +727,18 @@ void > GOMP_offload_register (void *host_table, enum offload_target_type target_type, > void *target_data) > { > + int i; > + gomp_mutex_lock (®ister_lock); > + > + /* Load image to all initialized devices. */ > + for (i = 0; i < num_devices; i++) > + { > + struct gomp_device_descr *devicep = &devices[i]; > + if (devicep->type == target_type && devicep->is_initialized) > + gomp_offload_image_to_device (devicep, host_table, target_data); Shouldn't either this function, or gomp_offload_image_to_device lock also devicep->lock mutex and unlock at the end? Where exactly I guess depends on if the devicep->* hook calls should be guarded with the mutex or not. If yes, it should be this function and gomp_init_device. > + if (devicep->type != target_type || !devicep->is_initialized) > + continue; > + Similarly. > + devicep->unload_image_func (devicep->target_id, target_data); > + > + /* Remove mapping from splay tree. */ > + for (j = 0; j < num_funcs; j++) > + { > + struct splay_tree_key_s k; > + k.host_start = (uintptr_t) host_func_table[j]; > + k.host_end = k.host_start + 1; > + splay_tree_remove (&devicep->mem_map, &k); > + } > + > + for (j = 0; j < num_vars; j++) > + { > + struct splay_tree_key_s k; > + k.host_start = (uintptr_t) host_var_table[j*2]; > + k.host_end = k.host_start + (uintptr_t) host_var_table[j*2+1]; > + splay_tree_remove (&devicep->mem_map, &k); > + } > + } Aren't you leaking here all the tgt->array and tgt allocations here? Though, if you change it to just two allocations (one tgt, one array), you'd need to free just once. Jakub