From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 117639 invoked by alias); 23 Apr 2015 16:42:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 117623 invoked by uid 89); 23 Apr 2015 16:42:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 23 Apr 2015 16:42:07 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-02.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1YlKCR-0001Ya-Ai from Thomas_Schwinge@mentor.com ; Thu, 23 Apr 2015 09:42:03 -0700 Received: from feldtkeller.schwinge.homeip.net (137.202.0.76) by SVR-IES-FEM-02.mgc.mentorg.com (137.202.0.106) with Microsoft SMTP Server id 14.3.224.2; Thu, 23 Apr 2015 17:41:44 +0100 From: Thomas Schwinge To: Julian Brown CC: , Jakub Jelinek Subject: Re: [PATCH] Tidy up locking for libgomp OpenACC entry points In-Reply-To: <20150422194243.115f26fa@octopus> References: <20150422194243.115f26fa@octopus> User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/24.3.1 (x86_64-pc-linux-gnu) Date: Thu, 23 Apr 2015 16:42:00 -0000 Message-ID: <87egnavltt.fsf@schwinge.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" X-SW-Source: 2015-04/txt/msg01433.txt.bz2 --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-length: 4891 Hi! On Wed, 22 Apr 2015 19:42:43 +0100, Julian Brown = wrote: > This patch is an attempt to fix some potential race conditions with > accesses to shared data structures from multiple concurrent threads in > libgomp's OpenACC entry points. The main change is to move locking out > of lookup_host and lookup_dev in oacc-mem.c and into their callers > (which can then hold the locks for the whole operation that they are > performing). Yeah, that makes sense I guess. > Also missing locking has been added for gomp_acc_insert_pointer. >=20 > Tests look OK (with offloading to NVidia PTX). How did you test to get some confidence in the locking being sufficient? I just did a very cursory review, but it looks good with a few review comments below. Going further (separate patch?), a few more comments: Is it OK that oacc-init.c:cached_base_dev is accessed without locking? Generally, we have to keep in mind that the same device may be accessed in parallel through both OpenACC and OpenMP interfaces. For this, for example, in oacc-init.c, even though acc_device_lock is held, is it OK to call gomp_init_device(D) without D->lock being locked? (Compare to target.c code.) Please document what exactly oacc-init.c:acc_device_lock is to guard. I'm not sure I'm understanding this correctly. Should oacc-init.c:acc_shutdown_1 release goacc_thread_lock before any gomp_fatal calls? (That seems to be the general policy in libgomp.) > --- a/libgomp/oacc-mem.c > +++ b/libgomp/oacc-mem.c > @@ -120,25 +116,32 @@ acc_free (void *d) > { > splay_tree_key k; > struct goacc_thread *thr =3D goacc_thread (); > + struct gomp_device_descr *acc_dev =3D thr->dev; >=20=20 > if (!d) > return; >=20=20 > assert (thr && thr->dev); >=20=20 > + gomp_mutex_lock (&acc_dev->lock); > + > /* We don't have to call lazy open here, as the ptr value must have > been returned by acc_malloc. It's not permitted to pass NULL in > (unless you got that null from acc_malloc). */ > - if ((k =3D lookup_dev (thr->dev->openacc.data_environ, d, 1))) > - { > - void *offset; > + if ((k =3D lookup_dev (acc_dev->openacc.data_environ, d, 1))) > + { > + void *offset; > + > + offset =3D d - k->tgt->tgt_start + k->tgt_offset; >=20=20 > - offset =3D d - k->tgt->tgt_start + k->tgt_offset; > + gomp_mutex_unlock (&acc_dev->lock); >=20=20 > - acc_unmap_data ((void *)(k->host_start + offset)); > - } > + acc_unmap_data ((void *)(k->host_start + offset)); > + } > + else > + gomp_mutex_unlock (&acc_dev->lock); Does it make sense to make the unlock unconditional, and move the acc_unmap_data after it, guarded by =C2=BBif (k)=C2=AB? > - thr->dev->free_func (thr->dev->target_id, d); > + acc_dev->free_func (acc_dev->target_id, d); > } >=20=20 > void > @@ -178,16 +181,24 @@ acc_deviceptr (void *h) > goacc_lazy_initialize (); >=20=20 > struct goacc_thread *thr =3D goacc_thread (); > + struct gomp_device_descr *dev =3D thr->dev; > + > + gomp_mutex_lock (&dev->lock); >=20=20 > - n =3D lookup_host (thr->dev, h, 1); > + n =3D lookup_host (dev, h, 1); >=20=20 > if (!n) > - return NULL; > + { > + gomp_mutex_unlock (&dev->lock); > + return NULL; > + } >=20=20 > offset =3D h - n->host_start; >=20=20 > d =3D n->tgt->tgt_start + n->tgt_offset + offset; >=20=20 > + gomp_mutex_unlock (&dev->lock); > + > return d; > } Do we need to retain the lock while working with n? If not, the unlock could be placed right after the lookup_host, unconditionally. I'm confused -- it's commonly being done (retained) in target.c code, but not in the tgt_fn lookup in target.c:GOMP_target. > @@ -204,16 +215,24 @@ acc_hostptr (void *d) > goacc_lazy_initialize (); >=20=20 > struct goacc_thread *thr =3D goacc_thread (); > + struct gomp_device_descr *acc_dev =3D thr->dev; >=20=20 > - n =3D lookup_dev (thr->dev->openacc.data_environ, d, 1); > + gomp_mutex_lock (&acc_dev->lock); > + > + n =3D lookup_dev (acc_dev->openacc.data_environ, d, 1); >=20=20 > if (!n) > - return NULL; > + { > + gomp_mutex_unlock (&acc_dev->lock); > + return NULL; > + } >=20=20 > offset =3D d - n->tgt->tgt_start + n->tgt_offset; >=20=20 > h =3D n->host_start + offset; >=20=20 > + gomp_mutex_unlock (&acc_dev->lock); > + > return h; > } Similar, and also for other code later on. > @@ -439,25 +494,35 @@ delete_copyout (unsigned f, void *h, size_t s) > if (f & FLAG_COPYOUT) > acc_dev->dev2host_func (acc_dev->target_id, h, d, s); >=20=20 > + gomp_mutex_unlock (&acc_dev->lock); > + > acc_unmap_data (h); >=20=20 > acc_dev->free_func (acc_dev->target_id, d); Should external functions (dev2host_func) be called without any locks held (to avoid issues with any call-backs); so move the unlocking before that? Gr=C3=BC=C3=9Fe, Thomas --=-=-= Content-Type: application/pgp-signature Content-length: 472 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVOSC+AAoJEPoxNhtoi6CO2H0IAJLa5errwbSx6EiIysZ0Jc3L xWxCeGt4a1E+eHN4RaVL6BIjXkMyNxGXYsX7EMkbo9au39YFvOuUNxWLfGhgZYyY F3x3PCqkFDBZ9I07dgKquzfjq4dKB0MH3USsq+IoQ4ctFh+NXOkPzlRrlOkFf6M3 ADbLZrfc8p+sSyjVtUzxvWjj8eS1i7v+6e4zkKt5oIfCGjjr1vecMh/jCtKv4jov CCT5EnO7qb7GJ53gTpn+YKeCZLZl9Ih2F8TTfaO4I0sDLgXLPeDsY+hAOJfFbANX HKHq1QeLnJWXlfGuZUANx2TU4vxPKyGzaDz1lBKpc0RFLARqjpfVsduUtKfCdkU= =NiXT -----END PGP SIGNATURE----- --=-=-=--