Hi! On Wed, 22 Apr 2015 19:42:43 +0100, Julian Brown wrote: > This patch is an attempt to fix some potential race conditions with > accesses to shared data structures from multiple concurrent threads in > libgomp's OpenACC entry points. The main change is to move locking out > of lookup_host and lookup_dev in oacc-mem.c and into their callers > (which can then hold the locks for the whole operation that they are > performing). Yeah, that makes sense I guess. > Also missing locking has been added for gomp_acc_insert_pointer. > > Tests look OK (with offloading to NVidia PTX). How did you test to get some confidence in the locking being sufficient? I just did a very cursory review, but it looks good with a few review comments below. Going further (separate patch?), a few more comments: Is it OK that oacc-init.c:cached_base_dev is accessed without locking? Generally, we have to keep in mind that the same device may be accessed in parallel through both OpenACC and OpenMP interfaces. For this, for example, in oacc-init.c, even though acc_device_lock is held, is it OK to call gomp_init_device(D) without D->lock being locked? (Compare to target.c code.) Please document what exactly oacc-init.c:acc_device_lock is to guard. I'm not sure I'm understanding this correctly. Should oacc-init.c:acc_shutdown_1 release goacc_thread_lock before any gomp_fatal calls? (That seems to be the general policy in libgomp.) > --- a/libgomp/oacc-mem.c > +++ b/libgomp/oacc-mem.c > @@ -120,25 +116,32 @@ acc_free (void *d) > { > splay_tree_key k; > struct goacc_thread *thr = goacc_thread (); > + struct gomp_device_descr *acc_dev = thr->dev; > > if (!d) > return; > > assert (thr && thr->dev); > > + gomp_mutex_lock (&acc_dev->lock); > + > /* We don't have to call lazy open here, as the ptr value must have > been returned by acc_malloc. It's not permitted to pass NULL in > (unless you got that null from acc_malloc). */ > - if ((k = lookup_dev (thr->dev->openacc.data_environ, d, 1))) > - { > - void *offset; > + if ((k = lookup_dev (acc_dev->openacc.data_environ, d, 1))) > + { > + void *offset; > + > + offset = d - k->tgt->tgt_start + k->tgt_offset; > > - offset = d - k->tgt->tgt_start + k->tgt_offset; > + gomp_mutex_unlock (&acc_dev->lock); > > - acc_unmap_data ((void *)(k->host_start + offset)); > - } > + acc_unmap_data ((void *)(k->host_start + offset)); > + } > + else > + gomp_mutex_unlock (&acc_dev->lock); Does it make sense to make the unlock unconditional, and move the acc_unmap_data after it, guarded by »if (k)«? > - thr->dev->free_func (thr->dev->target_id, d); > + acc_dev->free_func (acc_dev->target_id, d); > } > > void > @@ -178,16 +181,24 @@ acc_deviceptr (void *h) > goacc_lazy_initialize (); > > struct goacc_thread *thr = goacc_thread (); > + struct gomp_device_descr *dev = thr->dev; > + > + gomp_mutex_lock (&dev->lock); > > - n = lookup_host (thr->dev, h, 1); > + n = lookup_host (dev, h, 1); > > if (!n) > - return NULL; > + { > + gomp_mutex_unlock (&dev->lock); > + return NULL; > + } > > offset = h - n->host_start; > > d = n->tgt->tgt_start + n->tgt_offset + offset; > > + gomp_mutex_unlock (&dev->lock); > + > return d; > } Do we need to retain the lock while working with n? If not, the unlock could be placed right after the lookup_host, unconditionally. I'm confused -- it's commonly being done (retained) in target.c code, but not in the tgt_fn lookup in target.c:GOMP_target. > @@ -204,16 +215,24 @@ acc_hostptr (void *d) > goacc_lazy_initialize (); > > struct goacc_thread *thr = goacc_thread (); > + struct gomp_device_descr *acc_dev = thr->dev; > > - n = lookup_dev (thr->dev->openacc.data_environ, d, 1); > + gomp_mutex_lock (&acc_dev->lock); > + > + n = lookup_dev (acc_dev->openacc.data_environ, d, 1); > > if (!n) > - return NULL; > + { > + gomp_mutex_unlock (&acc_dev->lock); > + return NULL; > + } > > offset = d - n->tgt->tgt_start + n->tgt_offset; > > h = n->host_start + offset; > > + gomp_mutex_unlock (&acc_dev->lock); > + > return h; > } Similar, and also for other code later on. > @@ -439,25 +494,35 @@ delete_copyout (unsigned f, void *h, size_t s) > if (f & FLAG_COPYOUT) > acc_dev->dev2host_func (acc_dev->target_id, h, d, s); > > + gomp_mutex_unlock (&acc_dev->lock); > + > acc_unmap_data (h); > > acc_dev->free_func (acc_dev->target_id, d); Should external functions (dev2host_func) be called without any locks held (to avoid issues with any call-backs); so move the unlocking before that? Grüße, Thomas