On Thu, 23 Apr 2015 18:41:34 +0200 Thomas Schwinge wrote: > Hi! > > On Wed, 22 Apr 2015 19:42:43 +0100, Julian Brown > wrote: > > This patch is an attempt to fix some potential race conditions with > > accesses to shared data structures from multiple concurrent threads > > in libgomp's OpenACC entry points. The main change is to move > > locking out of lookup_host and lookup_dev in oacc-mem.c and into > > their callers (which can then hold the locks for the whole > > operation that they are performing). > > Yeah, that makes sense I guess. > > > Also missing locking has been added for gomp_acc_insert_pointer. > > > > Tests look OK (with offloading to NVidia PTX). > > How did you test to get some confidence in the locking being > sufficient? Merely by running the existing tests and via inspection, sadly. I'm not sure how much value we'd get from implementing an exhaustive threading testsuite at this stage: I guess testcases will be easier to come by in the future if/when people start to use e.g. OpenMP and OpenACC together. > Going further (separate patch?), a few more comments: > > Is it OK that oacc-init.c:cached_base_dev is accessed without locking? > > Generally, we have to keep in mind that the same device may be > accessed in parallel through both OpenACC and OpenMP interfaces. For > this, for example, in oacc-init.c, even though acc_device_lock is > held, is it OK to call gomp_init_device(D) without D->lock being > locked? (Compare to target.c code.) > > Please document what exactly oacc-init.c:acc_device_lock is to guard. > I'm not sure I'm understanding this correctly. I've attached a follow-on patch that documents the purpose of acc_device_lock -- and also fixes some places that should have been holding the lock, but were not. I've also added locking (with dev->lock) when calling gomp_init_device and gomp_fini_device from the OpenACC initialisation/finalisation code. > Should oacc-init.c:acc_shutdown_1 release goacc_thread_lock before any > gomp_fatal calls? (That seems to be the general policy in libgomp.) I added this to the first patch. > > --- a/libgomp/oacc-mem.c > > +++ b/libgomp/oacc-mem.c > > > @@ -120,25 +116,32 @@ acc_free (void *d) > > { > > splay_tree_key k; > > struct goacc_thread *thr = goacc_thread (); > > + struct gomp_device_descr *acc_dev = thr->dev; > > > > if (!d) > > return; > > > > assert (thr && thr->dev); > > > > + gomp_mutex_lock (&acc_dev->lock); > > + > > /* We don't have to call lazy open here, as the ptr value must > > have been returned by acc_malloc. It's not permitted to pass NULL > > in (unless you got that null from acc_malloc). */ > > - if ((k = lookup_dev (thr->dev->openacc.data_environ, d, 1))) > > - { > > - void *offset; > > + if ((k = lookup_dev (acc_dev->openacc.data_environ, d, 1))) > > + { > > + void *offset; > > + > > + offset = d - k->tgt->tgt_start + k->tgt_offset; > > > > - offset = d - k->tgt->tgt_start + k->tgt_offset; > > + gomp_mutex_unlock (&acc_dev->lock); > > > > - acc_unmap_data ((void *)(k->host_start + offset)); > > - } > > + acc_unmap_data ((void *)(k->host_start + offset)); > > + } > > + else > > + gomp_mutex_unlock (&acc_dev->lock); > > Does it make sense to make the unlock unconditional, and move the > acc_unmap_data after it, guarded by »if (k)«? I've left this one -- just a stylistic tweak, but I think it's fine as-is. > > - thr->dev->free_func (thr->dev->target_id, d); > > + acc_dev->free_func (acc_dev->target_id, d); > > } > > > > void > > @@ -178,16 +181,24 @@ acc_deviceptr (void *h) > > goacc_lazy_initialize (); > > > > struct goacc_thread *thr = goacc_thread (); > > + struct gomp_device_descr *dev = thr->dev; > > + > > + gomp_mutex_lock (&dev->lock); > > > > - n = lookup_host (thr->dev, h, 1); > > + n = lookup_host (dev, h, 1); > > > > if (!n) > > - return NULL; > > + { > > + gomp_mutex_unlock (&dev->lock); > > + return NULL; > > + } > > > > offset = h - n->host_start; > > > > d = n->tgt->tgt_start + n->tgt_offset + offset; > > > > + gomp_mutex_unlock (&dev->lock); > > + > > return d; > > } > > Do we need to retain the lock while working with n? If not, the > unlock could be placed right after the lookup_host, unconditionally. > I'm confused -- it's commonly being done (retained) in target.c code, > but not in the tgt_fn lookup in target.c:GOMP_target. I think the difference can be explained as follows: a given mapping (splay_key_tree_s) is essentially immutable after it is created (apart from the refcounts). Thus it can be safely accessed *so long as we know it will not be deallocated*. Now, in some parts of target.c, we have an "active" target_mem_desc, corresponding to a set of host-target mappings. So long as we are holding that target_mem_desc (e.g. as we are in GOMP_target_data), we know that none of the associated mappings' refcounts will fall to zero: so, we can access them (read only) safely without explicitly holding the lock. But, that's *not* the case for e.g. acc_deviceptr: that can be called at any point, in particular partway through deallocation of a set of mappings. So we need the lock for that. I didn't write the code of course, so E&OE :-). > Should external functions (dev2host_func) be called without any locks > held (to avoid issues with any call-backs); so move the unlocking > before that? I fixed this in the first patch. Re-tested (libgomp), results look OK. OK to apply now? Julian ChangeLog libgomp/ * oacc-init.c (acc_shutdown_1): Call gomp_mutex_unlock for goacc_thread_lock on error paths. * oacc-mem.c (lookup_host): Remove locking from function. Note locking requirement for caller in function comment. (lookup_dev): Likewise. (acc_free, acc_deviceptr, acc_hostptr, acc_is_present) (acc_map_data, acc_unmap_data, present_create_copy, delete_copyout) (update_dev_host, gomp_acc_insert_pointer, gomp_acc_remove_pointer): Add locking. ChangeLog libgomp/ * oacc-init.c (acc_device_lock): Add explanatory comment. (resolve_device): Add comment about locking requirement. (acc_init_1, acc_shutdown_1): Likewise. Add locking around gomp_init_device and gomp_fini_device calls. (acc_get_num_devices, acc_set_device_type, acc_get_device_type) (acc_get_device_num, acc_set_device_num): Add locking around resolve_device and gomp_init_device calls.