Since it wasn't too complicated to do, I implemented a dwarf object lock (per struct dwarf lock) instead of having the static global lock. As per your suggestion, I placed the lock over the whole dwarf_getalt function, so now find_debug_altlink is called with the lock already acquired. In addition, I placed locking around dwarf_setalt. I am currently in the process of testing these changes along with others to make sure everything is ironed out. On Tue, Oct 10, 2023 at 5:04 PM Mark Wielaard wrote: > Hi Heather, > > On Tue, Oct 10, 2023 at 03:42:54PM +0200, Mark Wielaard wrote: > > * libdw/dwarf_getalt.c: Add lock for > > dbg->alt_dwarf/main->alt_dwarf. > > This takes care of dwarf_getalt. Shouldn't dwarf_setalt also use > locking? > > > Signed-off-by: Heather S. McIntyre > > Signed-off-by: Mark Wielaard > > --- > > libdw/dwarf_getalt.c | 27 ++++++++++++++++++++++----- > > 1 file changed, 22 insertions(+), 5 deletions(-) > > > > diff --git a/libdw/dwarf_getalt.c b/libdw/dwarf_getalt.c > > index 0a12dfae..e3894c8c 100644 > > --- a/libdw/dwarf_getalt.c > > +++ b/libdw/dwarf_getalt.c > > @@ -44,6 +44,10 @@ > > #include > > #include > > > > +/* find_debug_altlink() modifies "dbg->alt_dwarf". > > + dwarf_getalt() reads "main->alt_dwarf". > > + Mutual exclusion is enforced to prevent a race. */ > > +rwlock_define(static, alt_dwarf_lock); > > Probably overkill, but should we consider a Dwarf lock object instead > of having a static global lock? > > > char * > > internal_function > > @@ -152,7 +156,9 @@ find_debug_altlink (Dwarf *dbg) > > Dwarf *alt = dwarf_begin (fd, O_RDONLY); > > if (alt != NULL) > > { > > + rwlock_wrlock(alt_dwarf_lock); > > dbg->alt_dwarf = alt; > > + rwlock_unlock(alt_dwarf_lock); > > dbg->alt_fd = fd; > > } > > else > > Is this lock wide enough? See also below. It looks like multiple > threads could arrive at this point at the same time. so alt_dwarf and > alt_fd can be (re)set multiple times, causing leaks of the Dwarf and > fd. > > > @@ -163,22 +169,33 @@ find_debug_altlink (Dwarf *dbg) > > Dwarf * > > dwarf_getalt (Dwarf *main) > > { > > + rwlock_rdlock(alt_dwarf_lock); > > + Dwarf *alt_dwarf_local = main->alt_dwarf; > > + rwlock_unlock(alt_dwarf_lock); > > + > > /* Only try once. */ > > - if (main == NULL || main->alt_dwarf == (void *) -1) > > + if (main == NULL || alt_dwarf_local == (void *) -1) > > return NULL; > > > > - if (main->alt_dwarf != NULL) > > - return main->alt_dwarf; > > + if (alt_dwarf_local != NULL) > > + return alt_dwarf_local; > > > > So at this point it looks like we can have multiple threads running > (because the lock has been dropped above) all with alt_dwarf_local > (and main->alt_dwarf) being NULL. > > > find_debug_altlink (main); > > > > + rwlock_rdlock(alt_dwarf_lock); > > + alt_dwarf_local = main->alt_dwarf; > > + rwlock_unlock(alt_dwarf_lock); > > + > > /* If we found nothing, make sure we don't try again. */ > > - if (main->alt_dwarf == NULL) > > + if (alt_dwarf_local == NULL) > > { > > + rwlock_wrlock(alt_dwarf_lock); > > main->alt_dwarf = (void *) -1; > > + rwlock_unlock(alt_dwarf_lock); > > + > > return NULL; > > } > > > > - return main->alt_dwarf; > > + return alt_dwarf_local; > > } > > INTDEF (dwarf_getalt) > > -- > > 2.41.0 > > > > It might be better to take the lock over the whole function (and only > call find_debug_altlink with the lock held). > > Cheers, > > Mark >