public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org, Szabolcs Nagy <szabolcs.nagy@arm.com>
Subject: Re: [PATCH v2 05/14] elf: Fix data races in pthread_create and TLS access [BZ #19329]
Date: Thu, 15 Apr 2021 14:44:15 -0300	[thread overview]
Message-ID: <fc80af73-659c-50ec-2aa1-ee0080aeb63b@linaro.org> (raw)
In-Reply-To: <ae990875879c5c4c14bb837c0971e88e2369193c.1618301209.git.szabolcs.nagy@arm.com>



On 13/04/2021 05:19, Szabolcs Nagy via Libc-alpha wrote:
> DTV setup at thread creation (_dl_allocate_tls_init) is changed
> to take the dlopen lock, GL(dl_load_lock).  Avoiding data races
> here without locks would require design changes: the map that is
> accessed for static TLS initialization here may be concurrently
> freed by dlclose.  That use after free may be solved by only
> locking around static TLS setup or by ensuring dlclose does not
> free modules with static TLS, however currently every link map
> with TLS has to be accessed at least to see if it needs static
> TLS.  And even if that's solved, still a lot of atomics would be
> needed to synchronize DTV related globals without a lock. So fix
> both bug 19329 and bug 27111 with a lock that prevents DTV setup
> running concurrently with dlopen or dlclose.

It sounds reasonable and I think the extra locks should be ok
performance-wise (concurrent dlopen/dlclose should not be an
hotspot operation in majority of the cases).

> 
> _dl_update_slotinfo at TLS access still does not use any locks
> so CONCURRENCY NOTES are added to explain the synchronization.
> The early exit from the slotinfo walk when max_modid is reached
> is not strictly necessary, but does not hurt either.
> 
> An incorrect acquire load was removed from _dl_resize_dtv: it
> did not synchronize with any release store or fence and
> synchronization is now handled separately at thread creation
> and TLS access time.

Reading the fix that add the acquire load (d8dd00805b8) it seems
the idea was to use a sequentially-consistent ordering, since 
previously GL(dl_tls_max_dtv_idx) was not accessed using atomic
operations.

> 
> There are still a number of racy read accesses to globals that
> will be changed to relaxed MO atomics in a followup patch. This
> should not introduce regressions compared to existing behaviour
> and avoid cluttering the main part of the fix.
> 
> Not all TLS access related data races got fixed here: there are
> additional races at lazy tlsdesc relocations see bug 27137.

LGTM, thanks.  There is only a misspelled word and in minor style issue
in the comments below.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>


> ---
>  elf/dl-tls.c | 63 +++++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 47 insertions(+), 16 deletions(-)
> 
> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> index f8b32b3ecb..33c06782b1 100644
> --- a/elf/dl-tls.c
> +++ b/elf/dl-tls.c
> @@ -471,14 +471,11 @@ extern dtv_t _dl_static_dtv[];
>  #endif
>  
>  static dtv_t *
> -_dl_resize_dtv (dtv_t *dtv)
> +_dl_resize_dtv (dtv_t *dtv, size_t max_modid)
>  {
>    /* Resize the dtv.  */
>    dtv_t *newp;
> -  /* Load GL(dl_tls_max_dtv_idx) atomically since it may be written to by
> -     other threads concurrently.  */
> -  size_t newsize
> -    = atomic_load_acquire (&GL(dl_tls_max_dtv_idx)) + DTV_SURPLUS;
> +  size_t newsize = max_modid + DTV_SURPLUS;
>    size_t oldsize = dtv[-1].counter;
>  
>    if (dtv == GL(dl_initial_dtv))

Ok.

> @@ -524,11 +521,14 @@ _dl_allocate_tls_init (void *result)
>    size_t total = 0;
>    size_t maxgen = 0;
>  
> +  /* Protects global dynamic TLS related state.  */
> +  __rtld_lock_lock_recursive (GL(dl_load_lock));
> +
>    /* Check if the current dtv is big enough.   */
>    if (dtv[-1].counter < GL(dl_tls_max_dtv_idx))
>      {
>        /* Resize the dtv.  */
> -      dtv = _dl_resize_dtv (dtv);
> +      dtv = _dl_resize_dtv (dtv, GL(dl_tls_max_dtv_idx));
>  
>        /* Install this new dtv in the thread data structures.  */
>        INSTALL_DTV (result, &dtv[-1]);

Ok.

> @@ -596,6 +596,7 @@ _dl_allocate_tls_init (void *result)
>        listp = listp->next;
>        assert (listp != NULL);
>      }
> +  __rtld_lock_unlock_recursive (GL(dl_load_lock));
>  
>    /* The DTV version is up-to-date now.  */
>    dtv[0].counter = maxgen;

Ok.

> @@ -730,12 +731,29 @@ _dl_update_slotinfo (unsigned long int req_modid)
>  
>    if (dtv[0].counter < listp->slotinfo[idx].gen)
>      {
> -      /* The generation counter for the slot is higher than what the
> -	 current dtv implements.  We have to update the whole dtv but
> -	 only those entries with a generation counter <= the one for
> -	 the entry we need.  */
> +      /* CONCURRENCY NOTES:
> +
> +	 Here the dtv needs to be updated to new_gen generation count.
> +
> +	 This code may be called during TLS access when GL(dl_load_lock)
> +	 is not held.  In that case the user code has to synchrnize with

s/synchrnize/synchronize

> +	 dlopen and dlclose calls of relevant modules.  A module m is
> +	 relevant if the generation of m <= new_gen and dlclose of m is
> +	 synchronized: a memory access here happens after the dlopen and
> +	 before the dlclose of relevant modules.  The dtv entries for
> +	 relevant modules need to be updated, other entries can be
> +	 arbitrary.
> +
> +	 This e.g. means that the first part of the slotinfo list can be
> +	 accessed race free, but the tail may be concurrently extended.
> +	 Similarly relevant slotinfo entries can be read race free, but
> +	 other entries are racy.  However updating a non-relevant dtv
> +	 entry does not affect correctness.  For a relevant module m,
> +	 max_modid >= modid of m.  */
>        size_t new_gen = listp->slotinfo[idx].gen;
>        size_t total = 0;
> +      size_t max_modid  = atomic_load_relaxed (&GL(dl_tls_max_dtv_idx));
> +      assert (max_modid >= req_modid);
>  
>        /* We have to look through the entire dtv slotinfo list.  */
>        listp =  GL(dl_tls_dtv_slotinfo_list);

Ok.

> @@ -745,12 +763,14 @@ _dl_update_slotinfo (unsigned long int req_modid)
>  	    {
>  	      size_t modid = total + cnt;
>  
> +	      /* Later entries are not relevant.  */
> +	      if (modid > max_modid)
> +		break;
> +
>  	      size_t gen = listp->slotinfo[cnt].gen;
>  
>  	      if (gen > new_gen)
> -		/* This is a slot for a generation younger than the
> -		   one we are handling now.  It might be incompletely
> -		   set up so ignore it.  */
> +		/* Not relevant.  */
>  		continue;
>  
>  	      /* If the entry is older than the current dtv layout we

Ok.

> @@ -767,7 +787,7 @@ _dl_update_slotinfo (unsigned long int req_modid)
>  		    continue;
>  
>  		  /* Resize the dtv.  */
> -		  dtv = _dl_resize_dtv (dtv);
> +		  dtv = _dl_resize_dtv (dtv, max_modid);
>  
>  		  assert (modid <= dtv[-1].counter);
>  

Ok.

> @@ -789,8 +809,17 @@ _dl_update_slotinfo (unsigned long int req_modid)
>  	    }
>  
>  	  total += listp->len;
> +	  if (total > max_modid)
> +	    break;
> +
> +	  /* Synchronize with _dl_add_to_slotinfo. Ideally this would

Double space after period.

> +	     be consume MO since we only need to order the accesses to
> +	     the next node after the read of the address and on most
> +	     hardware (other than alpha) a normal load would do that
> +	     because of the address dependency.  */
> +	  listp = atomic_load_acquire (&listp->next);
>  	}
> -      while ((listp = listp->next) != NULL);
> +      while (listp != NULL);
>  
>        /* This will be the new maximum generation counter.  */
>        dtv[0].counter = new_gen;

Ok.

> @@ -982,7 +1011,7 @@ _dl_add_to_slotinfo (struct link_map *l, bool do_add)
>  	 the first slot.  */
>        assert (idx == 0);
>  
> -      listp = prevp->next = (struct dtv_slotinfo_list *)
> +      listp = (struct dtv_slotinfo_list *)
>  	malloc (sizeof (struct dtv_slotinfo_list)
>  		+ TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
>        if (listp == NULL)
> @@ -996,6 +1025,8 @@ cannot create TLS data structures"));
>        listp->next = NULL;
>        memset (listp->slotinfo, '\0',
>  	      TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
> +      /* Synchronize with _dl_update_slotinfo.  */
> +      atomic_store_release (&prevp->next, listp);
>      }
>  
>    /* Add the information into the slotinfo data structure.  */
> 
 Ok.

  reply	other threads:[~2021-04-15 17:44 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-13  8:17 [PATCH v2 00/14] Dynamic TLS related data race fixes Szabolcs Nagy
2021-04-13  8:18 ` [PATCH v2 01/14] elf: Fix a DTV setup issue [BZ #27136] Szabolcs Nagy
2021-04-13  8:36   ` Andreas Schwab
2021-04-13  9:35     ` Szabolcs Nagy
2021-04-13 10:22       ` Andreas Schwab
2021-04-13 10:34         ` Szabolcs Nagy
2021-04-13 10:51           ` Andreas Schwab
2021-04-13  8:18 ` [PATCH v2 02/14] elf: Add a DTV setup test " Szabolcs Nagy
2021-04-14 18:06   ` Adhemerval Zanella
2021-04-15  9:53     ` Szabolcs Nagy
2021-04-13  8:18 ` [PATCH v2 03/14] elf: Fix comments and logic in _dl_add_to_slotinfo Szabolcs Nagy
2021-04-14 18:12   ` Adhemerval Zanella
2021-04-13  8:18 ` [PATCH v2 04/14] elf: Refactor _dl_update_slotinfo to avoid use after free Szabolcs Nagy
2021-04-14 18:20   ` Adhemerval Zanella
2021-04-13  8:19 ` [PATCH v2 05/14] elf: Fix data races in pthread_create and TLS access [BZ #19329] Szabolcs Nagy
2021-04-15 17:44   ` Adhemerval Zanella [this message]
2021-04-13  8:19 ` [PATCH v2 06/14] elf: Use relaxed atomics for racy accesses " Szabolcs Nagy
2021-04-15 18:21   ` Adhemerval Zanella
2021-04-16  9:12     ` Szabolcs Nagy
2021-05-11  2:56       ` Carlos O'Donell
2021-05-11  9:31         ` Szabolcs Nagy
2021-05-11 16:19           ` Szabolcs Nagy
2021-05-12 20:33           ` Carlos O'Donell
2021-04-13  8:19 ` [PATCH v2 07/14] elf: Add test case for " Szabolcs Nagy
2021-04-15 19:21   ` Adhemerval Zanella
2021-04-13  8:20 ` [PATCH v2 08/14] elf: Fix DTV gap reuse logic [BZ #27135] Szabolcs Nagy
2021-04-15 19:45   ` Adhemerval Zanella
2021-06-24  9:48   ` Florian Weimer
2021-06-24 12:27     ` Florian Weimer
2021-06-24 12:57       ` Adhemerval Zanella
2021-06-24 14:20         ` Florian Weimer
2021-06-24 18:58       ` Szabolcs Nagy
2021-04-13  8:20 ` [PATCH v2 09/14] x86_64: Avoid lazy relocation of tlsdesc [BZ #27137] Szabolcs Nagy
2021-04-13 14:02   ` H.J. Lu
2021-04-13  8:20 ` [PATCH v2 10/14] i386: " Szabolcs Nagy
2021-04-13 14:02   ` H.J. Lu
2021-04-13  8:21 ` [PATCH v2 11/14] x86_64: Remove lazy tlsdesc relocation related code Szabolcs Nagy
2021-04-13 14:03   ` H.J. Lu
2021-04-13  8:21 ` [PATCH v2 12/14] i386: " Szabolcs Nagy
2021-04-13 14:04   ` H.J. Lu
2021-04-13  8:21 ` [PATCH v2 13/14] elf: " Szabolcs Nagy
2021-04-15 19:52   ` Adhemerval Zanella
2021-04-13  8:21 ` [PATCH v2 14/14] RFC elf: Fix slow tls access after dlopen [BZ #19924] Szabolcs Nagy
2022-09-16  9:54   ` Carlos O'Donell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc80af73-659c-50ec-2aa1-ee0080aeb63b@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=libc-alpha@sourceware.org \
    --cc=szabolcs.nagy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).