From: Carlos O'Donell <carlos@redhat.com>
To: Arjun Shankar <arjun.is@lostca.se>, libc-alpha@sourceware.org
Cc: Florian Weimer <fweimer@redhat.com>
Subject: Re: [PATCH v2] Fix double-checked locking in __gconv_get_path and __gconv_read_conf [BZ #22062]
Date: Thu, 05 Oct 2017 20:23:00 -0000 [thread overview]
Message-ID: <940c9be0-37ae-ebf7-1013-e6e6a4991028@redhat.com> (raw)
In-Reply-To: <20171002143148.GA41930@aloka.lostca.se>
On 10/02/2017 07:31 AM, Arjun Shankar wrote:
> These two functions running in different threads could encounter a data race
> if the CPU or compiler reordered the store of __gconv_path_elem so as to be
> earlier than writes to the content that it points to. This has now been
> corrected by using atomics every time the variable is accessed.
My reviews are structured into 3 pats: high level, design, and
implementation.
(a) At a high level I see a problem with this change.
All of the callers of __gconv_read_conf do so using __libc_once.
Why can't we just delete all the double checked locking assuming that
__libc_once works correctly?
The use of __libc_once ensures that __gconv_read_conf is called only
once, and that takes into account all threads.
The only scenario it doesn't take into account is if an in-progress
call to __libc_once was interrupted by a dlopen of libpthread that
then started threads. Those threads might attempt a parallel
initialization. Though this problem has existed *before* the changes
you are making.
Either way, because of the use of __libc_once, the function need not
do any locking or atomics.
Was the __libc_once usage by the callers considered when the patch
was created?
(b) At the design level, the changes look good.
What you've done is a correct fix of DCLP (thought at a high level
it might not be needed and we might just delete the code and update
the comment to say it must be called with __libc_once).
I see perhaps one case where an atomic operation is not needed, and
I've commented on that below.
(c) The details of the patch are great, and the comments are good.
> ChangeLog:
>
> 2017-10-02 Arjun Shankar <arjun@redhat.com>
>
> [BZ #22062]
> * iconv/gconv_conf.c: Include <atomic.h>.
> (__gconv_get_path): Access __gconv_path_elem using atomics.
> (__gconv_read_conf): Likewise.
> (libc_freeres_fn): Likewise.
> ---
> Discussion on PATCH v1:
> https://sourceware.org/ml/libc-alpha/2017-09/msg00779.html
>
> iconv/gconv_conf.c | 42 ++++++++++++++++++++++++++++++++----------
> 1 file changed, 32 insertions(+), 10 deletions(-)
>
> diff --git a/iconv/gconv_conf.c b/iconv/gconv_conf.c
> index f1c28ce..58fd28e 100644
> --- a/iconv/gconv_conf.c
> +++ b/iconv/gconv_conf.c
> @@ -30,6 +30,7 @@
> #include <string.h>
> #include <unistd.h>
> #include <sys/param.h>
> +#include <atomic.h>
OK.
>
> #include <libc-lock.h>
> #include <gconv_int.h>
> @@ -428,8 +429,13 @@ __gconv_get_path (void)
>
> __libc_lock_lock (lock);
>
All of the caller of __gconv_read_conf do so using __libc_once, which
already does locking using pthread_once to avoid __gconv_read_conf
ever being called by a another thread?
e.g.
iconv/gconv_db.c: __libc_once (once, __gconv_read_conf);
iconv/gconv_db.c: __libc_once (once, __gconv_read_conf);
See my comments above.
I will continue reviewing the DCLP usage below because it has value.
> - /* Make sure there wasn't a second thread doing it already. */
> - result = (struct path_elem *) __gconv_path_elem;
> + /* __gconv_read_conf uses double-checked locking and therefore can make
> + a redundant call to this function while another thread is already
> + executing it. So first we make sure another thread has not already done
> + the work we want to do.
> +
> + A relaxed MO load is sufficient since we already have the lock. */
> + result = atomic_load_relaxed (&__gconv_path_elem);
OK. Good, this removes the data race between the unordered read by a thread
in __gconv_read_conf and a write by another thread in __gonv_get_path.
> if (result == NULL)
> {
> /* Determine the complete path first. */
> @@ -519,7 +525,10 @@ __gconv_get_path (void)
> result[n].len = 0;
> }
>
> - __gconv_path_elem = result ?: (struct path_elem *) &empty_path_elem;
> + /* This store synchronizes with the acquire MO load in
> + __gconv_read_conf. */
> + atomic_store_release (&__gconv_path_elem,
> + result ?: (struct path_elem *) &empty_path_elem);
OK. Good comment.
>
> free (cwd);
> }
> @@ -538,6 +547,7 @@ __gconv_read_conf (void)
> size_t nmodules = 0;
> int save_errno = errno;
> size_t cnt;
> + struct path_elem *gconv_path_elem_local;
OK.
>
> /* First see whether we should use the cache. */
> if (__gconv_load_cache () == 0)
> @@ -549,13 +559,20 @@ __gconv_read_conf (void)
>
> #ifndef STATIC_GCONV
> /* Find out where we have to look. */
> - if (__gconv_path_elem == NULL)
> - __gconv_get_path ();
>
> - for (cnt = 0; __gconv_path_elem[cnt].name != NULL; ++cnt)
> + /* This load is synchronized with the release MO store done during the
> + initialization of __gconv_path_elem in __gconv_get_path. */
OK. Perfect. If the acquire sees NULL it knows that the other thread might
not have completed the work, and so proceeds to acquire the lock and check
again (double checked locking). If it sees non-NULL it knows the other function
completed all of the work and the element data should be visible.
> + gconv_path_elem_local = atomic_load_acquire (&__gconv_path_elem);
> + if (gconv_path_elem_local == NULL)
> + {
> + __gconv_get_path ();
Multiple reads by threads are safe. In this case we can have multiple threads
here, but all write to __gconv_path_elem are complete. All we are doing is
having multiple readers, which is not a data race.
Is there any reason we need this atomic load?
> + gconv_path_elem_local = atomic_load_relaxed (&__gconv_path_elem);
> + }
> +
> + for (cnt = 0; gconv_path_elem_local[cnt].name != NULL; ++cnt)
> {
> - const char *elem = __gconv_path_elem[cnt].name;
> - size_t elem_len = __gconv_path_elem[cnt].len;
> + const char *elem = gconv_path_elem_local[cnt].name;
> + size_t elem_len = gconv_path_elem_local[cnt].len;
OK.
> char *filename;
>
> /* No slash needs to be inserted between elem and gconv_conf_filename;
> @@ -606,6 +623,11 @@ __gconv_read_conf (void)
> /* Free all resources if necessary. */
> libc_freeres_fn (free_mem)
> {
> - if (__gconv_path_elem != NULL && __gconv_path_elem != &empty_path_elem)
> - free ((void *) __gconv_path_elem);
> + /* __gconv_path_elem is always accessed atomically because it is used in
> + double-checked locking. Since this function is called when the process
> + has become single-threaded, it is enough to used a relaxed MO load. */
OK. We use an atomic load to see the update from a thread that initialized
__gconv_path_elem.
> + struct path_elem *gconv_path_elem_local
> + = atomic_load_relaxed (&__gconv_path_elem);
> + if (gconv_path_elem_local != NULL && gconv_path_elem_local != &empty_path_elem)
> + free ((void *) gconv_path_elem_local);
> }
>
--
Cheers,
Carlos.
next prev parent reply other threads:[~2017-10-05 20:23 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-02 14:31 Arjun Shankar
2017-10-05 15:18 ` Florian Weimer
2017-10-05 20:23 ` Carlos O'Donell [this message]
2017-10-05 22:20 ` Florian Weimer
2017-10-05 23:05 ` Carlos O'Donell
2017-10-06 6:38 ` Florian Weimer
2017-10-10 19:08 ` Carlos O'Donell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=940c9be0-37ae-ebf7-1013-e6e6a4991028@redhat.com \
--to=carlos@redhat.com \
--cc=arjun.is@lostca.se \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).