From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-alpha-return-78306-listarch-libc-alpha=sources.redhat.com@sourceware.org>
Received: (qmail 20519 invoked by alias); 29 Mar 2017 08:58:58 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Received: (qmail 20499 invoked by uid 89); 29 Mar 2017 08:58:57 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=relationships, Space, Hx-languages-length:3809
X-HELO: mx1.redhat.com
DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 2273561B8E
Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=triegel@redhat.com
DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 2273561B8E
Message-ID: <1490777931.5724.15.camel@redhat.com>
Subject: Re: [PATCH v2] dl-load: add memory barrier before updating the next
From: Torvald Riegel <triegel@redhat.com>
To: Maninder Singh <maninder1.s@samsung.com>
Cc: libc-alpha@sourceware.org, szabolcs.nagy@arm.com, pankaj.m@samsung.com, 
	ajeet.y@samsung.com, a.sahrawat@samsung.com, lalit.mohan@samsung.com, 
	akhilesh.k@samsung.com, hakbong5.lee@samsung.com, Vaneet Narang
	 <v.narang@samsung.com>
Date: Wed, 29 Mar 2017 08:58:00 -0000
In-Reply-To: <1490770778-11050-1-git-send-email-maninder1.s@samsung.com>
References: 
	<CGME20170329070025epcas5p4644c528bf730c87d3c69e2ac3557d643@epcas5p4.samsung.com>
	 <1490770778-11050-1-git-send-email-maninder1.s@samsung.com>
Content-Type: text/plain; charset="UTF-8"
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-SW-Source: 2017-03/txt/msg00667.txt.bz2

On Wed, 2017-03-29 at 12:29 +0530, Maninder Singh wrote:
> This patch adds C11 memory barrier before updating the liblist next.
> 
> Issue Fix: race condition between add_name_to_object  & _dl_name_match_p.
> One threads calling dlopen which further calls add_name_to_object &
> other thread trying to resolve RTLD_LAZY symbols through
> _dl_runtime_resolve which further calls.
> 
> _dl_name_match_p checks if libname->next is valid, then it assumes
> libname->next->name to be valid. Also add_name_to_object initialized
> name first and then sets valid next pointer.

This patch is better, but it is still incomplete:

We want to document concurrent code in sufficient detail so that other
developers can understand the synchronization scheme without having to
review all potentially interacting code first.  (I'm aware that this
isn't the case in several parts of glibc code, but we're trying to
improve that.)  This means documenting, in comments in the code, the
synchronization scheme that is intended.  In this particular case, this
is this libname_list:
* What concurrent operations can run on it?  Apparently, search in
_dl_name_match_p can be concurrent with insertion.  Can insertion be
concurrent with other insertion or not (if so, why)?
* What are the important happens-before relationships we need to ensure?
(For example, in this case, we need to ensure that if we see a list
element in search, then all stores to the fields of list element happen
before the search operation looks at the fields.)
* What release operations and acquire operations are supposed to
synchronize with each other.  In your patch, it's these two.  If we
don't document this, it will be much harder for other developers to
figure out what the necessary pairings are.  You analyzed that already,
so tell others about it.

Documenting the code in this way is a good test to see if you have
really analyzed the synchronization problem completely.  (If one can't
clearly explain how something is supposed to work, perhaps one doesn't
yet fully understand it.)

As noted on the Concurrency wiki page, we don't use atomic types, but
use the atomic_* functions consistently as if they'd be required for
atomic types.  That is, the next field would be an atomic type,
conceptually, and you should use atomic_load_relaxed for accesses to the
next field in the list search of the insertion operation, for example.
The only exception we make is for initializing stores.

> This patch avoids any reorder of instruction when next is set before
> name to avoid any race.

> Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
> Signed-off-by: Vaneet Narang <v.narang@samsung.com>
> ---
> v1 -> v2 use C11 atomics rather than direct memory barriers
> 
>  elf/dl-load.c |    2 +-
>  elf/dl-misc.c |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index a5318f9..52df931 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -418,7 +418,7 @@ add_name_to_object (struct link_map *l, const char *name)
>    newname->name = memcpy (newname + 1, name, name_len);
>    newname->next = NULL;
>    newname->dont_free = 0;
> -  lastp->next = newname;
> +  atomic_store_release (&(lastp->next), newname);

You can drop the inner parentheses.
Space between name of the function that you call and its arguments (see
the memcpy above).

>  }
>  
>  /* Standard search directories.  */
> diff --git a/elf/dl-misc.c b/elf/dl-misc.c
> index 1e9a6ee..8f48751 100644
> --- a/elf/dl-misc.c
> +++ b/elf/dl-misc.c
> @@ -295,7 +295,7 @@ _dl_name_match_p (const char *name, const struct link_map *map)
>      if (strcmp (name, runp->name) == 0)
>        return 1;
>      else
> -      runp = runp->next;
> +      runp = atomic_load_acquire(&(runp->next));

Likewise.

>  
>    return 0;
>  }