public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Fwd: local equivalent for pthread_once() in glibc?
       [not found] <9EBFE06E-AF1D-48E9-85AB-B74C048438B1@oracle.com>
@ 2017-04-25 21:19 ` Chris Aoki
  2017-04-25 22:55   ` Adhemerval Zanella
  2017-04-26  8:35   ` Florian Weimer
  0 siblings, 2 replies; 11+ messages in thread
From: Chris Aoki @ 2017-04-25 21:19 UTC (permalink / raw)
  To: libc-alpha; +Cc: Chris Aoki

One of my colleagues suggested that I forward the question
below to the libc-alpha alias which would increase my chances
of reaching a glibc malloc expert.   Actually my main question
(in the original message, below) is a general one, since situations
calling for pthread_once() can presumably occur in other contexts.

Chris Aoki

p.s.  The secondary question, which is specific to glibc malloc, is
whether ptmalloc_init() can be called concurrently by multiple threads.

> Begin forwarded message:
> 
> From: Chris Aoki <christopher.aoki@oracle.com>
> Subject: local equivalent for pthread_once() in glibc?
> Date: April 25, 2017 at 10:50:41 AM PDT
> To: libc-help@sourceware.org
> Cc: Chris Aoki <christopher.aoki@oracle.com>
> 
> I have a question about glibc internals.
> 
> Is there a private glibc function equivalent to pthread_once()?
> 
> I have a structure that is frequently accessed after initialization
> so putting a lock around initialization and a check for initialization
> would add considerable overhead.   Normally one would use pthread_once()
> in this situation, but my colleagues tell me that adding a reference to an
> external function is prohibited within libc.so.   I see a macro __libc_once()
> but it does not appear to synchronize:
> 
> /* Define once control variable.  */
> #define __libc_once_define(CLASS, NAME) CLASS int NAME = 0
> 
> /* Call handler iff the first call.  */
> #define __libc_once(ONCE_CONTROL, INIT_FUNCTION) \
>  do {                                                                        \
>    if ((ONCE_CONTROL) == 0) {                                                \
>      INIT_FUNCTION ();                                                       \
>      (ONCE_CONTROL) = 1;                                                     \
>    }                                                                         \
>  } while (0)
> 
> Any clues appreciated.  Thanks
> 
> Chris Aoki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-04-25 21:19 ` Fwd: local equivalent for pthread_once() in glibc? Chris Aoki
@ 2017-04-25 22:55   ` Adhemerval Zanella
  2017-04-26  8:35   ` Florian Weimer
  1 sibling, 0 replies; 11+ messages in thread
From: Adhemerval Zanella @ 2017-04-25 22:55 UTC (permalink / raw)
  To: libc-alpha



On 25/04/2017 18:19, Chris Aoki wrote:
> One of my colleagues suggested that I forward the question
> below to the libc-alpha alias which would increase my chances
> of reaching a glibc malloc expert.   Actually my main question
> (in the original message, below) is a general one, since situations
> calling for pthread_once() can presumably occur in other contexts.
> 
> Chris Aoki
> 
> p.s.  The secondary question, which is specific to glibc malloc, is
> whether ptmalloc_init() can be called concurrently by multiple threads.

It shouldn't, that's why it has the '__malloc_initialized' variable to control
its initialization.  Although, current code without any atomic do not guarantee
very strong semantics (I think it should use something similar to pthread_once
which is indeed __libc_once, see below).

> 
>> Begin forwarded message:
>>
>> From: Chris Aoki <christopher.aoki@oracle.com>
>> Subject: local equivalent for pthread_once() in glibc?
>> Date: April 25, 2017 at 10:50:41 AM PDT
>> To: libc-help@sourceware.org
>> Cc: Chris Aoki <christopher.aoki@oracle.com>
>>
>> I have a question about glibc internals.
>>
>> Is there a private glibc function equivalent to pthread_once()?
>>
>> I have a structure that is frequently accessed after initialization
>> so putting a lock around initialization and a check for initialization
>> would add considerable overhead.   Normally one would use pthread_once()
>> in this situation, but my colleagues tell me that adding a reference to an
>> external function is prohibited within libc.so.   I see a macro __libc_once()
>> but it does not appear to synchronize:
>>
>> /* Define once control variable.  */
>> #define __libc_once_define(CLASS, NAME) CLASS int NAME = 0
>>
>> /* Call handler iff the first call.  */
>> #define __libc_once(ONCE_CONTROL, INIT_FUNCTION) \
>>  do {                                                                        \
>>    if ((ONCE_CONTROL) == 0) {                                                \
>>      INIT_FUNCTION ();                                                       \
>>      (ONCE_CONTROL) = 1;                                                     \
>>    }                                                                         \
>>  } while (0)
>>
>> Any clues appreciated.  Thanks
>>
>> Chris Aoki
> 

This is the default implementation which won't be used on nptl target (basically
all current supported).  It will use the sysdeps/nptl/libc-lockP.h in fact:

250 /* Call handler iff the first call.  */
251 #define __libc_once(ONCE_CONTROL, INIT_FUNCTION) \
252   do {                                                                        \
253     if (PTFAVAIL (__pthread_once))                                            \
254       __libc_ptf_call_always (__pthread_once, (&(ONCE_CONTROL),               \
255                                                INIT_FUNCTION));               \
256     else if ((ONCE_CONTROL) == PTHREAD_ONCE_INIT) {                           \
257       INIT_FUNCTION ();                                                       \
258       (ONCE_CONTROL) |= 2;                                                    \
259     }                                                                         \
260   } while (0)


Which will call pthread_once for multithread programs (since PTFAVAIL will
returns true).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-04-25 21:19 ` Fwd: local equivalent for pthread_once() in glibc? Chris Aoki
  2017-04-25 22:55   ` Adhemerval Zanella
@ 2017-04-26  8:35   ` Florian Weimer
  2017-04-26 12:40     ` Adhemerval Zanella
  1 sibling, 1 reply; 11+ messages in thread
From: Florian Weimer @ 2017-04-26  8:35 UTC (permalink / raw)
  To: Chris Aoki, libc-alpha

On 04/25/2017 11:19 PM, Chris Aoki wrote:

> p.s.  The secondary question, which is specific to glibc malloc, is
> whether ptmalloc_init() can be called concurrently by multiple threads.

No, but the reason is really subtle: pthread_create calls calloc before 
creating the new thread. :)

#0  ptmalloc_init () at arena.c:255
#1  0x00007ffff787a7ad in ptmalloc_init () at malloc.c:2935
#2  malloc_hook_ini (sz=272, caller=<optimized out>) at hooks.c:31
#3  0x00007ffff7879d8a in __libc_calloc (n=<optimized out>,
     elem_size=<optimized out>) at malloc.c:3234
#4  0x00007ffff7deac2b in allocate_dtv (result=0x7ffff77f2700) at 
dl-tls.c:322
#5  __GI__dl_allocate_tls (mem=mem@entry=0x7ffff77f2700) at dl-tls.c:539
#6  0x00007ffff7bc128a in allocate_stack (stack=<synthetic pointer>,
     pdp=<synthetic pointer>, attr=0x7fffffffe520) at allocatestack.c:586
#7  __pthread_create_2_1 (newthread=0x7fffffffe5a8, attr=0x0,
     start_routine=0x4005c6 <f>, arg=0x0) at pthread_create.c:539

>> I have a structure that is frequently accessed after initialization
>> so putting a lock around initialization and a check for initialization
>> would add considerable overhead.   Normally one would use pthread_once()
>> in this situation, but my colleagues tell me that adding a reference to an
>> external function is prohibited within libc.so.   I see a macro __libc_once()
>> but it does not appear to synchronize:
>>
>> /* Define once control variable.  */
>> #define __libc_once_define(CLASS, NAME) CLASS int NAME = 0
>>
>> /* Call handler iff the first call.  */
>> #define __libc_once(ONCE_CONTROL, INIT_FUNCTION) \
>>   do {                                                                        \
>>     if ((ONCE_CONTROL) == 0) {                                                \
>>       INIT_FUNCTION ();                                                       \
>>       (ONCE_CONTROL) = 1;                                                     \
>>     }                                                                         \
>>   } while (0)
>>
>> Any clues appreciated.  Thanks

There is an override for the generic definition in 
sysdeps/nptl/libc-lockP.h, which is used on Linux.  In general, you have 
to examine the source tree carefully to discover such overrides.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-04-26  8:35   ` Florian Weimer
@ 2017-04-26 12:40     ` Adhemerval Zanella
  2017-05-17  9:57       ` Florian Weimer
  0 siblings, 1 reply; 11+ messages in thread
From: Adhemerval Zanella @ 2017-04-26 12:40 UTC (permalink / raw)
  To: libc-alpha



On 26/04/2017 05:35, Florian Weimer wrote:
> On 04/25/2017 11:19 PM, Chris Aoki wrote:
> 
>> p.s.  The secondary question, which is specific to glibc malloc, is
>> whether ptmalloc_init() can be called concurrently by multiple threads.
> 
> No, but the reason is really subtle: pthread_create calls calloc before creating the new thread. :)
> 
> #0  ptmalloc_init () at arena.c:255
> #1  0x00007ffff787a7ad in ptmalloc_init () at malloc.c:2935
> #2  malloc_hook_ini (sz=272, caller=<optimized out>) at hooks.c:31
> #3  0x00007ffff7879d8a in __libc_calloc (n=<optimized out>,
>     elem_size=<optimized out>) at malloc.c:3234
> #4  0x00007ffff7deac2b in allocate_dtv (result=0x7ffff77f2700) at dl-tls.c:322
> #5  __GI__dl_allocate_tls (mem=mem@entry=0x7ffff77f2700) at dl-tls.c:539
> #6  0x00007ffff7bc128a in allocate_stack (stack=<synthetic pointer>,
>     pdp=<synthetic pointer>, attr=0x7fffffffe520) at allocatestack.c:586
> #7  __pthread_create_2_1 (newthread=0x7fffffffe5a8, attr=0x0,
>     start_routine=0x4005c6 <f>, arg=0x0) at pthread_create.c:539
> 
>>> I have a structure that is frequently accessed after initialization
>>> so putting a lock around initialization and a check for initialization
>>> would add considerable overhead.   Normally one would use pthread_once()
>>> in this situation, but my colleagues tell me that adding a reference to an
>>> external function is prohibited within libc.so.   I see a macro __libc_once()
>>> but it does not appear to synchronize:
>>>
>>> /* Define once control variable.  */
>>> #define __libc_once_define(CLASS, NAME) CLASS int NAME = 0
>>>
>>> /* Call handler iff the first call.  */
>>> #define __libc_once(ONCE_CONTROL, INIT_FUNCTION) \
>>>   do {                                                                        \
>>>     if ((ONCE_CONTROL) == 0) {                                                \
>>>       INIT_FUNCTION ();                                                       \
>>>       (ONCE_CONTROL) = 1;                                                     \
>>>     }                                                                         \
>>>   } while (0)
>>>
>>> Any clues appreciated.  Thanks
> 
> There is an override for the generic definition in sysdeps/nptl/libc-lockP.h, which is used on Linux.  In general, you have to examine the source tree carefully to discover such overrides.
> 
> Thanks,
> Florian

Now that we are on the subject, shouldn't we use __libc_once on __malloc_initialized?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-04-26 12:40     ` Adhemerval Zanella
@ 2017-05-17  9:57       ` Florian Weimer
  2017-05-17 14:51         ` Adhemerval Zanella
  0 siblings, 1 reply; 11+ messages in thread
From: Florian Weimer @ 2017-05-17  9:57 UTC (permalink / raw)
  To: Adhemerval Zanella, libc-alpha

On 04/26/2017 02:40 PM, Adhemerval Zanella wrote:
> Now that we are on the subject, shouldn't we use __libc_once on
> __malloc_initialized?

I think that would be misleading.  The arena selection code assumes that
the main arena has been fully initialized, not just that ptmalloc_init
has run.  ptmalloc_init only performs a partial initialization, the rest
is done through malloc_consolidate (which is again triggered by the
allocation which is part of pthread_create).  If we fix ptmalloc_init to
allow concurrent invocation (although that never would happen), the lack
of full initialization would still be in issue (in the impossible case
that ptmalloc_init ran concurrenctly).

Thanks,
Florian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-05-17  9:57       ` Florian Weimer
@ 2017-05-17 14:51         ` Adhemerval Zanella
  2017-06-25 15:47           ` Florian Weimer
  0 siblings, 1 reply; 11+ messages in thread
From: Adhemerval Zanella @ 2017-05-17 14:51 UTC (permalink / raw)
  To: Florian Weimer, libc-alpha



On 17/05/2017 06:57, Florian Weimer wrote:
> On 04/26/2017 02:40 PM, Adhemerval Zanella wrote:
>> Now that we are on the subject, shouldn't we use __libc_once on
>> __malloc_initialized?
> 
> I think that would be misleading.  The arena selection code assumes that
> the main arena has been fully initialized, not just that ptmalloc_init
> has run.  ptmalloc_init only performs a partial initialization, the rest
> is done through malloc_consolidate (which is again triggered by the
> allocation which is part of pthread_create).  If we fix ptmalloc_init to
> allow concurrent invocation (although that never would happen), the lack
> of full initialization would still be in issue (in the impossible case
> that ptmalloc_init ran concurrenctly).
> 
> Thanks,
> Florian
> 

Right, but this is not seem the case for tunable where malloc_consolidate is
called from ptmalloc_init (at least for main_arena).  In any case, I still
think that for adequate __malloc_initialized access using C11 atomic since
its access is still done concurrently (that why I asked if using __libc_once
would be simpler).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-05-17 14:51         ` Adhemerval Zanella
@ 2017-06-25 15:47           ` Florian Weimer
  2017-06-26 12:13             ` Adhemerval Zanella
  0 siblings, 1 reply; 11+ messages in thread
From: Florian Weimer @ 2017-06-25 15:47 UTC (permalink / raw)
  To: Adhemerval Zanella, libc-alpha

On 05/17/2017 04:51 PM, Adhemerval Zanella wrote:
> Right, but this is not seem the case for tunable where malloc_consolidate is
> called from ptmalloc_init (at least for main_arena).  In any case, I still
> think that for adequate __malloc_initialized access using C11 atomic since
> its access is still done concurrently (that why I asked if using __libc_once
> would be simpler).

I don't understand.  The concurrent access solely consists of reads.  We
do not use atomics in that case.

Florian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-06-25 15:47           ` Florian Weimer
@ 2017-06-26 12:13             ` Adhemerval Zanella
  2017-06-26 12:51               ` Florian Weimer
  0 siblings, 1 reply; 11+ messages in thread
From: Adhemerval Zanella @ 2017-06-26 12:13 UTC (permalink / raw)
  To: Florian Weimer, libc-alpha



On 25/06/2017 12:46, Florian Weimer wrote:
> On 05/17/2017 04:51 PM, Adhemerval Zanella wrote:
>> Right, but this is not seem the case for tunable where malloc_consolidate is
>> called from ptmalloc_init (at least for main_arena).  In any case, I still
>> think that for adequate __malloc_initialized access using C11 atomic since
>> its access is still done concurrently (that why I asked if using __libc_once
>> would be simpler).
> 
> I don't understand.  The concurrent access solely consists of reads.  We
> do not use atomics in that case.

My understanding and my point is even for these cases we should aim for 
C11 atomic accesses, even for relaxed loads which on most architectures
will map to normal loads.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-06-26 12:13             ` Adhemerval Zanella
@ 2017-06-26 12:51               ` Florian Weimer
  2017-06-26 13:12                 ` Adhemerval Zanella
  0 siblings, 1 reply; 11+ messages in thread
From: Florian Weimer @ 2017-06-26 12:51 UTC (permalink / raw)
  To: Adhemerval Zanella, libc-alpha

On 06/26/2017 02:13 PM, Adhemerval Zanella wrote:
> 
> 
> On 25/06/2017 12:46, Florian Weimer wrote:
>> On 05/17/2017 04:51 PM, Adhemerval Zanella wrote:
>>> Right, but this is not seem the case for tunable where malloc_consolidate is
>>> called from ptmalloc_init (at least for main_arena).  In any case, I still
>>> think that for adequate __malloc_initialized access using C11 atomic since
>>> its access is still done concurrently (that why I asked if using __libc_once
>>> would be simpler).
>>
>> I don't understand.  The concurrent access solely consists of reads.  We
>> do not use atomics in that case.
> 
> My understanding and my point is even for these cases we should aim for 
> C11 atomic accesses, even for relaxed loads which on most architectures
> will map to normal loads.

I don't think this is true.  If the last write happen before all the
concurrent read accesses, we don't need atomics.  To me, this is quite
clear because this is what happens with locks, where we usually don't
use atomics within the critical section, either.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-06-26 12:51               ` Florian Weimer
@ 2017-06-26 13:12                 ` Adhemerval Zanella
  2017-06-26 23:59                   ` Carlos O'Donell
  0 siblings, 1 reply; 11+ messages in thread
From: Adhemerval Zanella @ 2017-06-26 13:12 UTC (permalink / raw)
  To: Florian Weimer, libc-alpha



On 26/06/2017 09:51, Florian Weimer wrote:
> On 06/26/2017 02:13 PM, Adhemerval Zanella wrote:
>>
>>
>> On 25/06/2017 12:46, Florian Weimer wrote:
>>> On 05/17/2017 04:51 PM, Adhemerval Zanella wrote:
>>>> Right, but this is not seem the case for tunable where malloc_consolidate is
>>>> called from ptmalloc_init (at least for main_arena).  In any case, I still
>>>> think that for adequate __malloc_initialized access using C11 atomic since
>>>> its access is still done concurrently (that why I asked if using __libc_once
>>>> would be simpler).
>>>
>>> I don't understand.  The concurrent access solely consists of reads.  We
>>> do not use atomics in that case.
>>
>> My understanding and my point is even for these cases we should aim for 
>> C11 atomic accesses, even for relaxed loads which on most architectures
>> will map to normal loads.
> 
> I don't think this is true.  If the last write happen before all the
> concurrent read accesses, we don't need atomics.  To me, this is quite
> clear because this is what happens with locks, where we usually don't
> use atomics within the critical section, either.

I do agree with your rationale, but from Torvald comment on BZ #20822 [1]
fix my understanding is to still use atomic relaxed MO for such cases
simply for consistency (and I will add to add more readability to state
the variable is indeed read concurrently and relaxed MO is suffice).

[1] https://sourceware.org/ml/libc-alpha/2016-12/msg00820.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: local equivalent for pthread_once() in glibc?
  2017-06-26 13:12                 ` Adhemerval Zanella
@ 2017-06-26 23:59                   ` Carlos O'Donell
  0 siblings, 0 replies; 11+ messages in thread
From: Carlos O'Donell @ 2017-06-26 23:59 UTC (permalink / raw)
  To: Adhemerval Zanella, Florian Weimer, libc-alpha

On 06/26/2017 09:12 AM, Adhemerval Zanella wrote:
> 
> 
> On 26/06/2017 09:51, Florian Weimer wrote:
>> On 06/26/2017 02:13 PM, Adhemerval Zanella wrote:
>>>
>>>
>>> On 25/06/2017 12:46, Florian Weimer wrote:
>>>> On 05/17/2017 04:51 PM, Adhemerval Zanella wrote:
>>>>> Right, but this is not seem the case for tunable where malloc_consolidate is
>>>>> called from ptmalloc_init (at least for main_arena).  In any case, I still
>>>>> think that for adequate __malloc_initialized access using C11 atomic since
>>>>> its access is still done concurrently (that why I asked if using __libc_once
>>>>> would be simpler).
>>>>
>>>> I don't understand.  The concurrent access solely consists of reads.  We
>>>> do not use atomics in that case.
>>>
>>> My understanding and my point is even for these cases we should aim for 
>>> C11 atomic accesses, even for relaxed loads which on most architectures
>>> will map to normal loads.
>>
>> I don't think this is true.  If the last write happen before all the
>> concurrent read accesses, we don't need atomics.  To me, this is quite
>> clear because this is what happens with locks, where we usually don't
>> use atomics within the critical section, either.
> 
> I do agree with your rationale, but from Torvald comment on BZ #20822 [1]
> fix my understanding is to still use atomic relaxed MO for such cases
> simply for consistency (and I will add to add more readability to state
> the variable is indeed read concurrently and relaxed MO is suffice).
> 
> [1] https://sourceware.org/ml/libc-alpha/2016-12/msg00820.html
 https://sourceware.org/glibc/wiki/Consensus#Standards_we_use
https://sourceware.org/glibc/wiki/Concurrency

If there is no data race then we need to document why we have a happens-before.

-- 
Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-06-26 23:59 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <9EBFE06E-AF1D-48E9-85AB-B74C048438B1@oracle.com>
2017-04-25 21:19 ` Fwd: local equivalent for pthread_once() in glibc? Chris Aoki
2017-04-25 22:55   ` Adhemerval Zanella
2017-04-26  8:35   ` Florian Weimer
2017-04-26 12:40     ` Adhemerval Zanella
2017-05-17  9:57       ` Florian Weimer
2017-05-17 14:51         ` Adhemerval Zanella
2017-06-25 15:47           ` Florian Weimer
2017-06-26 12:13             ` Adhemerval Zanella
2017-06-26 12:51               ` Florian Weimer
2017-06-26 13:12                 ` Adhemerval Zanella
2017-06-26 23:59                   ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).