public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Florian Weimer <fweimer@redhat.com>
Cc: Libc-alpha <libc-alpha@sourceware.org>, hpa@zytor.com
Subject: Re: RFC: pthread pid accessor (BZ# 27880)
Date: Tue, 1 Jun 2021 11:09:51 -0300	[thread overview]
Message-ID: <a043b868-1a74-98ba-3eac-5b159e706d7f@linaro.org> (raw)
In-Reply-To: <87zgwa979l.fsf@oldenburg.str.redhat.com>



On 31/05/2021 13:51, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> It seems that this is trickier than it seems, some issues we might consider 
>> first:
>>
>>   1. What should we do with detached threads? As for pthread_kill, issuing 
>>      a pthread_gettid_np might use an invalid handler (since the pthread_t 
>>      identifier might be reused).  This only solution I have is to define 
>>      it as undefined behavior, this is not great but to proper support it
>>      would incur to keep tracking or all possible pthread_t identifiers 
>>      (we already keep the glibc provided stacks, dl_stack_cache, so it 
>>       would be a matter to include the user provided one in the list as 
>>      special entries).
> 
> Detached threads are fine as long as the thread is still running.  This
> is something the application can ensure using synchronization.
> 
> There are other interfaces with this property, including pthread_kill.

Afaik pthread_kill detaches created threads or thread that call pthread_detach
are not really defined (the thread ID lifetime ends when detached is issued).
We even have a bug report for this, BZ #19193.

But currently calling pthread_kill is already undefined: it accesses 
the internal tid file without any extra check.  Even using the
INVALID_NOT_TERMINATED_TD_P/INVALID_TD_P won't really improve thing, since
might still access invalid memory if the thread cache was empty and the
resulted 'struct thread' was deallocated.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=19193

> 
>>   2. I think that once we provide this API, developers will start to use to
>>      query if a thread is alive and I am not sure if this is really the 
>>      proper API for this. This is the same issue as 1.
> 
> They probably use pthread_kill with a zero signal for that today.
> Here's an example for httpd:
> 
> |         /* deal with a rare timing window which affects waking up the
> |          * listener thread...  if the signal sent to the listener thread
> |          * is delivered between the time it verifies that the
> |          * listener_may_exit flag is clear and the time it enters a
> |          * blocking syscall, the signal didn't do any good...  work around
> |          * that by sleeping briefly and sending it again
> |          */
> | 
> |         iter = 0;
> |         while (iter < 10 &&
> | #ifdef HAVE_PTHREAD_KILL
> |                pthread_kill(*listener_os_thread, 0)
> | #else
> |                kill(ap_my_pid, 0)
> | #endif
> |                == 0) {
> |             /* listener not dead yet */

Right, I thing the newer interface might work for non detached or
threads that are not yet joined.

> 
>>   3. How do we handle the concurrent access between pthread_join and
>>   pthread_gettid_np? Once a pthread_join is issued, the pthread_t
>>   identifier might be reused and accessing it should be
>>   invalid. pthread_join first synchronizes using 'joinid' to avoid
>>   concurrent pthread_join and then wait the kernel signal on 'tid'
>>   that the thread has finished.  The straightforward
>>   'pthread_gettid_np' implementation would do a atomic load on tid,
>>   however it might read a transient value between pthread_join
>>   'joinid' setup and the futex wait.  I am not sure how to handle it
>>   correctly.
> 
> The application must ensure through synchronization that the lifetime of
> the thread handle has not ended yet.  Concurrent calls with pthread_join
> is fine as long as the thread has not exited yet (same as for
> pthread_kill).
> 
> The question is what we should do after thread exit, but with a joinable
> thread.  I think for that we should return the original TID the kernel
> assigned (even though it could not be reused).  That would strongly
> discourage the unsafe probing behavior because the function cannot be
> used to probe if the thread is still running.

Do you mean between the thread cancel/exit and kernel reset the struct
thread 'tid' field?  The main problem is the thread might be detached
between, that's why pthread_join synchronizes first using the 'joinid'
field.

But I think there is no much we can do it besides a simple atomic
load on struct thread 'tid'.  Trying to synchronize with 'joinid'
won't really help, since we 'pthread_detach' can't fail (not with
an intermittent error).  We might try to use either a busy wait or
a lock on pthread_deatch and pthread_join over 'joinid', but I don't
think this really solves much without introducing potential other
latency issues.

Peter has suggested to return zero or -1 with ESRCH if the pthread
is detached from its underlying kernel thread, but I think 
INVALID_NOT_TERMINATED_TD_P is not valid for detached threads since
the struct thread ownership might be invalid at the time of the
call. So I think we should just make it undefined behavior and not
making any assumptions.

> 
>> Also, MacOSX signature is:
>>
>>   int pthread_gettid_np (pthread_t thread, uint64_t *thread_id)
>>
>> And it returns the current thread identification if THREAD is NULL, returns
>> ESRCH for invalid handle (the 1. and 2. issue below), and also consults 
>> the kernel if the identifier can no be obtained.
> 
> Macos calls the interface pthread_threadid_np, actually.  It looks as if
> it returns a truly unique number that isn't reused within the process or
> system.  A Linux TID wouldn't be like that, so I think we should call
> the interface something else.

Fair enough, bionic has 

  pid_t pthread_gettid_np(pthread_t t)

So I think we might be an option. It basically returns the underlying
kernel process identifier, no extra guarantee as done by MacOSX 
implementation.

  reply	other threads:[~2021-06-01 14:09 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-27 19:36 Adhemerval Zanella
2021-05-31 16:51 ` Florian Weimer
2021-06-01 14:09   ` Adhemerval Zanella [this message]
2021-06-01 14:18     ` Florian Weimer
2021-06-01 15:16       ` Adhemerval Zanella
2021-06-01 17:36         ` Florian Weimer
2021-06-01 17:51           ` Adhemerval Zanella
2021-06-01 17:54             ` Florian Weimer
2021-06-01 18:01               ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a043b868-1a74-98ba-3eac-5b159e706d7f@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=fweimer@redhat.com \
    --cc=hpa@zytor.com \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).