From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Florian Weimer <fweimer@redhat.com>
Cc: Libc-alpha <libc-alpha@sourceware.org>, hpa@zytor.com
Subject: Re: RFC: pthread pid accessor (BZ# 27880)
Date: Tue, 1 Jun 2021 11:09:51 -0300 [thread overview]
Message-ID: <a043b868-1a74-98ba-3eac-5b159e706d7f@linaro.org> (raw)
In-Reply-To: <87zgwa979l.fsf@oldenburg.str.redhat.com>
On 31/05/2021 13:51, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> It seems that this is trickier than it seems, some issues we might consider
>> first:
>>
>> 1. What should we do with detached threads? As for pthread_kill, issuing
>> a pthread_gettid_np might use an invalid handler (since the pthread_t
>> identifier might be reused). This only solution I have is to define
>> it as undefined behavior, this is not great but to proper support it
>> would incur to keep tracking or all possible pthread_t identifiers
>> (we already keep the glibc provided stacks, dl_stack_cache, so it
>> would be a matter to include the user provided one in the list as
>> special entries).
>
> Detached threads are fine as long as the thread is still running. This
> is something the application can ensure using synchronization.
>
> There are other interfaces with this property, including pthread_kill.
Afaik pthread_kill detaches created threads or thread that call pthread_detach
are not really defined (the thread ID lifetime ends when detached is issued).
We even have a bug report for this, BZ #19193.
But currently calling pthread_kill is already undefined: it accesses
the internal tid file without any extra check. Even using the
INVALID_NOT_TERMINATED_TD_P/INVALID_TD_P won't really improve thing, since
might still access invalid memory if the thread cache was empty and the
resulted 'struct thread' was deallocated.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=19193
>
>> 2. I think that once we provide this API, developers will start to use to
>> query if a thread is alive and I am not sure if this is really the
>> proper API for this. This is the same issue as 1.
>
> They probably use pthread_kill with a zero signal for that today.
> Here's an example for httpd:
>
> | /* deal with a rare timing window which affects waking up the
> | * listener thread... if the signal sent to the listener thread
> | * is delivered between the time it verifies that the
> | * listener_may_exit flag is clear and the time it enters a
> | * blocking syscall, the signal didn't do any good... work around
> | * that by sleeping briefly and sending it again
> | */
> |
> | iter = 0;
> | while (iter < 10 &&
> | #ifdef HAVE_PTHREAD_KILL
> | pthread_kill(*listener_os_thread, 0)
> | #else
> | kill(ap_my_pid, 0)
> | #endif
> | == 0) {
> | /* listener not dead yet */
Right, I thing the newer interface might work for non detached or
threads that are not yet joined.
>
>> 3. How do we handle the concurrent access between pthread_join and
>> pthread_gettid_np? Once a pthread_join is issued, the pthread_t
>> identifier might be reused and accessing it should be
>> invalid. pthread_join first synchronizes using 'joinid' to avoid
>> concurrent pthread_join and then wait the kernel signal on 'tid'
>> that the thread has finished. The straightforward
>> 'pthread_gettid_np' implementation would do a atomic load on tid,
>> however it might read a transient value between pthread_join
>> 'joinid' setup and the futex wait. I am not sure how to handle it
>> correctly.
>
> The application must ensure through synchronization that the lifetime of
> the thread handle has not ended yet. Concurrent calls with pthread_join
> is fine as long as the thread has not exited yet (same as for
> pthread_kill).
>
> The question is what we should do after thread exit, but with a joinable
> thread. I think for that we should return the original TID the kernel
> assigned (even though it could not be reused). That would strongly
> discourage the unsafe probing behavior because the function cannot be
> used to probe if the thread is still running.
Do you mean between the thread cancel/exit and kernel reset the struct
thread 'tid' field? The main problem is the thread might be detached
between, that's why pthread_join synchronizes first using the 'joinid'
field.
But I think there is no much we can do it besides a simple atomic
load on struct thread 'tid'. Trying to synchronize with 'joinid'
won't really help, since we 'pthread_detach' can't fail (not with
an intermittent error). We might try to use either a busy wait or
a lock on pthread_deatch and pthread_join over 'joinid', but I don't
think this really solves much without introducing potential other
latency issues.
Peter has suggested to return zero or -1 with ESRCH if the pthread
is detached from its underlying kernel thread, but I think
INVALID_NOT_TERMINATED_TD_P is not valid for detached threads since
the struct thread ownership might be invalid at the time of the
call. So I think we should just make it undefined behavior and not
making any assumptions.
>
>> Also, MacOSX signature is:
>>
>> int pthread_gettid_np (pthread_t thread, uint64_t *thread_id)
>>
>> And it returns the current thread identification if THREAD is NULL, returns
>> ESRCH for invalid handle (the 1. and 2. issue below), and also consults
>> the kernel if the identifier can no be obtained.
>
> Macos calls the interface pthread_threadid_np, actually. It looks as if
> it returns a truly unique number that isn't reused within the process or
> system. A Linux TID wouldn't be like that, so I think we should call
> the interface something else.
Fair enough, bionic has
pid_t pthread_gettid_np(pthread_t t)
So I think we might be an option. It basically returns the underlying
kernel process identifier, no extra guarantee as done by MacOSX
implementation.
next prev parent reply other threads:[~2021-06-01 14:09 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-27 19:36 Adhemerval Zanella
2021-05-31 16:51 ` Florian Weimer
2021-06-01 14:09 ` Adhemerval Zanella [this message]
2021-06-01 14:18 ` Florian Weimer
2021-06-01 15:16 ` Adhemerval Zanella
2021-06-01 17:36 ` Florian Weimer
2021-06-01 17:51 ` Adhemerval Zanella
2021-06-01 17:54 ` Florian Weimer
2021-06-01 18:01 ` Adhemerval Zanella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a043b868-1a74-98ba-3eac-5b159e706d7f@linaro.org \
--to=adhemerval.zanella@linaro.org \
--cc=fweimer@redhat.com \
--cc=hpa@zytor.com \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).