Hi Florian, thanks for the review.

On 07/11/2016 15:21, Florian Weimer wrote:
> On 10/13/2016 09:45 PM, Adhemerval Zanella wrote:
>> This patch remove the PID cache and usage in current GLIBC code.  Current
>> usage is mainly used for performance optimization to avoid the syscall,
>> however it adds some issues:
>>
>>   - The exposed clone syscall will try to set pid/tid to make the new
>>     thread somewhat compatible with current GLIBC assumptions.  This cause
>>     a set of issue with new workloads and usercases (such as BZ#17214 and
> 
> usecases

Ack, I changed it on this new version.

> 
>>     [1]) as well for new internal usage of clone to optimize other algorithms
>>     (such as clone plus CLONE_VM for posix_spawn, BZ#19957).
>>
>>   - The caching complexity also added some bugs in the past [2] [3] and
>>     requires more effort of each port to handle such requirements (for
>>     both clone and vfork implementation).
>>
>>   - Caching performance gain in mainly or getpid and some specific
>>     code paths. The getpid performance leverage is questionable [4],
>>     either by the idea of getpid being a hotspot as for the getpid
>>     implementation itself (if it is indeed a justifiable hotspot a
>>     vDSO symbol could let to a much more simpler solution).
> 
> It's a hotspot for incorrect/broken fork detection.

If you mean the assert on fork.c, I review the code and it seems
unnecessary to remove the assert on child creation:

146   if (pid == 0)
147     {
148       struct pthread *self = THREAD_SELF;
149 
150       assert (THREAD_GETMEM (self, tid) != ppid);
151 

I added it back. 

> 
>>     Other usage is mainly for non usual code paths, such as pthread
>>     cancellation signal and handling.
>>
>> For thread creation (on atack allocation) the code simplification in fact
> 
> stack allocation

Ack.

> 
>> adds some performance gain due the no need of transverse the stack
>> cache and invalidate each element pid.
>>
>> Other thread usages will require a direct getpid syscall, such as
>> cancellation/setxid signal, thread cancellation, thread fail path
>> (at create_thread), and thread signal (pthread_kill and
>> pthread_sigqueue).  However these are hardly usual hotspots and I
>> think adding a syscall is justifiable.
>>
>> It also simplifies both the clone and vfork arch-specific implementation.
>> And by review each fork implementation there are some discrepancies that
>> this patch also solves:
>>
>>   - microblaze clone/vfork does not set/reset the pid/tid field
>>   - hppa uses the default vfork implementation that fallback to fork.
>>     Since vfork is deprecated I do not think we should bother with it.
>>
>> The patch also removes the TID caching in clone. My understanding for
>> such semantic is try provide some pthread usage after a user program
>> issue clone directly (as done by thread creation with CLONE_PARENT_SETTID
>> and pthread tid member). However, as stated before in multiple threads,
>> GLIBC provides clone syscalls without futher supporting all this
> 
> further

Ack.

> 
>> semantics. It means that, although GLIBC currently tries a better effort,
>> since it does not make any more guarantees, specially for newer and newer
>> clone flags.
> 
> So the question is whether this is used internally.  Why do you think this is safe?  Because we set it again with SET_TID_ADDRESS?
> 

The tid fields is basically used internally on pthread implementations
(including getpid) and since correct usage means thread *must* be
created using pthread_create we are sure the tid field will be
correctly set due 'set_tid_address' from __pthread_initialize_pids.

>> diff --git a/nptl/descr.h b/nptl/descr.h
>> index 8e4938d..17a2c9f 100644
>> --- a/nptl/descr.h
>> +++ b/nptl/descr.h
>> @@ -167,7 +167,7 @@ struct pthread
>>       therefore stack) used' flag.  */
>>    pid_t tid;
>>
>> -  /* Process ID - thread group ID in kernel speak.  */
>> +  /* Ununsed.  */
>>    pid_t pid;
> 
> Please rename to pid_unused or something like that, to make sure it's no longer referenced.

I renamed it on my local branch and I also updated the change spot
that it incur:

diff --git a/nptl_db/structs.def b/nptl_db/structs.def
index a9b621b..1cb6a46 100644
--- a/nptl_db/structs.def
+++ b/nptl_db/structs.def
@@ -48,7 +48,6 @@ DB_STRUCT (pthread)
 DB_STRUCT_FIELD (pthread, list)
 DB_STRUCT_FIELD (pthread, report_events)
 DB_STRUCT_FIELD (pthread, tid)
-DB_STRUCT_FIELD (pthread, pid)
 DB_STRUCT_FIELD (pthread, start_routine)
 DB_STRUCT_FIELD (pthread, cancelhandling)
 DB_STRUCT_FIELD (pthread, schedpolicy)


> 
>> diff --git a/sysdeps/unix/sysv/linux/getpid.c b/sysdeps/unix/sysv/linux/getpid.c
>> index 1124549..2bfafed 100644
>> --- a/sysdeps/unix/sysv/linux/getpid.c
>> +++ b/sysdeps/unix/sysv/linux/getpid.c
> 
> Can you drop this file completely, so that the default implementation is used?

I do not have a preference here, but I think now we can use syscalls.list
instead.  I change it on this version.

> 
>> diff --git a/sysdeps/unix/sysv/linux/pthread_kill.c b/sysdeps/unix/sysv/linux/pthread_kill.c
> 
>> @@ -49,14 +50,15 @@ __pthread_kill (pthread_t threadid, int signo)
>>    /* We have a special syscall to do the work.  */
>>    INTERNAL_SYSCALL_DECL (err);
>>
>> +  pid_t pid = getpid ();
> 
> Use __getpid for consistency?

Alright, I change it.

> 
>>    /* One comment: The PID field in the TCB can temporarily be changed
>>       (in fork).  But this must not affect this code here.  Since this
>>       function would have to be called while the thread is executing
>>       fork, it would have to happen in a signal handler.  But this is
>>       no allowed, pthread_kill is not guaranteed to be async-safe.  */
> 
> Comment is outdated.

Ack, I removed this implementation.

> 
>> diff --git a/sysdeps/unix/sysv/linux/pthread_sigqueue.c b/sysdeps/unix/sysv/linux/pthread_sigqueue.c
>> index 7694d54..642366b 100644
>> --- a/sysdeps/unix/sysv/linux/pthread_sigqueue.c
>> +++ b/sysdeps/unix/sysv/linux/pthread_sigqueue.c
>> @@ -49,12 +49,14 @@ pthread_sigqueue (pthread_t threadid, int signo, const union sigval value)
>>    if (signo == SIGCANCEL || signo == SIGTIMER || signo == SIGSETXID)
>>      return EINVAL;
>>
>> +  pid_t pid = getpid ();
> 
> Use __getpid for consistency?

Ack.

> 
>>      function would have to be called while the thread is executing
>>      fork, it would have to happen in a signal handler.  But this is
> 
> Comment is outdated.

Ack, I removed it.
> 
>> diff --git a/sysdeps/unix/sysv/linux/tst-clone2.c b/sysdeps/unix/sysv/linux/tst-clone2.c
>> index 68a7e6d..b20332a 100644
>> --- a/sysdeps/unix/sysv/linux/tst-clone2.c
> 
> It may make sense to update the file comment.
> 
>> -  pid_t pid = THREAD_GETMEM (THREAD_SELF, pid);
>> -  pid_t tid = THREAD_GETMEM (THREAD_SELF, tid);
>> +  /* Clone without flags do not cache the pid and tid is only set in thread
> 
> does not cache.  But the comment seems outdated?

Maybe, my initial idea was to make sure that of this caching
not being used anymore.  I removed this comment as well.

> 
>> +     creation by using CLONE_PARENT_SETTID plus pthread tid field address.
>> +     So to actually get all parent's pid and own pid/tid it requires to use
>> +     the syscalls.  */
>> +  pid_t ppid = getppid ();
>> +  pid_t pid = getpid ();
>> +  pid_t tid = syscall (__NR_gettid);
>>
>> +  while (write (pipefd[1], &ppid, sizeof ppid) < 0)
>> +    continue;
>>    while (write (pipefd[1], &pid, sizeof pid) < 0)
>>      continue;
>>    while (write (pipefd[1], &tid, sizeof tid) < 0)
> 
> These while loops look incorrect.  Perhaps just fail the test if the result is not equal to sizeof of the value being written?

I changed it.

> 
>> +  /* Some sanity checks for clone syscall: returned ppid should be currernt
> 
> current
> 

Ack.

> Thanks,
> Florian

In attachment I am sending a revised patch.