public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "agarwalshashank12 at gmail dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug nptl/26570] New: One of the thread is getting hanged when two threads are trying yo join each other.
Date: Thu, 03 Sep 2020 08:30:22 +0000	[thread overview]
Message-ID: <bug-26570-131@http.sourceware.org/bugzilla/> (raw)

https://sourceware.org/bugzilla/show_bug.cgi?id=26570

            Bug ID: 26570
           Summary: One of the thread is getting hanged when two threads
                    are trying yo join each other.
           Product: glibc
           Version: 2.31
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: nptl
          Assignee: unassigned at sourceware dot org
          Reporter: agarwalshashank12 at gmail dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

Created attachment 12814
  --> https://sourceware.org/bugzilla/attachment.cgi?id=12814&action=edit
Test Case describing the scenario

Hello,

Description -

I have obeserved a behaviour in pthread_join(3) where 2 threads are trying to
join each other and one of the thread is getting hanged indefinitely as it
stuck in the futex(2) system call for the other thread to wake.

Test Case Description -

1. Create multiple threads using pthread_create in the joinable state.
2. thread1 joins thread2 using pthread_join.
3. thread2 joins thread1 using pthread_join.

EXPECTED RESULT :
1. Thread1 should join thread2.
2. when thread2 tries to join thread1 it detects a deadlock.

Actual Result :
Thread 2 is getting hanged.

Glibc Version -
2.31, 2.28

Sample Code -
pthread_join_sample.c attached

Enviorenment -
$uname -a
Linux a1sb2-010 4.18.0-147.el8.x86_64 #1 SMP Thu Sep 26 15:52:44 UTC 2019
x86_64 x86_64 x86_64 GNU/Linux

Root Caues -

When any thread performs the pthread_join(3) on any thread, Glibc stores the ID
of the former thread(performing pthread_join(3)) into the member joinid of the
structure pthread of the thread on which pthread_join(3) is performed.

Hence in the expected scenario pthread_join(3) code flow works as follows -

1. thread t1 performing pthread_join(3) on thread t2, GLibc performs below
checks before invoking futex wait-

Check 1 -
If the joinid of t1 contains the id of the t2 then it returns the EDEADLK if
not then moves to check 2,

Check 2

Glibc checks wheather anyother thread has already performed the pthread_join on
t2 or not. This is achieved with the help of joinid member of the thread t2. If
this member contains the NULL value it means no other thread has performed the
pthread_join on t2 and thread t1 updates its ID into the joinid member of
structure pthread of t2 in nptl/pthread_join_common.c and perform futex wait.
If the joinid member of thread t2 already contain some non null value it means
some other thread has already performed pthread_join(3) on thread t2 and
library returns EINVAL to the thread t1.

2. When thread t2 performs pthread_join(3) on t1, Glibc fails in check 2 i.e
joinid of t2 contains the id of the t1 and it returns the EDEADLK.

As per analysis In Glibcv2.31, thread performing pthread_join(3) on any thread
is not able to update the ID of the former thread into the member joinid of the
other thread, Hence both the threads pass check 1 and check 2 and gone into the
futex wait and one thread gets hanged. In glibc 2.31 below code semantics is
used to update the ID of the thread -

atomic_compare_exchange_weak_acquire (&pd->joinid,&self,NULL)
This macro internally calls the builtin function __atomic_compare_exchange_n
((mem), (expected), (desired), 1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED) of gcc.
This built-in function implements an atomic compare and exchange operation.
This compares the contents of *ptr with the contents of *expected. If equal,
the operation is a read-modify-write operation that writes desired into *ptr.
If they are not equal, the operation is a read and the current contents of *ptr
are written into *expected.

The expected value passed to this builtin function is &self i.e the ID of the
thread calling pthread_join(3). So as per our analysis joinid of the thread on
which pthread_join(3) is performed will never be equal to the ID of the thread
calling pthread_join(3) as its initial value will by NULL and this member
joinid is always updated from pthread_join(3) hence in builtin function
__atomic_compare_exchange_n the comaprision will always gets failed and the
joinid will never be updated. Both the threads will go into the futex wait
operation.

Fix -

In order to fix the issue, expected value passed to the function
__atomic_compare_exchange_n should be null and the desired value should be the
ID of the thread performing pthread_join(3) as below -

atomic_compare_exchange_weak_acquire (&pd->joinid,&null_ptr,self)

Regards
Shashank Agarwal
HCL Technologies Limited
shashank_agarwal@hcl.com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

             reply	other threads:[~2020-09-03  8:30 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-03  8:30 agarwalshashank12 at gmail dot com [this message]
2020-09-03 13:08 ` [Bug nptl/26570] " adhemerval.zanella at linaro dot org
2020-09-04  5:23 ` agarwalshashank12 at gmail dot com
2020-09-04  8:01 ` schwab@linux-m68k.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-26570-131@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).