public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/18493] New: Infinite loop/deadlock? in __libc_recv (fd=fd@entry=300, buf=buf@entry=0x7f6042880600, n=n@entry=5, flags=-1, flags@entry=258) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:33
@ 2015-06-05 11:12 dan at censornet dot com
  2015-06-05 11:13 ` [Bug libc/18493] " dan at censornet dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: dan at censornet dot com @ 2015-06-05 11:12 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18493

            Bug ID: 18493
           Summary: Infinite loop/deadlock? in __libc_recv
                    (fd=fd@entry=300, buf=buf@entry=0x7f6042880600,
                    n=n@entry=5, flags=-1, flags@entry=258) at
                    ../sysdeps/unix/sysv/linux/x86_64/recv.c:33
           Product: glibc
           Version: 2.19
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
          Assignee: unassigned at sourceware dot org
          Reporter: dan at censornet dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

In a multi-threaded pthreads process running on Ubuntu 14.04 AMD64 (with over
1000 threads) which uses real time FIFO scheduling, we occasionally see calls
to recv() with flags (MSG_PEEK | MSG_WAITALL) get stuck in an infinte loop or
deadlock meaning the threads lock up chewing as much CPU as they can (due to
FIFO scheduling) while stuck inside recv().

Here's an example gdb back trace:

[Switching to thread 4 (Thread 0x7f6040546700 (LWP 27251))]
#0  0x00007f6231d2f7eb in __libc_recv (fd=fd@entry=146,
buf=buf@entry=0x7f6040543600, n=n@entry=5, flags=-1, flags@entry=258) at
../sysdeps/unix/sysv/linux/x86_64/recv.c:33
33      ../sysdeps/unix/sysv/linux/x86_64/recv.c: No such file or directory.
(gdb) bt
#0  0x00007f6231d2f7eb in __libc_recv (fd=fd@entry=146,
buf=buf@entry=0x7f6040543600, n=n@entry=5, flags=-1, flags@entry=258) at
../sysdeps/unix/sysv/linux/x86_64/recv.c:33
#1  0x0000000000421945 in recv (__flags=258, __n=5, __buf=0x7f6040543600,
__fd=146) at /usr/include/x86_64-linux-gnu/bits/socket2.h:44
[snip]

The socket is a TCP socket in blocking mode, the recv() call is inside an outer
loop with a counter, and I've checked the counter with gdb and it's always at
1, meaning that I'm sure that the outer loop isn't the problem, the thread is
indeed deadlocked inside the recv() internals.

Other nodes: 
* There always seems to be 2 or more threads deadlocked in the same place (same
recv() call but with distinct FDs)
* The threads calling recv() have cancellation disbaled by previously
executing: thread_setcancelstate(PTHREAD_CANCEL_DISABLE, NULL);

I've even tried adding a poll() call for POLLRDNORM on the socket before
calling recv() with MSG_PEEK | MSG_WAITALL flags to try to make sure there's
data available on the socket before calling poll(), but it makes no difference.

So, I don't know what is wrong here, I've read all the recv() documentation and
believe that recv() is being used correctly, the only conclusion I can come to
is that there is a bug in libc recv() when using flags MSG_PEEK | MSG_WAITALL
with thousands of pthreads running.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-06-11  8:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-05 11:12 [Bug libc/18493] New: Infinite loop/deadlock? in __libc_recv (fd=fd@entry=300, buf=buf@entry=0x7f6042880600, n=n@entry=5, flags=-1, flags@entry=258) at ../sysdeps/unix/sysv/linux/x86_64/recv.c:33 dan at censornet dot com
2015-06-05 11:13 ` [Bug libc/18493] " dan at censornet dot com
2015-06-05 11:17 ` dan at censornet dot com
2015-06-05 11:44 ` schwab@linux-m68k.org
2015-06-05 11:52 ` dan at censornet dot com
2015-06-10 14:00 ` fweimer at redhat dot com
2015-06-11  8:00 ` dan at censornet dot com
2015-06-11  8:07 ` fweimer at redhat dot com
2015-06-11  8:08 ` fweimer at redhat dot com
2015-06-11  8:15 ` dan at censornet dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).