public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/4294] New: rwlock hangs under stress load
@ 2007-03-28 18:13 Matthew dot L dot Dunkle at nasa dot gov
  2007-03-30 11:36 ` [Bug nptl/4294] " jakub at redhat dot com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Matthew dot L dot Dunkle at nasa dot gov @ 2007-03-28 18:13 UTC (permalink / raw)
  To: glibc-bugs

fedora core 6 x86_64 smp installation.  updated to kernel.org 2.6.20.3 kernel
configured with fully preemptible kernel, including preemptible big kernel lock.
 running on 4 dual-core AMD 880 system with 8-gig of ram.

real-time process with 4 reader threads and 1 writer thread of higher real-time
SCHED_FIFO priority than reader threads.  all 5 threads use "cpu affinity"
setting to each obtain a processor to themselves.  attempted to set rwlock
attributes to "writer preferred", although i'm not certain it worked ("flags"
still appear to be zero in pthread_rwlock_t structure, but maybe i'm looking at
the wrong thing).  approximately 15000 write locks and 4x15000 read locks per
second under full load.

process will eventually get "stuck" with reader threads returning "RESOURCE
TEMPORARILY UNAVAILABLE" status forever after.  amount of time it takes to get
"stuck" is highly variable (can be minutes or hours).

the EAGAIN return status would appear to be indicative of a counter overflow
condition, but i believe it's actually just the opposite, in a roundabout manner
of speaking.  i don't know much about assembly language code, so i tried taking
the "C" code equivalents (instead of the x86_64 assembly functions) for the
pthread_rwlock_rdlock, pthread_rwlock_wrlock, and pthread_rwlock_unlock
functions, and "incorporated them into my process" so to speak, hoping to
duplicate the symptoms, and allowing me to insert some printf statements, which
might shed some light on the problem.

i was able to duplicate the situation, and what appears to be happening to me is
two of the reader threads are simultaneously incrementing the __nr_readers
counter in the pthread_rwlock_t structure, so essentially one of the increments
is "missed".  for example, the __nr_readers counter starts at zero let's say,
both threads increment the counter simultaneously, and it ends up at one, where
it should have ended up at two.  then when the "unlock" call decrements the
counter, it goes to -1 (or 4294967295 as an unsigned 32-bit integer).  the next
time a rdlock is issued, it thinks the counter is about to roll over, and
returns the EAGAIN status.

i thought the low level lock should prevent two threads from simultaneously
incrementing or decrementing those counters, but for some reason that doesn't
seem to be the case?  so perhaps the problem is really with the lll_mutex_lock
rather than the rwlock itself, i'm not really sure?

sorry, this is my first bug report, and i didn't know what to fill in for the
host, target, build, triplets, but hopefully i've provided enough information
otherwise.  if not, feel free to e-mail me at Matthew.L.Dunkle@nasa.gov if you
need additional information.

i know this might not be easy to reproduce, especially considering the equipment
i am working with and so forth, but i appreciate whatever efforts you can make.
 in the meantime, i am going to attempt to use something else, maybe a plain
vanilla mutex, to see if i can get it working in a different manner.  thank you.

-- 
           Summary: rwlock hangs under stress load
           Product: glibc
           Version: 2.4
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
        AssignedTo: drepper at redhat dot com
        ReportedBy: Matthew dot L dot Dunkle at nasa dot gov
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=4294

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-06-17  0:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-28 18:13 [Bug nptl/4294] New: rwlock hangs under stress load Matthew dot L dot Dunkle at nasa dot gov
2007-03-30 11:36 ` [Bug nptl/4294] " jakub at redhat dot com
2007-03-30 14:22 ` Matthew dot L dot Dunkle at nasa dot gov
2007-04-09 22:28 ` Matthew dot L dot Dunkle at nasa dot gov
2007-05-01  6:01 ` drepper at redhat dot com
2008-06-17  0:19 ` twong at gear6 dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).