public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/31477] New: Priority Inversion and Unlimited Spin of RWlock
@ 2024-03-12 11:33 pengzheng at apache dot org
  2024-04-28 20:17 ` [Bug nptl/31477] " github at kalvdans dot no-ip.org
  0 siblings, 1 reply; 2+ messages in thread
From: pengzheng at apache dot org @ 2024-03-12 11:33 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=31477

            Bug ID: 31477
           Summary: Priority Inversion and Unlimited Spin of RWlock
           Product: glibc
           Version: 2.29
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: nptl
          Assignee: unassigned at sourceware dot org
          Reporter: pengzheng at apache dot org
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

I found that there are several unlimited spins in the current pthread_rwlock's
implementation.
Therefore, it suffers from the same issue of user-space spinlocks as mentioned
in this LWN article ([0]):

> One thread might be spinning on a lock while the holder has been preempted 
> and isn't running at all. In such cases, the lock will not be released soon,
> and the spinning just wastes CPU time. In the worst case, the thread that is 
> spinning may be the one that is keeping the lock holder from running, meaning 
> that the spinning thread is actively preventing the lock it needs from being 
> released. In such situations, the code should simply stop spinning and go to 
> sleep until the lock is released.

I just encountered one such issue in an embedded Linux system: there were
several readers of priority SCHED_RR, and one writer of priority SCHED_OTHER.

It was found that two high priority readers are spinning (consuming 100% CPU)
within the loop near the end of `__pthread_rwlock_rdlock_full`:

  for (;;)
    {
      while (((wpf = atomic_load_relaxed (&rwlock->__data.__wrphase_futex))
          | PTHREAD_RWLOCK_FUTEX_USED) == (1 | PTHREAD_RWLOCK_FUTEX_USED))
      {/*omitted*/}
      if (ready)
    /* See below.  */
             break;
      if ((atomic_load_acquire (&rwlock->__data.__readers)
       & PTHREAD_RWLOCK_WRPHASE) == 0)
            ready = true;
    }
  return 0;

And the SCHED_OTHER writer was just about to enable the `__wrphase_futex` in
`__pthread_rwlock_wrlock_full` (just one ARM instruction away)
but never able to do that (the two readers ate nearly all available CPUs):

  while ((r & PTHREAD_RWLOCK_WRPHASE) == 0
     && (r >> PTHREAD_RWLOCK_READER_SHIFT) == 0)
    {
      if (atomic_compare_exchange_weak_acquire (&rwlock->__data.__readers,
                        &r, r | PTHREAD_RWLOCK_WRPHASE))
    {
      atomic_store_relaxed (&rwlock->__data.__wrphase_futex, 1);  /* writer was
stuck HERE! */

      goto done;
    }
      /* TODO Back-off.  */
    }

In ARM assembly:

move r3, #1 ; the writer is stuck HERE!
str r3,[r12,#8] ; r12 holds the address of rwlock->__data, and 8 is the offset
of __readers in __data


Unlimited user space spin is too dangerous to be used, how about limiting the
total number of spins before suspending using futex? Or using rseq as mentioned
in the LWN artible?

Any ideas?

[0] https://lwn.net/Articles/944895/

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug nptl/31477] Priority Inversion and Unlimited Spin of RWlock
  2024-03-12 11:33 [Bug nptl/31477] New: Priority Inversion and Unlimited Spin of RWlock pengzheng at apache dot org
@ 2024-04-28 20:17 ` github at kalvdans dot no-ip.org
  0 siblings, 0 replies; 2+ messages in thread
From: github at kalvdans dot no-ip.org @ 2024-04-28 20:17 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=31477

github at kalvdans dot no-ip.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |github at kalvdans dot no-ip.org

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-04-28 20:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-12 11:33 [Bug nptl/31477] New: Priority Inversion and Unlimited Spin of RWlock pengzheng at apache dot org
2024-04-28 20:17 ` [Bug nptl/31477] " github at kalvdans dot no-ip.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).