public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "carlos at redhat dot com" <sourceware-bugzilla@sourceware.org>
To: glibc-bugs@sourceware.org
Subject: [Bug nptl/25847] pthread_cond_signal failed to wake up pthread_cond_wait due to a bug in undoing stealing
Date: Thu, 04 May 2023 12:24:33 +0000	[thread overview]
Message-ID: <bug-25847-131-f2943ZARQY@http.sourceware.org/bugzilla/> (raw)
In-Reply-To: <bug-25847-131@http.sourceware.org/bugzilla/>

https://sourceware.org/bugzilla/show_bug.cgi?id=25847

--- Comment #58 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Malte Skarupke from comment #56)
> Thanks for the review. Two points:
> 
> 1. I agree that this one is wrong. I'll fix it.
> 
> 2. It did strike me as odd that all the stores to g_signals are with relaxed
> MO. So if there is no group switching, pthread_cond_wait doesn't actually
> synchronize with pthread_cond_signal. I understand that the reason is that
> condition variables are always used together with a mutex, and the mutex
> forces synchronization, but even knowing that, it's kinda weird.

The intent is always to improve performance by avoiding the synchronize-with /
happens-before requirement since such a requirement could mean flushing writes
from that CPU and invalidating the caches on other CPUs (forcing them to read
the updated values).

If the mutex already provides that happens-before relationship then we need
only have relaxed MO, but in the case of the read of __g_signals we are
explicitly attempting to establish an order and make visible to other in-flight
threads the relaxed MO load of __g1_start, and so *something* has to create the
happens-before relationship and that is normally a release-acquire pair.

The last thing I will do when going through the changes is review the
release-acquire pairing, and if they make logical sense. This is something you
didn't cover in the TLA+ analysis, and in fact is *very* difficult to do in
TLA+, but there are existing prior-art of having done this in TLA+ but you have
to model it and that creates additional modelling states:

An interesting discussion of the topic is here:
Verification of Concurrent Data Structures with TLA (Cai 2018):
https://openresearch-repository.anu.edu.au/bitstream/1885/151830/1/20181109-SCNC2102-Zixian.pdf

Formally we would need to model weak memory models in TLA+/PlusCal in order to
prove that the implementation we have, with the specific acquire/release works
correctly and there isn't a total global order that observes a logical race
condition.

> I'm not sure about the subtleties about how CPUs behave without
> acquire-release semantics. What happens if someone signals without a mutex?
> What state does the woken thread see? This sounds like you could get really
> weird bugs if it sees partial state.

If you signal without a mutex then there is no ordering guarantee because
without the mutex release to synchronize-with the mutex acquire, the state of
memory on that CPU could be any previous old state of memory.

You can only see partial states if you use relaxed MO and expect to see
consistent states, but once you use stronger than relaxed you can start to
reason about which updates you would or would not have seen.

As a primer I recommend Preshing's writeup on the topic:
https://preshing.com/20130823/the-synchronizes-with-relation/

> I know it's wrong to signal without locking/unlocking the mutex, but I
> wouldn't be 100% confident that there are no valid cases where you'd do
> that. Like some kind of job system where you signal a bunch and have a
> broadcast at the end to make sure everyone gets woken.

POSIX allows calling broadcast or signal without locking/unlock the mutex, but
without the enforced ordering you may not wake any threads i.e. "predictable
scheduling"

https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_signal.html
~~~
The pthread_cond_broadcast() or pthread_cond_signal() functions may be called
by a thread whether or not it currently owns the mutex that threads calling
pthread_cond_wait() or pthread_cond_timedwait() have associated with the
condition variable during their waits; however, if predictable scheduling
behavior is required, then that mutex shall be locked by the thread calling
pthread_cond_broadcast() or pthread_cond_signal().
~~~

> Since I had a hard time reasoning about this, I had a half-finished tenth
> patch in the series where I changed this and just made all stores to
> g_signals use release MO, just to remove the hard-to-reason-about case.
> Would you be interested in that patch?

The algorithm must use acquire/release in order to ensure happens-before for
certain internal states to become visible. Part of the review of the series is
to review the acquire/release interactions and their impact. It is not correct
to change everything to release.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

  parent reply	other threads:[~2023-05-04 12:24 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-18  8:04 [Bug nptl/25847] New: " qin.li at thetradedesk dot com
2020-04-18  8:07 ` [Bug nptl/25847] " qin.li at thetradedesk dot com
2020-04-20 17:52 ` qin.li at thetradedesk dot com
2020-05-05 11:50 ` arun11299 at gmail dot com
2020-05-05 11:52 ` arun11299 at gmail dot com
2020-05-14  8:52 ` dingxiangfei2009 at protonmail dot ch
2020-05-19 21:15 ` carlos at redhat dot com
2020-10-08 19:07 ` michael.bacarella at gmail dot com
2020-10-10  8:11 ` flo at geekplace dot eu
2020-10-12 20:53 ` anssi.hannula at iki dot fi
2020-10-13 16:50 ` guillaume at morinfr dot org
2020-10-16 20:23 ` kirichenkoga at gmail dot com
2020-10-18 22:55 ` michael.bacarella at gmail dot com
2020-10-20 20:12 ` kirichenkoga at gmail dot com
2020-10-20 20:17 ` kirichenkoga at gmail dot com
2020-11-01 13:58 ` malteskarupke at fastmail dot fm
2020-11-01 15:40 ` michael.bacarella at gmail dot com
2020-11-15  2:45 ` malteskarupke at fastmail dot fm
2020-11-23 10:37 ` ydroneaud at opteya dot com
2020-11-23 16:36 ` mattst88 at gmail dot com
2020-11-23 16:46 ` adhemerval.zanella at linaro dot org
2020-12-07 15:41 ` balint at balintreczey dot hu
2020-12-09  3:22 ` malteskarupke at fastmail dot fm
2020-12-24 20:02 ` triegel at redhat dot com
2020-12-25 16:19 ` triegel at redhat dot com
2021-01-07  1:09 ` manojgupta at google dot com
2021-01-07  7:31 ` balint at balintreczey dot hu
2021-01-07 13:54 ` malteskarupke at fastmail dot fm
2021-01-07 20:43 ` triegel at redhat dot com
2021-01-07 23:31 ` triegel at redhat dot com
2021-01-08  3:45 ` malteskarupke at fastmail dot fm
2021-01-16  0:21 ` triegel at redhat dot com
2021-01-16  0:23 ` triegel at redhat dot com
2021-01-30  0:59 ` michael.bacarella at gmail dot com
2021-02-07 17:38 ` slav.isv at gmail dot com
2021-03-09 15:18 ` bugdal at aerifal dot cx
2021-04-06  0:37 ` frankbarrus_sw at shaggy dot cc
2021-04-06  1:17 ` frankbarrus_sw at shaggy dot cc
2021-04-06 14:32 ` frankbarrus_sw at shaggy dot cc
2021-04-06 16:49 ` frankbarrus_sw at shaggy dot cc
2021-04-11 12:12 ` carlos at redhat dot com
2021-04-11 12:13 ` carlos at redhat dot com
2021-04-13 12:21 ` frankbarrus_sw at shaggy dot cc
2021-04-14 16:57 ` qin.li at thetradedesk dot com
2021-04-15 14:13 ` frankbarrus_sw at shaggy dot cc
2021-04-15 14:34 ` frankbarrus_sw at shaggy dot cc
2021-04-30 17:41 ` venkiram_be at yahoo dot co.in
2021-05-04 22:48 ` frankbarrus_sw at shaggy dot cc
2021-05-04 22:51 ` frankbarrus_sw at shaggy dot cc
2021-05-04 22:58 ` frankbarrus_sw at shaggy dot cc
2021-05-13 13:25 ` tuliom at ascii dot art.br
2021-06-14 13:31 ` willireamangel at gmail dot com
2021-07-09  1:41 ` uwydoc at gmail dot com
2021-07-15  0:55 ` benh at kernel dot crashing.org
2021-08-16  9:41 ` evgeny+sourceware at loop54 dot com
2021-09-13 16:50 ` dushistov at mail dot ru
2021-09-22 19:03 ` evgeny+sourceware at loop54 dot com
2021-09-22 19:07 ` balint at balintreczey dot hu
2021-09-24  0:18 ` tuliom at ascii dot art.br
2021-09-24  0:58 ` michael.bacarella at gmail dot com
2021-09-29 11:50 ` fweimer at redhat dot com
2021-10-21 15:42 ` fweimer at redhat dot com
2021-10-30 22:17 ` sam at gentoo dot org
2021-11-25 14:49 ` arekm at maven dot pl
2022-09-18  5:38 ` malteskarupke at fastmail dot fm
2022-09-18 20:06 ` carlos at redhat dot com
2022-09-19  3:38 ` malteskarupke at fastmail dot fm
2022-09-24  0:03 ` bugzilla at dimebar dot com
2022-09-24 10:15 ` ismail at i10z dot com
2022-09-26 14:28 ` ehagberg at janestreet dot com
2022-09-26 14:32 ` ehagberg at janestreet dot com
2022-10-06 21:58 ` malteskarupke at fastmail dot fm
2022-10-07 12:01 ` crrodriguez at opensuse dot org
2022-10-15 19:57 ` malteskarupke at fastmail dot fm
2022-11-07 18:23 ` sourceware-bugzilla at djr61 dot uk
2023-01-28 14:57 ` malteskarupke at fastmail dot fm
2023-05-01 12:52 ` carlos at redhat dot com
2023-05-02 12:57 ` carlos at redhat dot com
2023-05-03  3:04 ` malteskarupke at fastmail dot fm
2023-05-04  4:57 ` malteskarupke at fastmail dot fm
2023-05-04 12:24 ` carlos at redhat dot com [this message]
2023-05-05 23:44 ` carlos at redhat dot com
2023-05-10 21:29 ` frankbarrus_sw at shaggy dot cc
2023-05-10 21:39 ` frankbarrus_sw at shaggy dot cc
2023-05-11  0:22 ` frankbarrus_sw at shaggy dot cc
2023-05-11 12:01 ` carlos at redhat dot com
2023-05-11 12:05 ` carlos at redhat dot com
2023-05-13  4:10 ` malteskarupke at fastmail dot fm
2023-08-24 20:24 ` jwakely.gcc at gmail dot com
2023-09-26 12:33 ` malteskarupke at fastmail dot fm
2023-09-26 12:38 ` fweimer at redhat dot com
2024-01-05  7:31 ` malteskarupke at fastmail dot fm
2024-02-17  9:44 ` github at kalvdans dot no-ip.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-25847-131-f2943ZARQY@http.sourceware.org/bugzilla/ \
    --to=sourceware-bugzilla@sourceware.org \
    --cc=glibc-bugs@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).