[Bug libstdc++/106772] atomic<T>::wait shouldn't touch waiter pool if used platform wait

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "valera.mironow at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug libstdc++/106772] atomic<T>::wait shouldn't touch waiter pool if used platform wait
Date: Wed, 28 Sep 2022 23:09:48 +0000	[thread overview]
Message-ID: <bug-106772-4-zKlz0T3i00@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-106772-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106772

--- Comment #23 from Mkkt Bkkt <valera.mironow at gmail dot com> ---
(In reply to Thomas Rodgers from comment #21)
> I wasn't sarcastic in what I posted. As I noted, this question has come up
> before in different contexts, Bugzilla is a useful historical archive, so
> updating this issue with my reasoning and a bit of data was primarily a
> capture task.

Ok, sorry

> In this test, it is the example Jonathan cited in #14; incrementing an
> atomic int and calling notify.

incrementing atomic int here doesn't touch waiter pool atomic.
I think in reality it possible that you have contention between load in notify
and fetch_add/sub in some wait (maybe on other atomics)

> This isn't about semaphore or any other synchronization primitive. Those
> types are free to make different choices that may be more appropriate to the
> constrained usage of the type just like Lewis' lightweight manual reset
> event does (I will also note that Lewis has reviewed this implementation,
> and has written a paper to be discussed at the Fall meeting, p2616). 

Ok, but it just means at least before C++26 devs should use own kludge.

> There are, as Jonathan has pointed out, use cases where notify can and will
> be called without a notifier having any way to determine it will wake a
> waiter. One example I, as the person that is going to have to implement
> C++26 executors care about is a wait free work-stealing deque, it certainly
> doesn't require anything more than spinning for work on an empty queue to be
> algorithmically correct, but after spinning on an empty queue, making the
> rounds trying to steal work from other deques, maybe spinning a bit more,
> just to be sure, the de-queuing thread which hasn't been able to acquire
> more work is probably going to want to enter a wait until such time as it
> knows it can do productive work. Another thread pushing work into that queue
> won't be able to determine if the dequeue-ing thread is spinning for new
> work, work stealing, or has entered a wait, but atomic<int>::notify() does
> know that, and can avoid penalizing the thread submitting work with a
> syscall, if there is no thread in a wait on the other end of the deque,
> which is the expected case for this algorithm.

It's good example, but I think it will be better to have own "waiter pool" in
executor for that.
I try to explain why:
1) In that case you can control size of waiter pool, for example just create
separate atomic per threadpool's thread.
2) you avoid contention/and false-positive with other atomics because its not
global 

> p1135 was the paper that added atomic wait/notify. One of the co-authors of
> that paper wrote the libc++ implementation. That implementation, as with
> libstdc++, is not simply a wrapper over the underlying platform's wait
> primitive. It has two 'backends', an exponentially timed backoff and ulock
> wait/wake. libstdc++ currently has a Futex and condvar backend. Both
> implementations make the choice of having a short-term spinning strategy and
> long term waiting strategy (spinning, futex/ulock, condvar).

I know (one moment that I was wrong libc++ also use waiter pool in platform
case), and I don't think its good.
From my point of view better to have something like
std::wait_spin waiter;
bool std::wait_spin::wait(Condition condition_func);
void std::wait_spin::reset();

Also libc++ implementation make timed wait on futex (but not on ulock), I don't
why, do you know?
https://discourse.llvm.org/t/why-do-atomic-int-wait-use-timed-futex/65341

But in general it's not about this thread.

> I have confirmed with the libc++ implementation's author, (who also chairs
> the C++ committee's Concurrency and Parallelism study group), that it was
> never the intention of p1135 or the subsequently standardized language in
> C++20 to imply that wait/notify were direct portable analogs to the
> platform's waiting primitives. There are users, such as yourself that want
> exactly that, there are other users (like in my prior industry) where busy
> waiting is the desired strategy, and in between those two choices are people
> who want it to work as advertised in the standard, and to do so
> 'efficiently'. Both libc++ and libstdc++ take a balanced approach somewhere
> between always going to OS and always spinning.

It's a good point, despite though atomic wait/notify is too low-level to that
make sense in my opinion.
As a result, it seems to me that in this form they aren't very suitable for
developers who know what they want (syscall).

And at the same time, developers who write business code aren't particularly
needed them (they generally use some library for coroutines/callbacks, for
example, like p2300)


> There is an open question here that your original issue raises -
> 
> * At what point do collisions on the waiter pool, with the cache
> invalidation traffic and spurious wakeups that result, swamp the gain of
> doing this empty waiter check on notify?

Just start threadpool with deques from your example.
No tasks.
All thread make some spin and then go to wait, some of threads make fetch_add
on the same atomic.
Then we push a lot of work, all threads wakeup and some of them make fetch_sub
on the same atomic.

> I also, made a comment about the 'size of the waiter pool not withstanding'.
> I chose a smaller size than libc++ chose, in part because Jonathan and I did
> not want to make any sort of ABI commitment until this had been in a few GCC
> releases. This implementation is header only at present, and still
> considered experimental. libc++ committed to an ABI early.
> 
> In the sizing of the libc++ waiter pool there is the comment that 'there is
> no magic in this value'. 
> 
> Not only is there no magic, there is no test of any sort that I have done,
> or that has been done on libc++ to determine what effect different size
> pools have under different load scenarios. So, all of it is a guess at this
> point. I will likely match libc++ when I do move this into the .so.

We just have a different vision of it usage.
I think user that use atomic::wait know about spinning wait before syscall,
know about syscall not needed if no waiters, etc.

You and libc++ devs think user can not know all these, and better to have more
universal implementation.

It's just different approaches and now I understand arguing about it was a
little pointless

next prev parent reply	other threads:[~2022-09-28 23:09 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-29 12:38 [Bug libstdc++/106772] New: " valera.mironow at gmail dot com
2022-08-29 12:40 ` [Bug libstdc++/106772] " valera.mironow at gmail dot com
2022-08-29 15:12 ` rodgertq at gcc dot gnu.org
2022-09-20  3:38 ` rodgertq at gcc dot gnu.org
2022-09-20  7:39 ` valera.mironow at gmail dot com
2022-09-20  7:47 ` valera.mironow at gmail dot com
2022-09-20 14:14 ` rodgertq at gcc dot gnu.org
2022-09-20 14:20 ` valera.mironow at gmail dot com
2022-09-20 14:23 ` valera.mironow at gmail dot com
2022-09-20 14:27 ` valera.mironow at gmail dot com
2022-09-20 14:36 ` redi at gcc dot gnu.org
2022-09-20 14:40 ` rodgertq at gcc dot gnu.org
2022-09-20 14:54 ` valera.mironow at gmail dot com
2022-09-20 15:45 ` valera.mironow at gmail dot com
2022-09-20 15:55 ` redi at gcc dot gnu.org
2022-09-20 16:00 ` valera.mironow at gmail dot com
2022-09-20 16:05 ` valera.mironow at gmail dot com
2022-09-28 21:54 ` rodgertq at gcc dot gnu.org
2022-09-28 22:00 ` valera.mironow at gmail dot com
2022-09-28 22:04 ` valera.mironow at gmail dot com
2022-09-28 22:18 ` valera.mironow at gmail dot com
2022-09-28 22:34 ` rodgertq at gcc dot gnu.org
2022-09-28 22:50 ` rodgertq at gcc dot gnu.org
2022-09-28 23:09 ` valera.mironow at gmail dot com [this message]
2022-09-28 23:25 ` valera.mironow at gmail dot com
2022-09-28 23:40 ` rodgertq at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-106772-4-zKlz0T3i00@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).