From: Mike Crowe <mac@mcrowe.com>
To: Jonathan Wakely <jwakely@redhat.com>
Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org
Subject: Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
Date: Fri, 13 Nov 2020 20:29:34 +0000 [thread overview]
Message-ID: <20201113202934.GA23583@mcrowe.com> (raw)
In-Reply-To: <20201113172522.GI503596@redhat.com>
On Friday 13 November 2020 at 17:25:22 +0000, Jonathan Wakely wrote:
> On 13/11/20 11:02 +0000, Jonathan Wakely wrote:
> > On 12/11/20 23:49 +0000, Jonathan Wakely wrote:
> > > To poll a std::future to see if it's ready you have to call one of the
> > > timed waiting functions. The most obvious way is wait_for(0s) but this
> > > was previously very inefficient because it would turn the relative
> > > timeout to an absolute one by calling system_clock::now(). When the
> > > relative timeout is zero (or less) we're obviously going to get a time
> > > that has already passed, but the overhead of obtaining the current time
> > > can be dozens of microseconds. The alternative is to call wait_until
> > > with an absolute timeout that is in the past. If you know the clock's
> > > epoch is in the past you can use a default constructed time_point.
> > > Alternatively, using some_clock::time_point::min() gives the earliest
> > > time point supported by the clock, which should be safe to assume is in
> > > the past. However, using a futex wait with an absolute timeout before
> > > the UNIX epoch fails and sets errno=EINVAL. The new code using futex
> > > waits with absolute timeouts was not checking for this case, which could
> > > result in hangs (or killing the process if the libray is built with
> > > assertions enabled).
> > >
> > > This patch checks for times before the epoch before attempting to wait
> > > on a futex with an absolute timeout, which fixes the hangs or crashes.
> > > It also makes it very fast to poll using an absolute timeout before the
> > > epoch (because we skip the futex syscall).
> > >
> > > It also makes future::wait_for avoid waiting at all when the relative
> > > timeout is zero or less, to avoid the unnecessary overhead of getting
> > > the current time. This makes polling with wait_for(0s) take only a few
> > > cycles instead of dozens of milliseconds.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > > * include/std/future (future::wait_for): Do not wait for
> > > durations less than or equal to zero.
> > > * src/c++11/futex.cc (_M_futex_wait_until)
> > > (_M_futex_wait_until_steady): Do not wait for timeouts before
> > > the epoch.
> > > * testsuite/30_threads/future/members/poll.cc: New test.
> > >
> > > Tested powerpc64le-linux. Committed to trunk.
> > >
> > > I think the shortcut in future::wait_for is worth backporting. The
> > > changes in src/c++11/futex.cc are not needed because the code using
> > > absolute timeouts with futex waits is not present on any release
> > > branch.
> >
> > I've committed this fix for the new test.
>
> Backporting the change to gcc-10 revealed an overflow bug in the
> existing code, resulting in blocking for years when given an absolute
> timeout in the distant past. There's still a similar bug in the new
> code (using futexes with absolute timeouts against clocks) where a
> large chrono::seconds value can overflow and produce an incorrect
> tv_sec value. Apart from the overflow itself being UB, the result in
> that case is just a spurious wakeup (the call says it timed out when
> it didn't reach the specified time). That should still be fixed, but
> I'll do it separately.
>
> Tested x86_64-linux. Committed to trunk.
>
>
> commit e7e0eeeb6e6707be2a6c6da49d4b6be3199e2af8
> Author: Jonathan Wakely <jwakely@redhat.com>
> Date: Fri Nov 13 15:19:04 2020
>
> libstdc++: Avoid 32-bit time_t overflows in futex calls
>
> The existing code doesn't check whether the chrono::seconds value is out
> of range of time_t. When using a timeout before the epoch (with a
> negative value) subtracting the current time (as time_t) and then
> assigning it to a time_t can overflow to a large positive value. This
> means that we end up waiting several years even though the specific
> timeout was in the distant past.
>
> We do have a check for negative timeouts, but that happens after the
> conversion to time_t so happens after the overflow.
>
> The conversion to a relative timeout is done in two places, so this
> factors it into a new function and adds the overflow checks there.
>
> libstdc++-v3/ChangeLog:
>
> * src/c++11/futex.cc (relative_timespec): New function to
> create relative time from two absolute times.
> (__atomic_futex_unsigned_base::_M_futex_wait_until)
> (__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
> Use relative_timespec.
>
> diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
> index 57f7dfe87e9e..c2b2d32e8c43 100644
> --- a/libstdc++-v3/src/c++11/futex.cc
> +++ b/libstdc++-v3/src/c++11/futex.cc
> @@ -31,6 +31,7 @@
> #include <unistd.h>
> #include <sys/time.h>
> #include <errno.h>
> +#include <ext/numeric_traits.h>
> #include <debug/debug.h>
>
> #ifdef _GLIBCXX_USE_CLOCK_GETTIME_SYSCALL
> @@ -46,20 +47,55 @@ const unsigned futex_clock_realtime_flag = 256;
> const unsigned futex_bitset_match_any = ~0;
> const unsigned futex_wake_op = 1;
>
> -namespace
> -{
> - std::atomic<bool> futex_clock_realtime_unavailable;
> - std::atomic<bool> futex_clock_monotonic_unavailable;
> -}
> -
> namespace std _GLIBCXX_VISIBILITY(default)
> {
> _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
> +namespace
> +{
> + std::atomic<bool> futex_clock_realtime_unavailable;
> + std::atomic<bool> futex_clock_monotonic_unavailable;
> +
> + // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
> + // as a timespec.
> + struct timespec
> + relative_timespec(chrono::seconds abs_s, chrono::nanoseconds abs_ns,
> + time_t now_s, long now_ns)
> + {
> + struct timespec rt;
> +
> + // Did we already time out?
> + if (now_s > abs_s.count())
> + {
> + rt.tv_sec = -1;
> + return rt;
> + }
> +
> + auto rel_s = abs_s.count() - now_s;
> +
> + // Avoid overflows
> + if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
> + rel_s = __gnu_cxx::__int_traits<time_t>::__max;
> + else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
> + rel_s = __gnu_cxx::__int_traits<time_t>::__min;
I may be missing something, but if the line above executes...
> +
> + // Convert the absolute timeout value to a relative timeout
> + rt.tv_sec = rel_s;
> + rt.tv_nsec = abs_ns.count() - now_ns;
> + if (rt.tv_nsec < 0)
> + {
> + rt.tv_nsec += 1000000000;
> + --rt.tv_sec;
...and so does this line above, then I think that we'll end up
underflowing. (Presumably rt.tv_sec will wrap round to being some time in
2038 on most 32-bit targets.)
I'm currently trying to persuade myself that this can actually happen and
if so work out how to come up with a test case for it.
I don't believe that this part changed in your later patch.
> + }
> +
> + return rt;
> + }
> +} // namespace
> +
> bool
> - __atomic_futex_unsigned_base::_M_futex_wait_until(unsigned *__addr,
> - unsigned __val,
> - bool __has_timeout, chrono::seconds __s, chrono::nanoseconds __ns)
> + __atomic_futex_unsigned_base::
> + _M_futex_wait_until(unsigned *__addr, unsigned __val, bool __has_timeout,
> + chrono::seconds __s, chrono::nanoseconds __ns)
> {
> if (!__has_timeout)
> {
> @@ -109,15 +145,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> // true or has just been set to true.
> struct timeval tv;
> gettimeofday (&tv, NULL);
> +
> // Convert the absolute timeout value to a relative timeout
> - struct timespec rt;
> - rt.tv_sec = __s.count() - tv.tv_sec;
> - rt.tv_nsec = __ns.count() - tv.tv_usec * 1000;
> - if (rt.tv_nsec < 0)
> - {
> - rt.tv_nsec += 1000000000;
> - --rt.tv_sec;
> - }
> + auto rt = relative_timespec(__s, __ns, tv.tv_sec, tv.tv_usec * 1000);
> +
> // Did we already time out?
> if (rt.tv_sec < 0)
> return false;
> @@ -134,9 +165,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> }
>
> bool
> - __atomic_futex_unsigned_base::_M_futex_wait_until_steady(unsigned *__addr,
> - unsigned __val,
> - bool __has_timeout, chrono::seconds __s, chrono::nanoseconds __ns)
> + __atomic_futex_unsigned_base::
> + _M_futex_wait_until_steady(unsigned *__addr, unsigned __val,
> + bool __has_timeout,
> + chrono::seconds __s, chrono::nanoseconds __ns)
Oh how I wish all these functions didn't expect already cracked seconds and
nanoseconds. :(
Mike.
next prev parent reply other threads:[~2020-11-13 20:29 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-12 23:49 Jonathan Wakely
2020-11-13 11:02 ` Jonathan Wakely
2020-11-13 17:25 ` Jonathan Wakely
2020-11-13 19:16 ` Jonathan Wakely
2020-11-13 20:29 ` Mike Crowe [this message]
2020-11-13 21:12 ` Jonathan Wakely
2020-11-13 21:16 ` Jonathan Wakely
2020-11-13 22:45 ` Jonathan Wakely
2020-11-14 0:17 ` Jonathan Wakely
2020-11-14 13:22 ` Mike Crowe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201113202934.GA23583@mcrowe.com \
--to=mac@mcrowe.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jwakely@redhat.com \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).