* [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
@ 2020-11-12 23:49 Jonathan Wakely
2020-11-13 11:02 ` Jonathan Wakely
0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Wakely @ 2020-11-12 23:49 UTC (permalink / raw)
To: libstdc++, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2138 bytes --]
To poll a std::future to see if it's ready you have to call one of the
timed waiting functions. The most obvious way is wait_for(0s) but this
was previously very inefficient because it would turn the relative
timeout to an absolute one by calling system_clock::now(). When the
relative timeout is zero (or less) we're obviously going to get a time
that has already passed, but the overhead of obtaining the current time
can be dozens of microseconds. The alternative is to call wait_until
with an absolute timeout that is in the past. If you know the clock's
epoch is in the past you can use a default constructed time_point.
Alternatively, using some_clock::time_point::min() gives the earliest
time point supported by the clock, which should be safe to assume is in
the past. However, using a futex wait with an absolute timeout before
the UNIX epoch fails and sets errno=EINVAL. The new code using futex
waits with absolute timeouts was not checking for this case, which could
result in hangs (or killing the process if the libray is built with
assertions enabled).
This patch checks for times before the epoch before attempting to wait
on a futex with an absolute timeout, which fixes the hangs or crashes.
It also makes it very fast to poll using an absolute timeout before the
epoch (because we skip the futex syscall).
It also makes future::wait_for avoid waiting at all when the relative
timeout is zero or less, to avoid the unnecessary overhead of getting
the current time. This makes polling with wait_for(0s) take only a few
cycles instead of dozens of milliseconds.
libstdc++-v3/ChangeLog:
* include/std/future (future::wait_for): Do not wait for
durations less than or equal to zero.
* src/c++11/futex.cc (_M_futex_wait_until)
(_M_futex_wait_until_steady): Do not wait for timeouts before
the epoch.
* testsuite/30_threads/future/members/poll.cc: New test.
Tested powerpc64le-linux. Committed to trunk.
I think the shortcut in future::wait_for is worth backporting. The
changes in src/c++11/futex.cc are not needed because the code using
absolute timeouts with futex waits is not present on any release
branch.
[-- Attachment #2: patch.txt --]
[-- Type: text/plain, Size: 8811 bytes --]
commit 93fc47746815ea9dac413322fcade2931f757e7f
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Thu Nov 12 21:25:14 2020
libstdc++: Optimise std::future::wait_for and fix futex polling
To poll a std::future to see if it's ready you have to call one of the
timed waiting functions. The most obvious way is wait_for(0s) but this
was previously very inefficient because it would turn the relative
timeout to an absolute one by calling system_clock::now(). When the
relative timeout is zero (or less) we're obviously going to get a time
that has already passed, but the overhead of obtaining the current time
can be dozens of microseconds. The alternative is to call wait_until
with an absolute timeout that is in the past. If you know the clock's
epoch is in the past you can use a default constructed time_point.
Alternatively, using some_clock::time_point::min() gives the earliest
time point supported by the clock, which should be safe to assume is in
the past. However, using a futex wait with an absolute timeout before
the UNIX epoch fails and sets errno=EINVAL. The new code using futex
waits with absolute timeouts was not checking for this case, which could
result in hangs (or killing the process if the libray is built with
assertions enabled).
This patch checks for times before the epoch before attempting to wait
on a futex with an absolute timeout, which fixes the hangs or crashes.
It also makes it very fast to poll using an absolute timeout before the
epoch (because we skip the futex syscall).
It also makes future::wait_for avoid waiting at all when the relative
timeout is zero or less, to avoid the unnecessary overhead of getting
the current time. This makes polling with wait_for(0s) take only a few
cycles instead of dozens of milliseconds.
libstdc++-v3/ChangeLog:
* include/std/future (future::wait_for): Do not wait for
durations less than or equal to zero.
* src/c++11/futex.cc (_M_futex_wait_until)
(_M_futex_wait_until_steady): Do not wait for timeouts before
the epoch.
* testsuite/30_threads/future/members/poll.cc: New test.
diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future
index 5d948018c75c..f7617cac8e93 100644
--- a/libstdc++-v3/include/std/future
+++ b/libstdc++-v3/include/std/future
@@ -345,10 +345,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// to synchronize with the thread that made it ready.
if (_M_status._M_load(memory_order_acquire) == _Status::__ready)
return future_status::ready;
+
if (_M_is_deferred_future())
return future_status::deferred;
- if (_M_status._M_load_when_equal_for(_Status::__ready,
- memory_order_acquire, __rel))
+
+ // Don't wait unless the relative time is greater than zero.
+ if (__rel > __rel.zero()
+ && _M_status._M_load_when_equal_for(_Status::__ready,
+ memory_order_acquire,
+ __rel))
{
// _GLIBCXX_RESOLVE_LIB_DEFECTS
// 2100. timed waiting functions must also join
@@ -377,10 +382,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// to synchronize with the thread that made it ready.
if (_M_status._M_load(memory_order_acquire) == _Status::__ready)
return future_status::ready;
+
if (_M_is_deferred_future())
return future_status::deferred;
+
if (_M_status._M_load_when_equal_until(_Status::__ready,
- memory_order_acquire, __abs))
+ memory_order_acquire,
+ __abs))
{
// _GLIBCXX_RESOLVE_LIB_DEFECTS
// 2100. timed waiting functions must also join
diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index 0331bd6df64f..57f7dfe87e9e 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -78,6 +78,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
struct timespec rt;
rt.tv_sec = __s.count();
rt.tv_nsec = __ns.count();
+
+ // futex sets errno=EINVAL for absolute timeouts before the epoch.
+ if (__builtin_expect(rt.tv_sec < 0, false))
+ return false;
+
if (syscall (SYS_futex, __addr,
futex_wait_bitset_op | futex_clock_realtime_flag,
__val, &rt, nullptr, futex_bitset_match_any) == -1)
@@ -151,6 +156,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
rt.tv_sec = __s.count();
rt.tv_nsec = __ns.count();
+ // futex sets errno=EINVAL for absolute timeouts before the epoch.
+ if (__builtin_expect(rt.tv_sec < 0, false))
+ return false;
+
if (syscall (SYS_futex, __addr,
futex_wait_bitset_op | futex_clock_monotonic_flag,
__val, &rt, nullptr, futex_bitset_match_any) == -1)
diff --git a/libstdc++-v3/testsuite/30_threads/future/members/poll.cc b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
new file mode 100644
index 000000000000..54580579d3a1
--- /dev/null
+++ b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
@@ -0,0 +1,103 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3. If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-options "-O3" }
+// { dg-do run { target c++11 } }
+
+#include <future>
+#include <chrono>
+#include <iostream>
+#include <testsuite_hooks.h>
+
+const int iterations = 200;
+
+using namespace std;
+
+template<typename Duration>
+double
+print(const char* desc, Duration dur)
+{
+ auto ns = chrono::duration_cast<chrono::nanoseconds>(dur).count();
+ double d = double(ns) / iterations;
+ cout << desc << ": " << ns << "ns for " << iterations
+ << " calls, avg " << d << "ns per call\n";
+ return d;
+}
+
+int main()
+{
+ promise<int> p;
+ future<int> f = p.get_future();
+
+ auto start = chrono::high_resolution_clock::now();
+ for(int i = 0; i < iterations; i++)
+ f.wait_for(chrono::seconds(0));
+ auto stop = chrono::high_resolution_clock::now();
+ double wait_for_0 = print("wait_for(0s)", stop - start);
+
+ start = chrono::high_resolution_clock::now();
+ for(int i = 0; i < iterations; i++)
+ f.wait_until(chrono::system_clock::time_point());
+ stop = chrono::high_resolution_clock::now();
+ double wait_until_sys_epoch __attribute__((unused))
+ = print("wait_until(system_clock epoch)", stop - start);
+
+ start = chrono::high_resolution_clock::now();
+ for(int i = 0; i < iterations; i++)
+ f.wait_until(chrono::steady_clock::time_point());
+ stop = chrono::high_resolution_clock::now();
+ double wait_until_steady_epoch __attribute__((unused))
+ = print("wait_until(steady_clock epoch", stop - start);
+
+ start = chrono::high_resolution_clock::now();
+ for(int i = 0; i < iterations; i++)
+ f.wait_until(chrono::system_clock::time_point::min());
+ stop = chrono::high_resolution_clock::now();
+ double wait_until_sys_min __attribute__((unused))
+ = print("wait_until(system_clock minimum)", stop - start);
+
+ start = chrono::high_resolution_clock::now();
+ for(int i = 0; i < iterations; i++)
+ f.wait_until(chrono::steady_clock::time_point::min());
+ stop = chrono::high_resolution_clock::now();
+ double wait_until_steady_min __attribute__((unused))
+ = print("wait_until(steady_clock minimum)", stop - start);
+
+ p.set_value(1);
+
+ start = chrono::high_resolution_clock::now();
+ for(int i = 0; i < iterations; i++)
+ f.wait_for(chrono::seconds(0));
+ stop = chrono::high_resolution_clock::now();
+ double ready = print("wait_for when ready", stop - start);
+
+ // polling before ready with wait_for(0s) should be almost as fast as
+ // after the result is ready.
+ VERIFY( wait_for_0 < (ready * 10) );
+
+ // The following two tests fail with GCC 11, see
+ // https://gcc.gnu.org/pipermail/libstdc++/2020-November/051422.html
+#if 0
+ // polling before ready using wait_until(epoch) should not be terribly slow.
+ VERIFY( wait_until_sys_epoch < (ready * 100) );
+ VERIFY( wait_until_steady_epoch < (ready * 100) );
+#endif
+
+ // polling before ready using wait_until(min) should not be terribly slow.
+ VERIFY( wait_until_sys_min < (ready * 100) );
+ VERIFY( wait_until_steady_min < (ready * 100) );
+}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
2020-11-12 23:49 [committed] libstdc++: Optimise std::future::wait_for and fix futex polling Jonathan Wakely
@ 2020-11-13 11:02 ` Jonathan Wakely
2020-11-13 17:25 ` Jonathan Wakely
0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Wakely @ 2020-11-13 11:02 UTC (permalink / raw)
To: libstdc++, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2330 bytes --]
On 12/11/20 23:49 +0000, Jonathan Wakely wrote:
>To poll a std::future to see if it's ready you have to call one of the
>timed waiting functions. The most obvious way is wait_for(0s) but this
>was previously very inefficient because it would turn the relative
>timeout to an absolute one by calling system_clock::now(). When the
>relative timeout is zero (or less) we're obviously going to get a time
>that has already passed, but the overhead of obtaining the current time
>can be dozens of microseconds. The alternative is to call wait_until
>with an absolute timeout that is in the past. If you know the clock's
>epoch is in the past you can use a default constructed time_point.
>Alternatively, using some_clock::time_point::min() gives the earliest
>time point supported by the clock, which should be safe to assume is in
>the past. However, using a futex wait with an absolute timeout before
>the UNIX epoch fails and sets errno=EINVAL. The new code using futex
>waits with absolute timeouts was not checking for this case, which could
>result in hangs (or killing the process if the libray is built with
>assertions enabled).
>
>This patch checks for times before the epoch before attempting to wait
>on a futex with an absolute timeout, which fixes the hangs or crashes.
>It also makes it very fast to poll using an absolute timeout before the
>epoch (because we skip the futex syscall).
>
>It also makes future::wait_for avoid waiting at all when the relative
>timeout is zero or less, to avoid the unnecessary overhead of getting
>the current time. This makes polling with wait_for(0s) take only a few
>cycles instead of dozens of milliseconds.
>
>libstdc++-v3/ChangeLog:
>
> * include/std/future (future::wait_for): Do not wait for
> durations less than or equal to zero.
> * src/c++11/futex.cc (_M_futex_wait_until)
> (_M_futex_wait_until_steady): Do not wait for timeouts before
> the epoch.
> * testsuite/30_threads/future/members/poll.cc: New test.
>
>Tested powerpc64le-linux. Committed to trunk.
>
>I think the shortcut in future::wait_for is worth backporting. The
>changes in src/c++11/futex.cc are not needed because the code using
>absolute timeouts with futex waits is not present on any release
>branch.
I've committed this fix for the new test.
Tested x86_64-linux, sparc-solaris-2.11 and powerpc-aix.
[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 4450 bytes --]
commit 8c4e33d2032ab150748ea2fe1df2b1c00652a338
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Fri Nov 13 10:04:33 2020
libstdc++: Add -pthread options to std::future polling test
For linux targets this test doesn't need -lpthread because it only uses
atomics, but for all other targets std::call_once still needs pthreads.
Add the necessary test directives to make that work.
The timings in this test might be too fragile or too target-specific, so
it might need to be adjusted in future, or restricted to only run on
specific targets. For now I've increased the allowed ratio between
wait_for calls before and after the future is made ready, because it was
failing with -O3 -march=native sometimes.
libstdc++-v3/ChangeLog:
* testsuite/30_threads/future/members/poll.cc: Require gthreads
and add -pthread for targets that require it. Relax required
ratio of wait_for calls before/after the future is ready.
diff --git a/libstdc++-v3/testsuite/30_threads/future/members/poll.cc b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
index 54580579d3a1..fff9bea899c9 100644
--- a/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
+++ b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc
@@ -17,6 +17,8 @@
// { dg-options "-O3" }
// { dg-do run { target c++11 } }
+// { dg-additional-options "-pthread" { target pthread } }
+// { dg-require-gthreads "" }
#include <future>
#include <chrono>
@@ -49,20 +51,6 @@ int main()
auto stop = chrono::high_resolution_clock::now();
double wait_for_0 = print("wait_for(0s)", stop - start);
- start = chrono::high_resolution_clock::now();
- for(int i = 0; i < iterations; i++)
- f.wait_until(chrono::system_clock::time_point());
- stop = chrono::high_resolution_clock::now();
- double wait_until_sys_epoch __attribute__((unused))
- = print("wait_until(system_clock epoch)", stop - start);
-
- start = chrono::high_resolution_clock::now();
- for(int i = 0; i < iterations; i++)
- f.wait_until(chrono::steady_clock::time_point());
- stop = chrono::high_resolution_clock::now();
- double wait_until_steady_epoch __attribute__((unused))
- = print("wait_until(steady_clock epoch", stop - start);
-
start = chrono::high_resolution_clock::now();
for(int i = 0; i < iterations; i++)
f.wait_until(chrono::system_clock::time_point::min());
@@ -77,6 +65,20 @@ int main()
double wait_until_steady_min __attribute__((unused))
= print("wait_until(steady_clock minimum)", stop - start);
+ start = chrono::high_resolution_clock::now();
+ for(int i = 0; i < iterations; i++)
+ f.wait_until(chrono::system_clock::time_point());
+ stop = chrono::high_resolution_clock::now();
+ double wait_until_sys_epoch __attribute__((unused))
+ = print("wait_until(system_clock epoch)", stop - start);
+
+ start = chrono::high_resolution_clock::now();
+ for(int i = 0; i < iterations; i++)
+ f.wait_until(chrono::steady_clock::time_point());
+ stop = chrono::high_resolution_clock::now();
+ double wait_until_steady_epoch __attribute__((unused))
+ = print("wait_until(steady_clock epoch", stop - start);
+
p.set_value(1);
start = chrono::high_resolution_clock::now();
@@ -85,19 +87,19 @@ int main()
stop = chrono::high_resolution_clock::now();
double ready = print("wait_for when ready", stop - start);
- // polling before ready with wait_for(0s) should be almost as fast as
+ // Polling before ready with wait_for(0s) should be almost as fast as
// after the result is ready.
- VERIFY( wait_for_0 < (ready * 10) );
+ VERIFY( wait_for_0 < (ready * 30) );
+
+ // Polling before ready using wait_until(min) should not be terribly slow.
+ VERIFY( wait_until_sys_min < (ready * 100) );
+ VERIFY( wait_until_steady_min < (ready * 100) );
// The following two tests fail with GCC 11, see
// https://gcc.gnu.org/pipermail/libstdc++/2020-November/051422.html
#if 0
- // polling before ready using wait_until(epoch) should not be terribly slow.
+ // Polling before ready using wait_until(epoch) should not be terribly slow.
VERIFY( wait_until_sys_epoch < (ready * 100) );
VERIFY( wait_until_steady_epoch < (ready * 100) );
#endif
-
- // polling before ready using wait_until(min) should not be terribly slow.
- VERIFY( wait_until_sys_min < (ready * 100) );
- VERIFY( wait_until_steady_min < (ready * 100) );
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
2020-11-13 11:02 ` Jonathan Wakely
@ 2020-11-13 17:25 ` Jonathan Wakely
2020-11-13 19:16 ` Jonathan Wakely
2020-11-13 20:29 ` Mike Crowe
0 siblings, 2 replies; 10+ messages in thread
From: Jonathan Wakely @ 2020-11-13 17:25 UTC (permalink / raw)
To: libstdc++, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2974 bytes --]
On 13/11/20 11:02 +0000, Jonathan Wakely wrote:
>On 12/11/20 23:49 +0000, Jonathan Wakely wrote:
>>To poll a std::future to see if it's ready you have to call one of the
>>timed waiting functions. The most obvious way is wait_for(0s) but this
>>was previously very inefficient because it would turn the relative
>>timeout to an absolute one by calling system_clock::now(). When the
>>relative timeout is zero (or less) we're obviously going to get a time
>>that has already passed, but the overhead of obtaining the current time
>>can be dozens of microseconds. The alternative is to call wait_until
>>with an absolute timeout that is in the past. If you know the clock's
>>epoch is in the past you can use a default constructed time_point.
>>Alternatively, using some_clock::time_point::min() gives the earliest
>>time point supported by the clock, which should be safe to assume is in
>>the past. However, using a futex wait with an absolute timeout before
>>the UNIX epoch fails and sets errno=EINVAL. The new code using futex
>>waits with absolute timeouts was not checking for this case, which could
>>result in hangs (or killing the process if the libray is built with
>>assertions enabled).
>>
>>This patch checks for times before the epoch before attempting to wait
>>on a futex with an absolute timeout, which fixes the hangs or crashes.
>>It also makes it very fast to poll using an absolute timeout before the
>>epoch (because we skip the futex syscall).
>>
>>It also makes future::wait_for avoid waiting at all when the relative
>>timeout is zero or less, to avoid the unnecessary overhead of getting
>>the current time. This makes polling with wait_for(0s) take only a few
>>cycles instead of dozens of milliseconds.
>>
>>libstdc++-v3/ChangeLog:
>>
>> * include/std/future (future::wait_for): Do not wait for
>> durations less than or equal to zero.
>> * src/c++11/futex.cc (_M_futex_wait_until)
>> (_M_futex_wait_until_steady): Do not wait for timeouts before
>> the epoch.
>> * testsuite/30_threads/future/members/poll.cc: New test.
>>
>>Tested powerpc64le-linux. Committed to trunk.
>>
>>I think the shortcut in future::wait_for is worth backporting. The
>>changes in src/c++11/futex.cc are not needed because the code using
>>absolute timeouts with futex waits is not present on any release
>>branch.
>
>I've committed this fix for the new test.
Backporting the change to gcc-10 revealed an overflow bug in the
existing code, resulting in blocking for years when given an absolute
timeout in the distant past. There's still a similar bug in the new
code (using futexes with absolute timeouts against clocks) where a
large chrono::seconds value can overflow and produce an incorrect
tv_sec value. Apart from the overflow itself being UB, the result in
that case is just a spurious wakeup (the call says it timed out when
it didn't reach the specified time). That should still be fixed, but
I'll do it separately.
Tested x86_64-linux. Committed to trunk.
[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 4902 bytes --]
commit e7e0eeeb6e6707be2a6c6da49d4b6be3199e2af8
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Fri Nov 13 15:19:04 2020
libstdc++: Avoid 32-bit time_t overflows in futex calls
The existing code doesn't check whether the chrono::seconds value is out
of range of time_t. When using a timeout before the epoch (with a
negative value) subtracting the current time (as time_t) and then
assigning it to a time_t can overflow to a large positive value. This
means that we end up waiting several years even though the specific
timeout was in the distant past.
We do have a check for negative timeouts, but that happens after the
conversion to time_t so happens after the overflow.
The conversion to a relative timeout is done in two places, so this
factors it into a new function and adds the overflow checks there.
libstdc++-v3/ChangeLog:
* src/c++11/futex.cc (relative_timespec): New function to
create relative time from two absolute times.
(__atomic_futex_unsigned_base::_M_futex_wait_until)
(__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
Use relative_timespec.
diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index 57f7dfe87e9e..c2b2d32e8c43 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -31,6 +31,7 @@
#include <unistd.h>
#include <sys/time.h>
#include <errno.h>
+#include <ext/numeric_traits.h>
#include <debug/debug.h>
#ifdef _GLIBCXX_USE_CLOCK_GETTIME_SYSCALL
@@ -46,20 +47,55 @@ const unsigned futex_clock_realtime_flag = 256;
const unsigned futex_bitset_match_any = ~0;
const unsigned futex_wake_op = 1;
-namespace
-{
- std::atomic<bool> futex_clock_realtime_unavailable;
- std::atomic<bool> futex_clock_monotonic_unavailable;
-}
-
namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION
+namespace
+{
+ std::atomic<bool> futex_clock_realtime_unavailable;
+ std::atomic<bool> futex_clock_monotonic_unavailable;
+
+ // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
+ // as a timespec.
+ struct timespec
+ relative_timespec(chrono::seconds abs_s, chrono::nanoseconds abs_ns,
+ time_t now_s, long now_ns)
+ {
+ struct timespec rt;
+
+ // Did we already time out?
+ if (now_s > abs_s.count())
+ {
+ rt.tv_sec = -1;
+ return rt;
+ }
+
+ auto rel_s = abs_s.count() - now_s;
+
+ // Avoid overflows
+ if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
+ rel_s = __gnu_cxx::__int_traits<time_t>::__max;
+ else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
+ rel_s = __gnu_cxx::__int_traits<time_t>::__min;
+
+ // Convert the absolute timeout value to a relative timeout
+ rt.tv_sec = rel_s;
+ rt.tv_nsec = abs_ns.count() - now_ns;
+ if (rt.tv_nsec < 0)
+ {
+ rt.tv_nsec += 1000000000;
+ --rt.tv_sec;
+ }
+
+ return rt;
+ }
+} // namespace
+
bool
- __atomic_futex_unsigned_base::_M_futex_wait_until(unsigned *__addr,
- unsigned __val,
- bool __has_timeout, chrono::seconds __s, chrono::nanoseconds __ns)
+ __atomic_futex_unsigned_base::
+ _M_futex_wait_until(unsigned *__addr, unsigned __val, bool __has_timeout,
+ chrono::seconds __s, chrono::nanoseconds __ns)
{
if (!__has_timeout)
{
@@ -109,15 +145,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
// true or has just been set to true.
struct timeval tv;
gettimeofday (&tv, NULL);
+
// Convert the absolute timeout value to a relative timeout
- struct timespec rt;
- rt.tv_sec = __s.count() - tv.tv_sec;
- rt.tv_nsec = __ns.count() - tv.tv_usec * 1000;
- if (rt.tv_nsec < 0)
- {
- rt.tv_nsec += 1000000000;
- --rt.tv_sec;
- }
+ auto rt = relative_timespec(__s, __ns, tv.tv_sec, tv.tv_usec * 1000);
+
// Did we already time out?
if (rt.tv_sec < 0)
return false;
@@ -134,9 +165,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
bool
- __atomic_futex_unsigned_base::_M_futex_wait_until_steady(unsigned *__addr,
- unsigned __val,
- bool __has_timeout, chrono::seconds __s, chrono::nanoseconds __ns)
+ __atomic_futex_unsigned_base::
+ _M_futex_wait_until_steady(unsigned *__addr, unsigned __val,
+ bool __has_timeout,
+ chrono::seconds __s, chrono::nanoseconds __ns)
{
if (!__has_timeout)
{
@@ -188,15 +220,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
#else
clock_gettime(CLOCK_MONOTONIC, &ts);
#endif
+
// Convert the absolute timeout value to a relative timeout
- struct timespec rt;
- rt.tv_sec = __s.count() - ts.tv_sec;
- rt.tv_nsec = __ns.count() - ts.tv_nsec;
- if (rt.tv_nsec < 0)
- {
- rt.tv_nsec += 1000000000;
- --rt.tv_sec;
- }
+ auto rt = relative_timespec(__s, __ns, ts.tv_sec, ts.tv_nsec);
+
// Did we already time out?
if (rt.tv_sec < 0)
return false;
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
2020-11-13 17:25 ` Jonathan Wakely
@ 2020-11-13 19:16 ` Jonathan Wakely
2020-11-13 20:29 ` Mike Crowe
1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Wakely @ 2020-11-13 19:16 UTC (permalink / raw)
To: libstdc++, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 888 bytes --]
>Backporting the change to gcc-10 revealed an overflow bug in the
>existing code, resulting in blocking for years when given an absolute
>timeout in the distant past. There's still a similar bug in the new
>code (using futexes with absolute timeouts against clocks) where a
>large chrono::seconds value can overflow and produce an incorrect
>tv_sec value. Apart from the overflow itself being UB, the result in
>that case is just a spurious wakeup (the call says it timed out when
>it didn't reach the specified time). That should still be fixed, but
>I'll do it separately.
And here's the separate fix for that other overflow. In fact it
doesn't just produce spurious wakeups. It either spins forever
(instead of sleeping gently) or incorrectly reports a timeout when it
should keep blocking. Either way, it's fixed now.
Tested x86_64-linux && powerpc64le-linux. Committed to trunk.
[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 5726 bytes --]
commit 91004436daaf8d54daa467908d1b634a1a352707
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Fri Nov 13 19:11:02 2020
libstdc++: Avoid more 32-bit time_t overflows in futex calls
This fixes another overflow in code converting a std::chrono::seconds
duration to a time_t. This time in the new code using a futex wait with
an absolute timeout (so this one doesn't need to be backported to the
release branches).
A timeout after the epochalypse would overflow the tv_sec field,
producing an incorrect value. If that incorrect value happened to be
negative, the syscall would return with EINVAL and then the caller would
keep retrying, spinning until the timeout was reached. If the value
happened to be positive, we would wake up too soon and incorrectly
report a timeout
libstdc++-v3/ChangeLog:
* src/c++11/futex.cc (relative_timespec): Add [[unlikely]]
attributes.
(__atomic_futex_unsigned_base::_M_futex_wait_until)
(__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
Check for overflow.
* testsuite/30_threads/future/members/wait_until_overflow.cc:
New test.
diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index c2b2d32e8c43..15959cebee57 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -51,6 +51,8 @@ namespace std _GLIBCXX_VISIBILITY(default)
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION
+ using __gnu_cxx::__int_traits;
+
namespace
{
std::atomic<bool> futex_clock_realtime_unavailable;
@@ -74,10 +76,10 @@ namespace
auto rel_s = abs_s.count() - now_s;
// Avoid overflows
- if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
- rel_s = __gnu_cxx::__int_traits<time_t>::__max;
- else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
- rel_s = __gnu_cxx::__int_traits<time_t>::__min;
+ if (rel_s > __int_traits<time_t>::__max) [[unlikely]]
+ rel_s = __int_traits<time_t>::__max;
+ else if (rel_s < __int_traits<time_t>::__min) [[unlikely]]
+ rel_s = __int_traits<time_t>::__min;
// Convert the absolute timeout value to a relative timeout
rt.tv_sec = rel_s;
@@ -111,14 +113,17 @@ namespace
{
if (!futex_clock_realtime_unavailable.load(std::memory_order_relaxed))
{
- struct timespec rt;
- rt.tv_sec = __s.count();
- rt.tv_nsec = __ns.count();
-
// futex sets errno=EINVAL for absolute timeouts before the epoch.
- if (__builtin_expect(rt.tv_sec < 0, false))
+ if (__s.count() < 0)
return false;
+ struct timespec rt;
+ if (__s.count() > __int_traits<time_t>::__max) [[unlikely]]
+ rt.tv_sec = __int_traits<time_t>::__max;
+ else
+ rt.tv_sec = __s.count();
+ rt.tv_nsec = __ns.count();
+
if (syscall (SYS_futex, __addr,
futex_wait_bitset_op | futex_clock_realtime_flag,
__val, &rt, nullptr, futex_bitset_match_any) == -1)
@@ -184,14 +189,17 @@ namespace
{
if (!futex_clock_monotonic_unavailable.load(std::memory_order_relaxed))
{
- struct timespec rt;
- rt.tv_sec = __s.count();
- rt.tv_nsec = __ns.count();
-
// futex sets errno=EINVAL for absolute timeouts before the epoch.
- if (__builtin_expect(rt.tv_sec < 0, false))
+ if (__s.count() < 0) [[unlikely]]
return false;
+ struct timespec rt;
+ if (__s.count() > __int_traits<time_t>::__max) [[unlikely]]
+ rt.tv_sec = __int_traits<time_t>::__max;
+ else
+ rt.tv_sec = __s.count();
+ rt.tv_nsec = __ns.count();
+
if (syscall (SYS_futex, __addr,
futex_wait_bitset_op | futex_clock_monotonic_flag,
__val, &rt, nullptr, futex_bitset_match_any) == -1)
diff --git a/libstdc++-v3/testsuite/30_threads/future/members/wait_until_overflow.cc b/libstdc++-v3/testsuite/30_threads/future/members/wait_until_overflow.cc
new file mode 100644
index 000000000000..8d6a5148ce3c
--- /dev/null
+++ b/libstdc++-v3/testsuite/30_threads/future/members/wait_until_overflow.cc
@@ -0,0 +1,48 @@
+// { dg-do run }
+// { dg-additional-options "-pthread" { target pthread } }
+// { dg-require-effective-target c++11 }
+// { dg-require-gthreads "" }
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library. This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3. If not see
+// <http://www.gnu.org/licenses/>.
+
+
+#include <future>
+#include <chrono>
+#include <climits>
+#include <testsuite_hooks.h>
+
+namespace chrono = std::chrono;
+
+void test01()
+{
+ std::future<void> fut = std::async(std::launch::async, [] {
+ std::this_thread::sleep_for(chrono::seconds(4));
+ });
+
+ // A time in the distant future, but which overflows 32-bit time_t:
+ auto then = chrono::system_clock::now() + chrono::seconds(UINT_MAX + 2LL);
+ auto status = fut.wait_until(then);
+ // The wait_until call should have waited for the result to be ready.
+ // If converting the time_point to time_t overflows, it will timeout.
+ VERIFY(status == std::future_status::ready);
+}
+
+int main()
+{
+ test01();
+}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
2020-11-13 17:25 ` Jonathan Wakely
2020-11-13 19:16 ` Jonathan Wakely
@ 2020-11-13 20:29 ` Mike Crowe
2020-11-13 21:12 ` Jonathan Wakely
1 sibling, 1 reply; 10+ messages in thread
From: Mike Crowe @ 2020-11-13 20:29 UTC (permalink / raw)
To: Jonathan Wakely; +Cc: libstdc++, gcc-patches
On Friday 13 November 2020 at 17:25:22 +0000, Jonathan Wakely wrote:
> On 13/11/20 11:02 +0000, Jonathan Wakely wrote:
> > On 12/11/20 23:49 +0000, Jonathan Wakely wrote:
> > > To poll a std::future to see if it's ready you have to call one of the
> > > timed waiting functions. The most obvious way is wait_for(0s) but this
> > > was previously very inefficient because it would turn the relative
> > > timeout to an absolute one by calling system_clock::now(). When the
> > > relative timeout is zero (or less) we're obviously going to get a time
> > > that has already passed, but the overhead of obtaining the current time
> > > can be dozens of microseconds. The alternative is to call wait_until
> > > with an absolute timeout that is in the past. If you know the clock's
> > > epoch is in the past you can use a default constructed time_point.
> > > Alternatively, using some_clock::time_point::min() gives the earliest
> > > time point supported by the clock, which should be safe to assume is in
> > > the past. However, using a futex wait with an absolute timeout before
> > > the UNIX epoch fails and sets errno=EINVAL. The new code using futex
> > > waits with absolute timeouts was not checking for this case, which could
> > > result in hangs (or killing the process if the libray is built with
> > > assertions enabled).
> > >
> > > This patch checks for times before the epoch before attempting to wait
> > > on a futex with an absolute timeout, which fixes the hangs or crashes.
> > > It also makes it very fast to poll using an absolute timeout before the
> > > epoch (because we skip the futex syscall).
> > >
> > > It also makes future::wait_for avoid waiting at all when the relative
> > > timeout is zero or less, to avoid the unnecessary overhead of getting
> > > the current time. This makes polling with wait_for(0s) take only a few
> > > cycles instead of dozens of milliseconds.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > > * include/std/future (future::wait_for): Do not wait for
> > > durations less than or equal to zero.
> > > * src/c++11/futex.cc (_M_futex_wait_until)
> > > (_M_futex_wait_until_steady): Do not wait for timeouts before
> > > the epoch.
> > > * testsuite/30_threads/future/members/poll.cc: New test.
> > >
> > > Tested powerpc64le-linux. Committed to trunk.
> > >
> > > I think the shortcut in future::wait_for is worth backporting. The
> > > changes in src/c++11/futex.cc are not needed because the code using
> > > absolute timeouts with futex waits is not present on any release
> > > branch.
> >
> > I've committed this fix for the new test.
>
> Backporting the change to gcc-10 revealed an overflow bug in the
> existing code, resulting in blocking for years when given an absolute
> timeout in the distant past. There's still a similar bug in the new
> code (using futexes with absolute timeouts against clocks) where a
> large chrono::seconds value can overflow and produce an incorrect
> tv_sec value. Apart from the overflow itself being UB, the result in
> that case is just a spurious wakeup (the call says it timed out when
> it didn't reach the specified time). That should still be fixed, but
> I'll do it separately.
>
> Tested x86_64-linux. Committed to trunk.
>
>
> commit e7e0eeeb6e6707be2a6c6da49d4b6be3199e2af8
> Author: Jonathan Wakely <jwakely@redhat.com>
> Date: Fri Nov 13 15:19:04 2020
>
> libstdc++: Avoid 32-bit time_t overflows in futex calls
>
> The existing code doesn't check whether the chrono::seconds value is out
> of range of time_t. When using a timeout before the epoch (with a
> negative value) subtracting the current time (as time_t) and then
> assigning it to a time_t can overflow to a large positive value. This
> means that we end up waiting several years even though the specific
> timeout was in the distant past.
>
> We do have a check for negative timeouts, but that happens after the
> conversion to time_t so happens after the overflow.
>
> The conversion to a relative timeout is done in two places, so this
> factors it into a new function and adds the overflow checks there.
>
> libstdc++-v3/ChangeLog:
>
> * src/c++11/futex.cc (relative_timespec): New function to
> create relative time from two absolute times.
> (__atomic_futex_unsigned_base::_M_futex_wait_until)
> (__atomic_futex_unsigned_base::_M_futex_wait_until_steady):
> Use relative_timespec.
>
> diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
> index 57f7dfe87e9e..c2b2d32e8c43 100644
> --- a/libstdc++-v3/src/c++11/futex.cc
> +++ b/libstdc++-v3/src/c++11/futex.cc
> @@ -31,6 +31,7 @@
> #include <unistd.h>
> #include <sys/time.h>
> #include <errno.h>
> +#include <ext/numeric_traits.h>
> #include <debug/debug.h>
>
> #ifdef _GLIBCXX_USE_CLOCK_GETTIME_SYSCALL
> @@ -46,20 +47,55 @@ const unsigned futex_clock_realtime_flag = 256;
> const unsigned futex_bitset_match_any = ~0;
> const unsigned futex_wake_op = 1;
>
> -namespace
> -{
> - std::atomic<bool> futex_clock_realtime_unavailable;
> - std::atomic<bool> futex_clock_monotonic_unavailable;
> -}
> -
> namespace std _GLIBCXX_VISIBILITY(default)
> {
> _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
> +namespace
> +{
> + std::atomic<bool> futex_clock_realtime_unavailable;
> + std::atomic<bool> futex_clock_monotonic_unavailable;
> +
> + // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
> + // as a timespec.
> + struct timespec
> + relative_timespec(chrono::seconds abs_s, chrono::nanoseconds abs_ns,
> + time_t now_s, long now_ns)
> + {
> + struct timespec rt;
> +
> + // Did we already time out?
> + if (now_s > abs_s.count())
> + {
> + rt.tv_sec = -1;
> + return rt;
> + }
> +
> + auto rel_s = abs_s.count() - now_s;
> +
> + // Avoid overflows
> + if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
> + rel_s = __gnu_cxx::__int_traits<time_t>::__max;
> + else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
> + rel_s = __gnu_cxx::__int_traits<time_t>::__min;
I may be missing something, but if the line above executes...
> +
> + // Convert the absolute timeout value to a relative timeout
> + rt.tv_sec = rel_s;
> + rt.tv_nsec = abs_ns.count() - now_ns;
> + if (rt.tv_nsec < 0)
> + {
> + rt.tv_nsec += 1000000000;
> + --rt.tv_sec;
...and so does this line above, then I think that we'll end up
underflowing. (Presumably rt.tv_sec will wrap round to being some time in
2038 on most 32-bit targets.)
I'm currently trying to persuade myself that this can actually happen and
if so work out how to come up with a test case for it.
I don't believe that this part changed in your later patch.
> + }
> +
> + return rt;
> + }
> +} // namespace
> +
> bool
> - __atomic_futex_unsigned_base::_M_futex_wait_until(unsigned *__addr,
> - unsigned __val,
> - bool __has_timeout, chrono::seconds __s, chrono::nanoseconds __ns)
> + __atomic_futex_unsigned_base::
> + _M_futex_wait_until(unsigned *__addr, unsigned __val, bool __has_timeout,
> + chrono::seconds __s, chrono::nanoseconds __ns)
> {
> if (!__has_timeout)
> {
> @@ -109,15 +145,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> // true or has just been set to true.
> struct timeval tv;
> gettimeofday (&tv, NULL);
> +
> // Convert the absolute timeout value to a relative timeout
> - struct timespec rt;
> - rt.tv_sec = __s.count() - tv.tv_sec;
> - rt.tv_nsec = __ns.count() - tv.tv_usec * 1000;
> - if (rt.tv_nsec < 0)
> - {
> - rt.tv_nsec += 1000000000;
> - --rt.tv_sec;
> - }
> + auto rt = relative_timespec(__s, __ns, tv.tv_sec, tv.tv_usec * 1000);
> +
> // Did we already time out?
> if (rt.tv_sec < 0)
> return false;
> @@ -134,9 +165,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> }
>
> bool
> - __atomic_futex_unsigned_base::_M_futex_wait_until_steady(unsigned *__addr,
> - unsigned __val,
> - bool __has_timeout, chrono::seconds __s, chrono::nanoseconds __ns)
> + __atomic_futex_unsigned_base::
> + _M_futex_wait_until_steady(unsigned *__addr, unsigned __val,
> + bool __has_timeout,
> + chrono::seconds __s, chrono::nanoseconds __ns)
Oh how I wish all these functions didn't expect already cracked seconds and
nanoseconds. :(
Mike.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
2020-11-13 20:29 ` Mike Crowe
@ 2020-11-13 21:12 ` Jonathan Wakely
2020-11-13 21:16 ` Jonathan Wakely
2020-11-13 22:45 ` Jonathan Wakely
0 siblings, 2 replies; 10+ messages in thread
From: Jonathan Wakely @ 2020-11-13 21:12 UTC (permalink / raw)
To: Mike Crowe; +Cc: libstdc++, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 3047 bytes --]
On 13/11/20 20:29 +0000, Mike Crowe via Libstdc++ wrote:
>On Friday 13 November 2020 at 17:25:22 +0000, Jonathan Wakely wrote:
>> + // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
>> + // as a timespec.
>> + struct timespec
>> + relative_timespec(chrono::seconds abs_s, chrono::nanoseconds abs_ns,
>> + time_t now_s, long now_ns)
>> + {
>> + struct timespec rt;
>> +
>> + // Did we already time out?
>> + if (now_s > abs_s.count())
>> + {
>> + rt.tv_sec = -1;
>> + return rt;
>> + }
>> +
>> + auto rel_s = abs_s.count() - now_s;
>> +
>> + // Avoid overflows
>> + if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
>> + rel_s = __gnu_cxx::__int_traits<time_t>::__max;
>> + else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
>> + rel_s = __gnu_cxx::__int_traits<time_t>::__min;
>
>I may be missing something, but if the line above executes...
>
>> +
>> + // Convert the absolute timeout value to a relative timeout
>> + rt.tv_sec = rel_s;
>> + rt.tv_nsec = abs_ns.count() - now_ns;
>> + if (rt.tv_nsec < 0)
>> + {
>> + rt.tv_nsec += 1000000000;
>> + --rt.tv_sec;
>
>...and so does this line above, then I think that we'll end up
>underflowing. (Presumably rt.tv_sec will wrap round to being some time in
>2038 on most 32-bit targets.)
Ugh.
>I'm currently trying to persuade myself that this can actually happen and
>if so work out how to come up with a test case for it.
Maybe something like:
auto d = chrono::floor<chrono::seconds>(system_clock::now().time_since_epoch() - seconds(INT_MAX + 2LL));
fut.wait_until(system_clock::time_point(d));
This will create a sys_time with a value that is slightly more than
INT_MAX seconds before the current time, with a zero nanoseconds
component. The difference between the gettimeofday result and this
time will be slightly more negative than INT_MIN and so will overflow
a 32-bit time_t, causing the code to use __int_traits<time_t>::__min.
As long as the gettimeofday call doesn't happen to also have a zero
nanoseconds component, the difference of the nanoseconds values will
be negative, and we will decrement the time_t value.
>I don't believe that this part changed in your later patch.
Right. Thanks for catching this.
The attached patch should fix it. There's no point clamping very
negative values to the minimum time_t value, since any relative
timeout less than zero has already passed. So we can just use -1
there (and not bother with the tv_nsec field at all):
else if (rel_s <= 0) [[unlikely]]
{
rt.tv_sec = -1;
}
But please check my working.
>Oh how I wish all these functions didn't expect already cracked seconds and
>nanoseconds. :(
Indeed.
At some point (probably not for GCC 11) I'm going to add a proper
wrapper class for all these futex calls, and make it robust, and then
replace the various hand-crafted uses of syscall(SYS_futex, ...) with
that type so we only need to fix all this in one place. And it will
just take a duration or time_point.
[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 1575 bytes --]
commit a77f131249202c7342cc385f2d0473ee032ae0eb
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Fri Nov 13 20:57:15 2020
libstdc++: Fix another 32-bit time_t overflow in futex timeouts
libstdc++-v3/ChangeLog:
* src/c++11/futex.cc (relative_timespec): Avoid overflow if the
difference of seconds is the minimum time_t value and the
difference of nanoseconds is also negative.
diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index 15959cebee57..50768403cc63 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -75,19 +75,25 @@ namespace
auto rel_s = abs_s.count() - now_s;
- // Avoid overflows
+ // Convert the absolute timeout to a relative timeout, without overflow.
if (rel_s > __int_traits<time_t>::__max) [[unlikely]]
- rel_s = __int_traits<time_t>::__max;
- else if (rel_s < __int_traits<time_t>::__min) [[unlikely]]
- rel_s = __int_traits<time_t>::__min;
-
- // Convert the absolute timeout value to a relative timeout
- rt.tv_sec = rel_s;
- rt.tv_nsec = abs_ns.count() - now_ns;
- if (rt.tv_nsec < 0)
{
- rt.tv_nsec += 1000000000;
- --rt.tv_sec;
+ rt.tv_sec = __int_traits<time_t>::__max;
+ rt.tv_nsec = 999999999;
+ }
+ else if (rel_s <= 0) [[unlikely]]
+ {
+ rt.tv_sec = -1;
+ }
+ else
+ {
+ rt.tv_sec = rel_s;
+ rt.tv_nsec = abs_ns.count() - now_ns;
+ if (rt.tv_nsec < 0)
+ {
+ rt.tv_nsec += 1000000000;
+ --rt.tv_sec;
+ }
}
return rt;
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
2020-11-13 21:12 ` Jonathan Wakely
@ 2020-11-13 21:16 ` Jonathan Wakely
2020-11-13 22:45 ` Jonathan Wakely
1 sibling, 0 replies; 10+ messages in thread
From: Jonathan Wakely @ 2020-11-13 21:16 UTC (permalink / raw)
To: Mike Crowe; +Cc: libstdc++, gcc-patches
On 13/11/20 21:12 +0000, Jonathan Wakely wrote:
>On 13/11/20 20:29 +0000, Mike Crowe via Libstdc++ wrote:
>>On Friday 13 November 2020 at 17:25:22 +0000, Jonathan Wakely wrote:
>>>+ // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
>>>+ // as a timespec.
>>>+ struct timespec
>>>+ relative_timespec(chrono::seconds abs_s, chrono::nanoseconds abs_ns,
>>>+ time_t now_s, long now_ns)
>>>+ {
>>>+ struct timespec rt;
>>>+
>>>+ // Did we already time out?
>>>+ if (now_s > abs_s.count())
>>>+ {
>>>+ rt.tv_sec = -1;
>>>+ return rt;
>>>+ }
>>>+
>>>+ auto rel_s = abs_s.count() - now_s;
>>>+
>>>+ // Avoid overflows
>>>+ if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
>>>+ rel_s = __gnu_cxx::__int_traits<time_t>::__max;
>>>+ else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
>>>+ rel_s = __gnu_cxx::__int_traits<time_t>::__min;
>>
>>I may be missing something, but if the line above executes...
>>
>>>+
>>>+ // Convert the absolute timeout value to a relative timeout
>>>+ rt.tv_sec = rel_s;
>>>+ rt.tv_nsec = abs_ns.count() - now_ns;
>>>+ if (rt.tv_nsec < 0)
>>>+ {
>>>+ rt.tv_nsec += 1000000000;
>>>+ --rt.tv_sec;
>>
>>...and so does this line above, then I think that we'll end up
>>underflowing. (Presumably rt.tv_sec will wrap round to being some time in
>>2038 on most 32-bit targets.)
>
>Ugh.
>
>>I'm currently trying to persuade myself that this can actually happen and
>>if so work out how to come up with a test case for it.
>
>Maybe something like:
>
>auto d = chrono::floor<chrono::seconds>(system_clock::now().time_since_epoch() - seconds(INT_MAX + 2LL));
>fut.wait_until(system_clock::time_point(d));
>
>This will create a sys_time with a value that is slightly more than
>INT_MAX seconds before the current time, with a zero nanoseconds
>component. The difference between the gettimeofday result and this
>time will be slightly more negative than INT_MIN and so will overflow
>a 32-bit time_t, causing the code to use __int_traits<time_t>::__min.
>As long as the gettimeofday call doesn't happen to also have a zero
>nanoseconds component, the difference of the nanoseconds values will
>be negative, and we will decrement the time_t value.
>
>>I don't believe that this part changed in your later patch.
>
>Right. Thanks for catching this.
>
>The attached patch should fix it. There's no point clamping very
>negative values to the minimum time_t value, since any relative
>timeout less than zero has already passed. So we can just use -1
>there (and not bother with the tv_nsec field at all):
>
> else if (rel_s <= 0) [[unlikely]]
> {
> rt.tv_sec = -1;
> }
Gah, no, this has to be < 0 not <= 0 otherwise we treat 0.1s as -1.1
And we should probably lose the [[unlikely]] there, since it's quite
feasible that an absolute time in the recent past gets used.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
2020-11-13 21:12 ` Jonathan Wakely
2020-11-13 21:16 ` Jonathan Wakely
@ 2020-11-13 22:45 ` Jonathan Wakely
2020-11-14 0:17 ` Jonathan Wakely
1 sibling, 1 reply; 10+ messages in thread
From: Jonathan Wakely @ 2020-11-13 22:45 UTC (permalink / raw)
To: Mike Crowe; +Cc: libstdc++, gcc-patches
On 13/11/20 21:12 +0000, Jonathan Wakely wrote:
>On 13/11/20 20:29 +0000, Mike Crowe via Libstdc++ wrote:
>>On Friday 13 November 2020 at 17:25:22 +0000, Jonathan Wakely wrote:
>>>+ // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
>>>+ // as a timespec.
>>>+ struct timespec
>>>+ relative_timespec(chrono::seconds abs_s, chrono::nanoseconds abs_ns,
>>>+ time_t now_s, long now_ns)
>>>+ {
>>>+ struct timespec rt;
>>>+
>>>+ // Did we already time out?
>>>+ if (now_s > abs_s.count())
>>>+ {
>>>+ rt.tv_sec = -1;
>>>+ return rt;
>>>+ }
>>>+
>>>+ auto rel_s = abs_s.count() - now_s;
>>>+
>>>+ // Avoid overflows
>>>+ if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
>>>+ rel_s = __gnu_cxx::__int_traits<time_t>::__max;
>>>+ else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
>>>+ rel_s = __gnu_cxx::__int_traits<time_t>::__min;
>>
>>I may be missing something, but if the line above executes...
>>
>>>+
>>>+ // Convert the absolute timeout value to a relative timeout
>>>+ rt.tv_sec = rel_s;
>>>+ rt.tv_nsec = abs_ns.count() - now_ns;
>>>+ if (rt.tv_nsec < 0)
>>>+ {
>>>+ rt.tv_nsec += 1000000000;
>>>+ --rt.tv_sec;
>>
>>...and so does this line above, then I think that we'll end up
>>underflowing. (Presumably rt.tv_sec will wrap round to being some time in
>>2038 on most 32-bit targets.)
>
>Ugh.
>
>>I'm currently trying to persuade myself that this can actually happen and
>>if so work out how to come up with a test case for it.
>
>Maybe something like:
>
>auto d = chrono::floor<chrono::seconds>(system_clock::now().time_since_epoch() - seconds(INT_MAX + 2LL));
>fut.wait_until(system_clock::time_point(d));
>
>This will create a sys_time with a value that is slightly more than
>INT_MAX seconds before the current time, with a zero nanoseconds
Ah, but such a time will never reach the overflow because the first
thing that the new relative_timespec function does is:
if (now_s > abs_s.count())
{
rt.tv_sec = -1;
return rt;
}
So in fact we can never have a negative rel_s anyway.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
2020-11-13 22:45 ` Jonathan Wakely
@ 2020-11-14 0:17 ` Jonathan Wakely
2020-11-14 13:22 ` Mike Crowe
0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Wakely @ 2020-11-14 0:17 UTC (permalink / raw)
To: Mike Crowe; +Cc: libstdc++, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 2301 bytes --]
On 13/11/20 22:45 +0000, Jonathan Wakely wrote:
>On 13/11/20 21:12 +0000, Jonathan Wakely wrote:
>>On 13/11/20 20:29 +0000, Mike Crowe via Libstdc++ wrote:
>>>On Friday 13 November 2020 at 17:25:22 +0000, Jonathan Wakely wrote:
>>>>+ // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
>>>>+ // as a timespec.
>>>>+ struct timespec
>>>>+ relative_timespec(chrono::seconds abs_s, chrono::nanoseconds abs_ns,
>>>>+ time_t now_s, long now_ns)
>>>>+ {
>>>>+ struct timespec rt;
>>>>+
>>>>+ // Did we already time out?
>>>>+ if (now_s > abs_s.count())
>>>>+ {
>>>>+ rt.tv_sec = -1;
>>>>+ return rt;
>>>>+ }
>>>>+
>>>>+ auto rel_s = abs_s.count() - now_s;
>>>>+
>>>>+ // Avoid overflows
>>>>+ if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
>>>>+ rel_s = __gnu_cxx::__int_traits<time_t>::__max;
>>>>+ else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
>>>>+ rel_s = __gnu_cxx::__int_traits<time_t>::__min;
>>>
>>>I may be missing something, but if the line above executes...
>>>
>>>>+
>>>>+ // Convert the absolute timeout value to a relative timeout
>>>>+ rt.tv_sec = rel_s;
>>>>+ rt.tv_nsec = abs_ns.count() - now_ns;
>>>>+ if (rt.tv_nsec < 0)
>>>>+ {
>>>>+ rt.tv_nsec += 1000000000;
>>>>+ --rt.tv_sec;
>>>
>>>...and so does this line above, then I think that we'll end up
>>>underflowing. (Presumably rt.tv_sec will wrap round to being some time in
>>>2038 on most 32-bit targets.)
>>
>>Ugh.
>>
>>>I'm currently trying to persuade myself that this can actually happen and
>>>if so work out how to come up with a test case for it.
>>
>>Maybe something like:
>>
>>auto d = chrono::floor<chrono::seconds>(system_clock::now().time_since_epoch() - seconds(INT_MAX + 2LL));
>>fut.wait_until(system_clock::time_point(d));
>>
>>This will create a sys_time with a value that is slightly more than
>>INT_MAX seconds before the current time, with a zero nanoseconds
>
>Ah, but such a time will never reach the overflow because the first
>thing that the new relative_timespec function does is:
>
> if (now_s > abs_s.count())
> {
> rt.tv_sec = -1;
> return rt;
> }
>
>So in fact we can never have a negative rel_s anyway.
Here's what I've pushed now, after testing on x86_64-linux.
[-- Attachment #2: patch.txt --]
[-- Type: text/x-patch, Size: 1918 bytes --]
commit b8d36dcc917e8a06d8c20b9f5ecc920ed2b9e947
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Fri Nov 13 20:57:15 2020
libstdc++: Remove redundant overflow check for futex timeout [PR 93456]
The relative_timespec function already checks for the case where the
specified timeout is in the past, so the difference can never be
negative. That means we dn't need to check if it's more negative than
the minimum time_t value.
libstdc++-v3/ChangeLog:
PR libstdc++/93456
* src/c++11/futex.cc (relative_timespec): Remove redundant check
negative values.
* testsuite/30_threads/future/members/wait_until_overflow.cc: Moved to...
* testsuite/30_threads/future/members/93456.cc: ...here.
diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
index 15959cebee57..c008798318c9 100644
--- a/libstdc++-v3/src/c++11/futex.cc
+++ b/libstdc++-v3/src/c++11/futex.cc
@@ -73,21 +73,23 @@ namespace
return rt;
}
- auto rel_s = abs_s.count() - now_s;
+ const auto rel_s = abs_s.count() - now_s;
- // Avoid overflows
+ // Convert the absolute timeout to a relative timeout, without overflow.
if (rel_s > __int_traits<time_t>::__max) [[unlikely]]
- rel_s = __int_traits<time_t>::__max;
- else if (rel_s < __int_traits<time_t>::__min) [[unlikely]]
- rel_s = __int_traits<time_t>::__min;
-
- // Convert the absolute timeout value to a relative timeout
- rt.tv_sec = rel_s;
- rt.tv_nsec = abs_ns.count() - now_ns;
- if (rt.tv_nsec < 0)
{
- rt.tv_nsec += 1000000000;
- --rt.tv_sec;
+ rt.tv_sec = __int_traits<time_t>::__max;
+ rt.tv_nsec = 999999999;
+ }
+ else
+ {
+ rt.tv_sec = rel_s;
+ rt.tv_nsec = abs_ns.count() - now_ns;
+ if (rt.tv_nsec < 0)
+ {
+ rt.tv_nsec += 1000000000;
+ --rt.tv_sec;
+ }
}
return rt;
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling
2020-11-14 0:17 ` Jonathan Wakely
@ 2020-11-14 13:22 ` Mike Crowe
0 siblings, 0 replies; 10+ messages in thread
From: Mike Crowe @ 2020-11-14 13:22 UTC (permalink / raw)
To: Jonathan Wakely; +Cc: libstdc++, gcc-patches
On Saturday 14 November 2020 at 00:17:22 +0000, Jonathan Wakely wrote:
> On 13/11/20 22:45 +0000, Jonathan Wakely wrote:
> > On 13/11/20 21:12 +0000, Jonathan Wakely wrote:
> > > On 13/11/20 20:29 +0000, Mike Crowe via Libstdc++ wrote:
> > > > On Friday 13 November 2020 at 17:25:22 +0000, Jonathan Wakely wrote:
> > > > > + // Return the relative duration from (now_s + now_ns) to (abs_s + abs_ns)
> > > > > + // as a timespec.
> > > > > + struct timespec
> > > > > + relative_timespec(chrono::seconds abs_s, chrono::nanoseconds abs_ns,
> > > > > + time_t now_s, long now_ns)
> > > > > + {
> > > > > + struct timespec rt;
> > > > > +
> > > > > + // Did we already time out?
> > > > > + if (now_s > abs_s.count())
> > > > > + {
> > > > > + rt.tv_sec = -1;
> > > > > + return rt;
> > > > > + }
> > > > > +
> > > > > + auto rel_s = abs_s.count() - now_s;
> > > > > +
> > > > > + // Avoid overflows
> > > > > + if (rel_s > __gnu_cxx::__int_traits<time_t>::__max)
> > > > > + rel_s = __gnu_cxx::__int_traits<time_t>::__max;
> > > > > + else if (rel_s < __gnu_cxx::__int_traits<time_t>::__min)
> > > > > + rel_s = __gnu_cxx::__int_traits<time_t>::__min;
> > > >
> > > > I may be missing something, but if the line above executes...
> > > >
> > > > > +
> > > > > + // Convert the absolute timeout value to a relative timeout
> > > > > + rt.tv_sec = rel_s;
> > > > > + rt.tv_nsec = abs_ns.count() - now_ns;
> > > > > + if (rt.tv_nsec < 0)
> > > > > + {
> > > > > + rt.tv_nsec += 1000000000;
> > > > > + --rt.tv_sec;
> > > >
> > > > ...and so does this line above, then I think that we'll end up
> > > > underflowing. (Presumably rt.tv_sec will wrap round to being some time in
> > > > 2038 on most 32-bit targets.)
> > >
> > > Ugh.
> > >
> > > > I'm currently trying to persuade myself that this can actually happen and
> > > > if so work out how to come up with a test case for it.
> > >
> > > Maybe something like:
> > >
> > > auto d = chrono::floor<chrono::seconds>(system_clock::now().time_since_epoch() - seconds(INT_MAX + 2LL));
> > > fut.wait_until(system_clock::time_point(d));
> > >
> > > This will create a sys_time with a value that is slightly more than
> > > INT_MAX seconds before the current time, with a zero nanoseconds
> >
> > Ah, but such a time will never reach the overflow because the first
> > thing that the new relative_timespec function does is:
> >
> > if (now_s > abs_s.count())
> > {
> > rt.tv_sec = -1;
> > return rt;
> > }
> >
> > So in fact we can never have a negative rel_s anyway.
>
> Here's what I've pushed now, after testing on x86_64-linux.
>
>
> commit b8d36dcc917e8a06d8c20b9f5ecc920ed2b9e947
> Author: Jonathan Wakely <jwakely@redhat.com>
> Date: Fri Nov 13 20:57:15 2020
>
> libstdc++: Remove redundant overflow check for futex timeout [PR 93456]
>
> The relative_timespec function already checks for the case where the
> specified timeout is in the past, so the difference can never be
> negative. That means we dn't need to check if it's more negative than
> the minimum time_t value.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/93456
> * src/c++11/futex.cc (relative_timespec): Remove redundant check
> negative values.
> * testsuite/30_threads/future/members/wait_until_overflow.cc: Moved to...
> * testsuite/30_threads/future/members/93456.cc: ...here.
>
> diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc
> index 15959cebee57..c008798318c9 100644
> --- a/libstdc++-v3/src/c++11/futex.cc
> +++ b/libstdc++-v3/src/c++11/futex.cc
> @@ -73,21 +73,23 @@ namespace
> return rt;
> }
>
> - auto rel_s = abs_s.count() - now_s;
> + const auto rel_s = abs_s.count() - now_s;
>
> - // Avoid overflows
> + // Convert the absolute timeout to a relative timeout, without overflow.
> if (rel_s > __int_traits<time_t>::__max) [[unlikely]]
> - rel_s = __int_traits<time_t>::__max;
> - else if (rel_s < __int_traits<time_t>::__min) [[unlikely]]
> - rel_s = __int_traits<time_t>::__min;
> -
> - // Convert the absolute timeout value to a relative timeout
> - rt.tv_sec = rel_s;
> - rt.tv_nsec = abs_ns.count() - now_ns;
> - if (rt.tv_nsec < 0)
> {
> - rt.tv_nsec += 1000000000;
> - --rt.tv_sec;
> + rt.tv_sec = __int_traits<time_t>::__max;
> + rt.tv_nsec = 999999999;
> + }
> + else
> + {
> + rt.tv_sec = rel_s;
> + rt.tv_nsec = abs_ns.count() - now_ns;
> + if (rt.tv_nsec < 0)
> + {
> + rt.tv_nsec += 1000000000;
> + --rt.tv_sec;
> + }
> }
>
> return rt;
LGTM. Thanks.
Mike.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-11-14 13:22 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-12 23:49 [committed] libstdc++: Optimise std::future::wait_for and fix futex polling Jonathan Wakely
2020-11-13 11:02 ` Jonathan Wakely
2020-11-13 17:25 ` Jonathan Wakely
2020-11-13 19:16 ` Jonathan Wakely
2020-11-13 20:29 ` Mike Crowe
2020-11-13 21:12 ` Jonathan Wakely
2020-11-13 21:16 ` Jonathan Wakely
2020-11-13 22:45 ` Jonathan Wakely
2020-11-14 0:17 ` Jonathan Wakely
2020-11-14 13:22 ` Mike Crowe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).