From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by sourceware.org (Postfix) with ESMTP id D2A733898000 for ; Fri, 13 Nov 2020 11:02:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D2A733898000 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-44-RpVkSLMAMY-wsD3tq1zX6w-1; Fri, 13 Nov 2020 06:02:21 -0500 X-MC-Unique: RpVkSLMAMY-wsD3tq1zX6w-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 270551084D68; Fri, 13 Nov 2020 11:02:20 +0000 (UTC) Received: from localhost (unknown [10.33.36.62]) by smtp.corp.redhat.com (Postfix) with ESMTP id C67765B4C9; Fri, 13 Nov 2020 11:02:19 +0000 (UTC) Date: Fri, 13 Nov 2020 11:02:18 +0000 From: Jonathan Wakely To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: [committed] libstdc++: Optimise std::future::wait_for and fix futex polling Message-ID: <20201113110218.GG503596@redhat.com> References: <20201112234940.GA475138@redhat.com> MIME-Version: 1.0 In-Reply-To: <20201112234940.GA475138@redhat.com> X-Clacks-Overhead: GNU Terry Pratchett X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/mixed; boundary="ptMYGWplstB9CqWP" Content-Disposition: inline X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Nov 2020 11:02:26 -0000 --ptMYGWplstB9CqWP Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline On 12/11/20 23:49 +0000, Jonathan Wakely wrote: >To poll a std::future to see if it's ready you have to call one of the >timed waiting functions. The most obvious way is wait_for(0s) but this >was previously very inefficient because it would turn the relative >timeout to an absolute one by calling system_clock::now(). When the >relative timeout is zero (or less) we're obviously going to get a time >that has already passed, but the overhead of obtaining the current time >can be dozens of microseconds. The alternative is to call wait_until >with an absolute timeout that is in the past. If you know the clock's >epoch is in the past you can use a default constructed time_point. >Alternatively, using some_clock::time_point::min() gives the earliest >time point supported by the clock, which should be safe to assume is in >the past. However, using a futex wait with an absolute timeout before >the UNIX epoch fails and sets errno=EINVAL. The new code using futex >waits with absolute timeouts was not checking for this case, which could >result in hangs (or killing the process if the libray is built with >assertions enabled). > >This patch checks for times before the epoch before attempting to wait >on a futex with an absolute timeout, which fixes the hangs or crashes. >It also makes it very fast to poll using an absolute timeout before the >epoch (because we skip the futex syscall). > >It also makes future::wait_for avoid waiting at all when the relative >timeout is zero or less, to avoid the unnecessary overhead of getting >the current time. This makes polling with wait_for(0s) take only a few >cycles instead of dozens of milliseconds. > >libstdc++-v3/ChangeLog: > > * include/std/future (future::wait_for): Do not wait for > durations less than or equal to zero. > * src/c++11/futex.cc (_M_futex_wait_until) > (_M_futex_wait_until_steady): Do not wait for timeouts before > the epoch. > * testsuite/30_threads/future/members/poll.cc: New test. > >Tested powerpc64le-linux. Committed to trunk. > >I think the shortcut in future::wait_for is worth backporting. The >changes in src/c++11/futex.cc are not needed because the code using >absolute timeouts with futex waits is not present on any release >branch. I've committed this fix for the new test. Tested x86_64-linux, sparc-solaris-2.11 and powerpc-aix. --ptMYGWplstB9CqWP Content-Type: text/x-patch; charset=us-ascii Content-Disposition: attachment; filename="patch.txt" commit 8c4e33d2032ab150748ea2fe1df2b1c00652a338 Author: Jonathan Wakely Date: Fri Nov 13 10:04:33 2020 libstdc++: Add -pthread options to std::future polling test For linux targets this test doesn't need -lpthread because it only uses atomics, but for all other targets std::call_once still needs pthreads. Add the necessary test directives to make that work. The timings in this test might be too fragile or too target-specific, so it might need to be adjusted in future, or restricted to only run on specific targets. For now I've increased the allowed ratio between wait_for calls before and after the future is made ready, because it was failing with -O3 -march=native sometimes. libstdc++-v3/ChangeLog: * testsuite/30_threads/future/members/poll.cc: Require gthreads and add -pthread for targets that require it. Relax required ratio of wait_for calls before/after the future is ready. diff --git a/libstdc++-v3/testsuite/30_threads/future/members/poll.cc b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc index 54580579d3a1..fff9bea899c9 100644 --- a/libstdc++-v3/testsuite/30_threads/future/members/poll.cc +++ b/libstdc++-v3/testsuite/30_threads/future/members/poll.cc @@ -17,6 +17,8 @@ // { dg-options "-O3" } // { dg-do run { target c++11 } } +// { dg-additional-options "-pthread" { target pthread } } +// { dg-require-gthreads "" } #include #include @@ -49,20 +51,6 @@ int main() auto stop = chrono::high_resolution_clock::now(); double wait_for_0 = print("wait_for(0s)", stop - start); - start = chrono::high_resolution_clock::now(); - for(int i = 0; i < iterations; i++) - f.wait_until(chrono::system_clock::time_point()); - stop = chrono::high_resolution_clock::now(); - double wait_until_sys_epoch __attribute__((unused)) - = print("wait_until(system_clock epoch)", stop - start); - - start = chrono::high_resolution_clock::now(); - for(int i = 0; i < iterations; i++) - f.wait_until(chrono::steady_clock::time_point()); - stop = chrono::high_resolution_clock::now(); - double wait_until_steady_epoch __attribute__((unused)) - = print("wait_until(steady_clock epoch", stop - start); - start = chrono::high_resolution_clock::now(); for(int i = 0; i < iterations; i++) f.wait_until(chrono::system_clock::time_point::min()); @@ -77,6 +65,20 @@ int main() double wait_until_steady_min __attribute__((unused)) = print("wait_until(steady_clock minimum)", stop - start); + start = chrono::high_resolution_clock::now(); + for(int i = 0; i < iterations; i++) + f.wait_until(chrono::system_clock::time_point()); + stop = chrono::high_resolution_clock::now(); + double wait_until_sys_epoch __attribute__((unused)) + = print("wait_until(system_clock epoch)", stop - start); + + start = chrono::high_resolution_clock::now(); + for(int i = 0; i < iterations; i++) + f.wait_until(chrono::steady_clock::time_point()); + stop = chrono::high_resolution_clock::now(); + double wait_until_steady_epoch __attribute__((unused)) + = print("wait_until(steady_clock epoch", stop - start); + p.set_value(1); start = chrono::high_resolution_clock::now(); @@ -85,19 +87,19 @@ int main() stop = chrono::high_resolution_clock::now(); double ready = print("wait_for when ready", stop - start); - // polling before ready with wait_for(0s) should be almost as fast as + // Polling before ready with wait_for(0s) should be almost as fast as // after the result is ready. - VERIFY( wait_for_0 < (ready * 10) ); + VERIFY( wait_for_0 < (ready * 30) ); + + // Polling before ready using wait_until(min) should not be terribly slow. + VERIFY( wait_until_sys_min < (ready * 100) ); + VERIFY( wait_until_steady_min < (ready * 100) ); // The following two tests fail with GCC 11, see // https://gcc.gnu.org/pipermail/libstdc++/2020-November/051422.html #if 0 - // polling before ready using wait_until(epoch) should not be terribly slow. + // Polling before ready using wait_until(epoch) should not be terribly slow. VERIFY( wait_until_sys_epoch < (ready * 100) ); VERIFY( wait_until_steady_epoch < (ready * 100) ); #endif - - // polling before ready using wait_until(min) should not be terribly slow. - VERIFY( wait_until_sys_min < (ready * 100) ); - VERIFY( wait_until_steady_min < (ready * 100) ); } --ptMYGWplstB9CqWP--