From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10088 invoked by alias); 14 Oct 2016 19:42:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 10063 invoked by uid 89); 14 Oct 2016 19:42:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=shave, yay, der X-Spam-User: qpsmtpd, 2 recipients X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 14 Oct 2016 19:41:53 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 23B95C05678D; Fri, 14 Oct 2016 19:41:52 +0000 (UTC) Received: from localhost (ovpn-116-70.ams2.redhat.com [10.36.116.70]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u9EJfpIY006237; Fri, 14 Oct 2016 15:41:51 -0400 Date: Fri, 14 Oct 2016 19:42:00 -0000 From: Jonathan Wakely To: Eelis Cc: libstdc++ , gcc-patches Subject: Re: [patch, libstdc++] std::shuffle: Generate two swap positions at a time if possible Message-ID: <20161014194150.GG2922@redhat.com> References: <5728B8CC.7030405@eelis.net> <20160831124502.GH3342@redhat.com> <57C9C304.90607@eelis.net> <57C9CAAE.50202@eelis.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="ofZMSlrAVk9bLeVm" Content-Disposition: inline In-Reply-To: <57C9CAAE.50202@eelis.net> X-Clacks-Overhead: GNU Terry Pratchett User-Agent: Mutt/1.7.0 (2016-08-17) X-SW-Source: 2016-10/txt/msg01204.txt.bz2 --ofZMSlrAVk9bLeVm Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline Content-length: 1256 On 02/09/16 20:53 +0200, Eelis wrote: >On 2016-09-02 20:20, Eelis van der Weegen wrote: >>On 2016-08-31 14:45, Jonathan Wakely wrote: >>>Is this significantly faster than just using >>>uniform_int_distribution<_IntType>{0, __bound - 1}(__g) so we don't >>>need to duplicate the logic? (And people maintaining the code won't >>>reconvince themselves it's correct every time they look at it :-) >>> >>>[..] >>> >>>Could we hoist this test out of the loop somehow? >>> >>>If we change the loop condition to be __i+1 < __last we don't need to >>>test it on every iteration, and then after the loop we can just do >>>the final swap if (__urange % 2). >> >>Reusing std::uniform_int_distribution seems just as fast, so I've removed __generate_random_index_below. >> >>I've hoisted the (__i + 1 == __last) check out of the loop (in a slightly different way), and it seems to shave off a couple more cycles, yay! >> >>Updated patch attached. >> > >Please ignore that patch, I used __g()&1 but that's invalid (the new "UniformRandomBitGenerator" name is misleading). > >Updated patch (which uses a proper distribution even for the [0,1] case) attached. I've finally got round to committing this patch to trunk. Thanks for your patience, and sorry for the delay! --ofZMSlrAVk9bLeVm Content-Type: text/x-patch; charset=us-ascii Content-Disposition: attachment; filename="patch.txt" Content-length: 2384 commit 1ddf9566764a85da4826628098b352bd30ba2bbc Author: Jonathan Wakely Date: Fri Oct 14 20:27:13 2016 +0100 Optimize std::shuffle by using generator to get two values at once 2016-10-14 Eelis van der Weegen * include/bits/stl_algo.h (shuffle): Extract two random numbers from each generator invocation when its range is large enough. diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h index 0538a79..db99cb8 100644 --- a/libstdc++-v3/include/bits/stl_algo.h +++ b/libstdc++-v3/include/bits/stl_algo.h @@ -3772,6 +3772,47 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION typedef typename std::make_unsigned<_DistanceType>::type __ud_type; typedef typename std::uniform_int_distribution<__ud_type> __distr_type; typedef typename __distr_type::param_type __p_type; + + typedef typename std::remove_reference<_UniformRandomNumberGenerator>::type _Gen; + typedef typename std::common_type::type __uc_type; + + const __uc_type __urngrange = __g.max() - __g.min(); + const __uc_type __urange = __uc_type(__last - __first); + + if (__urngrange / __urange >= __urange) + // I.e. (__urngrange >= __urange * __urange) but without wrap issues. + { + _RandomAccessIterator __i = __first + 1; + + // Since we know the range isn't empty, an even number of elements + // means an uneven number of elements /to swap/, in which case we + // do the first one up front: + + if ((__urange % 2) == 0) + { + __distr_type __d{0, 1}; + std::iter_swap(__i++, __first + __d(__g)); + } + + // Now we know that __last - __i is even, so we do the rest in pairs, + // using a single distribution invocation to produce swap positions + // for two successive elements at a time: + + while (__i != __last) + { + const __uc_type __swap_range = __uc_type(__i - __first) + 1; + const __uc_type __comp_range = __swap_range * (__swap_range + 1); + + std::uniform_int_distribution<__uc_type> __d{0, __comp_range - 1}; + const __uc_type __pospos = __d(__g); + + std::iter_swap(__i++, __first + (__pospos % __swap_range)); + std::iter_swap(__i++, __first + (__pospos / __swap_range)); + } + + return; + } + __distr_type __d; for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i) --ofZMSlrAVk9bLeVm--