From: "François Dumont" <frs.dumont@gmail.com>
To: David de Kloet <dskloet@gmail.com>
Cc: libstdc++@gcc.gnu.org
Subject: Re: __inplace_stable_sort improvement
Date: Thu, 25 Nov 2021 06:35:05 +0100 [thread overview]
Message-ID: <54dbf38f-6409-9971-bdec-457ac6d2473a@gmail.com> (raw)
In-Reply-To: <CAKxRKif8eweyvf5ugZ-qVFwtLNqHCOBZRwZg87gwXt9ivytc_w@mail.gmail.com>
On 24/11/21 7:48 pm, David de Kloet wrote:
> On Wed, Nov 24, 2021 at 7:33 PM François Dumont <frs.dumont@gmail.com> wrote:
>>
>> On 18/11/21 1:58 pm, David de Kloet via Libstdc++ wrote:
>>> Hello libstdc++,
>>>
>>> Apologies if this is the wrong list for my message. I believe I have an
>>> improvement for __inplace_stable_sort and I'm wondering if this is worth
>>> pursuing and if so how to proceed.
>> Of course, anyone interested in enhancing the library is welcome.
>>
>> The best way to do so would be to provide a patch on the current library
>> code. I'm the last one to have touch to this algo and it would be easier
>> for me to understand if you try to implement it directly on the library.
> _my_merge_without_buffer below preprocesses exactly to
> __merge_without_buffer from the library if you undefine NEW_ALGO so
> you can see exactly what my changes are.
>
> I currently don't have much time, but when I do I'd be happy to
> provide it as a patch. If you find time before me, you're also more
> than welcome to use the code.
>
>> The good news is that you are trying to enhance a piece of code rarely
>> used as it is only when the buffer allocation fails. So it is less
>> critical and so perhaps more open to modifications.
>>
>> During my last work on it I've added a performance test:
>> performance/25_algorithms/inplace_merge.cc. It is also supposed to check
>> performance when we fail to get a buffer and so use
>> __merge_without_buffer. Use 'make check-performance' to run it, see
>> library manual guide. Do not hesitate to change it if you think that
>> your benches are better.
> I don't currently have the environment to `make check-performance`. I
> just copied the code from the web and adjusted it. But (again) when I
> have time, I'll look into that.
>
> If you know of any "contributing to stdlibc++" guides, that would be great too.
https://gcc.gnu.org/onlinedocs/libstdc++/manual/appendix_contributing.html
>
>>> __inplace_stable_sort does a merge sort where in each recursive step the 2
>>> sublists are merged in-place. The merge step is itself also recursive. In order
>>> to merge [A, B) and [B, C), a tail section S1 of [A, B) and a head section S2
>>> of [B, C) are found such that all elements in S1 are greater than all elements
>>> in S2 and then those sections are swapped/rotated and the shorter sections are
>>> merged recursively. This is done by the function __merge_without_buffer.
>>>
>>> In the current implementation, __merge_without_buffer takes the half point of
>>> the longer section and does a binary search (via lower_bound or upper_bound) to
>>> find the corresponding cut point on the other section. Then it rotates the
>>> sections.
>>>
>>> My proposal is to find the cut points such that both rotated sections have the
>>> same length such that they can always simply be swapped.
>> What do you mean by swapped ?
> I mean the first element of section 1 gets swapped with the first
> element of section 2. The second element of section 1 gets swapped
> with the second element of section 2. Etc. for each index.
>
>> I see in the code below that you are just
>> changing how the cut positions are computed and then call the same
>> std::rotate algo. I cannot see any reason why changing the position
>> would make the rotate faster.
> The implementation of rotate does a series of such section swaps. But
> if both section have the same length, it's only a single swap of
> sections which makes rotate about 1.5-2 times faster.
Ok, I see now, you try to use as much as possible the std::swap_ranges
in __rotate for random access iterator. Maybe it means that this
optimization should be limited to this type of iterator. We'll see
depending on benches.
I'll have a try then when I have time.
Thanks
>
> The purpose of the code I added is to choose the cut positions such
> that the rotated sections are of equal length, which makes rotate
> faster by doing fewer element swaps, resulting in fewer calls to the
> copy/assignment constructor.
>
>>> I have implemented
>>> this and it seems to perform better in general and in the special cases that
>>> I've tried.
>>>
>>> Unfortunately, since we are no longer guaranteed to halve the longer section in
>>> each step, there is a possibility for a stack overflow. To avoid this, the new
>>> algorithm falls back to the old behavior whenever the sections have
>>> significantly different lengths.
>>>
>>> Also, where the current algorithm performs the binary search implicitly via
>>> lower_bound/upper_bound, the new algorithm has to implement the binary search
>>> explicitly because it has to look at both sections to determine the next step
>>> in the search.
>>>
>>> I have created wrappers for my value objects as well as my iterators to count
>>> the number of operations performed by the current and my algorithm and run
>>> benchmarks on various distributions. My code is at the bottom but below is the
>>> output as run on my machine. For each operation it outputs the ratio between
>>> the number of operations performed by my algorithm over the number performed by
>>> the current algorithm.
>>>
>>> Thanks for your attention,
>>> David
>>>
>>>
>>> Distribution: random_unique(100000)
>>> Example: [19, 13, 5, 0, 10, 6, 15, 11, 16, 9, 14, 18, 4, 17, 1, 7, 8, 12, 3, 2]
>>>
>>> move: 0.588026 = 3331848 / 5666154
>>> moveAssign: 0.594789 = 6852819 / 11521431
>>> compare: 0.914997 = 2028977 / 2217469
>>> iteratorDeref: 0.718466 = 11001863 / 15312994
>>> iteratorCopy: 0.791332 = 33977189 / 42936712
>>> iteratorEquals: 2.27841 = 5174606 / 2271143
>>> iteratorInc: 1.2143 = 7697971 / 6339449
>>> iteratorDec: 0.0833529 = 473556 / 5681340
>>> iteratorDiff: 0.444865 = 1362598 / 3062950
>>> iteratorSub: 0.0200683 = 6272 / 312533
>>> iteratorAdd: 0.0695467 = 57149 / 821736
>>> iteratorAddAssign: 1.36491 = 3141949 / 2301950
>>> iteratorArithmetic: 0.861623 = 17914101 / 20791101
>>> micros: 0.663704 = 650825 / 980596
>>>
>>> Distribution: random_dupes(100000)
>>> Example: [4, 4, 1, 3, 0, 3, 2, 2, 4, 4, 0, 3, 1, 0, 3, 2, 0, 2, 1, 1]
>>>
>>> move: 0.570041 = 2949623 / 5174401
>>> moveAssign: 0.577736 = 6087819 / 10537375
>>> compare: 0.845419 = 1530877 / 1810790
>>> iteratorDeref: 0.678233 = 9232869 / 13613114
>>> iteratorCopy: 0.742699 = 25652244 / 34539225
>>> iteratorEquals: 2.41643 = 4123881 / 1706601
>>> iteratorInc: 1.15617 = 6554374 / 5669028
>>> iteratorDec: 0.101004 = 539991 / 5346219
>>> iteratorDiff: 0.417093 = 959972 / 2301580
>>> iteratorSub: 0.0350046 = 8882 / 253738
>>> iteratorAdd: 0.0972571 = 64122 / 659304
>>> iteratorAddAssign: 1.2279 = 2247197 / 1830107
>>> iteratorArithmetic: 0.81605 = 14498419 / 17766577
>>> micros: 0.699386 = 554426 / 792732
>>>
>>> Distribution: increasing_unique(100000)
>>> Example: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>>>
>>> move: 1 = 91808 / 91808
>>> moveAssign: 1 = 91808 / 91808
>>> compare: 0.573591 = 224558 / 391495
>>> iteratorDeref: 0.722385 = 540924 / 748803
>>> iteratorCopy: 0.298175 = 976735 / 3275708
>>> iteratorEquals: 0.522302 = 116383 / 222827
>>> iteratorInc: 0.715444 = 132750 / 185549
>>> iteratorDec: 1 = 91808 / 91808
>>> iteratorDiff: 0.217455 = 40956 / 188342
>>> iteratorAdd: 1 = 16383 / 16383
>>> iteratorAddAssign: 0.299523 = 81884 / 273381
>>> iteratorArithmetic: 0.49082 = 480164 / 978290
>>> micros: 0.479792 = 22935 / 47802
>>>
>>> Distribution: increasing_dupes(100000)
>>> Example: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]
>>>
>>> move: 1 = 91808 / 91808
>>> moveAssign: 1 = 91808 / 91808
>>> compare: 0.573591 = 224558 / 391495
>>> iteratorDeref: 0.722385 = 540924 / 748803
>>> iteratorCopy: 0.298175 = 976735 / 3275708
>>> iteratorEquals: 0.522302 = 116383 / 222827
>>> iteratorInc: 0.715444 = 132750 / 185549
>>> iteratorDec: 1 = 91808 / 91808
>>> iteratorDiff: 0.217455 = 40956 / 188342
>>> iteratorAdd: 1 = 16383 / 16383
>>> iteratorAddAssign: 0.299523 = 81884 / 273381
>>> iteratorArithmetic: 0.49082 = 480164 / 978290
>>> micros: 0.449154 = 20675 / 46031
>>>
>>> Distribution: mostly_increasing(100000)
>>> Example: [3, 2, 3, 4, 5, 5, 6, 7, 9, 9, 11, 13, 12, 14, 15, 17, 17, 18, 21, 20]
>>>
>>> move: 0.681093 = 814725 / 1196203
>>> moveAssign: 0.703421 = 1809563 / 2572519
>>> compare: 0.924016 = 965320 / 1044700
>>> iteratorDeref: 0.847765 = 3802634 / 4485480
>>> iteratorCopy: 0.804068 = 11859111 / 14748894
>>> iteratorEquals: 1.59826 = 1388752 / 868913
>>> iteratorInc: 1.25659 = 1818421 / 1447111
>>> iteratorDec: 0.383738 = 537108 / 1399674
>>> iteratorDiff: 0.486488 = 562273 / 1155779
>>> iteratorSub: 0.0638659 = 6666 / 104375
>>> iteratorAdd: 0.184194 = 55586 / 301780
>>> iteratorAddAssign: 1.30369 = 1105439 / 847932
>>> iteratorArithmetic: 0.893672 = 5474245 / 6125564
>>> micros: 0.790603 = 233953 / 295917
>>>
>>> Distribution: decreasing_unique(100000)
>>> Example: [19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>>
>>> move: 0.40597 = 995312 / 2451688
>>> moveAssign: 0.45785 = 2459840 / 5372592
>>> compare: 0.489247 = 151631 / 309927
>>> iteratorDeref: 0.524798 = 3415934 / 6509043
>>> iteratorCopy: 0.563628 = 5971275 / 10594360
>>> iteratorEquals: 1.18395 = 843533 / 712475
>>> iteratorInc: 0.864454 = 1460463 / 1689462
>>> iteratorDec: 0.373125 = 1606976 / 4306800
>>> iteratorDiff: 0.495226 = 235930 / 476409
>>> iteratorSub: 0.248254 = 13824 / 55685
>>> iteratorAdd: 0.590877 = 135839 / 229894
>>> iteratorAddAssign: 0.421852 = 119646 / 283621
>>> iteratorArithmetic: 0.569514 = 4416211 / 7754346
>>> micros: 0.543145 = 167865 / 309061
>>>
>>> Distribution: decreasing_dupes(100000)
>>> Example: [4, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0]
>>>
>>> move: 0.741695 = 1455276 / 1962096
>>> moveAssign: 0.736048 = 2826598 / 3840238
>>> compare: 0.641679 = 282851 / 440798
>>> iteratorDeref: 0.740565 = 3386062 / 4572271
>>> iteratorCopy: 0.628798 = 5008472 / 7965150
>>> iteratorEquals: 1.50206 = 794354 / 528843
>>> iteratorInc: 1.39602 = 2164546 / 1550510
>>> iteratorDec: 0.336639 = 837728 / 2488504
>>> iteratorDiff: 0.313457 = 88552 / 282501
>>> iteratorSub: 0.165388 = 3880 / 23460
>>> iteratorAdd: 0.377644 = 27139 / 71864
>>> iteratorAddAssign: 0.551364 = 182383 / 330785
>>> iteratorArithmetic: 0.776766 = 4098582 / 5276467
>>> micros: 0.71282 = 160398 / 225019
>>>
>>> Distribution: mostly_decreasing(100000)
>>> Example: [20, 19, 20, 18, 18, 15, 13, 13, 14, 13, 10, 9, 10, 8, 5, 7,
>>> 6, 4, 3, 2]
>>>
>>> move: 0.760602 = 2197435 / 2889075
>>> moveAssign: 0.768473 = 4591313 / 5974593
>>> compare: 0.932913 = 977678 / 1047984
>>> iteratorDeref: 0.838641 = 6633169 / 7909424
>>> iteratorCopy: 0.829028 = 15447960 / 18633830
>>> iteratorEquals: 1.7009 = 2000686 / 1176251
>>> iteratorInc: 1.52771 = 3788203 / 2479657
>>> iteratorDec: 0.363843 = 1369990 / 3765332
>>> iteratorDiff: 0.489186 = 595472 / 1217270
>>> iteratorSub: 0.0877827 = 10925 / 124455
>>> iteratorAdd: 0.201991 = 69640 / 344767
>>> iteratorAddAssign: 1.32756 = 1133512 / 853830
>>> iteratorArithmetic: 0.900303 = 8968428 / 9961562
>>> micros: 0.846298 = 377504 / 446065
>>>
>>> Distribution: sawtooth_unique(100000)
>>> Example: [0, 3, 6, 9, 12, 15, 18, 1, 4, 7, 10, 13, 16, 19, 2, 5, 8, 11, 14, 17]
>>>
>>> move: 0.569953 = 2912465 / 5110007
>>> moveAssign: 0.577701 = 6012427 / 10407511
>>> compare: 0.824488 = 1604387 / 1945920
>>> iteratorDeref: 0.678979 = 9299794 / 13696724
>>> iteratorCopy: 0.72546 = 26157352 / 36056250
>>> iteratorEquals: 2.00683 = 4131766 / 2058856
>>> iteratorInc: 1.10728 = 6498257 / 5868688
>>> iteratorDec: 0.111225 = 552519 / 4967557
>>> iteratorDiff: 0.405328 = 988144 / 2437889
>>> iteratorSub: 0.026249 = 6327 / 241038
>>> iteratorAdd: 0.089034 = 55918 / 628052
>>> iteratorAddAssign: 1.18737 = 2297233 / 1934717
>>> iteratorArithmetic: 0.801143 = 14530164 / 18136797
>>> micros: 0.706282 = 583937 / 826776
>>>
>>> Distribution: sawtooth_dupes(100000)
>>> Example: [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]
>>>
>>> move: 0.547721 = 2348873 / 4288447
>>> moveAssign: 0.543252 = 4613817 / 8492965
>>> compare: 0.664642 = 841503 / 1266099
>>> iteratorDeref: 0.599656 = 6292282 / 10493159
>>> iteratorCopy: 0.60663 = 15336816 / 25281976
>>> iteratorEquals: 2.24803 = 2859760 / 1272118
>>> iteratorInc: 1.07142 = 4999546 / 4666276
>>> iteratorDec: 0.0490551 = 205953 / 4198405
>>> iteratorDiff: 0.314518 = 473157 / 1504390
>>> iteratorSub: 0.0371951 = 5998 / 161258
>>> iteratorAdd: 0.0875684 = 34672 / 395942
>>> iteratorAddAssign: 0.900009 = 1233077 / 1370072
>>> iteratorArithmetic: 0.72316 = 9812163 / 13568461
>>> micros: 0.610891 = 353102 / 578011
>>>
>>> Distribution: increasing_with_swap(100000)
>>> Example: [19, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 0]
>>>
>>> move: 1 = 291782 / 291782
>>> moveAssign: 1 = 491779 / 491779
>>> compare: 0.57449 = 225064 / 391763
>>> iteratorDeref: 0.819553 = 941922 / 1149312
>>> iteratorCopy: 0.377964 = 1392386 / 3683915
>>> iteratorEquals: 0.525758 = 117408 / 223312
>>> iteratorInc: 0.863708 = 333050 / 385605
>>> iteratorDec: 0.999753 = 291716 / 291788
>>> iteratorDiff: 0.224408 = 42528 / 189512
>>> iteratorSub: 0.888889 = 104 / 117
>>> iteratorAdd: 0.997151 = 16801 / 16849
>>> iteratorAddAssign: 0.302604 = 82864 / 273836
>>> iteratorArithmetic: 0.640448 = 884471 / 1381019
>>> micros: 0.616124 = 36233 / 58808
>>>
>>> Distribution: increasing_fast_slow(100000)
>>> Example: [0, 4, 8, 12, 0, 1, 2, 3, 5, 6, 7, 8, 10, 11, 12, 13, 15, 16, 17, 18]
>>>
>>> move: 0.640348 = 825772 / 1289568
>>> moveAssign: 0.627081 = 1559792 / 2487384
>>> compare: 0.714235 = 431286 / 603843
>>> iteratorDeref: 0.694913 = 2421875 / 3485151
>>> iteratorCopy: 0.637176 = 5645877 / 8860786
>>> iteratorEquals: 1.7552 = 953880 / 543458
>>> iteratorInc: 1.08941 = 1676609 / 1539005
>>> iteratorDec: 0.135904 = 154533 / 1137073
>>> iteratorDiff: 0.388132 = 192787 / 496705
>>> iteratorSub: 0.00577964 = 141 / 24396
>>> iteratorAdd: 0.207287 = 17272 / 83324
>>> iteratorAddAssign: 0.849015 = 431320 / 508024
>>> iteratorArithmetic: 0.790987 = 3426542 / 4331985
>>> micros: 0.777699 = 153552 / 197444
>>>
>>>
>>>
>>> #include <iostream>
>>> #include <stdio.h>
>>> #include <vector>
>>> #include <cstdlib>
>>> #include <algorithm>
>>> #include <functional>
>>> #include <ranges>
>>> #include <chrono>
>>> #include <cmath>
>>> #include <numeric>
>>>
>>> using namespace std;
>>>
>>> #define LINE0 0
>>> #define cout_line (cout << "line " << __LINE__ - LINE0 << ": ")
>>> #define vout(v) #v << " = " << (v)
>>> #define print1(v) cout_line << vout(v) << endl
>>> #define print2(v1, v2) cout_line << vout(v1) << ", " << vout(v2) << endl
>>> #define print3(v1, v2, v3) cout_line << vout(v1) << ", " << vout(v2)
>>> << ", " << vout(v3) << endl
>>> #define GET_MACRO(_1,_2,_3,NAME,...) NAME
>>> #define print(...) GET_MACRO(__VA_ARGS__, print3, print2, print1)(__VA_ARGS__)
>>>
>>> template <class Ch, class Tr, class Container>
>>> basic_ostream <Ch, Tr> & operator << (basic_ostream <Ch, Tr> & os,
>>> Container const& x);
>>>
>>> template <class X, class Y>
>>> ostream & operator << (ostream & os, pair <X, Y> const& p) {
>>> return os << p.first << ":" << p.second;
>>> }
>>>
>>> template <class Ch, class Tr, class Container>
>>> basic_ostream <Ch, Tr> & operator << (basic_ostream <Ch, Tr> & os,
>>> Container const& x) {
>>> os << "[";
>>> bool first = true;
>>> for(auto const& y : x) {
>>> if (!first) os << ", ";
>>> os << y;
>>> first = false;
>>> }
>>> return os << "]";
>>> }
>>>
>>>
>>> struct Counter {
>>> int construct = 0;
>>> int copy = 0;
>>> int assign = 0;
>>> int move = 0;
>>> int moveAssign = 0;
>>> int compare = 0;
>>> int iteratorDeref = 0;
>>> int iteratorCopy = 0;
>>> int iteratorEquals = 0;
>>> int iteratorInc = 0;
>>> int iteratorDec = 0;
>>> int iteratorDiff = 0;
>>> int iteratorSub = 0;
>>> int iteratorAdd = 0;
>>> int iteratorAddAssign = 0;
>>> int iteratorArithmetic = 0;
>>> int micros = 0;
>>> };
>>>
>>> int next_element_id = 0;
>>>
>>> class Element {
>>> Counter* counter;
>>> public:
>>> int val;
>>> int id; // Distinguis identical values to verify stability.
>>> Element(int val, Counter* counter)
>>> : val(val), counter(counter), id(++next_element_id) {
>>> ++counter->construct;
>>> }
>>>
>>> Element(Element const& other)
>>> : val(other.val), counter(other.counter), id(other.id) {
>>> ++counter->copy;
>>> }
>>>
>>> Element& operator=(Element const& other) {
>>> val = other.val;
>>> counter = other.counter;
>>> id = other.id;
>>> ++counter->assign;
>>> return *this;
>>> }
>>>
>>> Element(Element&& other) noexcept
>>> : val(move(other.val)), counter(other.counter), id(other.id) {
>>> ++counter->move;
>>> }
>>>
>>> Element& operator=(Element&& other) noexcept {
>>> val = move(other.val);
>>> counter = other.counter;
>>> id = other.id;
>>> ++counter->moveAssign;
>>> return *this;
>>> }
>>>
>>> bool operator<(Element const& other) const {
>>> ++counter->compare;
>>> return val < other.val;
>>> }
>>>
>>> bool operator==(Element const& other) const {
>>> return val == other.val && id == other.id;
>>> }
>>> };
>>>
>>> ostream & operator << (ostream & os, Element const& e) {
>>> return os << e.val << "(" << e.id << ")";
>>> }
>>>
>>>
>>> class RandomIterator {
>>> typedef vector<Element>::iterator It;
>>> It it;
>>> Counter* counter;
>>>
>>> public:
>>> using iterator_category = std::random_access_iterator_tag;
>>> using difference_type = std::ptrdiff_t;
>>> using value_type = Element;
>>> using pointer = value_type*;
>>> using reference = value_type&;
>>>
>>> RandomIterator(It it, Counter* counter) : it(it), counter(counter) {
>>> }
>>>
>>> RandomIterator(RandomIterator const& other) : it(other.it),
>>> counter(other.counter) {
>>> counter->iteratorCopy += 1;
>>> }
>>>
>>> bool operator== (RandomIterator other) {
>>> counter->iteratorEquals += 1;
>>> counter->iteratorArithmetic += 1;
>>> return it == other.it;
>>> }
>>>
>>> RandomIterator operator++ () {
>>> counter->iteratorInc += 1;
>>> counter->iteratorArithmetic += 1;
>>> return RandomIterator(++it, counter);
>>> }
>>>
>>> RandomIterator operator-- () {
>>> counter->iteratorDec += 1;
>>> counter->iteratorArithmetic += 1;
>>> return RandomIterator(--it, counter);
>>> }
>>>
>>> Element& operator* () {
>>> counter->iteratorDeref += 1;
>>> return *it;
>>> }
>>>
>>> difference_type operator- (RandomIterator other) {
>>> counter->iteratorDiff += 1;
>>> counter->iteratorArithmetic += 1;
>>> return it - other.it;
>>> }
>>>
>>> RandomIterator operator- (difference_type diff) {
>>> counter->iteratorSub += 1;
>>> counter->iteratorArithmetic += 1;
>>> return RandomIterator(it - diff, counter);
>>> }
>>>
>>> RandomIterator operator+ (difference_type diff) {
>>> counter->iteratorAdd += 1;
>>> counter->iteratorArithmetic += 1;
>>> return RandomIterator(it + diff, counter);
>>> }
>>>
>>> RandomIterator operator+= (difference_type diff) {
>>> counter->iteratorAddAssign += 1;
>>> counter->iteratorArithmetic += 1;
>>> return RandomIterator(it += diff, counter);
>>> }
>>> };
>>>
>>>
>>> class TestCase {
>>> Counter counter;
>>> vector<Element> elements;
>>>
>>> public:
>>> TestCase(vector<int> const& data) {
>>> elements.reserve(data.size());
>>> for (int i: data) {
>>> elements.push_back(Element(i, &counter));
>>> }
>>> }
>>>
>>> Counter sort(
>>> void algo(RandomIterator, RandomIterator),
>>> vector<Element>::iterator begin,
>>> vector<Element>::iterator end) {
>>> counter = Counter();
>>> auto t0 = std::chrono::system_clock::now();
>>>
>>> algo(RandomIterator(begin, &counter), RandomIterator(end, &counter));
>>>
>>> auto t1 = std::chrono::system_clock::now();
>>> counter.micros =
>>> std::chrono::duration_cast<std::chrono::microseconds>(t1 -
>>> t0).count();
>>> return counter;
>>> }
>>>
>>> pair<Counter,Counter> benchmark(
>>> void new_algo(RandomIterator, RandomIterator),
>>> void cur_algo(RandomIterator, RandomIterator)) {
>>> vector<Element> toSort1 = elements;
>>> Counter counter1 = sort(new_algo, toSort1.begin(), toSort1.end());
>>>
>>> vector<Element> toSort2 = elements;
>>> Counter counter2 = sort(cur_algo, toSort2.begin(), toSort2.end());
>>>
>>> if (toSort1 != toSort2) {
>>> cout << "INCORRECT RESULT!" << endl;
>>> exit(1);
>>> }
>>>
>>> return make_pair(counter1, counter2);
>>> }
>>> };
>>>
>>>
>>> void relative_metric(string name, int m1, int m2) {
>>> if (m1 == 0 && m2 == 0) return;
>>> if (m2 == 0) {
>>> cout << name << ": inf = " << m1 << " / " << m2 << endl;
>>> } else {
>>> cout << name << ": " << m1 / (double) m2 << " = " << m1 << " / "
>>> << m2 << endl;
>>> }
>>> }
>>>
>>>
>>> void relative_report(Counter const& c1, Counter const& c2) {
>>> cout << endl;
>>> relative_metric("construct", c1.construct, c2.construct);
>>> relative_metric("copy", c1.copy, c2.copy);
>>> relative_metric("assign", c1.assign, c2.assign);
>>> relative_metric("move", c1.move, c2.move);
>>> relative_metric("moveAssign", c1.moveAssign, c2.moveAssign);
>>> relative_metric("compare", c1.compare, c2.compare);
>>> relative_metric("iteratorDeref", c1.iteratorDeref, c2.iteratorDeref);
>>> relative_metric("iteratorCopy", c1.iteratorCopy, c2.iteratorCopy);
>>> relative_metric("iteratorEquals", c1.iteratorEquals, c2.iteratorEquals);
>>> relative_metric("iteratorInc", c1.iteratorInc, c2.iteratorInc);
>>> relative_metric("iteratorDec", c1.iteratorDec, c2.iteratorDec);
>>> relative_metric("iteratorDiff", c1.iteratorDiff, c2.iteratorDiff);
>>> relative_metric("iteratorSub", c1.iteratorSub, c2.iteratorSub);
>>> relative_metric("iteratorAdd", c1.iteratorAdd, c2.iteratorAdd);
>>> relative_metric("iteratorAddAssign", c1.iteratorAddAssign,
>>> c2.iteratorAddAssign);
>>> relative_metric("iteratorArithmetic", c1.iteratorArithmetic,
>>> c2.iteratorArithmetic);
>>> relative_metric("micros", c1.micros, c2.micros);
>>> cout << endl;
>>> }
>>>
>>>
>>> /// This is a helper function for the merge routines.
>>> template<typename _BidirectionalIterator, typename _Distance, typename _Compare>
>>> void _my_merge_without_buffer(
>>> _BidirectionalIterator __first,
>>> _BidirectionalIterator __middle,
>>> _BidirectionalIterator __last,
>>> _Distance __len1, _Distance __len2,
>>> _Compare __comp) {
>>>
>>> if (__len1 == 0 || __len2 == 0)
>>> return;
>>>
>>> if (__len1 + __len2 == 2) {
>>> if (__comp(__middle, __first))
>>> std::iter_swap(__first, __middle);
>>> return;
>>> }
>>>
>>> _BidirectionalIterator __first_cut = __first;
>>> _BidirectionalIterator __second_cut = __middle;
>>> _Distance __len11 = 0;
>>> _Distance __len22 = 0;
>>>
>>>
>>> #define NEW_ALGO 1
>>>
>>> #ifdef NEW_ALGO
>>> // Prevent stack overflow
>>> if (__len1 > 4 * __len2 || __len2 > 4 * __len1) {
>>> #endif
>>> if (__len1 > __len2)
>>> {
>>> __len11 = __len1 / 2;
>>> std::advance(__first_cut, __len11);
>>> __second_cut
>>> = std::__lower_bound(__middle, __last, *__first_cut,
>>> __gnu_cxx::__ops::__iter_comp_val(__comp));
>>> __len22 = std::distance(__middle, __second_cut);
>>> }
>>> else
>>> {
>>> __len22 = __len2 / 2;
>>> std::advance(__second_cut, __len22);
>>> __first_cut
>>> = std::__upper_bound(__first, __middle, *__second_cut,
>>> __gnu_cxx::__ops::__val_comp_iter(__comp));
>>> __len11 = std::distance(__first, __first_cut);
>>> }
>>> #ifdef NEW_ALGO
>>> } else
>>> {
>>> _Distance __cut_range;
>>> // __cut_len is always equal to __middle - __first_cut (and eventually to
>>> // __second_cut - __middle) but we keep track of it to avoid calculating it
>>> // later which can be expensive for bidirectional iterators.
>>> _Distance __cut_len;
>>>
>>> if (__len1 > __len2)
>>> {
>>> std::advance(__first_cut, __len1 - __len2);
>>> __cut_range = __len2;
>>> }
>>> else
>>> {
>>> __cut_range = __len1;
>>> }
>>> __cut_len = __cut_range;
>>>
>>> while (__cut_range > 0) {
>>> _Distance __half_cut_range = __cut_range / 2;
>>> _BidirectionalIterator __mid_first_cut = __first_cut;
>>> std::advance(__mid_first_cut, __cut_range - __half_cut_range - 1);
>>> _BidirectionalIterator __mid_second_cut = __second_cut;
>>> std::advance(__mid_second_cut, __half_cut_range);
>>>
>>> if (__comp(__mid_second_cut, __mid_first_cut)) {
>>> __cut_range -= __half_cut_range + 1;
>>> __second_cut = __mid_second_cut;
>>> ++__second_cut;
>>> } else {
>>> __cut_len -= __cut_range - __half_cut_range;
>>> __cut_range = __half_cut_range;
>>> __first_cut = __mid_first_cut;
>>> ++__first_cut;
>>> }
>>> }
>>> __len22 = __cut_len;
>>> __len11 = __len1 - __cut_len;
>>> }
>>> #endif
>>>
>>> _BidirectionalIterator __new_middle
>>> = std::rotate(__first_cut, __middle, __second_cut);
>>> _my_merge_without_buffer(__first, __first_cut, __new_middle,
>>> __len11, __len22, __comp);
>>> _my_merge_without_buffer(__new_middle, __second_cut, __last,
>>> __len1 - __len11, __len2 - __len22, __comp);
>>> }
>>>
>>> /// This is a helper function for the stable sorting routines.
>>> template<typename _RandomAccessIterator, typename _Compare>
>>> void
>>> _my_inplace_stable_sort(_RandomAccessIterator __first,
>>> _RandomAccessIterator __last, _Compare __comp)
>>> {
>>>
>>> if (__last - __first < 15)
>>> {
>>> std::__insertion_sort(__first, __last, __comp);
>>> return;
>>> }
>>> _RandomAccessIterator __middle = __first + (__last - __first) / 2;
>>> _my_inplace_stable_sort(__first, __middle, __comp);
>>> _my_inplace_stable_sort(__middle, __last, __comp);
>>> _my_merge_without_buffer(__first, __middle, __last,
>>> __middle - __first,
>>> __last - __middle,
>>> __comp);
>>> }
>>>
>>>
>>> template<typename Iterator>
>>> void my_inplace_stable_sort(Iterator start, Iterator end) {
>>> _my_inplace_stable_sort(start, end, __gnu_cxx::__ops::__iter_less_iter());
>>> }
>>>
>>> template<typename Iterator>
>>> void std_inplace_stable_sort(Iterator start, Iterator end) {
>>> std::__inplace_stable_sort(start, end, __gnu_cxx::__ops::__iter_less_iter());
>>> }
>>>
>>> void benchmark(string const& dist_name,
>>> vector<int> dist_func(int),
>>> int n) {
>>> vector<int> v = dist_func(n);
>>> cout << "Distribution: " << dist_name << "(" << n << ")" << endl;
>>> cout << "Example: " << dist_func(20) << endl;
>>> TestCase testCase(v);
>>>
>>> pair<Counter, Counter> counter_pair =
>>> testCase.benchmark(
>>> my_inplace_stable_sort
>>> , std_inplace_stable_sort
>>> );
>>> Counter cKloet = counter_pair.first;
>>> Counter cInplace = counter_pair.second;
>>>
>>> relative_report(cKloet, cInplace);
>>> }
>>>
>>>
>>> vector<int> random_unique(int n) {
>>> vector<int> v(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = i;
>>> }
>>> for (int i = n - 1; i >= 0; --i) {
>>> swap(v[i], v[rand() % (i + 1)]);
>>> }
>>> return v;
>>> }
>>>
>>> vector<int> random_dupes(int n) {
>>> vector<int> v(n);
>>> int r = (int)sqrt(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = i / r;
>>> }
>>> for (int i = n - 1; i >= 0; --i) {
>>> swap(v[i], v[rand() % (i + 1)]);
>>> }
>>> return v;
>>> }
>>>
>>> vector<int> increasing_unique(int n) {
>>> vector<int> v(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = i;
>>> }
>>> return v;
>>> }
>>>
>>>
>>> vector<int> increasing_dupes(int n) {
>>> vector<int> v(n);
>>> int r = (int)sqrt(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = i / r;
>>> }
>>> return v;
>>> }
>>>
>>>
>>> vector<int> mostly_increasing(int n) {
>>> vector<int> v(n);
>>> int r = (int)sqrt(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = i + rand() % r;
>>> }
>>> return v;
>>> }
>>>
>>>
>>> vector<int> decreasing_unique(int n) {
>>> vector<int> v(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = n - 1 - i;
>>> }
>>> return v;
>>> }
>>>
>>> vector<int> decreasing_dupes(int n) {
>>> vector<int> v(n);
>>> int r = (int)sqrt(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = (n - 1 - i) / r;
>>> }
>>> return v;
>>> }
>>>
>>> vector<int> mostly_decreasing(int n) {
>>> vector<int> v(n);
>>> int r = (int)sqrt(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = n - 1 - i + rand() % r;
>>> }
>>> return v;
>>> }
>>>
>>> vector<int> sawtooth_unique(int n) {
>>> int r = (int)sqrt(n);
>>> while (gcd(r, n) != 1) ++r;
>>> vector<int> v(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[(i * r) % n] = i;
>>> }
>>> return v;
>>> }
>>>
>>> vector<int> sawtooth_dupes(int n) {
>>> int r = (int)sqrt(n);
>>> vector<int> v(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = i % r;
>>> }
>>> return v;
>>> }
>>>
>>> vector<int> increasing_with_swap(int n) {
>>> vector<int> v(n);
>>> for (int i = 0; i < n; ++i) {
>>> v[i] = i;
>>> }
>>> swap(v[0], v[n - 1]);
>>> return v;
>>> }
>>>
>>> vector<int> increasing_fast_slow(int n) {
>>> int r = (int)sqrt(n);
>>> vector<int> v(n);
>>> for (int i = 0; i < r; ++i) {
>>> v[i] = i * r;
>>> }
>>> for (int i = r; i < n; ++i) {
>>> v[i] = (i - r) * n / (n - r);
>>> }
>>> return v;
>>> }
>>>
>>>
>>> #define BENCHMARK(dist, n) benchmark(#dist, dist, n)
>>>
>>> int main() {
>>> int n = 100000;
>>> BENCHMARK(random_unique, n);
>>> BENCHMARK(random_dupes, n);
>>> BENCHMARK(increasing_unique, n);
>>> BENCHMARK(increasing_dupes, n);
>>> BENCHMARK(mostly_increasing, n);
>>> BENCHMARK(decreasing_unique, n);
>>> BENCHMARK(decreasing_dupes, n);
>>> BENCHMARK(mostly_decreasing, n);
>>> BENCHMARK(sawtooth_unique, n);
>>> BENCHMARK(sawtooth_dupes, n);
>>> BENCHMARK(increasing_with_swap, n);
>>> BENCHMARK(increasing_fast_slow, n);
>>>
>>> return 0;
>>> }
>>
next prev parent reply other threads:[~2021-11-25 5:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-18 12:58 David de Kloet
2021-11-24 18:33 ` François Dumont
2021-11-24 18:48 ` David de Kloet
2021-11-25 5:35 ` François Dumont [this message]
2021-11-25 19:51 ` David de Kloet
2021-12-10 15:31 ` David de Kloet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54dbf38f-6409-9971-bdec-457ac6d2473a@gmail.com \
--to=frs.dumont@gmail.com \
--cc=dskloet@gmail.com \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).