From: "François Dumont" <frs.dumont@gmail.com>
To: David de Kloet <dskloet@gmail.com>, libstdc++@gcc.gnu.org
Subject: Re: __inplace_stable_sort improvement
Date: Wed, 24 Nov 2021 19:33:28 +0100 [thread overview]
Message-ID: <9f7342af-398c-fa0f-4182-761644d6e5f3@gmail.com> (raw)
In-Reply-To: <CAKxRKifbXorTaN+66m1f9qG8WHLqqXpdmoWVzoZDvu1yTv=yxA@mail.gmail.com>
On 18/11/21 1:58 pm, David de Kloet via Libstdc++ wrote:
> Hello libstdc++,
>
> Apologies if this is the wrong list for my message. I believe I have an
> improvement for __inplace_stable_sort and I'm wondering if this is worth
> pursuing and if so how to proceed.
Of course, anyone interested in enhancing the library is welcome.
The best way to do so would be to provide a patch on the current library
code. I'm the last one to have touch to this algo and it would be easier
for me to understand if you try to implement it directly on the library.
The good news is that you are trying to enhance a piece of code rarely
used as it is only when the buffer allocation fails. So it is less
critical and so perhaps more open to modifications.
During my last work on it I've added a performance test:
performance/25_algorithms/inplace_merge.cc. It is also supposed to check
performance when we fail to get a buffer and so use
__merge_without_buffer. Use 'make check-performance' to run it, see
library manual guide. Do not hesitate to change it if you think that
your benches are better.
> __inplace_stable_sort does a merge sort where in each recursive step the 2
> sublists are merged in-place. The merge step is itself also recursive. In order
> to merge [A, B) and [B, C), a tail section S1 of [A, B) and a head section S2
> of [B, C) are found such that all elements in S1 are greater than all elements
> in S2 and then those sections are swapped/rotated and the shorter sections are
> merged recursively. This is done by the function __merge_without_buffer.
>
> In the current implementation, __merge_without_buffer takes the half point of
> the longer section and does a binary search (via lower_bound or upper_bound) to
> find the corresponding cut point on the other section. Then it rotates the
> sections.
>
> My proposal is to find the cut points such that both rotated sections have the
> same length such that they can always simply be swapped.
What do you mean by swapped ? I see in the code below that you are just
changing how the cut positions are computed and then call the same
std::rotate algo. I cannot see any reason why changing the position
would make the rotate faster.
> I have implemented
> this and it seems to perform better in general and in the special cases that
> I've tried.
>
> Unfortunately, since we are no longer guaranteed to halve the longer section in
> each step, there is a possibility for a stack overflow. To avoid this, the new
> algorithm falls back to the old behavior whenever the sections have
> significantly different lengths.
>
> Also, where the current algorithm performs the binary search implicitly via
> lower_bound/upper_bound, the new algorithm has to implement the binary search
> explicitly because it has to look at both sections to determine the next step
> in the search.
>
> I have created wrappers for my value objects as well as my iterators to count
> the number of operations performed by the current and my algorithm and run
> benchmarks on various distributions. My code is at the bottom but below is the
> output as run on my machine. For each operation it outputs the ratio between
> the number of operations performed by my algorithm over the number performed by
> the current algorithm.
>
> Thanks for your attention,
> David
>
>
> Distribution: random_unique(100000)
> Example: [19, 13, 5, 0, 10, 6, 15, 11, 16, 9, 14, 18, 4, 17, 1, 7, 8, 12, 3, 2]
>
> move: 0.588026 = 3331848 / 5666154
> moveAssign: 0.594789 = 6852819 / 11521431
> compare: 0.914997 = 2028977 / 2217469
> iteratorDeref: 0.718466 = 11001863 / 15312994
> iteratorCopy: 0.791332 = 33977189 / 42936712
> iteratorEquals: 2.27841 = 5174606 / 2271143
> iteratorInc: 1.2143 = 7697971 / 6339449
> iteratorDec: 0.0833529 = 473556 / 5681340
> iteratorDiff: 0.444865 = 1362598 / 3062950
> iteratorSub: 0.0200683 = 6272 / 312533
> iteratorAdd: 0.0695467 = 57149 / 821736
> iteratorAddAssign: 1.36491 = 3141949 / 2301950
> iteratorArithmetic: 0.861623 = 17914101 / 20791101
> micros: 0.663704 = 650825 / 980596
>
> Distribution: random_dupes(100000)
> Example: [4, 4, 1, 3, 0, 3, 2, 2, 4, 4, 0, 3, 1, 0, 3, 2, 0, 2, 1, 1]
>
> move: 0.570041 = 2949623 / 5174401
> moveAssign: 0.577736 = 6087819 / 10537375
> compare: 0.845419 = 1530877 / 1810790
> iteratorDeref: 0.678233 = 9232869 / 13613114
> iteratorCopy: 0.742699 = 25652244 / 34539225
> iteratorEquals: 2.41643 = 4123881 / 1706601
> iteratorInc: 1.15617 = 6554374 / 5669028
> iteratorDec: 0.101004 = 539991 / 5346219
> iteratorDiff: 0.417093 = 959972 / 2301580
> iteratorSub: 0.0350046 = 8882 / 253738
> iteratorAdd: 0.0972571 = 64122 / 659304
> iteratorAddAssign: 1.2279 = 2247197 / 1830107
> iteratorArithmetic: 0.81605 = 14498419 / 17766577
> micros: 0.699386 = 554426 / 792732
>
> Distribution: increasing_unique(100000)
> Example: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
>
> move: 1 = 91808 / 91808
> moveAssign: 1 = 91808 / 91808
> compare: 0.573591 = 224558 / 391495
> iteratorDeref: 0.722385 = 540924 / 748803
> iteratorCopy: 0.298175 = 976735 / 3275708
> iteratorEquals: 0.522302 = 116383 / 222827
> iteratorInc: 0.715444 = 132750 / 185549
> iteratorDec: 1 = 91808 / 91808
> iteratorDiff: 0.217455 = 40956 / 188342
> iteratorAdd: 1 = 16383 / 16383
> iteratorAddAssign: 0.299523 = 81884 / 273381
> iteratorArithmetic: 0.49082 = 480164 / 978290
> micros: 0.479792 = 22935 / 47802
>
> Distribution: increasing_dupes(100000)
> Example: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]
>
> move: 1 = 91808 / 91808
> moveAssign: 1 = 91808 / 91808
> compare: 0.573591 = 224558 / 391495
> iteratorDeref: 0.722385 = 540924 / 748803
> iteratorCopy: 0.298175 = 976735 / 3275708
> iteratorEquals: 0.522302 = 116383 / 222827
> iteratorInc: 0.715444 = 132750 / 185549
> iteratorDec: 1 = 91808 / 91808
> iteratorDiff: 0.217455 = 40956 / 188342
> iteratorAdd: 1 = 16383 / 16383
> iteratorAddAssign: 0.299523 = 81884 / 273381
> iteratorArithmetic: 0.49082 = 480164 / 978290
> micros: 0.449154 = 20675 / 46031
>
> Distribution: mostly_increasing(100000)
> Example: [3, 2, 3, 4, 5, 5, 6, 7, 9, 9, 11, 13, 12, 14, 15, 17, 17, 18, 21, 20]
>
> move: 0.681093 = 814725 / 1196203
> moveAssign: 0.703421 = 1809563 / 2572519
> compare: 0.924016 = 965320 / 1044700
> iteratorDeref: 0.847765 = 3802634 / 4485480
> iteratorCopy: 0.804068 = 11859111 / 14748894
> iteratorEquals: 1.59826 = 1388752 / 868913
> iteratorInc: 1.25659 = 1818421 / 1447111
> iteratorDec: 0.383738 = 537108 / 1399674
> iteratorDiff: 0.486488 = 562273 / 1155779
> iteratorSub: 0.0638659 = 6666 / 104375
> iteratorAdd: 0.184194 = 55586 / 301780
> iteratorAddAssign: 1.30369 = 1105439 / 847932
> iteratorArithmetic: 0.893672 = 5474245 / 6125564
> micros: 0.790603 = 233953 / 295917
>
> Distribution: decreasing_unique(100000)
> Example: [19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>
> move: 0.40597 = 995312 / 2451688
> moveAssign: 0.45785 = 2459840 / 5372592
> compare: 0.489247 = 151631 / 309927
> iteratorDeref: 0.524798 = 3415934 / 6509043
> iteratorCopy: 0.563628 = 5971275 / 10594360
> iteratorEquals: 1.18395 = 843533 / 712475
> iteratorInc: 0.864454 = 1460463 / 1689462
> iteratorDec: 0.373125 = 1606976 / 4306800
> iteratorDiff: 0.495226 = 235930 / 476409
> iteratorSub: 0.248254 = 13824 / 55685
> iteratorAdd: 0.590877 = 135839 / 229894
> iteratorAddAssign: 0.421852 = 119646 / 283621
> iteratorArithmetic: 0.569514 = 4416211 / 7754346
> micros: 0.543145 = 167865 / 309061
>
> Distribution: decreasing_dupes(100000)
> Example: [4, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0]
>
> move: 0.741695 = 1455276 / 1962096
> moveAssign: 0.736048 = 2826598 / 3840238
> compare: 0.641679 = 282851 / 440798
> iteratorDeref: 0.740565 = 3386062 / 4572271
> iteratorCopy: 0.628798 = 5008472 / 7965150
> iteratorEquals: 1.50206 = 794354 / 528843
> iteratorInc: 1.39602 = 2164546 / 1550510
> iteratorDec: 0.336639 = 837728 / 2488504
> iteratorDiff: 0.313457 = 88552 / 282501
> iteratorSub: 0.165388 = 3880 / 23460
> iteratorAdd: 0.377644 = 27139 / 71864
> iteratorAddAssign: 0.551364 = 182383 / 330785
> iteratorArithmetic: 0.776766 = 4098582 / 5276467
> micros: 0.71282 = 160398 / 225019
>
> Distribution: mostly_decreasing(100000)
> Example: [20, 19, 20, 18, 18, 15, 13, 13, 14, 13, 10, 9, 10, 8, 5, 7,
> 6, 4, 3, 2]
>
> move: 0.760602 = 2197435 / 2889075
> moveAssign: 0.768473 = 4591313 / 5974593
> compare: 0.932913 = 977678 / 1047984
> iteratorDeref: 0.838641 = 6633169 / 7909424
> iteratorCopy: 0.829028 = 15447960 / 18633830
> iteratorEquals: 1.7009 = 2000686 / 1176251
> iteratorInc: 1.52771 = 3788203 / 2479657
> iteratorDec: 0.363843 = 1369990 / 3765332
> iteratorDiff: 0.489186 = 595472 / 1217270
> iteratorSub: 0.0877827 = 10925 / 124455
> iteratorAdd: 0.201991 = 69640 / 344767
> iteratorAddAssign: 1.32756 = 1133512 / 853830
> iteratorArithmetic: 0.900303 = 8968428 / 9961562
> micros: 0.846298 = 377504 / 446065
>
> Distribution: sawtooth_unique(100000)
> Example: [0, 3, 6, 9, 12, 15, 18, 1, 4, 7, 10, 13, 16, 19, 2, 5, 8, 11, 14, 17]
>
> move: 0.569953 = 2912465 / 5110007
> moveAssign: 0.577701 = 6012427 / 10407511
> compare: 0.824488 = 1604387 / 1945920
> iteratorDeref: 0.678979 = 9299794 / 13696724
> iteratorCopy: 0.72546 = 26157352 / 36056250
> iteratorEquals: 2.00683 = 4131766 / 2058856
> iteratorInc: 1.10728 = 6498257 / 5868688
> iteratorDec: 0.111225 = 552519 / 4967557
> iteratorDiff: 0.405328 = 988144 / 2437889
> iteratorSub: 0.026249 = 6327 / 241038
> iteratorAdd: 0.089034 = 55918 / 628052
> iteratorAddAssign: 1.18737 = 2297233 / 1934717
> iteratorArithmetic: 0.801143 = 14530164 / 18136797
> micros: 0.706282 = 583937 / 826776
>
> Distribution: sawtooth_dupes(100000)
> Example: [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]
>
> move: 0.547721 = 2348873 / 4288447
> moveAssign: 0.543252 = 4613817 / 8492965
> compare: 0.664642 = 841503 / 1266099
> iteratorDeref: 0.599656 = 6292282 / 10493159
> iteratorCopy: 0.60663 = 15336816 / 25281976
> iteratorEquals: 2.24803 = 2859760 / 1272118
> iteratorInc: 1.07142 = 4999546 / 4666276
> iteratorDec: 0.0490551 = 205953 / 4198405
> iteratorDiff: 0.314518 = 473157 / 1504390
> iteratorSub: 0.0371951 = 5998 / 161258
> iteratorAdd: 0.0875684 = 34672 / 395942
> iteratorAddAssign: 0.900009 = 1233077 / 1370072
> iteratorArithmetic: 0.72316 = 9812163 / 13568461
> micros: 0.610891 = 353102 / 578011
>
> Distribution: increasing_with_swap(100000)
> Example: [19, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 0]
>
> move: 1 = 291782 / 291782
> moveAssign: 1 = 491779 / 491779
> compare: 0.57449 = 225064 / 391763
> iteratorDeref: 0.819553 = 941922 / 1149312
> iteratorCopy: 0.377964 = 1392386 / 3683915
> iteratorEquals: 0.525758 = 117408 / 223312
> iteratorInc: 0.863708 = 333050 / 385605
> iteratorDec: 0.999753 = 291716 / 291788
> iteratorDiff: 0.224408 = 42528 / 189512
> iteratorSub: 0.888889 = 104 / 117
> iteratorAdd: 0.997151 = 16801 / 16849
> iteratorAddAssign: 0.302604 = 82864 / 273836
> iteratorArithmetic: 0.640448 = 884471 / 1381019
> micros: 0.616124 = 36233 / 58808
>
> Distribution: increasing_fast_slow(100000)
> Example: [0, 4, 8, 12, 0, 1, 2, 3, 5, 6, 7, 8, 10, 11, 12, 13, 15, 16, 17, 18]
>
> move: 0.640348 = 825772 / 1289568
> moveAssign: 0.627081 = 1559792 / 2487384
> compare: 0.714235 = 431286 / 603843
> iteratorDeref: 0.694913 = 2421875 / 3485151
> iteratorCopy: 0.637176 = 5645877 / 8860786
> iteratorEquals: 1.7552 = 953880 / 543458
> iteratorInc: 1.08941 = 1676609 / 1539005
> iteratorDec: 0.135904 = 154533 / 1137073
> iteratorDiff: 0.388132 = 192787 / 496705
> iteratorSub: 0.00577964 = 141 / 24396
> iteratorAdd: 0.207287 = 17272 / 83324
> iteratorAddAssign: 0.849015 = 431320 / 508024
> iteratorArithmetic: 0.790987 = 3426542 / 4331985
> micros: 0.777699 = 153552 / 197444
>
>
>
> #include <iostream>
> #include <stdio.h>
> #include <vector>
> #include <cstdlib>
> #include <algorithm>
> #include <functional>
> #include <ranges>
> #include <chrono>
> #include <cmath>
> #include <numeric>
>
> using namespace std;
>
> #define LINE0 0
> #define cout_line (cout << "line " << __LINE__ - LINE0 << ": ")
> #define vout(v) #v << " = " << (v)
> #define print1(v) cout_line << vout(v) << endl
> #define print2(v1, v2) cout_line << vout(v1) << ", " << vout(v2) << endl
> #define print3(v1, v2, v3) cout_line << vout(v1) << ", " << vout(v2)
> << ", " << vout(v3) << endl
> #define GET_MACRO(_1,_2,_3,NAME,...) NAME
> #define print(...) GET_MACRO(__VA_ARGS__, print3, print2, print1)(__VA_ARGS__)
>
> template <class Ch, class Tr, class Container>
> basic_ostream <Ch, Tr> & operator << (basic_ostream <Ch, Tr> & os,
> Container const& x);
>
> template <class X, class Y>
> ostream & operator << (ostream & os, pair <X, Y> const& p) {
> return os << p.first << ":" << p.second;
> }
>
> template <class Ch, class Tr, class Container>
> basic_ostream <Ch, Tr> & operator << (basic_ostream <Ch, Tr> & os,
> Container const& x) {
> os << "[";
> bool first = true;
> for(auto const& y : x) {
> if (!first) os << ", ";
> os << y;
> first = false;
> }
> return os << "]";
> }
>
>
> struct Counter {
> int construct = 0;
> int copy = 0;
> int assign = 0;
> int move = 0;
> int moveAssign = 0;
> int compare = 0;
> int iteratorDeref = 0;
> int iteratorCopy = 0;
> int iteratorEquals = 0;
> int iteratorInc = 0;
> int iteratorDec = 0;
> int iteratorDiff = 0;
> int iteratorSub = 0;
> int iteratorAdd = 0;
> int iteratorAddAssign = 0;
> int iteratorArithmetic = 0;
> int micros = 0;
> };
>
> int next_element_id = 0;
>
> class Element {
> Counter* counter;
> public:
> int val;
> int id; // Distinguis identical values to verify stability.
> Element(int val, Counter* counter)
> : val(val), counter(counter), id(++next_element_id) {
> ++counter->construct;
> }
>
> Element(Element const& other)
> : val(other.val), counter(other.counter), id(other.id) {
> ++counter->copy;
> }
>
> Element& operator=(Element const& other) {
> val = other.val;
> counter = other.counter;
> id = other.id;
> ++counter->assign;
> return *this;
> }
>
> Element(Element&& other) noexcept
> : val(move(other.val)), counter(other.counter), id(other.id) {
> ++counter->move;
> }
>
> Element& operator=(Element&& other) noexcept {
> val = move(other.val);
> counter = other.counter;
> id = other.id;
> ++counter->moveAssign;
> return *this;
> }
>
> bool operator<(Element const& other) const {
> ++counter->compare;
> return val < other.val;
> }
>
> bool operator==(Element const& other) const {
> return val == other.val && id == other.id;
> }
> };
>
> ostream & operator << (ostream & os, Element const& e) {
> return os << e.val << "(" << e.id << ")";
> }
>
>
> class RandomIterator {
> typedef vector<Element>::iterator It;
> It it;
> Counter* counter;
>
> public:
> using iterator_category = std::random_access_iterator_tag;
> using difference_type = std::ptrdiff_t;
> using value_type = Element;
> using pointer = value_type*;
> using reference = value_type&;
>
> RandomIterator(It it, Counter* counter) : it(it), counter(counter) {
> }
>
> RandomIterator(RandomIterator const& other) : it(other.it),
> counter(other.counter) {
> counter->iteratorCopy += 1;
> }
>
> bool operator== (RandomIterator other) {
> counter->iteratorEquals += 1;
> counter->iteratorArithmetic += 1;
> return it == other.it;
> }
>
> RandomIterator operator++ () {
> counter->iteratorInc += 1;
> counter->iteratorArithmetic += 1;
> return RandomIterator(++it, counter);
> }
>
> RandomIterator operator-- () {
> counter->iteratorDec += 1;
> counter->iteratorArithmetic += 1;
> return RandomIterator(--it, counter);
> }
>
> Element& operator* () {
> counter->iteratorDeref += 1;
> return *it;
> }
>
> difference_type operator- (RandomIterator other) {
> counter->iteratorDiff += 1;
> counter->iteratorArithmetic += 1;
> return it - other.it;
> }
>
> RandomIterator operator- (difference_type diff) {
> counter->iteratorSub += 1;
> counter->iteratorArithmetic += 1;
> return RandomIterator(it - diff, counter);
> }
>
> RandomIterator operator+ (difference_type diff) {
> counter->iteratorAdd += 1;
> counter->iteratorArithmetic += 1;
> return RandomIterator(it + diff, counter);
> }
>
> RandomIterator operator+= (difference_type diff) {
> counter->iteratorAddAssign += 1;
> counter->iteratorArithmetic += 1;
> return RandomIterator(it += diff, counter);
> }
> };
>
>
> class TestCase {
> Counter counter;
> vector<Element> elements;
>
> public:
> TestCase(vector<int> const& data) {
> elements.reserve(data.size());
> for (int i: data) {
> elements.push_back(Element(i, &counter));
> }
> }
>
> Counter sort(
> void algo(RandomIterator, RandomIterator),
> vector<Element>::iterator begin,
> vector<Element>::iterator end) {
> counter = Counter();
> auto t0 = std::chrono::system_clock::now();
>
> algo(RandomIterator(begin, &counter), RandomIterator(end, &counter));
>
> auto t1 = std::chrono::system_clock::now();
> counter.micros =
> std::chrono::duration_cast<std::chrono::microseconds>(t1 -
> t0).count();
> return counter;
> }
>
> pair<Counter,Counter> benchmark(
> void new_algo(RandomIterator, RandomIterator),
> void cur_algo(RandomIterator, RandomIterator)) {
> vector<Element> toSort1 = elements;
> Counter counter1 = sort(new_algo, toSort1.begin(), toSort1.end());
>
> vector<Element> toSort2 = elements;
> Counter counter2 = sort(cur_algo, toSort2.begin(), toSort2.end());
>
> if (toSort1 != toSort2) {
> cout << "INCORRECT RESULT!" << endl;
> exit(1);
> }
>
> return make_pair(counter1, counter2);
> }
> };
>
>
> void relative_metric(string name, int m1, int m2) {
> if (m1 == 0 && m2 == 0) return;
> if (m2 == 0) {
> cout << name << ": inf = " << m1 << " / " << m2 << endl;
> } else {
> cout << name << ": " << m1 / (double) m2 << " = " << m1 << " / "
> << m2 << endl;
> }
> }
>
>
> void relative_report(Counter const& c1, Counter const& c2) {
> cout << endl;
> relative_metric("construct", c1.construct, c2.construct);
> relative_metric("copy", c1.copy, c2.copy);
> relative_metric("assign", c1.assign, c2.assign);
> relative_metric("move", c1.move, c2.move);
> relative_metric("moveAssign", c1.moveAssign, c2.moveAssign);
> relative_metric("compare", c1.compare, c2.compare);
> relative_metric("iteratorDeref", c1.iteratorDeref, c2.iteratorDeref);
> relative_metric("iteratorCopy", c1.iteratorCopy, c2.iteratorCopy);
> relative_metric("iteratorEquals", c1.iteratorEquals, c2.iteratorEquals);
> relative_metric("iteratorInc", c1.iteratorInc, c2.iteratorInc);
> relative_metric("iteratorDec", c1.iteratorDec, c2.iteratorDec);
> relative_metric("iteratorDiff", c1.iteratorDiff, c2.iteratorDiff);
> relative_metric("iteratorSub", c1.iteratorSub, c2.iteratorSub);
> relative_metric("iteratorAdd", c1.iteratorAdd, c2.iteratorAdd);
> relative_metric("iteratorAddAssign", c1.iteratorAddAssign,
> c2.iteratorAddAssign);
> relative_metric("iteratorArithmetic", c1.iteratorArithmetic,
> c2.iteratorArithmetic);
> relative_metric("micros", c1.micros, c2.micros);
> cout << endl;
> }
>
>
> /// This is a helper function for the merge routines.
> template<typename _BidirectionalIterator, typename _Distance, typename _Compare>
> void _my_merge_without_buffer(
> _BidirectionalIterator __first,
> _BidirectionalIterator __middle,
> _BidirectionalIterator __last,
> _Distance __len1, _Distance __len2,
> _Compare __comp) {
>
> if (__len1 == 0 || __len2 == 0)
> return;
>
> if (__len1 + __len2 == 2) {
> if (__comp(__middle, __first))
> std::iter_swap(__first, __middle);
> return;
> }
>
> _BidirectionalIterator __first_cut = __first;
> _BidirectionalIterator __second_cut = __middle;
> _Distance __len11 = 0;
> _Distance __len22 = 0;
>
>
> #define NEW_ALGO 1
>
> #ifdef NEW_ALGO
> // Prevent stack overflow
> if (__len1 > 4 * __len2 || __len2 > 4 * __len1) {
> #endif
> if (__len1 > __len2)
> {
> __len11 = __len1 / 2;
> std::advance(__first_cut, __len11);
> __second_cut
> = std::__lower_bound(__middle, __last, *__first_cut,
> __gnu_cxx::__ops::__iter_comp_val(__comp));
> __len22 = std::distance(__middle, __second_cut);
> }
> else
> {
> __len22 = __len2 / 2;
> std::advance(__second_cut, __len22);
> __first_cut
> = std::__upper_bound(__first, __middle, *__second_cut,
> __gnu_cxx::__ops::__val_comp_iter(__comp));
> __len11 = std::distance(__first, __first_cut);
> }
> #ifdef NEW_ALGO
> } else
> {
> _Distance __cut_range;
> // __cut_len is always equal to __middle - __first_cut (and eventually to
> // __second_cut - __middle) but we keep track of it to avoid calculating it
> // later which can be expensive for bidirectional iterators.
> _Distance __cut_len;
>
> if (__len1 > __len2)
> {
> std::advance(__first_cut, __len1 - __len2);
> __cut_range = __len2;
> }
> else
> {
> __cut_range = __len1;
> }
> __cut_len = __cut_range;
>
> while (__cut_range > 0) {
> _Distance __half_cut_range = __cut_range / 2;
> _BidirectionalIterator __mid_first_cut = __first_cut;
> std::advance(__mid_first_cut, __cut_range - __half_cut_range - 1);
> _BidirectionalIterator __mid_second_cut = __second_cut;
> std::advance(__mid_second_cut, __half_cut_range);
>
> if (__comp(__mid_second_cut, __mid_first_cut)) {
> __cut_range -= __half_cut_range + 1;
> __second_cut = __mid_second_cut;
> ++__second_cut;
> } else {
> __cut_len -= __cut_range - __half_cut_range;
> __cut_range = __half_cut_range;
> __first_cut = __mid_first_cut;
> ++__first_cut;
> }
> }
> __len22 = __cut_len;
> __len11 = __len1 - __cut_len;
> }
> #endif
>
> _BidirectionalIterator __new_middle
> = std::rotate(__first_cut, __middle, __second_cut);
> _my_merge_without_buffer(__first, __first_cut, __new_middle,
> __len11, __len22, __comp);
> _my_merge_without_buffer(__new_middle, __second_cut, __last,
> __len1 - __len11, __len2 - __len22, __comp);
> }
>
> /// This is a helper function for the stable sorting routines.
> template<typename _RandomAccessIterator, typename _Compare>
> void
> _my_inplace_stable_sort(_RandomAccessIterator __first,
> _RandomAccessIterator __last, _Compare __comp)
> {
>
> if (__last - __first < 15)
> {
> std::__insertion_sort(__first, __last, __comp);
> return;
> }
> _RandomAccessIterator __middle = __first + (__last - __first) / 2;
> _my_inplace_stable_sort(__first, __middle, __comp);
> _my_inplace_stable_sort(__middle, __last, __comp);
> _my_merge_without_buffer(__first, __middle, __last,
> __middle - __first,
> __last - __middle,
> __comp);
> }
>
>
> template<typename Iterator>
> void my_inplace_stable_sort(Iterator start, Iterator end) {
> _my_inplace_stable_sort(start, end, __gnu_cxx::__ops::__iter_less_iter());
> }
>
> template<typename Iterator>
> void std_inplace_stable_sort(Iterator start, Iterator end) {
> std::__inplace_stable_sort(start, end, __gnu_cxx::__ops::__iter_less_iter());
> }
>
> void benchmark(string const& dist_name,
> vector<int> dist_func(int),
> int n) {
> vector<int> v = dist_func(n);
> cout << "Distribution: " << dist_name << "(" << n << ")" << endl;
> cout << "Example: " << dist_func(20) << endl;
> TestCase testCase(v);
>
> pair<Counter, Counter> counter_pair =
> testCase.benchmark(
> my_inplace_stable_sort
> , std_inplace_stable_sort
> );
> Counter cKloet = counter_pair.first;
> Counter cInplace = counter_pair.second;
>
> relative_report(cKloet, cInplace);
> }
>
>
> vector<int> random_unique(int n) {
> vector<int> v(n);
> for (int i = 0; i < n; ++i) {
> v[i] = i;
> }
> for (int i = n - 1; i >= 0; --i) {
> swap(v[i], v[rand() % (i + 1)]);
> }
> return v;
> }
>
> vector<int> random_dupes(int n) {
> vector<int> v(n);
> int r = (int)sqrt(n);
> for (int i = 0; i < n; ++i) {
> v[i] = i / r;
> }
> for (int i = n - 1; i >= 0; --i) {
> swap(v[i], v[rand() % (i + 1)]);
> }
> return v;
> }
>
> vector<int> increasing_unique(int n) {
> vector<int> v(n);
> for (int i = 0; i < n; ++i) {
> v[i] = i;
> }
> return v;
> }
>
>
> vector<int> increasing_dupes(int n) {
> vector<int> v(n);
> int r = (int)sqrt(n);
> for (int i = 0; i < n; ++i) {
> v[i] = i / r;
> }
> return v;
> }
>
>
> vector<int> mostly_increasing(int n) {
> vector<int> v(n);
> int r = (int)sqrt(n);
> for (int i = 0; i < n; ++i) {
> v[i] = i + rand() % r;
> }
> return v;
> }
>
>
> vector<int> decreasing_unique(int n) {
> vector<int> v(n);
> for (int i = 0; i < n; ++i) {
> v[i] = n - 1 - i;
> }
> return v;
> }
>
> vector<int> decreasing_dupes(int n) {
> vector<int> v(n);
> int r = (int)sqrt(n);
> for (int i = 0; i < n; ++i) {
> v[i] = (n - 1 - i) / r;
> }
> return v;
> }
>
> vector<int> mostly_decreasing(int n) {
> vector<int> v(n);
> int r = (int)sqrt(n);
> for (int i = 0; i < n; ++i) {
> v[i] = n - 1 - i + rand() % r;
> }
> return v;
> }
>
> vector<int> sawtooth_unique(int n) {
> int r = (int)sqrt(n);
> while (gcd(r, n) != 1) ++r;
> vector<int> v(n);
> for (int i = 0; i < n; ++i) {
> v[(i * r) % n] = i;
> }
> return v;
> }
>
> vector<int> sawtooth_dupes(int n) {
> int r = (int)sqrt(n);
> vector<int> v(n);
> for (int i = 0; i < n; ++i) {
> v[i] = i % r;
> }
> return v;
> }
>
> vector<int> increasing_with_swap(int n) {
> vector<int> v(n);
> for (int i = 0; i < n; ++i) {
> v[i] = i;
> }
> swap(v[0], v[n - 1]);
> return v;
> }
>
> vector<int> increasing_fast_slow(int n) {
> int r = (int)sqrt(n);
> vector<int> v(n);
> for (int i = 0; i < r; ++i) {
> v[i] = i * r;
> }
> for (int i = r; i < n; ++i) {
> v[i] = (i - r) * n / (n - r);
> }
> return v;
> }
>
>
> #define BENCHMARK(dist, n) benchmark(#dist, dist, n)
>
> int main() {
> int n = 100000;
> BENCHMARK(random_unique, n);
> BENCHMARK(random_dupes, n);
> BENCHMARK(increasing_unique, n);
> BENCHMARK(increasing_dupes, n);
> BENCHMARK(mostly_increasing, n);
> BENCHMARK(decreasing_unique, n);
> BENCHMARK(decreasing_dupes, n);
> BENCHMARK(mostly_decreasing, n);
> BENCHMARK(sawtooth_unique, n);
> BENCHMARK(sawtooth_dupes, n);
> BENCHMARK(increasing_with_swap, n);
> BENCHMARK(increasing_fast_slow, n);
>
> return 0;
> }
next prev parent reply other threads:[~2021-11-24 18:33 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-18 12:58 David de Kloet
2021-11-24 18:33 ` François Dumont [this message]
2021-11-24 18:48 ` David de Kloet
2021-11-25 5:35 ` François Dumont
2021-11-25 19:51 ` David de Kloet
2021-12-10 15:31 ` David de Kloet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9f7342af-398c-fa0f-4182-761644d6e5f3@gmail.com \
--to=frs.dumont@gmail.com \
--cc=dskloet@gmail.com \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).