public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* __inplace_stable_sort improvement
@ 2021-11-18 12:58 David de Kloet
  2021-11-24 18:33 ` François Dumont
  0 siblings, 1 reply; 6+ messages in thread
From: David de Kloet @ 2021-11-18 12:58 UTC (permalink / raw)
  To: libstdc++

Hello libstdc++,

Apologies if this is the wrong list for my message. I believe I have an
improvement for __inplace_stable_sort and I'm wondering if this is worth
pursuing and if so how to proceed.

__inplace_stable_sort does a merge sort where in each recursive step the 2
sublists are merged in-place. The merge step is itself also recursive. In order
to merge [A, B) and [B, C), a tail section S1 of [A, B) and a head section S2
of [B, C) are found such that all elements in S1 are greater than all elements
in S2 and then those sections are swapped/rotated and the shorter sections are
merged recursively. This is done by the function __merge_without_buffer.

In the current implementation, __merge_without_buffer takes the half point of
the longer section and does a binary search (via lower_bound or upper_bound) to
find the corresponding cut point on the other section. Then it rotates the
sections.

My proposal is to find the cut points such that both rotated sections have the
same length such that they can always simply be swapped. I have implemented
this and it seems to perform better in general and in the special cases that
I've tried.

Unfortunately, since we are no longer guaranteed to halve the longer section in
each step, there is a possibility for a stack overflow. To avoid this, the new
algorithm falls back to the old behavior whenever the sections have
significantly different lengths.

Also, where the current algorithm performs the binary search implicitly via
lower_bound/upper_bound, the new algorithm has to implement the binary search
explicitly because it has to look at both sections to determine the next step
in the search.

I have created wrappers for my value objects as well as my iterators to count
the number of operations performed by the current and my algorithm and run
benchmarks on various distributions. My code is at the bottom but below is the
output as run on my machine. For each operation it outputs the ratio between
the number of operations performed by my algorithm over the number performed by
the current algorithm.

Thanks for your attention,
David


Distribution: random_unique(100000)
Example: [19, 13, 5, 0, 10, 6, 15, 11, 16, 9, 14, 18, 4, 17, 1, 7, 8, 12, 3, 2]

move: 0.588026 = 3331848 / 5666154
moveAssign: 0.594789 = 6852819 / 11521431
compare: 0.914997 = 2028977 / 2217469
iteratorDeref: 0.718466 = 11001863 / 15312994
iteratorCopy: 0.791332 = 33977189 / 42936712
iteratorEquals: 2.27841 = 5174606 / 2271143
iteratorInc: 1.2143 = 7697971 / 6339449
iteratorDec: 0.0833529 = 473556 / 5681340
iteratorDiff: 0.444865 = 1362598 / 3062950
iteratorSub: 0.0200683 = 6272 / 312533
iteratorAdd: 0.0695467 = 57149 / 821736
iteratorAddAssign: 1.36491 = 3141949 / 2301950
iteratorArithmetic: 0.861623 = 17914101 / 20791101
micros: 0.663704 = 650825 / 980596

Distribution: random_dupes(100000)
Example: [4, 4, 1, 3, 0, 3, 2, 2, 4, 4, 0, 3, 1, 0, 3, 2, 0, 2, 1, 1]

move: 0.570041 = 2949623 / 5174401
moveAssign: 0.577736 = 6087819 / 10537375
compare: 0.845419 = 1530877 / 1810790
iteratorDeref: 0.678233 = 9232869 / 13613114
iteratorCopy: 0.742699 = 25652244 / 34539225
iteratorEquals: 2.41643 = 4123881 / 1706601
iteratorInc: 1.15617 = 6554374 / 5669028
iteratorDec: 0.101004 = 539991 / 5346219
iteratorDiff: 0.417093 = 959972 / 2301580
iteratorSub: 0.0350046 = 8882 / 253738
iteratorAdd: 0.0972571 = 64122 / 659304
iteratorAddAssign: 1.2279 = 2247197 / 1830107
iteratorArithmetic: 0.81605 = 14498419 / 17766577
micros: 0.699386 = 554426 / 792732

Distribution: increasing_unique(100000)
Example: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

move: 1 = 91808 / 91808
moveAssign: 1 = 91808 / 91808
compare: 0.573591 = 224558 / 391495
iteratorDeref: 0.722385 = 540924 / 748803
iteratorCopy: 0.298175 = 976735 / 3275708
iteratorEquals: 0.522302 = 116383 / 222827
iteratorInc: 0.715444 = 132750 / 185549
iteratorDec: 1 = 91808 / 91808
iteratorDiff: 0.217455 = 40956 / 188342
iteratorAdd: 1 = 16383 / 16383
iteratorAddAssign: 0.299523 = 81884 / 273381
iteratorArithmetic: 0.49082 = 480164 / 978290
micros: 0.479792 = 22935 / 47802

Distribution: increasing_dupes(100000)
Example: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]

move: 1 = 91808 / 91808
moveAssign: 1 = 91808 / 91808
compare: 0.573591 = 224558 / 391495
iteratorDeref: 0.722385 = 540924 / 748803
iteratorCopy: 0.298175 = 976735 / 3275708
iteratorEquals: 0.522302 = 116383 / 222827
iteratorInc: 0.715444 = 132750 / 185549
iteratorDec: 1 = 91808 / 91808
iteratorDiff: 0.217455 = 40956 / 188342
iteratorAdd: 1 = 16383 / 16383
iteratorAddAssign: 0.299523 = 81884 / 273381
iteratorArithmetic: 0.49082 = 480164 / 978290
micros: 0.449154 = 20675 / 46031

Distribution: mostly_increasing(100000)
Example: [3, 2, 3, 4, 5, 5, 6, 7, 9, 9, 11, 13, 12, 14, 15, 17, 17, 18, 21, 20]

move: 0.681093 = 814725 / 1196203
moveAssign: 0.703421 = 1809563 / 2572519
compare: 0.924016 = 965320 / 1044700
iteratorDeref: 0.847765 = 3802634 / 4485480
iteratorCopy: 0.804068 = 11859111 / 14748894
iteratorEquals: 1.59826 = 1388752 / 868913
iteratorInc: 1.25659 = 1818421 / 1447111
iteratorDec: 0.383738 = 537108 / 1399674
iteratorDiff: 0.486488 = 562273 / 1155779
iteratorSub: 0.0638659 = 6666 / 104375
iteratorAdd: 0.184194 = 55586 / 301780
iteratorAddAssign: 1.30369 = 1105439 / 847932
iteratorArithmetic: 0.893672 = 5474245 / 6125564
micros: 0.790603 = 233953 / 295917

Distribution: decreasing_unique(100000)
Example: [19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

move: 0.40597 = 995312 / 2451688
moveAssign: 0.45785 = 2459840 / 5372592
compare: 0.489247 = 151631 / 309927
iteratorDeref: 0.524798 = 3415934 / 6509043
iteratorCopy: 0.563628 = 5971275 / 10594360
iteratorEquals: 1.18395 = 843533 / 712475
iteratorInc: 0.864454 = 1460463 / 1689462
iteratorDec: 0.373125 = 1606976 / 4306800
iteratorDiff: 0.495226 = 235930 / 476409
iteratorSub: 0.248254 = 13824 / 55685
iteratorAdd: 0.590877 = 135839 / 229894
iteratorAddAssign: 0.421852 = 119646 / 283621
iteratorArithmetic: 0.569514 = 4416211 / 7754346
micros: 0.543145 = 167865 / 309061

Distribution: decreasing_dupes(100000)
Example: [4, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0]

move: 0.741695 = 1455276 / 1962096
moveAssign: 0.736048 = 2826598 / 3840238
compare: 0.641679 = 282851 / 440798
iteratorDeref: 0.740565 = 3386062 / 4572271
iteratorCopy: 0.628798 = 5008472 / 7965150
iteratorEquals: 1.50206 = 794354 / 528843
iteratorInc: 1.39602 = 2164546 / 1550510
iteratorDec: 0.336639 = 837728 / 2488504
iteratorDiff: 0.313457 = 88552 / 282501
iteratorSub: 0.165388 = 3880 / 23460
iteratorAdd: 0.377644 = 27139 / 71864
iteratorAddAssign: 0.551364 = 182383 / 330785
iteratorArithmetic: 0.776766 = 4098582 / 5276467
micros: 0.71282 = 160398 / 225019

Distribution: mostly_decreasing(100000)
Example: [20, 19, 20, 18, 18, 15, 13, 13, 14, 13, 10, 9, 10, 8, 5, 7,
6, 4, 3, 2]

move: 0.760602 = 2197435 / 2889075
moveAssign: 0.768473 = 4591313 / 5974593
compare: 0.932913 = 977678 / 1047984
iteratorDeref: 0.838641 = 6633169 / 7909424
iteratorCopy: 0.829028 = 15447960 / 18633830
iteratorEquals: 1.7009 = 2000686 / 1176251
iteratorInc: 1.52771 = 3788203 / 2479657
iteratorDec: 0.363843 = 1369990 / 3765332
iteratorDiff: 0.489186 = 595472 / 1217270
iteratorSub: 0.0877827 = 10925 / 124455
iteratorAdd: 0.201991 = 69640 / 344767
iteratorAddAssign: 1.32756 = 1133512 / 853830
iteratorArithmetic: 0.900303 = 8968428 / 9961562
micros: 0.846298 = 377504 / 446065

Distribution: sawtooth_unique(100000)
Example: [0, 3, 6, 9, 12, 15, 18, 1, 4, 7, 10, 13, 16, 19, 2, 5, 8, 11, 14, 17]

move: 0.569953 = 2912465 / 5110007
moveAssign: 0.577701 = 6012427 / 10407511
compare: 0.824488 = 1604387 / 1945920
iteratorDeref: 0.678979 = 9299794 / 13696724
iteratorCopy: 0.72546 = 26157352 / 36056250
iteratorEquals: 2.00683 = 4131766 / 2058856
iteratorInc: 1.10728 = 6498257 / 5868688
iteratorDec: 0.111225 = 552519 / 4967557
iteratorDiff: 0.405328 = 988144 / 2437889
iteratorSub: 0.026249 = 6327 / 241038
iteratorAdd: 0.089034 = 55918 / 628052
iteratorAddAssign: 1.18737 = 2297233 / 1934717
iteratorArithmetic: 0.801143 = 14530164 / 18136797
micros: 0.706282 = 583937 / 826776

Distribution: sawtooth_dupes(100000)
Example: [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]

move: 0.547721 = 2348873 / 4288447
moveAssign: 0.543252 = 4613817 / 8492965
compare: 0.664642 = 841503 / 1266099
iteratorDeref: 0.599656 = 6292282 / 10493159
iteratorCopy: 0.60663 = 15336816 / 25281976
iteratorEquals: 2.24803 = 2859760 / 1272118
iteratorInc: 1.07142 = 4999546 / 4666276
iteratorDec: 0.0490551 = 205953 / 4198405
iteratorDiff: 0.314518 = 473157 / 1504390
iteratorSub: 0.0371951 = 5998 / 161258
iteratorAdd: 0.0875684 = 34672 / 395942
iteratorAddAssign: 0.900009 = 1233077 / 1370072
iteratorArithmetic: 0.72316 = 9812163 / 13568461
micros: 0.610891 = 353102 / 578011

Distribution: increasing_with_swap(100000)
Example: [19, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 0]

move: 1 = 291782 / 291782
moveAssign: 1 = 491779 / 491779
compare: 0.57449 = 225064 / 391763
iteratorDeref: 0.819553 = 941922 / 1149312
iteratorCopy: 0.377964 = 1392386 / 3683915
iteratorEquals: 0.525758 = 117408 / 223312
iteratorInc: 0.863708 = 333050 / 385605
iteratorDec: 0.999753 = 291716 / 291788
iteratorDiff: 0.224408 = 42528 / 189512
iteratorSub: 0.888889 = 104 / 117
iteratorAdd: 0.997151 = 16801 / 16849
iteratorAddAssign: 0.302604 = 82864 / 273836
iteratorArithmetic: 0.640448 = 884471 / 1381019
micros: 0.616124 = 36233 / 58808

Distribution: increasing_fast_slow(100000)
Example: [0, 4, 8, 12, 0, 1, 2, 3, 5, 6, 7, 8, 10, 11, 12, 13, 15, 16, 17, 18]

move: 0.640348 = 825772 / 1289568
moveAssign: 0.627081 = 1559792 / 2487384
compare: 0.714235 = 431286 / 603843
iteratorDeref: 0.694913 = 2421875 / 3485151
iteratorCopy: 0.637176 = 5645877 / 8860786
iteratorEquals: 1.7552 = 953880 / 543458
iteratorInc: 1.08941 = 1676609 / 1539005
iteratorDec: 0.135904 = 154533 / 1137073
iteratorDiff: 0.388132 = 192787 / 496705
iteratorSub: 0.00577964 = 141 / 24396
iteratorAdd: 0.207287 = 17272 / 83324
iteratorAddAssign: 0.849015 = 431320 / 508024
iteratorArithmetic: 0.790987 = 3426542 / 4331985
micros: 0.777699 = 153552 / 197444



#include <iostream>
#include <stdio.h>
#include <vector>
#include <cstdlib>
#include <algorithm>
#include <functional>
#include <ranges>
#include <chrono>
#include <cmath>
#include <numeric>

using namespace std;

#define LINE0 0
#define cout_line (cout << "line " << __LINE__ - LINE0 << ": ")
#define vout(v) #v << " = " << (v)
#define print1(v) cout_line << vout(v) << endl
#define print2(v1, v2) cout_line << vout(v1) << ", " << vout(v2) << endl
#define print3(v1, v2, v3) cout_line << vout(v1) << ", " << vout(v2)
<< ", " << vout(v3) << endl
#define GET_MACRO(_1,_2,_3,NAME,...) NAME
#define print(...) GET_MACRO(__VA_ARGS__, print3, print2, print1)(__VA_ARGS__)

template <class Ch, class Tr, class Container>
basic_ostream <Ch, Tr> & operator << (basic_ostream <Ch, Tr> & os,
Container const& x);

template <class X, class Y>
ostream & operator << (ostream & os, pair <X, Y> const& p) {
return os << p.first << ":" << p.second;
}

template <class Ch, class Tr, class Container>
basic_ostream <Ch, Tr> & operator << (basic_ostream <Ch, Tr> & os,
Container const& x) {
    os << "[";
    bool first = true;
    for(auto const& y : x) {
        if (!first) os << ", ";
        os << y;
        first = false;
    }
    return os << "]";
}


struct Counter {
  int construct = 0;
  int copy = 0;
  int assign = 0;
  int move = 0;
  int moveAssign = 0;
  int compare = 0;
  int iteratorDeref = 0;
  int iteratorCopy = 0;
  int iteratorEquals = 0;
  int iteratorInc = 0;
  int iteratorDec = 0;
  int iteratorDiff = 0;
  int iteratorSub = 0;
  int iteratorAdd = 0;
  int iteratorAddAssign = 0;
  int iteratorArithmetic = 0;
  int micros = 0;
};

int next_element_id = 0;

class Element {
  Counter* counter;
public:
  int val;
  int id;  // Distinguis identical values to verify stability.
  Element(int val, Counter* counter)
      : val(val), counter(counter), id(++next_element_id) {
    ++counter->construct;
  }

  Element(Element const& other)
      : val(other.val), counter(other.counter), id(other.id) {
    ++counter->copy;
  }

  Element& operator=(Element const& other) {
    val = other.val;
    counter = other.counter;
    id = other.id;
    ++counter->assign;
    return *this;
  }

  Element(Element&& other) noexcept
      : val(move(other.val)), counter(other.counter), id(other.id) {
    ++counter->move;
  }

  Element& operator=(Element&& other) noexcept {
    val = move(other.val);
    counter = other.counter;
    id = other.id;
    ++counter->moveAssign;
    return *this;
  }

  bool operator<(Element const& other) const {
    ++counter->compare;
    return val < other.val;
  }

  bool operator==(Element const& other) const {
    return val == other.val && id == other.id;
  }
};

ostream & operator << (ostream & os, Element const& e) {
return os << e.val << "(" << e.id << ")";
}


class RandomIterator {
  typedef vector<Element>::iterator It;
  It it;
  Counter* counter;

public:
  using iterator_category = std::random_access_iterator_tag;
  using difference_type   = std::ptrdiff_t;
  using value_type        = Element;
  using pointer           = value_type*;
  using reference         = value_type&;

  RandomIterator(It it, Counter* counter) : it(it), counter(counter) {
  }

  RandomIterator(RandomIterator const& other) : it(other.it),
counter(other.counter) {
    counter->iteratorCopy += 1;
  }

  bool operator== (RandomIterator other) {
    counter->iteratorEquals += 1;
    counter->iteratorArithmetic += 1;
    return it == other.it;
  }

  RandomIterator operator++ () {
    counter->iteratorInc += 1;
    counter->iteratorArithmetic += 1;
    return RandomIterator(++it, counter);
  }

  RandomIterator operator-- () {
    counter->iteratorDec += 1;
    counter->iteratorArithmetic += 1;
    return RandomIterator(--it, counter);
  }

  Element& operator* () {
    counter->iteratorDeref += 1;
    return *it;
  }

  difference_type operator- (RandomIterator other) {
    counter->iteratorDiff += 1;
    counter->iteratorArithmetic += 1;
    return it - other.it;
  }

  RandomIterator operator- (difference_type diff) {
    counter->iteratorSub += 1;
    counter->iteratorArithmetic += 1;
    return RandomIterator(it - diff, counter);
  }

  RandomIterator operator+ (difference_type diff) {
    counter->iteratorAdd += 1;
    counter->iteratorArithmetic += 1;
    return RandomIterator(it + diff, counter);
  }

  RandomIterator operator+= (difference_type diff) {
    counter->iteratorAddAssign += 1;
    counter->iteratorArithmetic += 1;
    return RandomIterator(it += diff, counter);
  }
};


class TestCase {
  Counter counter;
  vector<Element> elements;

public:
  TestCase(vector<int> const& data) {
    elements.reserve(data.size());
    for (int i: data) {
      elements.push_back(Element(i, &counter));
    }
  }

  Counter sort(
      void algo(RandomIterator, RandomIterator),
      vector<Element>::iterator begin,
      vector<Element>::iterator end) {
    counter = Counter();
    auto t0 = std::chrono::system_clock::now();

    algo(RandomIterator(begin, &counter), RandomIterator(end, &counter));

    auto t1 = std::chrono::system_clock::now();
    counter.micros =
std::chrono::duration_cast<std::chrono::microseconds>(t1 -
t0).count();
    return counter;
  }

  pair<Counter,Counter> benchmark(
      void new_algo(RandomIterator, RandomIterator),
      void cur_algo(RandomIterator, RandomIterator)) {
    vector<Element> toSort1 = elements;
    Counter counter1 = sort(new_algo, toSort1.begin(), toSort1.end());

    vector<Element> toSort2 = elements;
    Counter counter2 = sort(cur_algo, toSort2.begin(), toSort2.end());

    if (toSort1 != toSort2) {
      cout << "INCORRECT RESULT!" << endl;
      exit(1);
    }

    return make_pair(counter1, counter2);
  }
};


void relative_metric(string name, int m1, int m2) {
  if (m1 == 0 && m2 == 0) return;
  if (m2 == 0) {
    cout << name << ": inf = " << m1 << " / " << m2 << endl;
  } else {
    cout << name << ": " << m1 / (double) m2 << " = " << m1 << " / "
<< m2 << endl;
  }
}


void relative_report(Counter const& c1, Counter const& c2) {
  cout << endl;
  relative_metric("construct", c1.construct, c2.construct);
  relative_metric("copy", c1.copy, c2.copy);
  relative_metric("assign", c1.assign, c2.assign);
  relative_metric("move", c1.move, c2.move);
  relative_metric("moveAssign", c1.moveAssign, c2.moveAssign);
  relative_metric("compare", c1.compare, c2.compare);
  relative_metric("iteratorDeref", c1.iteratorDeref, c2.iteratorDeref);
  relative_metric("iteratorCopy", c1.iteratorCopy, c2.iteratorCopy);
  relative_metric("iteratorEquals", c1.iteratorEquals, c2.iteratorEquals);
  relative_metric("iteratorInc", c1.iteratorInc, c2.iteratorInc);
  relative_metric("iteratorDec", c1.iteratorDec, c2.iteratorDec);
  relative_metric("iteratorDiff", c1.iteratorDiff, c2.iteratorDiff);
  relative_metric("iteratorSub", c1.iteratorSub, c2.iteratorSub);
  relative_metric("iteratorAdd", c1.iteratorAdd, c2.iteratorAdd);
  relative_metric("iteratorAddAssign", c1.iteratorAddAssign,
c2.iteratorAddAssign);
  relative_metric("iteratorArithmetic", c1.iteratorArithmetic,
c2.iteratorArithmetic);
  relative_metric("micros", c1.micros, c2.micros);
  cout << endl;
}


/// This is a helper function for the merge routines.
template<typename _BidirectionalIterator, typename _Distance, typename _Compare>
void _my_merge_without_buffer(
_BidirectionalIterator __first,
_BidirectionalIterator __middle,
_BidirectionalIterator __last,
_Distance __len1, _Distance __len2,
_Compare __comp) {

if (__len1 == 0 || __len2 == 0)
return;

if (__len1 + __len2 == 2) {
  if (__comp(__middle, __first))
    std::iter_swap(__first, __middle);
  return;
}

_BidirectionalIterator __first_cut = __first;
_BidirectionalIterator __second_cut = __middle;
_Distance __len11 = 0;
_Distance __len22 = 0;


#define NEW_ALGO 1

#ifdef NEW_ALGO
  // Prevent stack overflow
if (__len1 > 4 * __len2 || __len2 > 4 * __len1) {
#endif
    if (__len1 > __len2)
      {
        __len11 = __len1 / 2;
        std::advance(__first_cut, __len11);
        __second_cut
          = std::__lower_bound(__middle, __last, *__first_cut,
             __gnu_cxx::__ops::__iter_comp_val(__comp));
        __len22 = std::distance(__middle, __second_cut);
      }
    else
      {
        __len22 = __len2 / 2;
        std::advance(__second_cut, __len22);
        __first_cut
          = std::__upper_bound(__first, __middle, *__second_cut,
             __gnu_cxx::__ops::__val_comp_iter(__comp));
        __len11 = std::distance(__first, __first_cut);
      }
#ifdef NEW_ALGO
  } else
  {
    _Distance __cut_range;
    // __cut_len is always equal to __middle - __first_cut (and eventually to
    // __second_cut - __middle) but we keep track of it to avoid calculating it
    // later which can be expensive for bidirectional iterators.
    _Distance __cut_len;

if (__len1 > __len2)
{
std::advance(__first_cut, __len1 - __len2);
__cut_range = __len2;
}
else
{
__cut_range = __len1;
}
    __cut_len = __cut_range;

while (__cut_range > 0) {
_Distance __half_cut_range = __cut_range / 2;
_BidirectionalIterator __mid_first_cut = __first_cut;
std::advance(__mid_first_cut, __cut_range - __half_cut_range - 1);
_BidirectionalIterator __mid_second_cut = __second_cut;
std::advance(__mid_second_cut, __half_cut_range);

if (__comp(__mid_second_cut, __mid_first_cut)) {
__cut_range -= __half_cut_range + 1;
__second_cut = __mid_second_cut;
++__second_cut;
} else {
        __cut_len -= __cut_range - __half_cut_range;
__cut_range = __half_cut_range;
__first_cut = __mid_first_cut;
++__first_cut;
}
}
__len22 = __cut_len;
    __len11 = __len1 - __cut_len;
}
#endif

  _BidirectionalIterator __new_middle
      = std::rotate(__first_cut, __middle, __second_cut);
  _my_merge_without_buffer(__first, __first_cut, __new_middle,
      __len11, __len22, __comp);
  _my_merge_without_buffer(__new_middle, __second_cut, __last,
      __len1 - __len11, __len2 - __len22, __comp);
}

/// This is a helper function for the stable sorting routines.
template<typename _RandomAccessIterator, typename _Compare>
  void
  _my_inplace_stable_sort(_RandomAccessIterator __first,
      _RandomAccessIterator __last, _Compare __comp)
  {

    if (__last - __first < 15)
      {
        std::__insertion_sort(__first, __last, __comp);
        return;
      }
    _RandomAccessIterator __middle = __first + (__last - __first) / 2;
    _my_inplace_stable_sort(__first, __middle, __comp);
    _my_inplace_stable_sort(__middle, __last, __comp);
    _my_merge_without_buffer(__first, __middle, __last,
        __middle - __first,
        __last - __middle,
        __comp);
  }


template<typename Iterator>
void my_inplace_stable_sort(Iterator start, Iterator end) {
  _my_inplace_stable_sort(start, end, __gnu_cxx::__ops::__iter_less_iter());
}

template<typename Iterator>
void std_inplace_stable_sort(Iterator start, Iterator end) {
  std::__inplace_stable_sort(start, end, __gnu_cxx::__ops::__iter_less_iter());
}

void benchmark(string const& dist_name,
    vector<int> dist_func(int),
    int n) {
  vector<int> v = dist_func(n);
  cout << "Distribution: " << dist_name << "(" << n << ")" << endl;
  cout << "Example: " << dist_func(20) << endl;
  TestCase testCase(v);

  pair<Counter, Counter> counter_pair =
      testCase.benchmark(
          my_inplace_stable_sort
          , std_inplace_stable_sort
          );
  Counter cKloet = counter_pair.first;
  Counter cInplace = counter_pair.second;

  relative_report(cKloet, cInplace);
}


vector<int> random_unique(int n) {
  vector<int> v(n);
  for (int i = 0; i < n; ++i) {
    v[i] = i;
  }
  for (int i = n - 1; i >= 0; --i) {
    swap(v[i], v[rand() % (i + 1)]);
  }
  return v;
}

vector<int> random_dupes(int n) {
  vector<int> v(n);
  int r = (int)sqrt(n);
  for (int i = 0; i < n; ++i) {
    v[i] = i / r;
  }
  for (int i = n - 1; i >= 0; --i) {
    swap(v[i], v[rand() % (i + 1)]);
  }
  return v;
}

vector<int> increasing_unique(int n) {
  vector<int> v(n);
  for (int i = 0; i < n; ++i) {
    v[i] = i;
  }
  return v;
}


vector<int> increasing_dupes(int n) {
  vector<int> v(n);
  int r = (int)sqrt(n);
  for (int i = 0; i < n; ++i) {
    v[i] = i / r;
  }
  return v;
}


vector<int> mostly_increasing(int n) {
  vector<int> v(n);
  int r = (int)sqrt(n);
  for (int i = 0; i < n; ++i) {
    v[i] = i + rand() % r;
  }
  return v;
}


vector<int> decreasing_unique(int n) {
  vector<int> v(n);
  for (int i = 0; i < n; ++i) {
    v[i] = n - 1 - i;
  }
  return v;
}

vector<int> decreasing_dupes(int n) {
  vector<int> v(n);
  int r = (int)sqrt(n);
  for (int i = 0; i < n; ++i) {
    v[i] = (n - 1 - i) / r;
  }
  return v;
}

vector<int> mostly_decreasing(int n) {
  vector<int> v(n);
  int r = (int)sqrt(n);
  for (int i = 0; i < n; ++i) {
    v[i] = n - 1 - i + rand() % r;
  }
  return v;
}

vector<int> sawtooth_unique(int n) {
  int r = (int)sqrt(n);
  while (gcd(r, n) != 1) ++r;
  vector<int> v(n);
  for (int i = 0; i < n; ++i) {
    v[(i * r) % n] = i;
  }
  return v;
}

vector<int> sawtooth_dupes(int n) {
  int r = (int)sqrt(n);
  vector<int> v(n);
  for (int i = 0; i < n; ++i) {
    v[i] = i % r;
  }
  return v;
}

vector<int> increasing_with_swap(int n) {
  vector<int> v(n);
  for (int i = 0; i < n; ++i) {
    v[i] = i;
  }
  swap(v[0], v[n - 1]);
  return v;
}

vector<int> increasing_fast_slow(int n) {
  int r = (int)sqrt(n);
  vector<int> v(n);
  for (int i = 0; i < r; ++i) {
    v[i] = i * r;
  }
  for (int i = r; i < n; ++i) {
    v[i] = (i - r) * n / (n - r);
  }
  return v;
}


#define BENCHMARK(dist, n) benchmark(#dist, dist, n)

int main() {
  int n = 100000;
  BENCHMARK(random_unique, n);
  BENCHMARK(random_dupes, n);
  BENCHMARK(increasing_unique, n);
  BENCHMARK(increasing_dupes, n);
  BENCHMARK(mostly_increasing, n);
  BENCHMARK(decreasing_unique, n);
  BENCHMARK(decreasing_dupes, n);
  BENCHMARK(mostly_decreasing, n);
  BENCHMARK(sawtooth_unique, n);
  BENCHMARK(sawtooth_dupes, n);
  BENCHMARK(increasing_with_swap, n);
  BENCHMARK(increasing_fast_slow, n);

  return 0;
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-10 15:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-18 12:58 __inplace_stable_sort improvement David de Kloet
2021-11-24 18:33 ` François Dumont
2021-11-24 18:48   ` David de Kloet
2021-11-25  5:35     ` François Dumont
2021-11-25 19:51       ` David de Kloet
2021-12-10 15:31         ` David de Kloet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).