From: David de Kloet <dskloet@gmail.com>
To: "François Dumont" <frs.dumont@gmail.com>
Cc: libstdc++@gcc.gnu.org
Subject: Re: __inplace_stable_sort improvement
Date: Fri, 10 Dec 2021 16:31:00 +0100 [thread overview]
Message-ID: <CAKxRKidz=HXfnFbZSw8mXPuBt+83MfE=Y8G5iVEvrrVqP0wbbA@mail.gmail.com> (raw)
In-Reply-To: <CAKxRKie4MQ0q2B5eUenttoHj2vm-4ymmbQVkBOrozdsZoS-dBQ@mail.gmail.com>
I had another look at this and decided to try `make check-performance` as
you suggested, but I got
make: *** No rule to make target 'check-performance'. Stop.
Then I tried to run `configure` (from another directory) and got
Building GCC requires GMP 4.2+, MPFR 3.1.0+ and MPC 0.8.0+.
...
Since I have no idea what I'm doing, I thought I'd better check if I'm
missing something obvious before trying to figure it out.
On Thu, Nov 25, 2021 at 8:51 PM David de Kloet <dskloet@gmail.com> wrote:
> On Thu, Nov 25, 2021 at 6:35 AM François Dumont <frs.dumont@gmail.com>
> wrote:
> >
> > On 24/11/21 7:48 pm, David de Kloet wrote:
> > > On Wed, Nov 24, 2021 at 7:33 PM François Dumont <frs.dumont@gmail.com>
> wrote:
> > >>
> > >> On 18/11/21 1:58 pm, David de Kloet via Libstdc++ wrote:
> > >>> Hello libstdc++,
> > >>>
> > >>> Apologies if this is the wrong list for my message. I believe I have
> an
> > >>> improvement for __inplace_stable_sort and I'm wondering if this is
> worth
> > >>> pursuing and if so how to proceed.
> > >> Of course, anyone interested in enhancing the library is welcome.
> > >>
> > >> The best way to do so would be to provide a patch on the current
> library
> > >> code. I'm the last one to have touch to this algo and it would be
> easier
> > >> for me to understand if you try to implement it directly on the
> library.
> > > _my_merge_without_buffer below preprocesses exactly to
> > > __merge_without_buffer from the library if you undefine NEW_ALGO so
> > > you can see exactly what my changes are.
> > >
> > > I currently don't have much time, but when I do I'd be happy to
> > > provide it as a patch. If you find time before me, you're also more
> > > than welcome to use the code.
> > >
> > >> The good news is that you are trying to enhance a piece of code rarely
> > >> used as it is only when the buffer allocation fails. So it is less
> > >> critical and so perhaps more open to modifications.
> > >>
> > >> During my last work on it I've added a performance test:
> > >> performance/25_algorithms/inplace_merge.cc. It is also supposed to
> check
> > >> performance when we fail to get a buffer and so use
> > >> __merge_without_buffer. Use 'make check-performance' to run it, see
> > >> library manual guide. Do not hesitate to change it if you think that
> > >> your benches are better.
> > > I don't currently have the environment to `make check-performance`. I
> > > just copied the code from the web and adjusted it. But (again) when I
> > > have time, I'll look into that.
> > >
> > > If you know of any "contributing to stdlibc++" guides, that would be
> great too.
> >
> >
> https://gcc.gnu.org/onlinedocs/libstdc++/manual/appendix_contributing.html
> >
> >
> > >
> > >>> __inplace_stable_sort does a merge sort where in each recursive step
> the 2
> > >>> sublists are merged in-place. The merge step is itself also
> recursive. In order
> > >>> to merge [A, B) and [B, C), a tail section S1 of [A, B) and a head
> section S2
> > >>> of [B, C) are found such that all elements in S1 are greater than
> all elements
> > >>> in S2 and then those sections are swapped/rotated and the shorter
> sections are
> > >>> merged recursively. This is done by the function
> __merge_without_buffer.
> > >>>
> > >>> In the current implementation, __merge_without_buffer takes the half
> point of
> > >>> the longer section and does a binary search (via lower_bound or
> upper_bound) to
> > >>> find the corresponding cut point on the other section. Then it
> rotates the
> > >>> sections.
> > >>>
> > >>> My proposal is to find the cut points such that both rotated
> sections have the
> > >>> same length such that they can always simply be swapped.
> > >> What do you mean by swapped ?
> > > I mean the first element of section 1 gets swapped with the first
> > > element of section 2. The second element of section 1 gets swapped
> > > with the second element of section 2. Etc. for each index.
> > >
> > >> I see in the code below that you are just
> > >> changing how the cut positions are computed and then call the same
> > >> std::rotate algo. I cannot see any reason why changing the position
> > >> would make the rotate faster.
> > > The implementation of rotate does a series of such section swaps. But
> > > if both section have the same length, it's only a single swap of
> > > sections which makes rotate about 1.5-2 times faster.
> >
> > Ok, I see now, you try to use as much as possible the std::swap_ranges
> > in __rotate for random access iterator. Maybe it means that this
> > optimization should be limited to this type of iterator. We'll see
> > depending on benches.
>
> The binary search implementation is a bit awkward for the
> bidi-iterator, so having a separate implementation for random access
> iterators might improve the readability. However, if you look at my
> benchmarks, it also does fewer comparisons than the current algorithm.
> But rotate() doesn't do any comparisons. So there must be another
> benefit to choosing the cut points this way, but I'm not entirely sure
> why.
>
> >
> > I'll have a try then when I have time.
> >
> > Thanks
> >
> > >
> > > The purpose of the code I added is to choose the cut positions such
> > > that the rotated sections are of equal length, which makes rotate
> > > faster by doing fewer element swaps, resulting in fewer calls to the
> > > copy/assignment constructor.
> > >
> > >>> I have implemented
> > >>> this and it seems to perform better in general and in the special
> cases that
> > >>> I've tried.
> > >>>
> > >>> Unfortunately, since we are no longer guaranteed to halve the longer
> section in
> > >>> each step, there is a possibility for a stack overflow. To avoid
> this, the new
> > >>> algorithm falls back to the old behavior whenever the sections have
> > >>> significantly different lengths.
> > >>>
> > >>> Also, where the current algorithm performs the binary search
> implicitly via
> > >>> lower_bound/upper_bound, the new algorithm has to implement the
> binary search
> > >>> explicitly because it has to look at both sections to determine the
> next step
> > >>> in the search.
> > >>>
> > >>> I have created wrappers for my value objects as well as my iterators
> to count
> > >>> the number of operations performed by the current and my algorithm
> and run
> > >>> benchmarks on various distributions. My code is at the bottom but
> below is the
> > >>> output as run on my machine. For each operation it outputs the ratio
> between
> > >>> the number of operations performed by my algorithm over the number
> performed by
> > >>> the current algorithm.
> > >>>
> > >>> Thanks for your attention,
> > >>> David
> > >>>
> > >>>
> > >>> Distribution: random_unique(100000)
> > >>> Example: [19, 13, 5, 0, 10, 6, 15, 11, 16, 9, 14, 18, 4, 17, 1, 7,
> 8, 12, 3, 2]
> > >>>
> > >>> move: 0.588026 = 3331848 / 5666154
> > >>> moveAssign: 0.594789 = 6852819 / 11521431
> > >>> compare: 0.914997 = 2028977 / 2217469
> > >>> iteratorDeref: 0.718466 = 11001863 / 15312994
> > >>> iteratorCopy: 0.791332 = 33977189 / 42936712
> > >>> iteratorEquals: 2.27841 = 5174606 / 2271143
> > >>> iteratorInc: 1.2143 = 7697971 / 6339449
> > >>> iteratorDec: 0.0833529 = 473556 / 5681340
> > >>> iteratorDiff: 0.444865 = 1362598 / 3062950
> > >>> iteratorSub: 0.0200683 = 6272 / 312533
> > >>> iteratorAdd: 0.0695467 = 57149 / 821736
> > >>> iteratorAddAssign: 1.36491 = 3141949 / 2301950
> > >>> iteratorArithmetic: 0.861623 = 17914101 / 20791101
> > >>> micros: 0.663704 = 650825 / 980596
> > >>>
> > >>> Distribution: random_dupes(100000)
> > >>> Example: [4, 4, 1, 3, 0, 3, 2, 2, 4, 4, 0, 3, 1, 0, 3, 2, 0, 2, 1, 1]
> > >>>
> > >>> move: 0.570041 = 2949623 / 5174401
> > >>> moveAssign: 0.577736 = 6087819 / 10537375
> > >>> compare: 0.845419 = 1530877 / 1810790
> > >>> iteratorDeref: 0.678233 = 9232869 / 13613114
> > >>> iteratorCopy: 0.742699 = 25652244 / 34539225
> > >>> iteratorEquals: 2.41643 = 4123881 / 1706601
> > >>> iteratorInc: 1.15617 = 6554374 / 5669028
> > >>> iteratorDec: 0.101004 = 539991 / 5346219
> > >>> iteratorDiff: 0.417093 = 959972 / 2301580
> > >>> iteratorSub: 0.0350046 = 8882 / 253738
> > >>> iteratorAdd: 0.0972571 = 64122 / 659304
> > >>> iteratorAddAssign: 1.2279 = 2247197 / 1830107
> > >>> iteratorArithmetic: 0.81605 = 14498419 / 17766577
> > >>> micros: 0.699386 = 554426 / 792732
> > >>>
> > >>> Distribution: increasing_unique(100000)
> > >>> Example: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
> 17, 18, 19]
> > >>>
> > >>> move: 1 = 91808 / 91808
> > >>> moveAssign: 1 = 91808 / 91808
> > >>> compare: 0.573591 = 224558 / 391495
> > >>> iteratorDeref: 0.722385 = 540924 / 748803
> > >>> iteratorCopy: 0.298175 = 976735 / 3275708
> > >>> iteratorEquals: 0.522302 = 116383 / 222827
> > >>> iteratorInc: 0.715444 = 132750 / 185549
> > >>> iteratorDec: 1 = 91808 / 91808
> > >>> iteratorDiff: 0.217455 = 40956 / 188342
> > >>> iteratorAdd: 1 = 16383 / 16383
> > >>> iteratorAddAssign: 0.299523 = 81884 / 273381
> > >>> iteratorArithmetic: 0.49082 = 480164 / 978290
> > >>> micros: 0.479792 = 22935 / 47802
> > >>>
> > >>> Distribution: increasing_dupes(100000)
> > >>> Example: [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]
> > >>>
> > >>> move: 1 = 91808 / 91808
> > >>> moveAssign: 1 = 91808 / 91808
> > >>> compare: 0.573591 = 224558 / 391495
> > >>> iteratorDeref: 0.722385 = 540924 / 748803
> > >>> iteratorCopy: 0.298175 = 976735 / 3275708
> > >>> iteratorEquals: 0.522302 = 116383 / 222827
> > >>> iteratorInc: 0.715444 = 132750 / 185549
> > >>> iteratorDec: 1 = 91808 / 91808
> > >>> iteratorDiff: 0.217455 = 40956 / 188342
> > >>> iteratorAdd: 1 = 16383 / 16383
> > >>> iteratorAddAssign: 0.299523 = 81884 / 273381
> > >>> iteratorArithmetic: 0.49082 = 480164 / 978290
> > >>> micros: 0.449154 = 20675 / 46031
> > >>>
> > >>> Distribution: mostly_increasing(100000)
> > >>> Example: [3, 2, 3, 4, 5, 5, 6, 7, 9, 9, 11, 13, 12, 14, 15, 17, 17,
> 18, 21, 20]
> > >>>
> > >>> move: 0.681093 = 814725 / 1196203
> > >>> moveAssign: 0.703421 = 1809563 / 2572519
> > >>> compare: 0.924016 = 965320 / 1044700
> > >>> iteratorDeref: 0.847765 = 3802634 / 4485480
> > >>> iteratorCopy: 0.804068 = 11859111 / 14748894
> > >>> iteratorEquals: 1.59826 = 1388752 / 868913
> > >>> iteratorInc: 1.25659 = 1818421 / 1447111
> > >>> iteratorDec: 0.383738 = 537108 / 1399674
> > >>> iteratorDiff: 0.486488 = 562273 / 1155779
> > >>> iteratorSub: 0.0638659 = 6666 / 104375
> > >>> iteratorAdd: 0.184194 = 55586 / 301780
> > >>> iteratorAddAssign: 1.30369 = 1105439 / 847932
> > >>> iteratorArithmetic: 0.893672 = 5474245 / 6125564
> > >>> micros: 0.790603 = 233953 / 295917
> > >>>
> > >>> Distribution: decreasing_unique(100000)
> > >>> Example: [19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4,
> 3, 2, 1, 0]
> > >>>
> > >>> move: 0.40597 = 995312 / 2451688
> > >>> moveAssign: 0.45785 = 2459840 / 5372592
> > >>> compare: 0.489247 = 151631 / 309927
> > >>> iteratorDeref: 0.524798 = 3415934 / 6509043
> > >>> iteratorCopy: 0.563628 = 5971275 / 10594360
> > >>> iteratorEquals: 1.18395 = 843533 / 712475
> > >>> iteratorInc: 0.864454 = 1460463 / 1689462
> > >>> iteratorDec: 0.373125 = 1606976 / 4306800
> > >>> iteratorDiff: 0.495226 = 235930 / 476409
> > >>> iteratorSub: 0.248254 = 13824 / 55685
> > >>> iteratorAdd: 0.590877 = 135839 / 229894
> > >>> iteratorAddAssign: 0.421852 = 119646 / 283621
> > >>> iteratorArithmetic: 0.569514 = 4416211 / 7754346
> > >>> micros: 0.543145 = 167865 / 309061
> > >>>
> > >>> Distribution: decreasing_dupes(100000)
> > >>> Example: [4, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 2, 1, 1, 1, 1, 0, 0, 0, 0]
> > >>>
> > >>> move: 0.741695 = 1455276 / 1962096
> > >>> moveAssign: 0.736048 = 2826598 / 3840238
> > >>> compare: 0.641679 = 282851 / 440798
> > >>> iteratorDeref: 0.740565 = 3386062 / 4572271
> > >>> iteratorCopy: 0.628798 = 5008472 / 7965150
> > >>> iteratorEquals: 1.50206 = 794354 / 528843
> > >>> iteratorInc: 1.39602 = 2164546 / 1550510
> > >>> iteratorDec: 0.336639 = 837728 / 2488504
> > >>> iteratorDiff: 0.313457 = 88552 / 282501
> > >>> iteratorSub: 0.165388 = 3880 / 23460
> > >>> iteratorAdd: 0.377644 = 27139 / 71864
> > >>> iteratorAddAssign: 0.551364 = 182383 / 330785
> > >>> iteratorArithmetic: 0.776766 = 4098582 / 5276467
> > >>> micros: 0.71282 = 160398 / 225019
> > >>>
> > >>> Distribution: mostly_decreasing(100000)
> > >>> Example: [20, 19, 20, 18, 18, 15, 13, 13, 14, 13, 10, 9, 10, 8, 5, 7,
> > >>> 6, 4, 3, 2]
> > >>>
> > >>> move: 0.760602 = 2197435 / 2889075
> > >>> moveAssign: 0.768473 = 4591313 / 5974593
> > >>> compare: 0.932913 = 977678 / 1047984
> > >>> iteratorDeref: 0.838641 = 6633169 / 7909424
> > >>> iteratorCopy: 0.829028 = 15447960 / 18633830
> > >>> iteratorEquals: 1.7009 = 2000686 / 1176251
> > >>> iteratorInc: 1.52771 = 3788203 / 2479657
> > >>> iteratorDec: 0.363843 = 1369990 / 3765332
> > >>> iteratorDiff: 0.489186 = 595472 / 1217270
> > >>> iteratorSub: 0.0877827 = 10925 / 124455
> > >>> iteratorAdd: 0.201991 = 69640 / 344767
> > >>> iteratorAddAssign: 1.32756 = 1133512 / 853830
> > >>> iteratorArithmetic: 0.900303 = 8968428 / 9961562
> > >>> micros: 0.846298 = 377504 / 446065
> > >>>
> > >>> Distribution: sawtooth_unique(100000)
> > >>> Example: [0, 3, 6, 9, 12, 15, 18, 1, 4, 7, 10, 13, 16, 19, 2, 5, 8,
> 11, 14, 17]
> > >>>
> > >>> move: 0.569953 = 2912465 / 5110007
> > >>> moveAssign: 0.577701 = 6012427 / 10407511
> > >>> compare: 0.824488 = 1604387 / 1945920
> > >>> iteratorDeref: 0.678979 = 9299794 / 13696724
> > >>> iteratorCopy: 0.72546 = 26157352 / 36056250
> > >>> iteratorEquals: 2.00683 = 4131766 / 2058856
> > >>> iteratorInc: 1.10728 = 6498257 / 5868688
> > >>> iteratorDec: 0.111225 = 552519 / 4967557
> > >>> iteratorDiff: 0.405328 = 988144 / 2437889
> > >>> iteratorSub: 0.026249 = 6327 / 241038
> > >>> iteratorAdd: 0.089034 = 55918 / 628052
> > >>> iteratorAddAssign: 1.18737 = 2297233 / 1934717
> > >>> iteratorArithmetic: 0.801143 = 14530164 / 18136797
> > >>> micros: 0.706282 = 583937 / 826776
> > >>>
> > >>> Distribution: sawtooth_dupes(100000)
> > >>> Example: [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]
> > >>>
> > >>> move: 0.547721 = 2348873 / 4288447
> > >>> moveAssign: 0.543252 = 4613817 / 8492965
> > >>> compare: 0.664642 = 841503 / 1266099
> > >>> iteratorDeref: 0.599656 = 6292282 / 10493159
> > >>> iteratorCopy: 0.60663 = 15336816 / 25281976
> > >>> iteratorEquals: 2.24803 = 2859760 / 1272118
> > >>> iteratorInc: 1.07142 = 4999546 / 4666276
> > >>> iteratorDec: 0.0490551 = 205953 / 4198405
> > >>> iteratorDiff: 0.314518 = 473157 / 1504390
> > >>> iteratorSub: 0.0371951 = 5998 / 161258
> > >>> iteratorAdd: 0.0875684 = 34672 / 395942
> > >>> iteratorAddAssign: 0.900009 = 1233077 / 1370072
> > >>> iteratorArithmetic: 0.72316 = 9812163 / 13568461
> > >>> micros: 0.610891 = 353102 / 578011
> > >>>
> > >>> Distribution: increasing_with_swap(100000)
> > >>> Example: [19, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
> 17, 18, 0]
> > >>>
> > >>> move: 1 = 291782 / 291782
> > >>> moveAssign: 1 = 491779 / 491779
> > >>> compare: 0.57449 = 225064 / 391763
> > >>> iteratorDeref: 0.819553 = 941922 / 1149312
> > >>> iteratorCopy: 0.377964 = 1392386 / 3683915
> > >>> iteratorEquals: 0.525758 = 117408 / 223312
> > >>> iteratorInc: 0.863708 = 333050 / 385605
> > >>> iteratorDec: 0.999753 = 291716 / 291788
> > >>> iteratorDiff: 0.224408 = 42528 / 189512
> > >>> iteratorSub: 0.888889 = 104 / 117
> > >>> iteratorAdd: 0.997151 = 16801 / 16849
> > >>> iteratorAddAssign: 0.302604 = 82864 / 273836
> > >>> iteratorArithmetic: 0.640448 = 884471 / 1381019
> > >>> micros: 0.616124 = 36233 / 58808
> > >>>
> > >>> Distribution: increasing_fast_slow(100000)
> > >>> Example: [0, 4, 8, 12, 0, 1, 2, 3, 5, 6, 7, 8, 10, 11, 12, 13, 15,
> 16, 17, 18]
> > >>>
> > >>> move: 0.640348 = 825772 / 1289568
> > >>> moveAssign: 0.627081 = 1559792 / 2487384
> > >>> compare: 0.714235 = 431286 / 603843
> > >>> iteratorDeref: 0.694913 = 2421875 / 3485151
> > >>> iteratorCopy: 0.637176 = 5645877 / 8860786
> > >>> iteratorEquals: 1.7552 = 953880 / 543458
> > >>> iteratorInc: 1.08941 = 1676609 / 1539005
> > >>> iteratorDec: 0.135904 = 154533 / 1137073
> > >>> iteratorDiff: 0.388132 = 192787 / 496705
> > >>> iteratorSub: 0.00577964 = 141 / 24396
> > >>> iteratorAdd: 0.207287 = 17272 / 83324
> > >>> iteratorAddAssign: 0.849015 = 431320 / 508024
> > >>> iteratorArithmetic: 0.790987 = 3426542 / 4331985
> > >>> micros: 0.777699 = 153552 / 197444
> > >>>
> > >>>
> > >>>
> > >>> #include <iostream>
> > >>> #include <stdio.h>
> > >>> #include <vector>
> > >>> #include <cstdlib>
> > >>> #include <algorithm>
> > >>> #include <functional>
> > >>> #include <ranges>
> > >>> #include <chrono>
> > >>> #include <cmath>
> > >>> #include <numeric>
> > >>>
> > >>> using namespace std;
> > >>>
> > >>> #define LINE0 0
> > >>> #define cout_line (cout << "line " << __LINE__ - LINE0 << ": ")
> > >>> #define vout(v) #v << " = " << (v)
> > >>> #define print1(v) cout_line << vout(v) << endl
> > >>> #define print2(v1, v2) cout_line << vout(v1) << ", " << vout(v2) <<
> endl
> > >>> #define print3(v1, v2, v3) cout_line << vout(v1) << ", " << vout(v2)
> > >>> << ", " << vout(v3) << endl
> > >>> #define GET_MACRO(_1,_2,_3,NAME,...) NAME
> > >>> #define print(...) GET_MACRO(__VA_ARGS__, print3, print2,
> print1)(__VA_ARGS__)
> > >>>
> > >>> template <class Ch, class Tr, class Container>
> > >>> basic_ostream <Ch, Tr> & operator << (basic_ostream <Ch, Tr> & os,
> > >>> Container const& x);
> > >>>
> > >>> template <class X, class Y>
> > >>> ostream & operator << (ostream & os, pair <X, Y> const& p) {
> > >>> return os << p.first << ":" << p.second;
> > >>> }
> > >>>
> > >>> template <class Ch, class Tr, class Container>
> > >>> basic_ostream <Ch, Tr> & operator << (basic_ostream <Ch, Tr> & os,
> > >>> Container const& x) {
> > >>> os << "[";
> > >>> bool first = true;
> > >>> for(auto const& y : x) {
> > >>> if (!first) os << ", ";
> > >>> os << y;
> > >>> first = false;
> > >>> }
> > >>> return os << "]";
> > >>> }
> > >>>
> > >>>
> > >>> struct Counter {
> > >>> int construct = 0;
> > >>> int copy = 0;
> > >>> int assign = 0;
> > >>> int move = 0;
> > >>> int moveAssign = 0;
> > >>> int compare = 0;
> > >>> int iteratorDeref = 0;
> > >>> int iteratorCopy = 0;
> > >>> int iteratorEquals = 0;
> > >>> int iteratorInc = 0;
> > >>> int iteratorDec = 0;
> > >>> int iteratorDiff = 0;
> > >>> int iteratorSub = 0;
> > >>> int iteratorAdd = 0;
> > >>> int iteratorAddAssign = 0;
> > >>> int iteratorArithmetic = 0;
> > >>> int micros = 0;
> > >>> };
> > >>>
> > >>> int next_element_id = 0;
> > >>>
> > >>> class Element {
> > >>> Counter* counter;
> > >>> public:
> > >>> int val;
> > >>> int id; // Distinguis identical values to verify stability.
> > >>> Element(int val, Counter* counter)
> > >>> : val(val), counter(counter), id(++next_element_id) {
> > >>> ++counter->construct;
> > >>> }
> > >>>
> > >>> Element(Element const& other)
> > >>> : val(other.val), counter(other.counter), id(other.id) {
> > >>> ++counter->copy;
> > >>> }
> > >>>
> > >>> Element& operator=(Element const& other) {
> > >>> val = other.val;
> > >>> counter = other.counter;
> > >>> id = other.id;
> > >>> ++counter->assign;
> > >>> return *this;
> > >>> }
> > >>>
> > >>> Element(Element&& other) noexcept
> > >>> : val(move(other.val)), counter(other.counter), id(other.id)
> {
> > >>> ++counter->move;
> > >>> }
> > >>>
> > >>> Element& operator=(Element&& other) noexcept {
> > >>> val = move(other.val);
> > >>> counter = other.counter;
> > >>> id = other.id;
> > >>> ++counter->moveAssign;
> > >>> return *this;
> > >>> }
> > >>>
> > >>> bool operator<(Element const& other) const {
> > >>> ++counter->compare;
> > >>> return val < other.val;
> > >>> }
> > >>>
> > >>> bool operator==(Element const& other) const {
> > >>> return val == other.val && id == other.id;
> > >>> }
> > >>> };
> > >>>
> > >>> ostream & operator << (ostream & os, Element const& e) {
> > >>> return os << e.val << "(" << e.id << ")";
> > >>> }
> > >>>
> > >>>
> > >>> class RandomIterator {
> > >>> typedef vector<Element>::iterator It;
> > >>> It it;
> > >>> Counter* counter;
> > >>>
> > >>> public:
> > >>> using iterator_category = std::random_access_iterator_tag;
> > >>> using difference_type = std::ptrdiff_t;
> > >>> using value_type = Element;
> > >>> using pointer = value_type*;
> > >>> using reference = value_type&;
> > >>>
> > >>> RandomIterator(It it, Counter* counter) : it(it),
> counter(counter) {
> > >>> }
> > >>>
> > >>> RandomIterator(RandomIterator const& other) : it(other.it),
> > >>> counter(other.counter) {
> > >>> counter->iteratorCopy += 1;
> > >>> }
> > >>>
> > >>> bool operator== (RandomIterator other) {
> > >>> counter->iteratorEquals += 1;
> > >>> counter->iteratorArithmetic += 1;
> > >>> return it == other.it;
> > >>> }
> > >>>
> > >>> RandomIterator operator++ () {
> > >>> counter->iteratorInc += 1;
> > >>> counter->iteratorArithmetic += 1;
> > >>> return RandomIterator(++it, counter);
> > >>> }
> > >>>
> > >>> RandomIterator operator-- () {
> > >>> counter->iteratorDec += 1;
> > >>> counter->iteratorArithmetic += 1;
> > >>> return RandomIterator(--it, counter);
> > >>> }
> > >>>
> > >>> Element& operator* () {
> > >>> counter->iteratorDeref += 1;
> > >>> return *it;
> > >>> }
> > >>>
> > >>> difference_type operator- (RandomIterator other) {
> > >>> counter->iteratorDiff += 1;
> > >>> counter->iteratorArithmetic += 1;
> > >>> return it - other.it;
> > >>> }
> > >>>
> > >>> RandomIterator operator- (difference_type diff) {
> > >>> counter->iteratorSub += 1;
> > >>> counter->iteratorArithmetic += 1;
> > >>> return RandomIterator(it - diff, counter);
> > >>> }
> > >>>
> > >>> RandomIterator operator+ (difference_type diff) {
> > >>> counter->iteratorAdd += 1;
> > >>> counter->iteratorArithmetic += 1;
> > >>> return RandomIterator(it + diff, counter);
> > >>> }
> > >>>
> > >>> RandomIterator operator+= (difference_type diff) {
> > >>> counter->iteratorAddAssign += 1;
> > >>> counter->iteratorArithmetic += 1;
> > >>> return RandomIterator(it += diff, counter);
> > >>> }
> > >>> };
> > >>>
> > >>>
> > >>> class TestCase {
> > >>> Counter counter;
> > >>> vector<Element> elements;
> > >>>
> > >>> public:
> > >>> TestCase(vector<int> const& data) {
> > >>> elements.reserve(data.size());
> > >>> for (int i: data) {
> > >>> elements.push_back(Element(i, &counter));
> > >>> }
> > >>> }
> > >>>
> > >>> Counter sort(
> > >>> void algo(RandomIterator, RandomIterator),
> > >>> vector<Element>::iterator begin,
> > >>> vector<Element>::iterator end) {
> > >>> counter = Counter();
> > >>> auto t0 = std::chrono::system_clock::now();
> > >>>
> > >>> algo(RandomIterator(begin, &counter), RandomIterator(end,
> &counter));
> > >>>
> > >>> auto t1 = std::chrono::system_clock::now();
> > >>> counter.micros =
> > >>> std::chrono::duration_cast<std::chrono::microseconds>(t1 -
> > >>> t0).count();
> > >>> return counter;
> > >>> }
> > >>>
> > >>> pair<Counter,Counter> benchmark(
> > >>> void new_algo(RandomIterator, RandomIterator),
> > >>> void cur_algo(RandomIterator, RandomIterator)) {
> > >>> vector<Element> toSort1 = elements;
> > >>> Counter counter1 = sort(new_algo, toSort1.begin(),
> toSort1.end());
> > >>>
> > >>> vector<Element> toSort2 = elements;
> > >>> Counter counter2 = sort(cur_algo, toSort2.begin(),
> toSort2.end());
> > >>>
> > >>> if (toSort1 != toSort2) {
> > >>> cout << "INCORRECT RESULT!" << endl;
> > >>> exit(1);
> > >>> }
> > >>>
> > >>> return make_pair(counter1, counter2);
> > >>> }
> > >>> };
> > >>>
> > >>>
> > >>> void relative_metric(string name, int m1, int m2) {
> > >>> if (m1 == 0 && m2 == 0) return;
> > >>> if (m2 == 0) {
> > >>> cout << name << ": inf = " << m1 << " / " << m2 << endl;
> > >>> } else {
> > >>> cout << name << ": " << m1 / (double) m2 << " = " << m1 << " /
> "
> > >>> << m2 << endl;
> > >>> }
> > >>> }
> > >>>
> > >>>
> > >>> void relative_report(Counter const& c1, Counter const& c2) {
> > >>> cout << endl;
> > >>> relative_metric("construct", c1.construct, c2.construct);
> > >>> relative_metric("copy", c1.copy, c2.copy);
> > >>> relative_metric("assign", c1.assign, c2.assign);
> > >>> relative_metric("move", c1.move, c2.move);
> > >>> relative_metric("moveAssign", c1.moveAssign, c2.moveAssign);
> > >>> relative_metric("compare", c1.compare, c2.compare);
> > >>> relative_metric("iteratorDeref", c1.iteratorDeref,
> c2.iteratorDeref);
> > >>> relative_metric("iteratorCopy", c1.iteratorCopy,
> c2.iteratorCopy);
> > >>> relative_metric("iteratorEquals", c1.iteratorEquals,
> c2.iteratorEquals);
> > >>> relative_metric("iteratorInc", c1.iteratorInc, c2.iteratorInc);
> > >>> relative_metric("iteratorDec", c1.iteratorDec, c2.iteratorDec);
> > >>> relative_metric("iteratorDiff", c1.iteratorDiff,
> c2.iteratorDiff);
> > >>> relative_metric("iteratorSub", c1.iteratorSub, c2.iteratorSub);
> > >>> relative_metric("iteratorAdd", c1.iteratorAdd, c2.iteratorAdd);
> > >>> relative_metric("iteratorAddAssign", c1.iteratorAddAssign,
> > >>> c2.iteratorAddAssign);
> > >>> relative_metric("iteratorArithmetic", c1.iteratorArithmetic,
> > >>> c2.iteratorArithmetic);
> > >>> relative_metric("micros", c1.micros, c2.micros);
> > >>> cout << endl;
> > >>> }
> > >>>
> > >>>
> > >>> /// This is a helper function for the merge routines.
> > >>> template<typename _BidirectionalIterator, typename _Distance,
> typename _Compare>
> > >>> void _my_merge_without_buffer(
> > >>> _BidirectionalIterator __first,
> > >>> _BidirectionalIterator __middle,
> > >>> _BidirectionalIterator __last,
> > >>> _Distance __len1, _Distance __len2,
> > >>> _Compare __comp) {
> > >>>
> > >>> if (__len1 == 0 || __len2 == 0)
> > >>> return;
> > >>>
> > >>> if (__len1 + __len2 == 2) {
> > >>> if (__comp(__middle, __first))
> > >>> std::iter_swap(__first, __middle);
> > >>> return;
> > >>> }
> > >>>
> > >>> _BidirectionalIterator __first_cut = __first;
> > >>> _BidirectionalIterator __second_cut = __middle;
> > >>> _Distance __len11 = 0;
> > >>> _Distance __len22 = 0;
> > >>>
> > >>>
> > >>> #define NEW_ALGO 1
> > >>>
> > >>> #ifdef NEW_ALGO
> > >>> // Prevent stack overflow
> > >>> if (__len1 > 4 * __len2 || __len2 > 4 * __len1) {
> > >>> #endif
> > >>> if (__len1 > __len2)
> > >>> {
> > >>> __len11 = __len1 / 2;
> > >>> std::advance(__first_cut, __len11);
> > >>> __second_cut
> > >>> = std::__lower_bound(__middle, __last, *__first_cut,
> > >>> __gnu_cxx::__ops::__iter_comp_val(__comp));
> > >>> __len22 = std::distance(__middle, __second_cut);
> > >>> }
> > >>> else
> > >>> {
> > >>> __len22 = __len2 / 2;
> > >>> std::advance(__second_cut, __len22);
> > >>> __first_cut
> > >>> = std::__upper_bound(__first, __middle, *__second_cut,
> > >>> __gnu_cxx::__ops::__val_comp_iter(__comp));
> > >>> __len11 = std::distance(__first, __first_cut);
> > >>> }
> > >>> #ifdef NEW_ALGO
> > >>> } else
> > >>> {
> > >>> _Distance __cut_range;
> > >>> // __cut_len is always equal to __middle - __first_cut (and
> eventually to
> > >>> // __second_cut - __middle) but we keep track of it to avoid
> calculating it
> > >>> // later which can be expensive for bidirectional iterators.
> > >>> _Distance __cut_len;
> > >>>
> > >>> if (__len1 > __len2)
> > >>> {
> > >>> std::advance(__first_cut, __len1 - __len2);
> > >>> __cut_range = __len2;
> > >>> }
> > >>> else
> > >>> {
> > >>> __cut_range = __len1;
> > >>> }
> > >>> __cut_len = __cut_range;
> > >>>
> > >>> while (__cut_range > 0) {
> > >>> _Distance __half_cut_range = __cut_range / 2;
> > >>> _BidirectionalIterator __mid_first_cut = __first_cut;
> > >>> std::advance(__mid_first_cut, __cut_range - __half_cut_range - 1);
> > >>> _BidirectionalIterator __mid_second_cut = __second_cut;
> > >>> std::advance(__mid_second_cut, __half_cut_range);
> > >>>
> > >>> if (__comp(__mid_second_cut, __mid_first_cut)) {
> > >>> __cut_range -= __half_cut_range + 1;
> > >>> __second_cut = __mid_second_cut;
> > >>> ++__second_cut;
> > >>> } else {
> > >>> __cut_len -= __cut_range - __half_cut_range;
> > >>> __cut_range = __half_cut_range;
> > >>> __first_cut = __mid_first_cut;
> > >>> ++__first_cut;
> > >>> }
> > >>> }
> > >>> __len22 = __cut_len;
> > >>> __len11 = __len1 - __cut_len;
> > >>> }
> > >>> #endif
> > >>>
> > >>> _BidirectionalIterator __new_middle
> > >>> = std::rotate(__first_cut, __middle, __second_cut);
> > >>> _my_merge_without_buffer(__first, __first_cut, __new_middle,
> > >>> __len11, __len22, __comp);
> > >>> _my_merge_without_buffer(__new_middle, __second_cut, __last,
> > >>> __len1 - __len11, __len2 - __len22, __comp);
> > >>> }
> > >>>
> > >>> /// This is a helper function for the stable sorting routines.
> > >>> template<typename _RandomAccessIterator, typename _Compare>
> > >>> void
> > >>> _my_inplace_stable_sort(_RandomAccessIterator __first,
> > >>> _RandomAccessIterator __last, _Compare __comp)
> > >>> {
> > >>>
> > >>> if (__last - __first < 15)
> > >>> {
> > >>> std::__insertion_sort(__first, __last, __comp);
> > >>> return;
> > >>> }
> > >>> _RandomAccessIterator __middle = __first + (__last - __first)
> / 2;
> > >>> _my_inplace_stable_sort(__first, __middle, __comp);
> > >>> _my_inplace_stable_sort(__middle, __last, __comp);
> > >>> _my_merge_without_buffer(__first, __middle, __last,
> > >>> __middle - __first,
> > >>> __last - __middle,
> > >>> __comp);
> > >>> }
> > >>>
> > >>>
> > >>> template<typename Iterator>
> > >>> void my_inplace_stable_sort(Iterator start, Iterator end) {
> > >>> _my_inplace_stable_sort(start, end,
> __gnu_cxx::__ops::__iter_less_iter());
> > >>> }
> > >>>
> > >>> template<typename Iterator>
> > >>> void std_inplace_stable_sort(Iterator start, Iterator end) {
> > >>> std::__inplace_stable_sort(start, end,
> __gnu_cxx::__ops::__iter_less_iter());
> > >>> }
> > >>>
> > >>> void benchmark(string const& dist_name,
> > >>> vector<int> dist_func(int),
> > >>> int n) {
> > >>> vector<int> v = dist_func(n);
> > >>> cout << "Distribution: " << dist_name << "(" << n << ")" << endl;
> > >>> cout << "Example: " << dist_func(20) << endl;
> > >>> TestCase testCase(v);
> > >>>
> > >>> pair<Counter, Counter> counter_pair =
> > >>> testCase.benchmark(
> > >>> my_inplace_stable_sort
> > >>> , std_inplace_stable_sort
> > >>> );
> > >>> Counter cKloet = counter_pair.first;
> > >>> Counter cInplace = counter_pair.second;
> > >>>
> > >>> relative_report(cKloet, cInplace);
> > >>> }
> > >>>
> > >>>
> > >>> vector<int> random_unique(int n) {
> > >>> vector<int> v(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = i;
> > >>> }
> > >>> for (int i = n - 1; i >= 0; --i) {
> > >>> swap(v[i], v[rand() % (i + 1)]);
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>> vector<int> random_dupes(int n) {
> > >>> vector<int> v(n);
> > >>> int r = (int)sqrt(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = i / r;
> > >>> }
> > >>> for (int i = n - 1; i >= 0; --i) {
> > >>> swap(v[i], v[rand() % (i + 1)]);
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>> vector<int> increasing_unique(int n) {
> > >>> vector<int> v(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = i;
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>>
> > >>> vector<int> increasing_dupes(int n) {
> > >>> vector<int> v(n);
> > >>> int r = (int)sqrt(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = i / r;
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>>
> > >>> vector<int> mostly_increasing(int n) {
> > >>> vector<int> v(n);
> > >>> int r = (int)sqrt(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = i + rand() % r;
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>>
> > >>> vector<int> decreasing_unique(int n) {
> > >>> vector<int> v(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = n - 1 - i;
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>> vector<int> decreasing_dupes(int n) {
> > >>> vector<int> v(n);
> > >>> int r = (int)sqrt(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = (n - 1 - i) / r;
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>> vector<int> mostly_decreasing(int n) {
> > >>> vector<int> v(n);
> > >>> int r = (int)sqrt(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = n - 1 - i + rand() % r;
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>> vector<int> sawtooth_unique(int n) {
> > >>> int r = (int)sqrt(n);
> > >>> while (gcd(r, n) != 1) ++r;
> > >>> vector<int> v(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[(i * r) % n] = i;
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>> vector<int> sawtooth_dupes(int n) {
> > >>> int r = (int)sqrt(n);
> > >>> vector<int> v(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = i % r;
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>> vector<int> increasing_with_swap(int n) {
> > >>> vector<int> v(n);
> > >>> for (int i = 0; i < n; ++i) {
> > >>> v[i] = i;
> > >>> }
> > >>> swap(v[0], v[n - 1]);
> > >>> return v;
> > >>> }
> > >>>
> > >>> vector<int> increasing_fast_slow(int n) {
> > >>> int r = (int)sqrt(n);
> > >>> vector<int> v(n);
> > >>> for (int i = 0; i < r; ++i) {
> > >>> v[i] = i * r;
> > >>> }
> > >>> for (int i = r; i < n; ++i) {
> > >>> v[i] = (i - r) * n / (n - r);
> > >>> }
> > >>> return v;
> > >>> }
> > >>>
> > >>>
> > >>> #define BENCHMARK(dist, n) benchmark(#dist, dist, n)
> > >>>
> > >>> int main() {
> > >>> int n = 100000;
> > >>> BENCHMARK(random_unique, n);
> > >>> BENCHMARK(random_dupes, n);
> > >>> BENCHMARK(increasing_unique, n);
> > >>> BENCHMARK(increasing_dupes, n);
> > >>> BENCHMARK(mostly_increasing, n);
> > >>> BENCHMARK(decreasing_unique, n);
> > >>> BENCHMARK(decreasing_dupes, n);
> > >>> BENCHMARK(mostly_decreasing, n);
> > >>> BENCHMARK(sawtooth_unique, n);
> > >>> BENCHMARK(sawtooth_dupes, n);
> > >>> BENCHMARK(increasing_with_swap, n);
> > >>> BENCHMARK(increasing_fast_slow, n);
> > >>>
> > >>> return 0;
> > >>> }
> > >>
> >
>
prev parent reply other threads:[~2021-12-10 15:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-18 12:58 David de Kloet
2021-11-24 18:33 ` François Dumont
2021-11-24 18:48 ` David de Kloet
2021-11-25 5:35 ` François Dumont
2021-11-25 19:51 ` David de Kloet
2021-12-10 15:31 ` David de Kloet [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKxRKidz=HXfnFbZSw8mXPuBt+83MfE=Y8G5iVEvrrVqP0wbbA@mail.gmail.com' \
--to=dskloet@gmail.com \
--cc=frs.dumont@gmail.com \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).