From: chang jc <r97922153@gmail.com>
To: Jason Merrill <jason@redhat.com>
Cc: gcc-patches List <gcc-patches@gcc.gnu.org>,
"libstdc++" <libstdc++@gcc.gnu.org>
Subject: Re: [C++ PATCH] Speed up inplace_merge algorithm & fix inefficient logic(PR c++/83938)
Date: Thu, 25 Jan 2018 23:10:00 -0000 [thread overview]
Message-ID: <CALnYPH__r8ragj4j0nAY2DKOUFZafYoEEbsjJmMp0Cw-3x=-2w@mail.gmail.com> (raw)
In-Reply-To: <CADzB+2kDq7A_h_+YdgLHQsqO+rhZabndHzxLmqQ5+bf3jHobxw@mail.gmail.com>
Hi:
1. The __len = (__len + 1) / 2; is as you suggested, need to modify as
__len = (__len == 1) ? 0 : ((__len + 1) / 2);
2. The coding gain has been shown PR c++/83938; I re-post here
21
20
19
18
17
16
0.471136
0.625695
0.767262
0.907461
1.04838
1.19508
0.340845
0.48651
0.639139
0.770133
0.898454
1.04632
it means when Merge [0, 4325376, 16777216); A is a sorted integer with
4325376 & B with 12451840 elements, total with 16M integers
The proposed method has the speed up under given buffer size =, ex
2^16, 2^17, ... 2^21 in unit of sizeof(int), for example, 2^16 means
given sizof(int)*64K bytes.
3. As your suggestion, _TmpBuf __buf should be rewrite.
4. It represents a fact that the intuitive idea to split from larger
part is wrong.
For example, if you have an input sorted array A & B, A has 8 integers
& B has 24 integers. Given tmp buffer whose capacity = 4 integers.
Current it tries to split from B, right?
Then we have:
A1 | A2 B1 | B2
B1 & B2 has 12 integers each, right?
Current algorithm selects pivot as 13th integer from B, right?
If the corresponding upper bound of A is 6th integer.
Then it split in
A1 = 5 | A2 = 3 | B1 = 12 | B2 = 12
After rotate, we have two arrays to merge
[A1 = 5 | B1 = 12] & [A2 = 3 | B2 = 12]
Great, [A2 = 3 | B2 = 12] can use tmp buffer to merge.
Sadly, [A1 = 5 | B1 = 12] CANNOT.
So we do rotate again, split & merge the two split arrays from [A1 = 5
| B1 = 12] again.
But wait, if we always split from the smaller one instead of larger one.
After rotate, it promises two split arrays both contain ceiling[small/2].
Since tmp buffer also allocate by starting from sizeof(small) &
recursively downgrade by ceiling[small/2^(# of fail allocate)].
It means the allocated tmp buffer promises to be sufficient at the
level of (# of fail allocate).
Instead, you can see if split from large at level (# of fail allocate)
several split array still CANNOT use tmp buffer to do buffered merge.
As you know, buffered merge is far speed then (split, rotate, and
merge two sub-arrays) (PR c++/83938 gives the profiling figures),
the way should provide speedup.
Thanks.
On 24/01/2018 18:23, François Dumont wrote:
Hi
It sounds like a very sensitive change to make but nothing worth figures.
Do you have any bench showing the importance of the gain ?
At least the memory usage optimization is obvious.
On 19/01/2018 10:43, chang jc wrote:
Current std::inplace_merge() suffers from performance issue by inefficient
logic under limited memory,
It leads to performance downgrade.
Please help to review it.
Index: include/bits/stl_algo.h
===================================================================
--- include/bits/stl_algo.h (revision 256871)
+++ include/bits/stl_algo.h (working copy)
@@ -2437,7 +2437,7 @@
_BidirectionalIterator __second_cut = __middle;
_Distance __len11 = 0;
_Distance __len22 = 0;
- if (__len1 > __len2)
+ if (__len1 < __len2)
{
__len11 = __len1 / 2;
std::advance(__first_cut, __len11);
@@ -2539,9 +2539,15 @@
const _DistanceType __len1 = std::distance(__first, __middle);
const _DistanceType __len2 = std::distance(__middle, __last);
+
typedef _Temporary_buffer<_BidirectionalIterator, _ValueType>
_TmpBuf;
- _TmpBuf __buf(__first, __last);
-
+ _BidirectionalIterator __start, __end;
+ if (__len1 < __len2) {
+ __start = __first; __end = __middle;
+ } else {
+ __start = __middle; __end = __last;
+ }
+ _TmpBuf __buf(__start, ___end);
Note another optimization, we could introduce a _Temporary_buffer<> constructor
in order to write:
_TmpBuf __buf(std::min(__len1, __len2), __first);
So that std::distance do not need to be called again.
if (__buf.begin() == 0)
std::__merge_without_buffer
(__first, __middle, __last, __len1, __len2, __comp);
Index: include/bits/stl_tempbuf.h
===================================================================
--- include/bits/stl_tempbuf.h (revision 256871)
+++ include/bits/stl_tempbuf.h (working copy)
@@ -95,7 +95,7 @@
std::nothrow));
if (__tmp != 0)
return std::pair<_Tp*, ptrdiff_t>(__tmp, __len);
- __len /= 2;
+ __len = (__len + 1) / 2;
This part is problematic, if __len is 1 and allocation fails it will loop
forever.
It doesn't seem really necessary for your patch.
2018-01-20 4:05 GMT+08:00 Jason Merrill <jason@redhat.com>:
> This is a libstdc++ bug and patch, not the C++ front end. So I'm
> adding the libstdc++ list to CC.
>
> On Fri, Jan 19, 2018 at 3:02 AM, chang jc <r97922153@gmail.com> wrote:
> > Current std::inplace_merge() suffers from performance issue by
> inefficient
> > logic under limited memory,
> >
> > It leads to performance downgrade.
> >
> > Please help to review it.
> >
> > Index: include/bits/stl_algo.h
> > ===================================================================
> > --- include/bits/stl_algo.h (revision 256871)
> > +++ include/bits/stl_algo.h (working copy)
> > @@ -2437,7 +2437,7 @@
> > _BidirectionalIterator __second_cut = __middle;
> > _Distance __len11 = 0;
> > _Distance __len22 = 0;
> > - if (__len1 > __len2)
> > + if (__len1 < __len2)
> > {
> > __len11 = __len1 / 2;
> > std::advance(__first_cut, __len11);
> > @@ -2539,9 +2539,15 @@
> > const _DistanceType __len1 = std::distance(__first, __middle);
> > const _DistanceType __len2 = std::distance(__middle, __last);
> >
> > +
> > typedef _Temporary_buffer<_BidirectionalIterator, _ValueType>
> _TmpBuf;
> > - _TmpBuf __buf(__first, __last);
> > -
> > + _BidirectionalIterator __start, __end;
> > + if (__len1 < __len2) {
> > + __start = __first; __end = __middle;
> > + } else {
> > + __start = __middle; __end = __last;
> > + }
> > + _TmpBuf __buf(__start, ___end);
> > if (__buf.begin() == 0)
> > std::__merge_without_buffer
> > (__first, __middle, __last, __len1, __len2, __comp);
> > Index: include/bits/stl_tempbuf.h
> > ===================================================================
> > --- include/bits/stl_tempbuf.h (revision 256871)
> > +++ include/bits/stl_tempbuf.h (working copy)
> > @@ -95,7 +95,7 @@
> > std::nothrow));
> > if (__tmp != 0)
> > return std::pair<_Tp*, ptrdiff_t>(__tmp, __len);
> > - __len /= 2;
> > + __len = (__len + 1) / 2;
> > }
> > return std::pair<_Tp*, ptrdiff_t>(static_cast<_Tp*>(0), 0);
> > }
> >
> >
> >
> >
> > Thanks
>
next prev parent reply other threads:[~2018-01-25 22:37 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-19 8:44 chang jc
2018-01-19 20:45 ` Jason Merrill
2018-01-25 23:10 ` chang jc [this message]
2018-01-30 22:43 ` François Dumont
2018-06-06 16:39 ` François Dumont
2018-07-24 10:22 ` [C++ PATCH] Speed up inplace_merge algorithm & fix inefficient logic(PR libstdc++/83938) François Dumont
2018-08-21 20:34 ` François Dumont
2018-10-29 8:55 ` François Dumont
2018-12-21 21:03 ` Jonathan Wakely
2019-06-09 14:27 ` François Dumont
2019-07-16 16:41 ` François Dumont
2020-11-19 12:08 ` [C++ PATCH] Speed up inplace_merge algorithm & fix inefficient logic(PR libstdc++/83938)# Jonathan Wakely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALnYPH__r8ragj4j0nAY2DKOUFZafYoEEbsjJmMp0Cw-3x=-2w@mail.gmail.com' \
--to=r97922153@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jason@redhat.com \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).