On Mon, 22 Jan 2024, 21:17 François Dumont, <frs.dumont@gmail.com> wrote:

> Hi
>
> Here is what I compiled from your proposals with this commit message:
>
> Author: Huanghui Nie <nnnjkk@gmail.com>
> Date:   Mon Jan 22 06:45:48 2024 +0100
>
>      libstdc++: [_Hashtable] Remove useless check for _M_before_begin
>
>      When removing the first node of a bucket it is useless to check if
> this bucket
>      is the one containing the _M_before_begin node. The bucket
> before-begin node is
>      already transfered to the next pointed-to bucket regardeless if it
> is the container
>      before-begin node.
>
>      libstdc++-v3/ChangeLog:
>
>              * include/bits/hashtable.h
> (_Hahstable<>::_M_remove_bucket_begin): Remove
>              _M_before_begin check and cleanup implementation.
>
>      Co-authored-by: Théo Papadopoulo <papadopoulo@gmail.com>
>
> Let me know if ok to commit ?
>

OK, thanks



> François
>
>
> On 22/01/2024 14:42, Théo Papadopoulo wrote:
> > On 1/22/24 14:29, Jonathan Wakely wrote:
> >> On Mon, 22 Jan 2024 at 13:22, Théo Papadopoulo
> >> <papadopoulo@gmail.com> wrote:
> >>> On 1/22/24 11:07, Jonathan Wakely wrote:
> >>>> On Thu, 18 Jan 2024 at 15:59, Théo Papadopoulo
> >>>> <papadopoulo@gmail.com> wrote:
> >>>>> Wouldn't it be clearer if written as:
> >>>>>
> >>>>> if (!__next_n) {
> >>>>>       _M_buckets[__bkt] = nullptr;
> >>>>> } else if (__next_bkt != __bkt) {
> >>>>>       _M_buckets[__next_bkt] = _M_buckets[__bkt];
> >>>>>       _M_buckets[__bkt] = nullptr;
> >>>>> }
> >>>>>
> >>>>> I think this is strictly equivalent and clearer.
> >>>> But it does two branches, when the second one isn't necessary. If
> >>>> __next_bkt == __bkt the assignment is harmless.
> >>>>
> >>>> I think it's much clearer the way Huanghui Nie originally suggested.
> >>> But there are two branches in the original code as well (the ifs are
> >>> just nested)...
> >>> And they are sequential in both cases (with the !next_n on the
> >>> faster path).
> >>>    The condition are just a bit simpler in the proposed new form.
> >>
> >> Ah, yes, so you mean also replacing the earlier condition:
> >>          if (!__next_n || __next_bkt != __bkt)
> >>
> >> I was only looking at the changed code not the earlier context.
> >> /facepalm
> >>
> >> Yes, looking at the full context I like your suggestion.
> >
> > Similarly, with your response, I too understand better your answer !!!
> >
> >
> >>
> >>> Anyway, as a (the) maintainer, your opinion is the most important
> >>> and if
> >>> you are more at ease
> >>> with one form than the other, so be it.
> >>>
> >>>       Theo.
> >>>
> >>>
> >>>
> >>>>>       Theo.
> >>>>>
> >>>>>
> >>>>> On 1/18/24 10:26, Huanghui Nie wrote:
> >>>>>
> >>>>> Yes, I have. I did a benchmark today.
> >>>>>
> >>>>> The conclusion is: the time consumption can be reduced by 0.4% ~
> >>>>> 1.2% when unordered_set erase(begin()), and 1.2% ~ 2.4% when
> >>>>> erase(begin(), end()).
> >>>>>
> >>>>>
> >>>>> My test environment:
> >>>>>
> >>>>> CPU: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 2393.365 MHz, 56 CPUs
> >>>>>
> >>>>> MEM: 256G
> >>>>>
> >>>>> OS: CentOS-8.2
> >>>>>
> >>>>> g++: gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)
> >>>>>
> >>>>> Compile flags: -O3 -std=c++17
> >>>>>
> >>>>>
> >>>>> Test conclusion data (time taken to delete every 100 million
> >>>>> elements):
> >>>>>
> >>>>> erase(begin()):
> >>>>>
> >>>>> |size of unordered_set   |100     |1,000   |10,000 |100,000
> >>>>> |1,000,000|10,000,000|
> >>>>>
> >>>>> |base time consuming
> >>>>> (ms)|3827.736|3807.725|3830.168|3807.373|3798.713 |3854.168  |
> >>>>>
> >>>>> |test time consuming
> >>>>> (ms)|3783.406|3789.460|3791.146|3778.033|3783.494 |3808.137  |
> >>>>>
> >>>>> |Time-consuming reduction|1.16%   |0.48%   |1.02% |0.77%
> >>>>> |0.40%    |1.19%     |
> >>>>>
> >>>>> erase(begin(),end()):
> >>>>>
> >>>>> |size of unordered_set   |100     |1,000   |10,000 |100,000
> >>>>> |1,000,000|10,000,000|
> >>>>>
> >>>>> |base time consuming
> >>>>> (ms)|2779.229|2768.550|2795.778|2767.385|2761.521 |2804.099  |
> >>>>>
> >>>>> |test time consuming
> >>>>> (ms)|2712.759|2726.578|2752.224|2732.140|2718.953 |2739.727  |
> >>>>>
> >>>>> |Time-consuming reduction|2.39%   |1.52%   |1.56% |1.27%
> >>>>> |1.54%    |2.30%     |
> >>>>>
> >>>>>
> >>>>> Please see the attachment for test code and detailed test result.
> >>>>>
> >>>>>
> >>>>> 2024年1月18日(木) 4:04 François Dumont <frs.dumont@gmail.com>:
> >>>>>> Hi
> >>>>>>
> >>>>>> Looks like a great finding to me, this is indeed a useless check,
> >>>>>> thanks!
> >>>>>>
> >>>>>> Have you any figures on the performance enhancement ? It might
> >>>>>> help to get proper approval as gcc is currently in dev stage 4
> >>>>>> that is to say only bug fixes normally.
> >>>>>>
> >>>>>> François
> >>>>>>
> >>>>>> On 17/01/2024 09:11, Huanghui Nie wrote:
> >>>>>>
> >>>>>> Hi.
> >>>>>>
> >>>>>> When I implemented a hash table with reference to the C++ STL, I
> >>>>>> found that when the hash table in the C++ STL deletes elements,
> >>>>>> if the first element deleted is the begin element, the before
> >>>>>> begin node is repeatedly assigned. This creates unnecessary
> >>>>>> performance overhead.
> >>>>>>
> >>>>>>
> >>>>>> First, let’s see the code implementation:
> >>>>>>
> >>>>>> In _M_remove_bucket_begin, _M_before_begin._M_nxt is assigned
> >>>>>> when &_M_before_begin == _M_buckets[__bkt]. That also means
> >>>>>> _M_buckets[__bkt]->_M_nxt is assigned under some conditions.
> >>>>>>
> >>>>>> _M_remove_bucket_begin is called by _M_erase and _M_extract_node:
> >>>>>>
> >>>>>> Case _M_erase a range: _M_remove_bucket_begin is called in a for
> >>>>>> loop when __is_bucket_begin is true. And if __is_bucket_begin is
> >>>>>> true and &_M_before_begin == _M_buckets[__bkt], __prev_n must be
> >>>>>> &_M_before_begin. __prev_n->_M_nxt is always assigned in
> >>>>>> _M_erase. That means _M_before_begin._M_nxt is always assigned,
> >>>>>> if _M_remove_bucket_begin is called and &_M_before_begin ==
> >>>>>> _M_buckets[__bkt]. So there’s no need to assign
> >>>>>> _M_before_begin._M_nxt in _M_remove_bucket_begin.
> >>>>>> Other cases: _M_remove_bucket_begin is called when __prev_n ==
> >>>>>> _M_buckets[__bkt]. And __prev_n->_M_nxt is always assigned in
> >>>>>> _M_erase and _M_before_begin. That means
> >>>>>> _M_buckets[__bkt]->_M_nxt is always assigned. So there's no need
> >>>>>> to assign _M_buckets[__bkt]->_M_nxt in _M_remove_bucket_begin.
> >>>>>>
> >>>>>> In summary, there’s no need to check &_M_before_begin ==
> >>>>>> _M_buckets[__bkt] and assign _M_before_begin._M_nxt in
> >>>>>> _M_remove_bucket_begin.
> >>>>>>
> >>>>>>
> >>>>>> Then let’s see the responsibility of each method:
> >>>>>>
> >>>>>> The hash table in the C++ STL is composed of hash buckets and a
> >>>>>> node list. The update of the node list is responsible for
> >>>>>> _M_erase and _M_extract_node method. _M_remove_bucket_begin
> >>>>>> method only needs to update the hash buckets. The update of
> >>>>>> _M_before_begin belongs to the update of the node list. So
> >>>>>> _M_remove_bucket_begin doesn’t need to update _M_before_begin.
> >>>>>>
> >>>>>>
> >>>>>> Existing tests listed below cover this change:
> >>>>>>
> >>>>>> 23_containers/unordered_set/allocator/copy.cc
> >>>>>>
> >>>>>> 23_containers/unordered_set/allocator/copy_assign.cc
> >>>>>>
> >>>>>> 23_containers/unordered_set/allocator/move.cc
> >>>>>>
> >>>>>> 23_containers/unordered_set/allocator/move_assign.cc
> >>>>>>
> >>>>>> 23_containers/unordered_set/allocator/swap.cc
> >>>>>>
> >>>>>> 23_containers/unordered_set/erase/1.cc
> >>>>>>
> >>>>>> 23_containers/unordered_set/erase/24061-set.cc
> >>>>>>
> >>>>>> 23_containers/unordered_set/modifiers/extract.cc
> >>>>>>
> >>>>>> 23_containers/unordered_set/operations/count.cc
> >>>>>>
> >>>>>> 23_containers/unordered_set/requirements/exception/basic.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/allocator/copy.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/allocator/copy_assign.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/allocator/move.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/allocator/move_assign.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/allocator/swap.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/erase/1.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/erase/24061-map.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/modifiers/extract.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/modifiers/move_assign.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/operations/count.cc
> >>>>>>
> >>>>>> 23_containers/unordered_map/requirements/exception/basic.cc
> >>>>>>
> >>>>>>
> >>>>>> Regression tested on x86_64-pc-linux-gnu. Is it OK to commit?
> >>>>>>
> >>>>>>
> >>>>>> ---
> >>>>>>
> >>>>>> ChangeLog:
> >>>>>>
> >>>>>>
> >>>>>> libstdc++: hashtable: No need to update before begin node in
> >>>>>> _M_remove_bucket_begin
> >>>>>>
> >>>>>>
> >>>>>> 2024-01-16  Huanghui Nie  <nnnjkk@gmail.com>
> >>>>>>
> >>>>>>
> >>>>>> gcc/
> >>>>>>
> >>>>>> * libstdc++-v3/include/bits/hashtable.h
> >>>>>>
> >>>>>>
> >>>>>> ---
> >>>>>>
> >>>>>>
> >>>>>> diff --git a/libstdc++-v3/include/bits/hashtable.h
> >>>>>> b/libstdc++-v3/include/bits/hashtable.h
> >>>>>>
> >>>>>> index b48610036fa..6056639e663 100644
> >>>>>>
> >>>>>> --- a/libstdc++-v3/include/bits/hashtable.h
> >>>>>>
> >>>>>> +++ b/libstdc++-v3/include/bits/hashtable.h
> >>>>>>
> >>>>>> @@ -872,13 +872,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>>>>>
> >>>>>>           if (!__next_n || __next_bkt != __bkt)
> >>>>>>
> >>>>>>             {
> >>>>>>
> >>>>>>               // Bucket is now empty
> >>>>>>
> >>>>>> -           // First update next bucket if any
> >>>>>>
> >>>>>> +           // Update next bucket if any
> >>>>>>
> >>>>>>               if (__next_n)
> >>>>>>
> >>>>>>                 _M_buckets[__next_bkt] = _M_buckets[__bkt];
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -           // Second update before begin node if necessary
> >>>>>>
> >>>>>> -           if (&_M_before_begin == _M_buckets[__bkt])
> >>>>>>
> >>>>>> -             _M_before_begin._M_nxt = __next_n;
> >>>>>>
> >>>>>>               _M_buckets[__bkt] = nullptr;
> >>>>>>
> >>>>>>             }
> >>>>>>
> >>>>>>          }
> >>>>>>
> >>>>>>
> >