On Mon, 22 Jan 2024, 21:17 François Dumont, wrote: > Hi > > Here is what I compiled from your proposals with this commit message: > > Author: Huanghui Nie > Date: Mon Jan 22 06:45:48 2024 +0100 > > libstdc++: [_Hashtable] Remove useless check for _M_before_begin > > When removing the first node of a bucket it is useless to check if > this bucket > is the one containing the _M_before_begin node. The bucket > before-begin node is > already transfered to the next pointed-to bucket regardeless if it > is the container > before-begin node. > > libstdc++-v3/ChangeLog: > > * include/bits/hashtable.h > (_Hahstable<>::_M_remove_bucket_begin): Remove > _M_before_begin check and cleanup implementation. > > Co-authored-by: Théo Papadopoulo > > Let me know if ok to commit ? > OK, thanks > François > > > On 22/01/2024 14:42, Théo Papadopoulo wrote: > > On 1/22/24 14:29, Jonathan Wakely wrote: > >> On Mon, 22 Jan 2024 at 13:22, Théo Papadopoulo > >> wrote: > >>> On 1/22/24 11:07, Jonathan Wakely wrote: > >>>> On Thu, 18 Jan 2024 at 15:59, Théo Papadopoulo > >>>> wrote: > >>>>> Wouldn't it be clearer if written as: > >>>>> > >>>>> if (!__next_n) { > >>>>> _M_buckets[__bkt] = nullptr; > >>>>> } else if (__next_bkt != __bkt) { > >>>>> _M_buckets[__next_bkt] = _M_buckets[__bkt]; > >>>>> _M_buckets[__bkt] = nullptr; > >>>>> } > >>>>> > >>>>> I think this is strictly equivalent and clearer. > >>>> But it does two branches, when the second one isn't necessary. If > >>>> __next_bkt == __bkt the assignment is harmless. > >>>> > >>>> I think it's much clearer the way Huanghui Nie originally suggested. > >>> But there are two branches in the original code as well (the ifs are > >>> just nested)... > >>> And they are sequential in both cases (with the !next_n on the > >>> faster path). > >>> The condition are just a bit simpler in the proposed new form. > >> > >> Ah, yes, so you mean also replacing the earlier condition: > >> if (!__next_n || __next_bkt != __bkt) > >> > >> I was only looking at the changed code not the earlier context. > >> /facepalm > >> > >> Yes, looking at the full context I like your suggestion. > > > > Similarly, with your response, I too understand better your answer !!! > > > > > >> > >>> Anyway, as a (the) maintainer, your opinion is the most important > >>> and if > >>> you are more at ease > >>> with one form than the other, so be it. > >>> > >>> Theo. > >>> > >>> > >>> > >>>>> Theo. > >>>>> > >>>>> > >>>>> On 1/18/24 10:26, Huanghui Nie wrote: > >>>>> > >>>>> Yes, I have. I did a benchmark today. > >>>>> > >>>>> The conclusion is: the time consumption can be reduced by 0.4% ~ > >>>>> 1.2% when unordered_set erase(begin()), and 1.2% ~ 2.4% when > >>>>> erase(begin(), end()). > >>>>> > >>>>> > >>>>> My test environment: > >>>>> > >>>>> CPU: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 2393.365 MHz, 56 CPUs > >>>>> > >>>>> MEM: 256G > >>>>> > >>>>> OS: CentOS-8.2 > >>>>> > >>>>> g++: gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC) > >>>>> > >>>>> Compile flags: -O3 -std=c++17 > >>>>> > >>>>> > >>>>> Test conclusion data (time taken to delete every 100 million > >>>>> elements): > >>>>> > >>>>> erase(begin()): > >>>>> > >>>>> |size of unordered_set |100 |1,000 |10,000 |100,000 > >>>>> |1,000,000|10,000,000| > >>>>> > >>>>> |base time consuming > >>>>> (ms)|3827.736|3807.725|3830.168|3807.373|3798.713 |3854.168 | > >>>>> > >>>>> |test time consuming > >>>>> (ms)|3783.406|3789.460|3791.146|3778.033|3783.494 |3808.137 | > >>>>> > >>>>> |Time-consuming reduction|1.16% |0.48% |1.02% |0.77% > >>>>> |0.40% |1.19% | > >>>>> > >>>>> erase(begin(),end()): > >>>>> > >>>>> |size of unordered_set |100 |1,000 |10,000 |100,000 > >>>>> |1,000,000|10,000,000| > >>>>> > >>>>> |base time consuming > >>>>> (ms)|2779.229|2768.550|2795.778|2767.385|2761.521 |2804.099 | > >>>>> > >>>>> |test time consuming > >>>>> (ms)|2712.759|2726.578|2752.224|2732.140|2718.953 |2739.727 | > >>>>> > >>>>> |Time-consuming reduction|2.39% |1.52% |1.56% |1.27% > >>>>> |1.54% |2.30% | > >>>>> > >>>>> > >>>>> Please see the attachment for test code and detailed test result. > >>>>> > >>>>> > >>>>> 2024年1月18日(木) 4:04 François Dumont : > >>>>>> Hi > >>>>>> > >>>>>> Looks like a great finding to me, this is indeed a useless check, > >>>>>> thanks! > >>>>>> > >>>>>> Have you any figures on the performance enhancement ? It might > >>>>>> help to get proper approval as gcc is currently in dev stage 4 > >>>>>> that is to say only bug fixes normally. > >>>>>> > >>>>>> François > >>>>>> > >>>>>> On 17/01/2024 09:11, Huanghui Nie wrote: > >>>>>> > >>>>>> Hi. > >>>>>> > >>>>>> When I implemented a hash table with reference to the C++ STL, I > >>>>>> found that when the hash table in the C++ STL deletes elements, > >>>>>> if the first element deleted is the begin element, the before > >>>>>> begin node is repeatedly assigned. This creates unnecessary > >>>>>> performance overhead. > >>>>>> > >>>>>> > >>>>>> First, let’s see the code implementation: > >>>>>> > >>>>>> In _M_remove_bucket_begin, _M_before_begin._M_nxt is assigned > >>>>>> when &_M_before_begin == _M_buckets[__bkt]. That also means > >>>>>> _M_buckets[__bkt]->_M_nxt is assigned under some conditions. > >>>>>> > >>>>>> _M_remove_bucket_begin is called by _M_erase and _M_extract_node: > >>>>>> > >>>>>> Case _M_erase a range: _M_remove_bucket_begin is called in a for > >>>>>> loop when __is_bucket_begin is true. And if __is_bucket_begin is > >>>>>> true and &_M_before_begin == _M_buckets[__bkt], __prev_n must be > >>>>>> &_M_before_begin. __prev_n->_M_nxt is always assigned in > >>>>>> _M_erase. That means _M_before_begin._M_nxt is always assigned, > >>>>>> if _M_remove_bucket_begin is called and &_M_before_begin == > >>>>>> _M_buckets[__bkt]. So there’s no need to assign > >>>>>> _M_before_begin._M_nxt in _M_remove_bucket_begin. > >>>>>> Other cases: _M_remove_bucket_begin is called when __prev_n == > >>>>>> _M_buckets[__bkt]. And __prev_n->_M_nxt is always assigned in > >>>>>> _M_erase and _M_before_begin. That means > >>>>>> _M_buckets[__bkt]->_M_nxt is always assigned. So there's no need > >>>>>> to assign _M_buckets[__bkt]->_M_nxt in _M_remove_bucket_begin. > >>>>>> > >>>>>> In summary, there’s no need to check &_M_before_begin == > >>>>>> _M_buckets[__bkt] and assign _M_before_begin._M_nxt in > >>>>>> _M_remove_bucket_begin. > >>>>>> > >>>>>> > >>>>>> Then let’s see the responsibility of each method: > >>>>>> > >>>>>> The hash table in the C++ STL is composed of hash buckets and a > >>>>>> node list. The update of the node list is responsible for > >>>>>> _M_erase and _M_extract_node method. _M_remove_bucket_begin > >>>>>> method only needs to update the hash buckets. The update of > >>>>>> _M_before_begin belongs to the update of the node list. So > >>>>>> _M_remove_bucket_begin doesn’t need to update _M_before_begin. > >>>>>> > >>>>>> > >>>>>> Existing tests listed below cover this change: > >>>>>> > >>>>>> 23_containers/unordered_set/allocator/copy.cc > >>>>>> > >>>>>> 23_containers/unordered_set/allocator/copy_assign.cc > >>>>>> > >>>>>> 23_containers/unordered_set/allocator/move.cc > >>>>>> > >>>>>> 23_containers/unordered_set/allocator/move_assign.cc > >>>>>> > >>>>>> 23_containers/unordered_set/allocator/swap.cc > >>>>>> > >>>>>> 23_containers/unordered_set/erase/1.cc > >>>>>> > >>>>>> 23_containers/unordered_set/erase/24061-set.cc > >>>>>> > >>>>>> 23_containers/unordered_set/modifiers/extract.cc > >>>>>> > >>>>>> 23_containers/unordered_set/operations/count.cc > >>>>>> > >>>>>> 23_containers/unordered_set/requirements/exception/basic.cc > >>>>>> > >>>>>> 23_containers/unordered_map/allocator/copy.cc > >>>>>> > >>>>>> 23_containers/unordered_map/allocator/copy_assign.cc > >>>>>> > >>>>>> 23_containers/unordered_map/allocator/move.cc > >>>>>> > >>>>>> 23_containers/unordered_map/allocator/move_assign.cc > >>>>>> > >>>>>> 23_containers/unordered_map/allocator/swap.cc > >>>>>> > >>>>>> 23_containers/unordered_map/erase/1.cc > >>>>>> > >>>>>> 23_containers/unordered_map/erase/24061-map.cc > >>>>>> > >>>>>> 23_containers/unordered_map/modifiers/extract.cc > >>>>>> > >>>>>> 23_containers/unordered_map/modifiers/move_assign.cc > >>>>>> > >>>>>> 23_containers/unordered_map/operations/count.cc > >>>>>> > >>>>>> 23_containers/unordered_map/requirements/exception/basic.cc > >>>>>> > >>>>>> > >>>>>> Regression tested on x86_64-pc-linux-gnu. Is it OK to commit? > >>>>>> > >>>>>> > >>>>>> --- > >>>>>> > >>>>>> ChangeLog: > >>>>>> > >>>>>> > >>>>>> libstdc++: hashtable: No need to update before begin node in > >>>>>> _M_remove_bucket_begin > >>>>>> > >>>>>> > >>>>>> 2024-01-16 Huanghui Nie > >>>>>> > >>>>>> > >>>>>> gcc/ > >>>>>> > >>>>>> * libstdc++-v3/include/bits/hashtable.h > >>>>>> > >>>>>> > >>>>>> --- > >>>>>> > >>>>>> > >>>>>> diff --git a/libstdc++-v3/include/bits/hashtable.h > >>>>>> b/libstdc++-v3/include/bits/hashtable.h > >>>>>> > >>>>>> index b48610036fa..6056639e663 100644 > >>>>>> > >>>>>> --- a/libstdc++-v3/include/bits/hashtable.h > >>>>>> > >>>>>> +++ b/libstdc++-v3/include/bits/hashtable.h > >>>>>> > >>>>>> @@ -872,13 +872,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >>>>>> > >>>>>> if (!__next_n || __next_bkt != __bkt) > >>>>>> > >>>>>> { > >>>>>> > >>>>>> // Bucket is now empty > >>>>>> > >>>>>> - // First update next bucket if any > >>>>>> > >>>>>> + // Update next bucket if any > >>>>>> > >>>>>> if (__next_n) > >>>>>> > >>>>>> _M_buckets[__next_bkt] = _M_buckets[__bkt]; > >>>>>> > >>>>>> > >>>>>> > >>>>>> - // Second update before begin node if necessary > >>>>>> > >>>>>> - if (&_M_before_begin == _M_buckets[__bkt]) > >>>>>> > >>>>>> - _M_before_begin._M_nxt = __next_n; > >>>>>> > >>>>>> _M_buckets[__bkt] = nullptr; > >>>>>> > >>>>>> } > >>>>>> > >>>>>> } > >>>>>> > >>>>>> > >