public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "François Dumont" <frs.dumont@gmail.com>
To: Huanghui Nie <nnnjkk@gmail.com>
Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] libstdc++: hashtable: No need to update before begin node in _M_remove_bucket_begin
Date: Mon, 22 Jan 2024 07:14:31 +0100	[thread overview]
Message-ID: <74164a9e-ae0b-4960-ae2c-ca84cd88eb91@gmail.com> (raw)
In-Reply-To: <CAOv15-XF6EbMe0f6oxO5Yq2h-FZb+Smd6oj5FMQKGPZLd1kYKg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6868 bytes --]

Thanks, nice result, I'll try to run the performance benchmarks that are 
coming with libstdc++ to see if they spot anything.

That's tests in testsuite/performance folder in case you want to have a 
try yourself.

François


On 18/01/2024 10:26, Huanghui Nie wrote:
>
> Yes, I have. I did a benchmark today.
>
> The conclusion is: the time consumption can be reduced by 0.4% ~ 1.2% 
> when unordered_set erase(begin()), and 1.2% ~ 2.4% when erase(begin(), 
> end()).
>
>
> My test environment:
>
> CPU: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 2393.365 MHz, 56 CPUs
>
> MEM: 256G
>
> OS: CentOS-8.2
>
> g++: gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)
>
> Compile flags: -O3 -std=c++17
>
>
> Test conclusion data (time taken to delete every 100 million elements):
>
> erase(begin()):
>
> |size of unordered_set |100 |1,000 |10,000|100,000 |1,000,000|10,000,000|
>
> |base time consuming (ms)|3827.736|3807.725|3830.168|3807.373|3798.713 
> |3854.168|
>
> |test time consuming (ms)|3783.406|3789.460|3791.146|3778.033|3783.494 
> |3808.137|
>
> |Time-consuming reduction|1.16% |0.48% |1.02% |0.77% |0.40%|1.19% |
>
> erase(begin(),end()):
>
> |size of unordered_set |100 |1,000 |10,000|100,000 |1,000,000|10,000,000|
>
> |base time consuming (ms)|2779.229|2768.550|2795.778|2767.385|2761.521 
> |2804.099|
>
> |test time consuming (ms)|2712.759|2726.578|2752.224|2732.140|2718.953 
> |2739.727|
>
> |Time-consuming reduction|2.39% |1.52% |1.56% |1.27% |1.54%|2.30% |
>
>
> Please see the attachment for test code and detailed test result.
>
>
> 2024年1月18日(木) 4:04 François Dumont <frs.dumont@gmail.com>:
>
>     Hi
>
>     Looks like a great finding to me, this is indeed a useless check,
>     thanks!
>
>     Have you any figures on the performance enhancement ? It might
>     help to get proper approval as gcc is currently in dev stage 4
>     that is to say only bug fixes normally.
>
>     François
>
>     On 17/01/2024 09:11, Huanghui Nie wrote:
>>
>>     Hi.
>>
>>     When I implemented a hash table with reference to the C++ STL, I
>>     found that when the hash table in the C++ STL deletes elements,
>>     if the first element deleted is the begin element, the before
>>     begin node is repeatedly assigned. This creates unnecessary
>>     performance overhead.
>>
>>
>>     First, let’s see the code implementation:
>>
>>     In _M_remove_bucket_begin, _M_before_begin._M_nxt is assigned
>>     when &_M_before_begin == _M_buckets[__bkt]. That also means
>>     _M_buckets[__bkt]->_M_nxt is assigned under some conditions.
>>
>>     _M_remove_bucket_begin is called by _M_erase and _M_extract_node:
>>
>>      1. Case _M_erase a range: _M_remove_bucket_begin is called in a
>>         for loop when __is_bucket_begin is true. And if
>>         __is_bucket_begin is true and &_M_before_begin ==
>>         _M_buckets[__bkt], __prev_n must be &_M_before_begin.
>>         __prev_n->_M_nxt is always assigned in _M_erase. That means
>>         _M_before_begin._M_nxt is always assigned, if
>>         _M_remove_bucket_begin is called and &_M_before_begin ==
>>         _M_buckets[__bkt]. So there’s no need to assign
>>         _M_before_begin._M_nxt in _M_remove_bucket_begin.
>>      2. Other cases: _M_remove_bucket_begin is called when __prev_n
>>         == _M_buckets[__bkt]. And __prev_n->_M_nxt is always assigned
>>         in _M_erase and _M_before_begin. That means
>>         _M_buckets[__bkt]->_M_nxt is always assigned. So there's no
>>         need to assign _M_buckets[__bkt]->_M_nxt in
>>         _M_remove_bucket_begin.
>>
>>     In summary, there’s no need to check &_M_before_begin ==
>>     _M_buckets[__bkt] and assign _M_before_begin._M_nxt in
>>     _M_remove_bucket_begin.
>>
>>
>>     Then let’s see the responsibility of each method:
>>
>>     The hash table in the C++ STL is composed of hash buckets and a
>>     node list. The update of the node list is responsible for
>>     _M_erase and _M_extract_node method. _M_remove_bucket_begin
>>     method only needs to update the hash buckets. The update of
>>     _M_before_begin belongs to the update of the node list. So
>>     _M_remove_bucket_begin doesn’t need to update _M_before_begin.
>>
>>
>>     Existing tests listed below cover this change:
>>
>>     23_containers/unordered_set/allocator/copy.cc
>>
>>     23_containers/unordered_set/allocator/copy_assign.cc
>>
>>     23_containers/unordered_set/allocator/move.cc
>>
>>     23_containers/unordered_set/allocator/move_assign.cc
>>
>>     23_containers/unordered_set/allocator/swap.cc
>>
>>     23_containers/unordered_set/erase/1.cc
>>
>>     23_containers/unordered_set/erase/24061-set.cc
>>
>>     23_containers/unordered_set/modifiers/extract.cc
>>
>>     23_containers/unordered_set/operations/count.cc
>>
>>     23_containers/unordered_set/requirements/exception/basic.cc
>>
>>     23_containers/unordered_map/allocator/copy.cc
>>
>>     23_containers/unordered_map/allocator/copy_assign.cc
>>
>>     23_containers/unordered_map/allocator/move.cc
>>
>>     23_containers/unordered_map/allocator/move_assign.cc
>>
>>     23_containers/unordered_map/allocator/swap.cc
>>
>>     23_containers/unordered_map/erase/1.cc
>>
>>     23_containers/unordered_map/erase/24061-map.cc
>>
>>     23_containers/unordered_map/modifiers/extract.cc
>>
>>     23_containers/unordered_map/modifiers/move_assign.cc
>>
>>     23_containers/unordered_map/operations/count.cc
>>
>>     23_containers/unordered_map/requirements/exception/basic.cc
>>
>>
>>     Regression tested on x86_64-pc-linux-gnu. Is it OK to commit?
>>
>>
>>     ---
>>
>>     ChangeLog:
>>
>>
>>     libstdc++: hashtable: No need to update before begin node in
>>     _M_remove_bucket_begin
>>
>>
>>     2024-01-16Huanghui Nie<nnnjkk@gmail.com>
>>
>>
>>     gcc/
>>
>>     * libstdc++-v3/include/bits/hashtable.h
>>
>>
>>     ---
>>
>>
>>     diff --git a/libstdc++-v3/include/bits/hashtable.h
>>     b/libstdc++-v3/include/bits/hashtable.h
>>
>>     index b48610036fa..6056639e663 100644
>>
>>     --- a/libstdc++-v3/include/bits/hashtable.h
>>
>>     +++ b/libstdc++-v3/include/bits/hashtable.h
>>
>>     @@ -872,13 +872,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>
>>           if (!__next_n || __next_bkt != __bkt)
>>
>>             {
>>
>>               // Bucket is now empty
>>
>>     -         // First update next bucket if any
>>
>>     +         // Update next bucket if any
>>
>>               if (__next_n)
>>
>>                 _M_buckets[__next_bkt] = _M_buckets[__bkt];
>>
>>     -         // Second update before begin node if necessary
>>
>>     -         if (&_M_before_begin == _M_buckets[__bkt])
>>
>>     -           _M_before_begin._M_nxt = __next_n;
>>
>>               _M_buckets[__bkt] = nullptr;
>>
>>             }
>>
>>         }
>>
>>

  reply	other threads:[~2024-01-22  6:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-17  8:11 Huanghui Nie
2024-01-17 20:04 ` François Dumont
2024-01-18  9:26   ` Huanghui Nie
2024-01-22  6:14     ` François Dumont [this message]
  -- strict thread matches above, loose matches on Subject: below --
2024-01-17  4:19 Huanghui Nie
2024-01-17  4:39 ` Sam James
2024-01-17  8:12   ` Huanghui Nie
2024-01-17  8:18     ` Jonathan Wakely
2024-01-17 10:30       ` Huanghui Nie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=74164a9e-ae0b-4960-ae2c-ca84cd88eb91@gmail.com \
    --to=frs.dumont@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libstdc++@gcc.gnu.org \
    --cc=nnnjkk@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).