From: Huanghui Nie <nnnjkk@gmail.com>
To: "François Dumont" <frs.dumont@gmail.com>
Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] libstdc++: hashtable: No need to update before begin node in _M_remove_bucket_begin
Date: Thu, 18 Jan 2024 17:26:08 +0800 [thread overview]
Message-ID: <CAOv15-XF6EbMe0f6oxO5Yq2h-FZb+Smd6oj5FMQKGPZLd1kYKg@mail.gmail.com> (raw)
In-Reply-To: <5b87ddea-ba56-4173-bda3-652f038dcacc@gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 6190 bytes --]
Yes, I have. I did a benchmark today.
The conclusion is: the time consumption can be reduced by 0.4% ~ 1.2% when
unordered_set erase(begin()), and 1.2% ~ 2.4% when erase(begin(), end()).
My test environment:
CPU: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 2393.365 MHz, 56 CPUs
MEM: 256G
OS: CentOS-8.2
g++: gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)
Compile flags: -O3 -std=c++17
Test conclusion data (time taken to delete every 100 million elements):
erase(begin()):
|size of unordered_set |100 |1,000 |10,000 |100,000
|1,000,000|10,000,000|
|base time consuming (ms)|3827.736|3807.725|3830.168|3807.373|3798.713
|3854.168 |
|test time consuming (ms)|3783.406|3789.460|3791.146|3778.033|3783.494
|3808.137 |
|Time-consuming reduction|1.16% |0.48% |1.02% |0.77% |0.40% |1.19%
|
erase(begin(),end()):
|size of unordered_set |100 |1,000 |10,000 |100,000
|1,000,000|10,000,000|
|base time consuming (ms)|2779.229|2768.550|2795.778|2767.385|2761.521
|2804.099 |
|test time consuming (ms)|2712.759|2726.578|2752.224|2732.140|2718.953
|2739.727 |
|Time-consuming reduction|2.39% |1.52% |1.56% |1.27% |1.54% |2.30%
|
Please see the attachment for test code and detailed test result.
2024年1月18日(木) 4:04 François Dumont <frs.dumont@gmail.com>:
> Hi
>
> Looks like a great finding to me, this is indeed a useless check, thanks!
>
> Have you any figures on the performance enhancement ? It might help to get
> proper approval as gcc is currently in dev stage 4 that is to say only bug
> fixes normally.
>
> François
> On 17/01/2024 09:11, Huanghui Nie wrote:
>
> Hi.
>
> When I implemented a hash table with reference to the C++ STL, I found
> that when the hash table in the C++ STL deletes elements, if the first
> element deleted is the begin element, the before begin node is repeatedly
> assigned. This creates unnecessary performance overhead.
>
>
> First, let’s see the code implementation:
>
> In _M_remove_bucket_begin, _M_before_begin._M_nxt is assigned when
> &_M_before_begin == _M_buckets[__bkt]. That also means
> _M_buckets[__bkt]->_M_nxt is assigned under some conditions.
>
> _M_remove_bucket_begin is called by _M_erase and _M_extract_node:
>
> 1. Case _M_erase a range: _M_remove_bucket_begin is called in a for
> loop when __is_bucket_begin is true. And if __is_bucket_begin is true and
> &_M_before_begin == _M_buckets[__bkt], __prev_n must be &_M_before_begin.
> __prev_n->_M_nxt is always assigned in _M_erase. That means
> _M_before_begin._M_nxt is always assigned, if _M_remove_bucket_begin is
> called and &_M_before_begin == _M_buckets[__bkt]. So there’s no need to
> assign _M_before_begin._M_nxt in _M_remove_bucket_begin.
> 2. Other cases: _M_remove_bucket_begin is called when __prev_n ==
> _M_buckets[__bkt]. And __prev_n->_M_nxt is always assigned in _M_erase and
> _M_before_begin. That means _M_buckets[__bkt]->_M_nxt is always assigned.
> So there's no need to assign _M_buckets[__bkt]->_M_nxt in
> _M_remove_bucket_begin.
>
> In summary, there’s no need to check &_M_before_begin == _M_buckets[__bkt]
> and assign _M_before_begin._M_nxt in _M_remove_bucket_begin.
>
>
> Then let’s see the responsibility of each method:
>
> The hash table in the C++ STL is composed of hash buckets and a node list.
> The update of the node list is responsible for _M_erase and _M_extract_node
> method. _M_remove_bucket_begin method only needs to update the hash
> buckets. The update of _M_before_begin belongs to the update of the node
> list. So _M_remove_bucket_begin doesn’t need to update _M_before_begin.
>
>
> Existing tests listed below cover this change:
>
> 23_containers/unordered_set/allocator/copy.cc
>
> 23_containers/unordered_set/allocator/copy_assign.cc
>
> 23_containers/unordered_set/allocator/move.cc
>
> 23_containers/unordered_set/allocator/move_assign.cc
>
> 23_containers/unordered_set/allocator/swap.cc
>
> 23_containers/unordered_set/erase/1.cc
>
> 23_containers/unordered_set/erase/24061-set.cc
>
> 23_containers/unordered_set/modifiers/extract.cc
>
> 23_containers/unordered_set/operations/count.cc
>
> 23_containers/unordered_set/requirements/exception/basic.cc
>
> 23_containers/unordered_map/allocator/copy.cc
>
> 23_containers/unordered_map/allocator/copy_assign.cc
>
> 23_containers/unordered_map/allocator/move.cc
>
> 23_containers/unordered_map/allocator/move_assign.cc
>
> 23_containers/unordered_map/allocator/swap.cc
>
> 23_containers/unordered_map/erase/1.cc
>
> 23_containers/unordered_map/erase/24061-map.cc
>
> 23_containers/unordered_map/modifiers/extract.cc
>
> 23_containers/unordered_map/modifiers/move_assign.cc
>
> 23_containers/unordered_map/operations/count.cc
>
> 23_containers/unordered_map/requirements/exception/basic.cc
>
>
> Regression tested on x86_64-pc-linux-gnu. Is it OK to commit?
>
>
> ---
>
> ChangeLog:
>
>
> libstdc++: hashtable: No need to update before begin node in
> _M_remove_bucket_begin
>
>
> 2024-01-16 Huanghui Nie <nnnjkk@gmail.com>
>
>
> gcc/
>
> * libstdc++-v3/include/bits/hashtable.h
>
>
> ---
>
>
> diff --git a/libstdc++-v3/include/bits/hashtable.h
> b/libstdc++-v3/include/bits/hashtable.h
>
> index b48610036fa..6056639e663 100644
>
> --- a/libstdc++-v3/include/bits/hashtable.h
>
> +++ b/libstdc++-v3/include/bits/hashtable.h
>
> @@ -872,13 +872,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
> if (!__next_n || __next_bkt != __bkt)
>
> {
>
> // Bucket is now empty
>
> - // First update next bucket if any
>
> + // Update next bucket if any
>
> if (__next_n)
>
> _M_buckets[__next_bkt] = _M_buckets[__bkt];
>
>
>
> - // Second update before begin node if necessary
>
> - if (&_M_before_begin == _M_buckets[__bkt])
>
> - _M_before_begin._M_nxt = __next_n;
>
> _M_buckets[__bkt] = nullptr;
>
> }
>
> }
>
>
[-- Attachment #2: test1.cc --]
[-- Type: application/octet-stream, Size: 630 bytes --]
#include <cstdio>
#include <cstdlib>
#include <unordered_set>
#include <chrono>
int main(int argc, char** argv) {
int size = ::atoi(argv[1]);
int loop = ::atoi(argv[2]);
std::unordered_set<int> data;
data.reserve(size);
double t;
for (int k = 0; k < loop; ++k) {
for (int i = 0; i < size; ++i) {
data.emplace(i);
}
auto t1 = std::chrono::steady_clock::now();
for (int i = 0; i < size; ++i) {
data.erase(data.begin());
}
auto t2 = std::chrono::steady_clock::now();
t += std::chrono::duration<double, std::micro>(t2 - t1).count();
}
printf("Cost: %lf\n", t);
return 0;
}
[-- Attachment #3: test2.cc --]
[-- Type: application/octet-stream, Size: 597 bytes --]
#include <cstdio>
#include <cstdlib>
#include <unordered_set>
#include <chrono>
int main(int argc, char** argv) {
int size = ::atoi(argv[1]);
int loop = ::atoi(argv[2]);
std::unordered_set<int> data;
data.reserve(size);
double t;
for (int k = 0; k < loop; ++k) {
for (int i = 0; i < size; ++i) {
data.emplace(i);
}
auto t1 = std::chrono::steady_clock::now();
data.erase(data.begin(), data.end());
auto t2 = std::chrono::steady_clock::now();
t += std::chrono::duration<double, std::micro>(t2 - t1).count();
}
printf("Cost: %lf\n", t);
return 0;
}
[-- Attachment #4: result.csv --]
[-- Type: text/csv, Size: 6212 bytes --]
hashtable version,"test type(1=erase(begin() 2=erase(begin(), end())",size of unordered_set,loop,cost time1,cost time2,cost time3,cost time4,cost time5,cost time6,cost time7,cost time8,cost time9,cost time10,cost time11,cost time12,cost time13,cost time14,cost time15,cost time16,cost time17,cost time18,cost time19,cost time20,cost time21,cost time22,cost time23,cost time24,average costtime
base,1,100,1000000,3823.493,3822.681,3823.487,3826.009,3827.419,3815.043,3824.576,3827.587,3823.115,3822.625,3828.422,3823.058,3826.549,3873.806,3815.655,3853.773,3816.512,3826.372,3807.932,3852.067,3820.759,3824.035,3833.077,3827.616,3827.736
test,1,100,1000000,3771.571,3769.245,3784.259,3770.356,3799.412,3773.946,3811.564,3770.921,3776.988,3814.076,3782.95,3796.984,3770.042,3783.934,3822.614,3786.986,3799.942,3769.783,3771.954,3767.856,3771.71,3771.4,3787.584,3775.674,3783.406
base,2,100,1000000,2809.389,2778.206,2776.783,2806.46,2773.297,2753.868,2779.01,2782.715,2801.758,2776.475,2775.62,2786.181,2772.866,2779.171,2776.704,2779.79,2778.756,2780.544,2783.637,2754.423,2780.021,2778.152,2758.39,2779.289,2779.229
test,2,100,1000000,2695.99,2706.163,2741.074,2702.181,2711.039,2697.633,2706.972,2734.776,2705.539,2704.209,2709.499,2727.339,2737.397,2748.108,2737.865,2705.216,2709.018,2706.612,2712.392,2702.284,2707.86,2685.334,2711.586,2700.123,2712.759
base,1,1000,100000,3797.022,3801.701,3845.955,3800.277,3799.269,3804.899,3798.337,3797.996,3803.812,3797.084,3799.005,3803.165,3795.779,3886.368,3805.567,3801.1,3796.808,3799.167,3801.139,3824.756,3797.933,3824.186,3799.616,3804.458,3807.725
test,1,1000,100000,3791.256,3790.226,3787.149,3787.383,3842.439,3787.531,3789.466,3788.159,3790.282,3777.703,3788.607,3788.634,3789.289,3783.478,3785.203,3796.558,3787.165,3789.253,3739.095,3809.067,3790.257,3789.134,3788.45,3791.262,3789.46
base,2,1000,100000,2770.653,2766.165,2766.432,2771.019,2764.841,2773.394,2768.809,2771.287,2766.246,2777.166,2769.634,2765.984,2766.127,2754.672,2768.229,2772.901,2769.123,2773.159,2774.906,2764.569,2766.496,2765.35,2767.608,2770.423,2768.55
test,2,1000,100000,2716.778,2714.804,2715.655,2720.923,2719.634,2731.495,2718.936,2742.902,2699.984,2723.99,2720.647,2724.48,2738.964,2720.266,2733.468,2715.76,2718.717,2716.562,2725.198,2717.34,2825.85,2732.884,2720.978,2721.668,2726.578
base,1,10000,10000,3822.724,3822.136,3832.718,3823.444,3822.066,3824.938,3845.033,3823.653,3825.097,3821.607,3856.073,3821.575,3821.405,3824.923,3821.77,3823.008,3821.993,3839.306,3820.877,3925.253,3817.731,3821.818,3821.173,3823.716,3830.168
test,1,10000,10000,3799.764,3785.985,3788.534,3789.304,3786.5,3788.131,3825.623,3785.362,3784.366,3801.078,3784.956,3784.937,3822.219,3785.595,3785.953,3788.438,3784.868,3791.152,3786.138,3786.018,3790.52,3790.296,3786.592,3785.18,3791.146
base,2,10000,10000,2792.236,2792.593,2793.781,2794.741,2794.203,2805.638,2795.744,2791.699,2816.771,2792.745,2791.476,2797.161,2790.68,2805.948,2790.781,2790.321,2790.225,2807.69,2790.333,2788.867,2796.099,2796.358,2795.142,2797.44,2795.778
test,2,10000,10000,2746.432,2742.164,2745.393,2768.938,2770.194,2747.573,2747.745,2745.096,2743.483,2744.135,2743.616,2811.89,2749.38,2745.868,2745.868,2744.948,2760.844,2743.756,2746.124,2745.905,2746.125,2774.749,2745.862,2747.297,2752.224
base,1,100000,1000,3803.815,3801.374,3805.723,3838.825,3807.848,3804.506,3809.31,3808.968,3809.804,3803.585,3804.104,3803.752,3833.906,3801.922,3801.311,3808.528,3801.909,3809.198,3800.951,3807.704,3802.195,3808.516,3806.399,3792.792,3807.373
test,1,100000,1000,3772.25,3773.281,3775.894,3773.233,3779.474,3776.377,3772.346,3776.045,3774.53,3777.022,3771.48,3774.368,3774.361,3774.061,3776.991,3784.857,3793.553,3773.227,3773.481,3806.12,3780.02,3791.458,3776.513,3771.853,3778.033
base,2,100000,1000,2774.227,2765.314,2770.277,2767.069,2763.217,2760.879,2770.374,2768.014,2765.732,2773.391,2765.237,2764.888,2766.775,2768.556,2772.143,2766.144,2765.849,2759.546,2768.855,2766.845,2766.124,2769.056,2768.17,2770.565,2767.385
test,2,100000,1000,2762.534,2688.875,2727.942,2732.512,2731.866,2730.009,2732.819,2731.785,2726.283,2730.371,2725.973,2737.078,2731.83,2730.165,2734.018,2727.441,2730.356,2729.162,2729.638,2781.079,2729.196,2730.84,2728.868,2730.714,2732.14
base,1,1000000,100,3795.995,3794.453,3798.35,3795.408,3797.449,3793.136,3799.852,3799.073,3795.84,3794.06,3796.174,3792.995,3795.656,3808.393,3795.261,3796.692,3794.845,3794.827,3793.469,3834.264,3796.209,3792.741,3816.559,3797.411,3798.713
test,1,1000000,100,3731.346,3779.127,3787.058,3763.331,3782.874,3787.526,3783.998,3794.579,3787.743,3780.275,3780.013,3778.24,3779.863,3785.267,3782.577,3795.678,3868.659,3782.761,3781.478,3781.01,3781.462,3780.29,3781.546,3767.153,3783.494
base,2,1000000,100,2761.417,2764.684,2761.706,2749.641,2759.581,2764.313,2760.867,2754.61,2760.823,2768.231,2759.123,2759.991,2759.151,2758.963,2765.045,2760.949,2762.091,2762.359,2770.512,2761.514,2759.618,2758.706,2769.368,2763.245,2761.521
test,2,1000000,100,2713.671,2715.116,2723.414,2713.218,2713.316,2726.65,2714.696,2688.076,2789.947,2712.222,2710.506,2691.592,2713.17,2712.399,2714.737,2713.679,2713.862,2713.766,2712.854,2724.32,2713.091,2713.861,2781.472,2715.245,2718.953
base,1,10000000,10,3840.799,3822.04,3823.061,3823.658,3822.138,4065.106,3836.565,3826.96,3821.787,3819.953,3822.437,3823.611,3823.497,4050.641,3820.995,3827.454,4061.5,3820.556,3824.042,3822.966,3822.598,3828.905,3822.462,3826.311,3854.168
test,1,10000000,10,3794.528,3794.328,3795.333,3798.696,3769.481,3793.053,3873.152,3799.613,3796.982,3798.711,3795.175,3795.941,4031.049,3809.053,3793.369,3793.605,3793.4,3793.058,3795.602,3794.969,3796.661,3797.499,3794.753,3797.278,3808.137
base,2,10000000,10,2789.121,2788.516,2776.764,2787.776,2789.507,2790.068,2777.693,2792.21,2790.414,2802.279,2789.662,2789.86,2785.881,2975.997,2788.857,2959.528,2784.704,2789.618,2790.713,2788.469,2788.762,2793.355,2797.491,2791.119,2804.099
test,2,10000000,10,2736.393,2740.613,2738.415,2736.739,2746.083,2750.259,2739.898,2736.715,2756.323,2740.862,2738.368,2717.114,2750.023,2739.277,2736.757,2737.928,2734.841,2736.417,2740.543,2739.228,2741.903,2740.973,2738.099,2739.668,2739.727
next prev parent reply other threads:[~2024-01-18 9:26 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-17 8:11 Huanghui Nie
2024-01-17 20:04 ` François Dumont
2024-01-18 9:26 ` Huanghui Nie [this message]
2024-01-22 6:14 ` François Dumont
-- strict thread matches above, loose matches on Subject: below --
2024-01-17 4:19 Huanghui Nie
2024-01-17 4:39 ` Sam James
2024-01-17 8:12 ` Huanghui Nie
2024-01-17 8:18 ` Jonathan Wakely
2024-01-17 10:30 ` Huanghui Nie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOv15-XF6EbMe0f6oxO5Yq2h-FZb+Smd6oj5FMQKGPZLd1kYKg@mail.gmail.com \
--to=nnnjkk@gmail.com \
--cc=frs.dumont@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).