From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 87B5D3858D20 for ; Mon, 22 Jan 2024 13:30:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 87B5D3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 87B5D3858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705930207; cv=none; b=Q4rQJUGGAGGZIFrKpD1uCJMy6CFRQyHbVaDydKNmPTfdsE4eT4D07BuWiUv5nJTlMcUb5puA8iiXMdSzp+fk5/z0GAvNB+rlSq1hA7L2LnJ9lCdC1j9wjyH06PSllgpr2J5E6wp4fqxSzAhag3Dwtlym28mr8cxDIiB2IETXaj4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705930207; c=relaxed/simple; bh=0kXnpqP2+vOtbukY0h1Mig0Ptl8Hqmsgle8WEpvdGnI=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=l5cLiZ5yNOLTq9TOzBQ7uldcTnrDSjLO4W1Kb807twQInEwD4OBvmdAFKmjUpADKf198EIJ7QBkVMjD7y0X7Bhl17Oc+LJafnLOETt7ef2qWJO04qdiJzTGL1fBSSgCQwV3KOWzl8kvN+IwmmxJGI9Y8yH9YL4MjfG7ubnMAqKc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705930205; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fyzUKRN0AB1wiqIhS00Z+eiWwRKbmKRhOHqh2KIgTL0=; b=NxAG5N5nTZD5B3//rD0Yl4a7Qh1RcSiRB3BmU8DvqkuKy/bHfqJFa6eBkRh1uWHjneNU03 9QqYujsmGd3+sPImZeKVs7DT4fQWFjPI9PIUskmEhffgLmoHsDGxoy83vbE3kgjaY/I5lL sKVljIS+rIPmfZaX3DtCtABXqPX3tu4= Received: from mail-yw1-f198.google.com (mail-yw1-f198.google.com [209.85.128.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-259--tc4sLZ8OYiEgajDb10pjA-1; Mon, 22 Jan 2024 08:30:03 -0500 X-MC-Unique: -tc4sLZ8OYiEgajDb10pjA-1 Received: by mail-yw1-f198.google.com with SMTP id 00721157ae682-5f6c12872fbso48604737b3.1 for ; Mon, 22 Jan 2024 05:30:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705930203; x=1706535003; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fyzUKRN0AB1wiqIhS00Z+eiWwRKbmKRhOHqh2KIgTL0=; b=Xee1Brq97STI5RJVgUG9VjeIUfklYqfml3IDS9PlWxmAkDwzk+LrAj2MCVk/AhS2mO UrfPognVUYtV4oDpRqKddML5ugcwivNqVWzIiaJFpytbXQQBp/Dq8yDloQ26lABRIFyP +GtWMSMWuDrbHrhSuoyKxATRbJCfksfwDv8mxx9zxTBFAn/3CMcvYuf0TfkZUsD3cDUa SDN9Vzm3j4+9c7XBIU7/Fx3EsqgOSkqLhYrc8SWhD8UIgwbiteJe4O1I+BzdAMfbODXR d1p94Z6c5h+1DnoGyYUdKUrQbiN5DkDjX62Ro1PKCYGtPL/0EMsu9tK/AJTlofLpH3H2 Eq/Q== X-Gm-Message-State: AOJu0YyFgyBFVS9hgyz5Q08q3KWHgkrgsysxASAMZHYXxLs8AM8mDpcA af2e0/vZ68F2m3jix7pD9Pbxn+nDIqw2iNTZP+itz3IIRCgcdkEsrIM1JdwTc57425dLpIX+h0v /YZjBY1kLGDfOsk1OdnlKBmnvsnjwqw+rkVm9MW2+2lIvi1xvuz2542BG1hYSXYHXCl/Nd2Fgyf xU0YOeA0DCyd62c8ncc0CmgQqLdno= X-Received: by 2002:a25:aa83:0:b0:dc2:65da:d3af with SMTP id t3-20020a25aa83000000b00dc265dad3afmr2354124ybi.65.1705930203109; Mon, 22 Jan 2024 05:30:03 -0800 (PST) X-Google-Smtp-Source: AGHT+IH7vydvglcng58JwA6KPgwPAlT5SKAfiCExIGTGpjHLx6W5oem1DqFhRc6Xr+126HAk4RAl+t6L2UqJ6rl//QM= X-Received: by 2002:a25:aa83:0:b0:dc2:65da:d3af with SMTP id t3-20020a25aa83000000b00dc265dad3afmr2354119ybi.65.1705930202764; Mon, 22 Jan 2024 05:30:02 -0800 (PST) MIME-Version: 1.0 References: <5b87ddea-ba56-4173-bda3-652f038dcacc@gmail.com> <3915ea0c-a086-4b73-bda0-f81fbcbf63ca@gmail.com> In-Reply-To: <3915ea0c-a086-4b73-bda0-f81fbcbf63ca@gmail.com> From: Jonathan Wakely Date: Mon, 22 Jan 2024 13:29:46 +0000 Message-ID: Subject: Re: [PATCH] libstdc++: hashtable: No need to update before begin node in _M_remove_bucket_begin To: =?UTF-8?Q?Th=C3=A9o_Papadopoulo?= Cc: libstdc++@gcc.gnu.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 22 Jan 2024 at 13:22, Th=C3=A9o Papadopoulo = wrote: > > On 1/22/24 11:07, Jonathan Wakely wrote: > > On Thu, 18 Jan 2024 at 15:59, Th=C3=A9o Papadopoulo wrote: > >> Wouldn't it be clearer if written as: > >> > >> if (!__next_n) { > >> _M_buckets[__bkt] =3D nullptr; > >> } else if (__next_bkt !=3D __bkt) { > >> _M_buckets[__next_bkt] =3D _M_buckets[__bkt]; > >> _M_buckets[__bkt] =3D nullptr; > >> } > >> > >> I think this is strictly equivalent and clearer. > > But it does two branches, when the second one isn't necessary. If > > __next_bkt =3D=3D __bkt the assignment is harmless. > > > > I think it's much clearer the way Huanghui Nie originally suggested. > > But there are two branches in the original code as well (the ifs are > just nested)... > And they are sequential in both cases (with the !next_n on the faster pat= h). > The condition are just a bit simpler in the proposed new form. Ah, yes, so you mean also replacing the earlier condition: if (!__next_n || __next_bkt !=3D __bkt) I was only looking at the changed code not the earlier context. /facepalm Yes, looking at the full context I like your suggestion. > > Anyway, as a (the) maintainer, your opinion is the most important and if > you are more at ease > with one form than the other, so be it. > > Theo. > > > > > > >> Theo. > >> > >> > >> On 1/18/24 10:26, Huanghui Nie wrote: > >> > >> Yes, I have. I did a benchmark today. > >> > >> The conclusion is: the time consumption can be reduced by 0.4% ~ 1.2% = when unordered_set erase(begin()), and 1.2% ~ 2.4% when erase(begin(), end(= )). > >> > >> > >> My test environment: > >> > >> CPU: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 2393.365 MHz, 56 CPUs > >> > >> MEM: 256G > >> > >> OS: CentOS-8.2 > >> > >> g++: gcc version 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC) > >> > >> Compile flags: -O3 -std=3Dc++17 > >> > >> > >> Test conclusion data (time taken to delete every 100 million elements)= : > >> > >> erase(begin()): > >> > >> |size of unordered_set |100 |1,000 |10,000 |100,000 |1,000,00= 0|10,000,000| > >> > >> |base time consuming (ms)|3827.736|3807.725|3830.168|3807.373|3798.713= |3854.168 | > >> > >> |test time consuming (ms)|3783.406|3789.460|3791.146|3778.033|3783.494= |3808.137 | > >> > >> |Time-consuming reduction|1.16% |0.48% |1.02% |0.77% |0.40% = |1.19% | > >> > >> erase(begin(),end()): > >> > >> |size of unordered_set |100 |1,000 |10,000 |100,000 |1,000,00= 0|10,000,000| > >> > >> |base time consuming (ms)|2779.229|2768.550|2795.778|2767.385|2761.521= |2804.099 | > >> > >> |test time consuming (ms)|2712.759|2726.578|2752.224|2732.140|2718.953= |2739.727 | > >> > >> |Time-consuming reduction|2.39% |1.52% |1.56% |1.27% |1.54% = |2.30% | > >> > >> > >> Please see the attachment for test code and detailed test result. > >> > >> > >> 2024=E5=B9=B41=E6=9C=8818=E6=97=A5(=E6=9C=A8) 4:04 Fran=C3=A7ois Dumon= t : > >>> Hi > >>> > >>> Looks like a great finding to me, this is indeed a useless check, tha= nks! > >>> > >>> Have you any figures on the performance enhancement ? It might help t= o get proper approval as gcc is currently in dev stage 4 that is to say onl= y bug fixes normally. > >>> > >>> Fran=C3=A7ois > >>> > >>> On 17/01/2024 09:11, Huanghui Nie wrote: > >>> > >>> Hi. > >>> > >>> When I implemented a hash table with reference to the C++ STL, I foun= d that when the hash table in the C++ STL deletes elements, if the first el= ement deleted is the begin element, the before begin node is repeatedly ass= igned. This creates unnecessary performance overhead. > >>> > >>> > >>> First, let=E2=80=99s see the code implementation: > >>> > >>> In _M_remove_bucket_begin, _M_before_begin._M_nxt is assigned when &_= M_before_begin =3D=3D _M_buckets[__bkt]. That also means _M_buckets[__bkt]-= >_M_nxt is assigned under some conditions. > >>> > >>> _M_remove_bucket_begin is called by _M_erase and _M_extract_node: > >>> > >>> Case _M_erase a range: _M_remove_bucket_begin is called in a for loop= when __is_bucket_begin is true. And if __is_bucket_begin is true and &_M_b= efore_begin =3D=3D _M_buckets[__bkt], __prev_n must be &_M_before_begin. __= prev_n->_M_nxt is always assigned in _M_erase. That means _M_before_begin._= M_nxt is always assigned, if _M_remove_bucket_begin is called and &_M_befor= e_begin =3D=3D _M_buckets[__bkt]. So there=E2=80=99s no need to assign _M_b= efore_begin._M_nxt in _M_remove_bucket_begin. > >>> Other cases: _M_remove_bucket_begin is called when __prev_n =3D=3D _M= _buckets[__bkt]. And __prev_n->_M_nxt is always assigned in _M_erase and _M= _before_begin. That means _M_buckets[__bkt]->_M_nxt is always assigned. So = there's no need to assign _M_buckets[__bkt]->_M_nxt in _M_remove_bucket_beg= in. > >>> > >>> In summary, there=E2=80=99s no need to check &_M_before_begin =3D=3D = _M_buckets[__bkt] and assign _M_before_begin._M_nxt in _M_remove_bucket_beg= in. > >>> > >>> > >>> Then let=E2=80=99s see the responsibility of each method: > >>> > >>> The hash table in the C++ STL is composed of hash buckets and a node = list. The update of the node list is responsible for _M_erase and _M_extrac= t_node method. _M_remove_bucket_begin method only needs to update the hash = buckets. The update of _M_before_begin belongs to the update of the node li= st. So _M_remove_bucket_begin doesn=E2=80=99t need to update _M_before_begi= n. > >>> > >>> > >>> Existing tests listed below cover this change: > >>> > >>> 23_containers/unordered_set/allocator/copy.cc > >>> > >>> 23_containers/unordered_set/allocator/copy_assign.cc > >>> > >>> 23_containers/unordered_set/allocator/move.cc > >>> > >>> 23_containers/unordered_set/allocator/move_assign.cc > >>> > >>> 23_containers/unordered_set/allocator/swap.cc > >>> > >>> 23_containers/unordered_set/erase/1.cc > >>> > >>> 23_containers/unordered_set/erase/24061-set.cc > >>> > >>> 23_containers/unordered_set/modifiers/extract.cc > >>> > >>> 23_containers/unordered_set/operations/count.cc > >>> > >>> 23_containers/unordered_set/requirements/exception/basic.cc > >>> > >>> 23_containers/unordered_map/allocator/copy.cc > >>> > >>> 23_containers/unordered_map/allocator/copy_assign.cc > >>> > >>> 23_containers/unordered_map/allocator/move.cc > >>> > >>> 23_containers/unordered_map/allocator/move_assign.cc > >>> > >>> 23_containers/unordered_map/allocator/swap.cc > >>> > >>> 23_containers/unordered_map/erase/1.cc > >>> > >>> 23_containers/unordered_map/erase/24061-map.cc > >>> > >>> 23_containers/unordered_map/modifiers/extract.cc > >>> > >>> 23_containers/unordered_map/modifiers/move_assign.cc > >>> > >>> 23_containers/unordered_map/operations/count.cc > >>> > >>> 23_containers/unordered_map/requirements/exception/basic.cc > >>> > >>> > >>> Regression tested on x86_64-pc-linux-gnu. Is it OK to commit? > >>> > >>> > >>> --- > >>> > >>> ChangeLog: > >>> > >>> > >>> libstdc++: hashtable: No need to update before begin node in _M_remov= e_bucket_begin > >>> > >>> > >>> 2024-01-16 Huanghui Nie > >>> > >>> > >>> gcc/ > >>> > >>> * libstdc++-v3/include/bits/hashtable.h > >>> > >>> > >>> --- > >>> > >>> > >>> diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/inc= lude/bits/hashtable.h > >>> > >>> index b48610036fa..6056639e663 100644 > >>> > >>> --- a/libstdc++-v3/include/bits/hashtable.h > >>> > >>> +++ b/libstdc++-v3/include/bits/hashtable.h > >>> > >>> @@ -872,13 +872,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >>> > >>> if (!__next_n || __next_bkt !=3D __bkt) > >>> > >>> { > >>> > >>> // Bucket is now empty > >>> > >>> - // First update next bucket if any > >>> > >>> + // Update next bucket if any > >>> > >>> if (__next_n) > >>> > >>> _M_buckets[__next_bkt] =3D _M_buckets[__bkt]; > >>> > >>> > >>> > >>> - // Second update before begin node if necessary > >>> > >>> - if (&_M_before_begin =3D=3D _M_buckets[__bkt]) > >>> > >>> - _M_before_begin._M_nxt =3D __next_n; > >>> > >>> _M_buckets[__bkt] =3D nullptr; > >>> > >>> } > >>> > >>> } > >>> > >>> >