From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 451943858C41 for ; Thu, 9 Nov 2023 08:11:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 451943858C41 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 451943858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699517510; cv=none; b=ZYIxI3QFc9G44p+9lc6bfO/4xaSTOh1pxBbY6VwNCBFB+OuyqXMrN970jez60JEfpib1BmsZOrTyq2uxTlee7yssJljU23qGkRTdzztDV7jhO2cGWOT9AS+xRQlGnCPGja3GNOL57rKYsP4hPNYj9zwKDvgDcHpm/srvF687+uY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699517510; c=relaxed/simple; bh=nLnpLjBHI8Fiq3JJvY4jpepDDgsBKsm8C3jW4i+1B5g=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=Dw0AGCgm3P3x1NUWr5cmu+WPGam7ij6zo4GA1Y8+xF5HEZr7K9lDV1YUiZCDtv9diZan0mxtZZQ4G58FsayynVRz3gwVsW9ZNbuGtqpBAFYxJt1MXo7uLTaRRmpGggyN/jcjMSLAn/7fP7mtwY/2S0XI3fRzTpya09ryVPwsv7U= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699517504; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QmwGZ4+nxUow9gQYj24xcvbc26PnYQaYBYeptW5z3aw=; b=FLkUr5VYAdA4fBAb2EjQTfynN4FYfJcXthWDyEIumDlen/NDFV4jFrOY5RNa+BrL7vd+bu g9X8xtOQ8mvfRmumdj0rOo9nzuYpRwbR46XvmRBM+9C62SW/N1kZK1G6i/znS8rBQAHStY zjuiPUpXPRdllJQ7nBgzq2NaMemKzHs= Received: from mail-yw1-f200.google.com (mail-yw1-f200.google.com [209.85.128.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-650-MZ1fXwMCM6KrWYL9QBwmwA-1; Thu, 09 Nov 2023 03:11:43 -0500 X-MC-Unique: MZ1fXwMCM6KrWYL9QBwmwA-1 Received: by mail-yw1-f200.google.com with SMTP id 00721157ae682-5a7b9e83b70so4953327b3.0 for ; Thu, 09 Nov 2023 00:11:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699517503; x=1700122303; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QmwGZ4+nxUow9gQYj24xcvbc26PnYQaYBYeptW5z3aw=; b=g8/LOlgCC6my8lNXafnHX9lb4Q5aXEJXnnUfdSgkl8of7wKPZuaKhNnCx7ZMSyBcRO BCbHacw9vQ8aALrl8e+WugdvZfJHgZeIOyXlZVjrOyO+aXrvCYlIigsjDbM9DIBPn0v5 Ew7AJjWFUQVpBi57gM2Ychrj4K1EYcIpDDT5oI52ee8npT8g0tFvRboWVpJMBZ02Gcqu eHxKPVzCZW+GRQMBBllZb5PN5rRLKfjGhoh73juvJgbjTr8/wDVd459brheTr4buCyC4 Zc0vT1Hbxg3GdIlFE8cdN04VGSImF7zNwVVyUcoyjiGYWQE7aGJnzvTOBeKi0w6fqh+c 5Mbg== X-Gm-Message-State: AOJu0Yyk+NNpMxaZFoyPx6BXhCYx/ppWW+h3EwX8x0Y8XUdKXnGEFRw9 ugpkM5cM0Ps0XaElqaQrRiWf3QJEGeJ3Zq8SfHJkw0+/ymilIn6JueIAGYwS4GJtXLlwa0g9cc7 bgyfefhFz3EDZYnk1zbBYy/sM8N7pz84= X-Received: by 2002:a0d:ddc2:0:b0:5a7:fb66:61ff with SMTP id g185-20020a0dddc2000000b005a7fb6661ffmr3331818ywe.21.1699517503190; Thu, 09 Nov 2023 00:11:43 -0800 (PST) X-Google-Smtp-Source: AGHT+IEHR5XD1dk5sXzKCZB8Gd3hSalHzj0rIJqQr8wN/C6/8QjCowGq4Cw9A/1GWGUeL8r/wpoWxhNuU1jmO96jzR8= X-Received: by 2002:a0d:ddc2:0:b0:5a7:fb66:61ff with SMTP id g185-20020a0dddc2000000b005a7fb6661ffmr3331813ywe.21.1699517502836; Thu, 09 Nov 2023 00:11:42 -0800 (PST) MIME-Version: 1.0 References: <7f61df18-dd99-4ff5-9fcd-8ca7820403d4@gmail.com> In-Reply-To: From: Jonathan Wakely Date: Thu, 9 Nov 2023 08:11:31 +0000 Message-ID: Subject: Re: [PATCH][_Hashtable] Use RAII to restore Rehash state To: =?UTF-8?Q?Fran=C3=A7ois_Dumont?= Cc: "libstdc++" , gcc-patches X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 26 Oct 2023 at 21:52, Fran=C3=A7ois Dumont w= rote: > > > On 26/10/2023 12:43, Jonathan Wakely wrote: > > On 26/10/23 07:18 +0200, Fran=C3=A7ois Dumont wrote: > >> libstdc++: [_Hashtable] Use RAII type to manage rehash functor sta= te > >> > >> Replace usage of __try/__catch with a RAII type to restore rehash > >> functor > >> state when needed. > > > > Generally I really like replacing try-catch with RAII but I have some > > questions below. > > > >> libstdc++-v3/ChangeLog: > >> > >> * include/bits/hashtable_policy.h (_RehashStateGuard): New= . > >> (_Insert_base<>::_M_insert_range(_IIt, _IIt, const > >> _NodeGet&, false_type)): > >> Adapt. > >> * include/bits/hashtable.h (__rehash_guard_t): New. > >> (__rehash_state): Remove. > >> (_M_rehash): Remove. > >> (_M_rehash_aux): Rename into _M_rehash. > >> (_M_assign_elements, _M_insert_unique_node, > >> _M_insert_multi_node): Adapt. > >> (rehash): Adapt. > >> > >> > >> Tested under Linux x64. > >> > >> Ok to commit ? > >> > >> Fran=C3=A7ois > > > >> diff --git a/libstdc++-v3/include/bits/hashtable.h > >> b/libstdc++-v3/include/bits/hashtable.h > >> index 0857448f7ed..64071ac1fb2 100644 > >> --- a/libstdc++-v3/include/bits/hashtable.h > >> +++ b/libstdc++-v3/include/bits/hashtable.h > >> @@ -234,6 +234,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> _RehashPolicy, _Traits>; > >> using __enable_default_ctor > >> =3D _Hashtable_enable_default_ctor<_Equal, _Hash, _Alloc>; > >> + using __rehash_guard_t > >> + =3D __detail::_RehashStateGuard<_RehashPolicy>; > >> > >> public: > >> typedef _Key key_type; > >> @@ -264,7 +266,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> > >> private: > >> using __rehash_type =3D _RehashPolicy; > >> - using __rehash_state =3D typename __rehash_type::_State; > >> > >> using __unique_keys =3D typename __traits_type::__unique_keys; > >> > >> @@ -1200,14 +1201,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> > >> private: > >> // Helper rehash method used when keys are unique. > >> - void _M_rehash_aux(size_type __bkt_count, true_type __uks); > >> + void _M_rehash(size_type __bkt_count, true_type __uks); > >> > >> // Helper rehash method used when keys can be non-unique. > >> - void _M_rehash_aux(size_type __bkt_count, false_type __uks); > >> - > >> - // Unconditionally change size of bucket array to n, restore > >> - // hash policy state to __state on exception. > >> - void _M_rehash(size_type __bkt_count, const __rehash_state& > >> __state); > >> + void _M_rehash(size_type __bkt_count, false_type __uks); > >> }; > >> > >> // Definitions of class template _Hashtable's out-of-line member > >> functions. > >> @@ -1337,7 +1334,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> { > >> __buckets_ptr __former_buckets =3D nullptr; > >> std::size_t __former_bucket_count =3D _M_bucket_count; > >> - const __rehash_state& __former_state =3D _M_rehash_policy._M_stat= e(); > >> + __rehash_guard_t __rehash_guard(_M_rehash_policy); > >> > >> if (_M_bucket_count !=3D __ht._M_bucket_count) > >> { > >> @@ -1359,6 +1356,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> _M_assign(std::forward<_Ht>(__ht), __roan); > >> if (__former_buckets) > >> _M_deallocate_buckets(__former_buckets, > >> __former_bucket_count); > >> + __rehash_guard._M_reset =3D false; > > > > I find this confusing. Usually "reset" means that something is > > cleared, so won't take an action in the destructor. e.g. if you use > > std::unique_ptr::reset() then the object is destroyed immediately, and > > then nothing happens in the destructor. Here it's the opposite, > > _M_reset=3Dtrue means that it _sould_ do something in the destructor. > > > > The problem is the ambiguity between "reset the state in the > > destructor later" and "reset the object to an empty state now". > > > > If the member was called _M_guarded then it might be clearer. > > _M_guarded=3Dtrue means the guard is active, and will restore the state > > later. Or _M_active, or _M_armed, or even _M_reset_in_dtor. Any of > > those names avoids the confusion with the semantics of > > std::unique_ptr::reset() and similar well-known APIs. > > > > Or what I usually do is store a pointer to the guarded object in the > > RAII guard type, and then just null the pointer to disarm the guard. > > That means you don't need a separate bool member variable. If > > _RehashStateGuard::_M_rehash_policy was called _M_guarded_obj and was > > a _RehashPolicy* instead of _RehashPolicy& then disarming it would be: > > > > __rehash_guard._M_guarded_obj =3D nullptr; > > > > This seems clear to me, as it says that the guard no longer has > > anything to guard, so won't do anything in the destructor. > > > Looks clearer to me too. > > I started with a _RehashStateScope and a _M_commit() method, it could > have been worst :-) > > > > > >> } > >> __catch(...) > >> { > >> @@ -1366,7 +1364,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> { > >> // Restore previous buckets. > >> _M_deallocate_buckets(); > >> - _M_rehash_policy._M_reset(__former_state); > >> _M_buckets =3D __former_buckets; > >> _M_bucket_count =3D __former_bucket_count; > >> } > >> @@ -2142,17 +2139,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> __node_ptr __node, size_type __n_elt) > >> -> iterator > >> { > >> - const __rehash_state& __saved_state =3D > >> _M_rehash_policy._M_state(); > >> + __rehash_guard_t __rehash_guard(_M_rehash_policy); > >> std::pair __do_rehash > >> =3D _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_co= unt, > >> __n_elt); > >> > >> if (__do_rehash.first) > >> { > >> - _M_rehash(__do_rehash.second, __saved_state); > >> + _M_rehash(__do_rehash.second, true_type{}); > >> __bkt =3D _M_bucket_index(__code); > >> } > >> > >> + __rehash_guard._M_reset =3D false; > >> this->_M_store_code(*__node, __code); > > > > This changes behaviour. Previously the try-catch was inside the call to > > _M_rehash. Now we're starting to guard before calling _M_need_rehash, > > We were capturing _M_rehash_policy state before the call to > _M_need_rehash so it's not different. > > We really need to instantiate the guard before we call _M_need_rehash > cause it is the method changing the _M_rehash_policy state. > > > and we run the destructor unconditionally even if we never call > > _M_rehash. > > Here we only want to reset rehash state on exception. > > Note that with the 2 rehash policy implementations we provide it doesn't > matter if we reset or not when we do not call _M_rehash because their > state changes only when rehash is needed. But some user's fancy rehash > policy might update their state even if no rehash is requested and in > this case it is better to reset this state only on exception. > > > > I think that's OK, because _M_need_rehash is noexcept (but > > it's confusing because we have a const overload of _M_need_rehash > > which is noexcept(false) and a non-const overload which is > > noexcept(true) ... should they both be noexcept?) > > That's another subject but yes, _M_need_rehash should be non-const and > noexcept on both rehash policy implementations. I considered doing such > a thing but won't it impact abi as _Prime_rehash_policy._M_need_rehash > symbols is coming from the lib ? > > > > > > Can we do this instead: > > > > > > - const __rehash_state& __saved_state =3D _M_rehash_policy._M_stat= e(); > > std::pair __do_rehash > > =3D _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_co= unt, > > __n_elt); > > > > if (__do_rehash.first) > > { > > + __rehash_guard_t __rehash_guard(_M_rehash_policy); > > Clearly no, we need to capture state before _M_need_rehash. Like it was > done with the __saved_state above. > > > > - _M_rehash(__do_rehash.second, __saved_state); > > + _M_rehash(__do_rehash.second, true_type{}); > > + __rehash_guard._M_reset =3D false; > > __bkt =3D _M_bucket_index(__code); > > } > > > > this->_M_store_code(*__node, __code); > > > > This only creates a guard object if we're actually going to rehash, so > > we don't run its destructor (and check if we need t restore anything) > > when no rehash is needed. > > > > N.B. _M_bucket_index is also confusing. There are two overloads of it, > > one is noexcept and one isn't. The one that we call here is > > noexcept(false), so that call could throw ... is it OK that we don't > > restore the old state if that happens? Should that be guarded too? (I > > have no idea). > > > > Also, I see that the noexcept(true) overload of _M_bucket_index calls > > __hash_code_base::_M_bucket_index, which is noexcept(/* condition */), > > so the caller says it cannot throw, but it calls something that > > sometimes throws. Is that correct? > > It can be noexcept because usage of hash code cache depends on either > hash functor is noexcept or not. > > > > >> > >> // Always insert at the beginning of the bucket. > >> @@ -2172,13 +2170,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> __hash_code __code, __node_ptr __node) > >> -> iterator > >> { > >> - const __rehash_state& __saved_state =3D > >> _M_rehash_policy._M_state(); > >> + __rehash_guard_t __rehash_guard(_M_rehash_policy); > >> std::pair __do_rehash > >> =3D _M_rehash_policy._M_need_rehash(_M_bucket_count, > >> _M_element_count, 1); > >> > >> if (__do_rehash.first) > >> - _M_rehash(__do_rehash.second, __saved_state); > >> + _M_rehash(__do_rehash.second, false_type{}); > >> > >> + __rehash_guard._M_reset =3D false; > > > > Again, we're creating the guard object in a larger scope than needed. > > > >> this->_M_store_code(*__node, __code); > >> const key_type& __k =3D _ExtractKey{}(__node->_M_v()); > >> size_type __bkt =3D _M_bucket_index(__code); > >> @@ -2509,39 +2508,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: > >> rehash(size_type __bkt_count) > >> { > >> - const __rehash_state& __saved_state =3D > >> _M_rehash_policy._M_state(); > >> + __rehash_guard_t __rehash_guard(_M_rehash_policy); > >> __bkt_count > >> =3D std::max(_M_rehash_policy._M_bkt_for_elements(_M_element_count > >> + 1), > >> __bkt_count); > >> __bkt_count =3D _M_rehash_policy._M_next_bkt(__bkt_count); > >> > >> if (__bkt_count !=3D _M_bucket_count) > >> - _M_rehash(__bkt_count, __saved_state); > >> - else > >> - // No rehash, restore previous state to keep it consistent with > >> - // container state. > >> - _M_rehash_policy._M_reset(__saved_state); > >> - } > >> - > >> - template >> - typename _ExtractKey, typename _Equal, > >> - typename _Hash, typename _RangeHash, typename _Unused, > >> - typename _RehashPolicy, typename _Traits> > >> - void > >> - _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, > >> - _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: > >> - _M_rehash(size_type __bkt_count, const __rehash_state& __state) > >> - { > >> - __try > >> - { > >> - _M_rehash_aux(__bkt_count, __unique_keys{}); > >> - } > >> - __catch(...) > >> { > >> - // A failure here means that buckets allocation failed. We only > >> - // have to restore hash policy previous state. > >> - _M_rehash_policy._M_reset(__state); > >> - __throw_exception_again; > >> + _M_rehash(__bkt_count, __unique_keys{}); > >> + __rehash_guard._M_reset =3D false; > > > > And a larger scope again. > > > >> } > >> } > >> > >> @@ -2553,7 +2529,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> void > >> _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, > >> _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: > >> - _M_rehash_aux(size_type __bkt_count, true_type /* __uks */) > >> + _M_rehash(size_type __bkt_count, true_type /* __uks */) > >> { > >> __buckets_ptr __new_buckets =3D _M_allocate_buckets(__bkt_count)= ; > >> __node_ptr __p =3D _M_begin(); > >> @@ -2596,7 +2572,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > >> void > >> _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal, > >> _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>:: > >> - _M_rehash_aux(size_type __bkt_count, false_type /* __uks */) > >> + _M_rehash(size_type __bkt_count, false_type /* __uks */) > >> { > >> __buckets_ptr __new_buckets =3D _M_allocate_buckets(__bkt_count)= ; > >> __node_ptr __p =3D _M_begin(); > >> diff --git a/libstdc++-v3/include/bits/hashtable_policy.h > >> b/libstdc++-v3/include/bits/hashtable_policy.h > >> index 5d162463dc3..8b9626b1575 100644 > >> --- a/libstdc++-v3/include/bits/hashtable_policy.h > >> +++ b/libstdc++-v3/include/bits/hashtable_policy.h > >> @@ -715,6 +715,26 @@ namespace __detail > >> std::size_t _M_next_resize; > >> }; > >> > >> + template > >> + struct _RehashStateGuard > >> + { > >> + _RehashPolicy& _M_rehash_policy; > >> + typename _RehashPolicy::_State _M_prev_state; > >> + bool _M_reset =3D true; > > > > This could be: > > > > + _RehashPolicy* _M_guarded_obj; > > + typename _RehashPolicy::_State _M_prev_state; > > > >> + > >> + _RehashStateGuard(_RehashPolicy& __policy) > >> + : _M_rehash_policy(__policy) > >> + , _M_prev_state(__policy._M_state()) > > > > + _RehashStateGuard(_RehashPolicy& __policy) > > + : _M_guarded_obj(std::__addressof(__policy)) > > + , _M_prev_state(__policy._M_state()) > > > >> + { } > >> + _RehashStateGuard(const _RehashStateGuard&) =3D delete; > >> + > >> + ~_RehashStateGuard() > >> + { > >> + if (_M_reset) > >> + _M_rehash_policy._M_reset(_M_prev_state); > > > > + if (_M_guarded_obj) > > + _M_guarded_obj->_M_reset(_M_prev_state); > > > >> + } > >> + }; > >> + > >> // Base classes for std::_Hashtable. We define these base classes > >> // because in some cases we want to do different things depending on > >> // the value of a policy class. In some cases the policy class > >> @@ -1007,7 +1027,7 @@ namespace __detail > >> const _NodeGetter& __node_gen, false_type __uks) > >> { > >> using __rehash_type =3D typename __hashtable::__rehash_type; > >> - using __rehash_state =3D typename __hashtable::__rehash_state; > >> + using __rehash_guard_t =3D typename __hashtable::__rehash_guard_t= ; > >> using pair_type =3D std::pair; > >> > >> size_type __n_elt =3D __detail::__distance_fw(__first, __last); > >> @@ -1016,14 +1036,15 @@ namespace __detail > >> > >> __hashtable& __h =3D _M_conjure_hashtable(); > >> __rehash_type& __rehash =3D __h._M_rehash_policy; > >> - const __rehash_state& __saved_state =3D __rehash._M_state(); > >> + __rehash_guard_t __rehash_guard(__rehash); > >> pair_type __do_rehash =3D __rehash._M_need_rehash(__h._M_bucket_co= unt, > >> __h._M_element_count, > >> __n_elt); > >> > >> if (__do_rehash.first) > >> - __h._M_rehash(__do_rehash.second, __saved_state); > >> + __h._M_rehash(__do_rehash.second, __uks); > >> > >> + __rehash_guard._M_reset =3D false; > > > > A larger scope again. > > Note that there is only one place where we reset rehash policy state > even if no exception took place. It is in the std rehash method when the > user requests a rehash that has no effect. > > Here is the updated patch, ok to commit ? Thanks for the explanations, OK for trunk.