From: "François Dumont" <frs.dumont@gmail.com>
To: Jonathan Wakely <jwakely@redhat.com>
Cc: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>
Subject: Re: libstdc++ PR 57272 Fancy pointer support in Hashtable
Date: Mon, 2 Nov 2020 22:33:51 +0100 [thread overview]
Message-ID: <c36e84a8-8a78-d9c4-f819-697cf6c37443@gmail.com> (raw)
In-Reply-To: <20201102141120.GR503596@redhat.com>
On 02/11/20 3:11 pm, Jonathan Wakely wrote:
> On 01/11/20 22:48 +0100, François Dumont via Libstdc++ wrote:
>> Here is an other attempt.
>>
>> This time I am storing the node using allocator pointer just in the
>> singly linked list of nodes. Buckets are still __node_base* so that
>> the custom pointer is not manipulated too much. Moreover iterators
>> are also using node raw pointers.
>
> There's no point doing it if you still use raw pointers.
>
> It either has to be done completely, or it's a waste of time and
> energy.
Is there a clear Standard point I am violating in doing so ? Is there a
chance that std::__to_address is ill formed ?
In hashtable we have somehow 2 data structures, the singly link list and
the buckets array. I thought it was rather a good idear to not store the
custom pointers in both. It makes more clear that the node are stored in
the singly linked list, not in the buckets.
>
>> As advised I introduced new types in case of custom pointers. But I
>> am also using those in gnu versioned namespace so that I could more
>> easily test this new code with all existing test cases.
>>
>> Note that as we are at introducing a new abi I am also changing node
>> memory layout in this case. I think it is better to put the hash code
>> cache before the node value rather than after. It will be closer to
>> the begining of the node and so accessible without mem page fault.
>
> I think that's a bad idea. if somebody is using a fancy pointer which
> is just a thin wrapper around a real pointer, and it has implicit
> conversions to/from the real pointer), then I think it might work OK
> today. For example, __gnu_cxx::_Ext_pointer in <ext/pointer.h>. If the
> size and alignment of the fancy pointer is just the same as the real
> pointer, and the layout of the node classes doesn't change order, I
> think that will probably Just Work.
Ok, good point, I'll remove this part of the patch.
> If you reorder the node members, it definitely won't work.
>
> Is this a realistic scenario? I don't know. It might be.
>
> If we want to do that only for the versioned namespace, that would be
> OK, but should be a separate patch.
>
> I'm also concerned about the number of differences that depend on
> _GLIBCXX_INLINE_VERSION. The code gets a lot less maintainable with so
> many differences, and they only exist to support a mode nobody uses.
Still, it helps to validate the code which is normally dedicated to
custom pointers.
>
> Wouldn't implementing https://wg21.link/P0809R0 or
> https://wg21.link/P0919R3 (and https://wg21.link/P1690R1) be a better
> use of time?
I thought support of custom pointers was.
But sure, I can have a look at those too. But I won't give up on this one !
>
> There are a couple more comments below, for things that I noticed
> while quickly skimming over the patch.
>
>
>> To be clear the node mem layout is:
>> - next node pointer
>> - node value_type
>> - hash code (optional)
>>
>> The new node mem layout is:
>> - next node pointer
>> - hash code (optional)
>> - node value_type
>>
>> Here is the git log in case you validate it.
>>
>> Â Â Â libstdc++: Store allocator::pointer in hashtable implementation
>>
>> Â Â Â Use allocator pointer type in _Hashtable implementation.
>>
>> Â Â Â Â Â Â Â Â Â Â Â * include/bits/hashtable_policy.h
>> Â Â Â Â Â Â Â Â Â Â Â (_ReuseOrAllocNode<>::__node_type): Remove.
>> Â Â Â Â Â Â Â Â Â Â Â (_ReuseOrAllocNode<>::__node_pointer): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_ReuseOrAllocNode(__node_pointer,
>> __hashtable_alloc&)): Adapt to use
>> Â Â Â Â Â Â Â Â Â Â Â latter.
>> (_ReuseOrAllocNode<>::operator()(_Arg&&)): Return latter.
>> Â Â Â Â Â Â Â Â Â Â Â (_AllocNode<>::__node_type): Remove.
>> Â Â Â Â Â Â Â Â Â Â Â (_AllocNode<>::__node_pointer): New.
>> (_AllocNode<>::operator()<>(_Arg&&)): Return latter.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hash_node_cust_ptr_base<>): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hash_node_cache_value<typename _Value, bool
>> _Cache_hash_code>): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hash_node<>::__node_base): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hash_node<>::__node_ptr): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hash_node<>::__node_type): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hash_node<>::__node_value_cache_type): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hash_node<>::_M_next_ptr()): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hash_cust_ptr_node<typename _Ptr, bool
>> _Cache_hash_code>): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_iterator_base<typename
>> _NodeType>): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Node_iterator_base<>): Inherits from latter.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_iterator<typename _NodeType, bool
>> __constant_iterators>):
>> Â Â Â Â Â Â Â Â Â Â Â New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_const_iterator<typename _NodeType,
>> bool __constant_iterators>):
>> Â Â Â Â Â Â Â Â Â Â Â New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Insert_base<>::__alloc_ptr): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Insert_base<>::__node_type): New. Define
>> conditionally to _Hash_node<>
>> Â Â Â Â Â Â Â Â Â Â Â or _Hash_cust_ptr_node<> depending on
>> __alloc_ptr being a raw pointer.
>> Â Â Â Â Â Â Â Â Â Â Â (_Insert_base<>::__node_alloc_type): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Insert_base<>::__hashtable_alloc): Remove.
>> Â Â Â Â Â Â Â Â Â Â Â (_Insert_base<>::iterator): Define
>> conditionally to _Node_iterator<>
>> Â Â Â Â Â Â Â Â Â Â Â or _Hashtable_iterator<> depending on
>> __alloc_ptr being a raw pointer.
>> Â Â Â Â Â Â Â Â Â Â Â (_Insert_base<>::const_iterator): Define
>> conditionally to
>> Â Â Â Â Â Â Â Â Â Â Â _Node_const_iterator<> or
>> _Hashtable_const_iterator<> depending on
>> Â Â Â Â Â Â Â Â Â Â Â __alloc_ptr being a raw pointer.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_local_iter_base<>): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hash_code_base<>::_M_bucket_index(const
>> _Hash_node_cache_value<>&,
>> Â Â Â Â Â Â Â Â Â Â Â size_t)): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_local_iter_base<>): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_local_iterator<>): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_const_local_iterator<>): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_base<>::_M_equals(const _Key&,
>> __hash_code,
>> Â Â Â Â Â Â Â Â Â Â Â const _Hash_node_cache_value<>&): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_base<>::_M_node_equals(const
>> _Hash_node_cache_value<>&,
>> Â Â Â Â Â Â Â Â Â Â Â const _Hash_node_cache_value<>&)): New.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_alloc<>::__value_alloc_traits):
>> Remove.
>> Â Â Â Â Â Â Â Â Â Â Â (_Hashtable_alloc<>::__node_base_ptr): Remove.
>> Â Â Â Â Â Â Â Â Â Â Â * include/bits/hashtable.h (_Hashtable<>): Adapt.
>> Â Â Â Â Â Â Â Â Â Â Â *
>> testsuite/23_containers/unordered_map/allocator/ext_ptr.cc: New test.
>> Â Â Â Â Â Â Â Â Â Â Â *
>> testsuite/23_containers/unordered_multimap/allocator/ext_ptr.cc:
>> Â Â Â Â Â Â Â Â Â Â Â New test.
>> Â Â Â Â Â Â Â Â Â Â Â *
>> testsuite/23_containers/unordered_multiset/allocator/ext_ptr.cc:
>> Â Â Â Â Â Â Â Â Â Â Â New test.
>> Â Â Â Â Â Â Â Â Â Â Â *
>> testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Adapt.
>>
>> Tested under Linux x86_64 normal and version namespace modes.
>>
>> François
>>
>>
>> On 20/10/20 1:04 pm, Jonathan Wakely wrote:
>>> On 28/09/20 22:37 +0200, François Dumont via Libstdc++ wrote:
>>>> Following recent changes on _Hashtable I rebase the patch and
>>>> completely review it.
>>>>
>>>> I managed to integrate the allocator custom pointer type without
>>>> touching to _Hashtable base types like _Hash_code_base or
>>>> _Hashtable_base. However I cannot see how to use the custom pointer
>>>> type without impacting the node types like _Hash_node_base which
>>>> now takes a template parameter, the custom pointer type.
>>>>
>>>> On an abi point of view node types are different however the data
>>>> structure is the same. The only difference is that the
>>>> _Hash_node_base _M_nxt is now a _Hash_node<> custom pointer rather
>>>> than a simple _Hash_node_base*.
>>>>
>>>> Even if this patch can't go in because of the abi breaking change I
>>>> am going to adapt some of the code simplifications for master.
>>>> Especially the _Hash_code_base and _Local_iterator_base
>>>> simplifications.
>>>>
>>>> Let me know if you can think of a way to integrate the custom
>>>> pointer without impacting abi. Unless impacting node types and
>>>> associated iterator types is fine even if I already noticed that
>>>> pretty printer tests are broken with those changes.
>>>
>>> The approach I used for the other containers (which was never
>>> completed and committed) is something like:
>>>
>>> struct _Node_base
>>> {
>>> Â _Node_base* _M_next;
>>> };
>>>
>>> template<typename _Ptr>
>>> struct _Fancy_node_base
>>> {
>>> Â _Ptr _M_next;
>>> };
>>>
>>> template<typename _Ptr>
>>> Â using node_base = conditional_t<is_pointer<_Ptr>::value,
>>> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
>>> _Node_base,
>>> _Fancy_node_base<_Ptr>>;
>>>
>>> This way all existing code that has allocators with non-fancy pointers
>>> continues to use the same type. Code using fancy pointers (which
>>> doesn't currently work properly anyway) changes to use the new types
>>> that depend on the pointer type.
>>>
>>
>
>> diff --git a/libstdc++-v3/include/bits/hashtable.h
>> b/libstdc++-v3/include/bits/hashtable.h
>> index 6c6c5edde0b..86644d447ca 100644
>> --- a/libstdc++-v3/include/bits/hashtable.h
>> +++ b/libstdc++-v3/include/bits/hashtable.h
>> @@ -182,8 +182,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> _RehashPolicy, _Traits>,
>> private __detail::_Hashtable_alloc<
>> __alloc_rebind<_Alloc,
>> - __detail::_Hash_node<_Value,
>> - _Traits::__hash_cached::value>>>
>> +#if _GLIBCXX_INLINE_VERSION
>> + __detail::_Hash_cust_ptr_node<
>> + __detail::__alloc_val_ptr<_Alloc, _Value>,
>> + _Traits::__hash_cached::value>>>
>> +#else
>> + typename std::conditional<
>> + std::__is_pointer<
>> + __detail::__alloc_val_ptr<_Alloc, _Value>>::__value,
>> + __detail::_Hash_node<_Value, _Traits::__hash_cached::value>,
>> + __detail::_Hash_cust_ptr_node<
>> + __detail::__alloc_val_ptr<_Alloc, _Value>,
>> + _Traits::__hash_cached::value>>::type>>
>> +#endif
>
> This ugliness should be hidden behind an alias template.
>
> Use is_pointer<P>::value, not std::__is_pointer<P>::__value.
> This is C++11 code, there's no need to use the C++98 traits. And you
> don't need the std:: qualification.
>
> _Hash_cust_ptr_node is not a good name if it's also used (sometimes)
> for normal pointers. How about _Hash_pnode, or something like that?
Ok.
>
>
>> +
>> + /**
>> + * Primary template struct _Hash_cust_ptr_node.
>
> This comment is not useful. It shouldn't a Doxygen comment, because
> this is not something we need to put in the API documentation for
> end-users. I can tell it's the primary template, because I can read
> C++. What is the type for? How is it different to _Hash_node? That's
> what I'd like to read here.
Yes, that's a bad copy/paster.
>
>> + */
>> + template<typename _Ptr, bool _Cache_hash_code>
>> + struct _Hash_cust_ptr_node
>> + : _Hash_node_cust_ptr_base<__ptr_rebind<_Ptr,
>> + _Hash_cust_ptr_node<_Ptr, _Cache_hash_code>>>
>> + , _Hash_node_cache_value<typename
>> std::pointer_traits<_Ptr>::element_type,
>> + _Cache_hash_code>
>
next prev parent reply other threads:[~2020-11-02 21:33 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-19 17:31 François Dumont
2020-05-15 21:12 ` François Dumont
2020-09-28 20:37 ` François Dumont
2020-10-20 11:04 ` Jonathan Wakely
2020-10-20 17:26 ` François Dumont
2020-11-01 21:48 ` François Dumont
2020-11-02 14:11 ` Jonathan Wakely
2020-11-02 21:33 ` François Dumont [this message]
2021-01-11 18:10 ` François Dumont
2021-06-10 17:22 ` François Dumont
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c36e84a8-8a78-d9c4-f819-697cf6c37443@gmail.com \
--to=frs.dumont@gmail.com \
--cc=jwakely@redhat.com \
--cc=libstdc++@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).