public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: "François Dumont" <frs.dumont@gmail.com>
To: Jonathan Wakely <jwakely@redhat.com>
Cc: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>
Subject: Re: libstdc++ PR 57272 Fancy pointer support in Hashtable
Date: Mon, 2 Nov 2020 22:33:51 +0100	[thread overview]
Message-ID: <c36e84a8-8a78-d9c4-f819-697cf6c37443@gmail.com> (raw)
In-Reply-To: <20201102141120.GR503596@redhat.com>

On 02/11/20 3:11 pm, Jonathan Wakely wrote:
> On 01/11/20 22:48 +0100, François Dumont via Libstdc++ wrote:
>> Here is an other attempt.
>>
>> This time I am storing the node using allocator pointer just in the 
>> singly linked list of nodes. Buckets are still __node_base* so that 
>> the custom pointer is not manipulated too much. Moreover iterators 
>> are also using node raw pointers.
>
> There's no point doing it if you still use raw pointers.
>
> It either has to be done completely, or it's a waste of time and
> energy.

Is there a clear Standard point I am violating in doing so ? Is there a 
chance that std::__to_address is ill formed ?

In hashtable we have somehow 2 data structures, the singly link list and 
the buckets array. I thought it was rather a good idear to not store the 
custom pointers in both. It makes more clear that the node are stored in 
the singly linked list, not in the buckets.

>
>> As advised I introduced new types in case of custom pointers. But I 
>> am also using those in gnu versioned namespace so that I could more 
>> easily test this new code with all existing test cases.
>>
>> Note that as we are at introducing a new abi I am also changing node 
>> memory layout in this case. I think it is better to put the hash code 
>> cache before the node value rather than after. It will be closer to 
>> the begining of the node and so accessible without mem page fault.
>
> I think that's a bad idea. if somebody is using a fancy pointer which
> is just a thin wrapper around a real pointer, and it has implicit
> conversions to/from the real pointer), then I think it might work OK
> today. For example, __gnu_cxx::_Ext_pointer in <ext/pointer.h>. If the
> size and alignment of the fancy pointer is just the same as the real
> pointer, and the layout of the node classes doesn't change order, I
> think that will probably Just Work.

Ok, good point, I'll remove this part of the patch.

> If you reorder the node members, it definitely won't work.
>
> Is this a realistic scenario? I don't know. It might be.
>
> If we want to do that only for the versioned namespace, that would be
> OK, but should be a separate patch.
>
> I'm also concerned about the number of differences that depend on
> _GLIBCXX_INLINE_VERSION. The code gets a lot less maintainable with so
> many differences, and they only exist to support a mode nobody uses.
Still, it helps to validate the code which is normally dedicated to 
custom pointers.
>
> Wouldn't implementing https://wg21.link/P0809R0 or
> https://wg21.link/P0919R3 (and https://wg21.link/P1690R1) be a better
> use of time?

I thought support of custom pointers was.

But sure, I can have a look at those too. But I won't give up on this one !

>
> There are a couple more comments below, for things that I noticed
> while quickly skimming over the patch.
>
>
>> To be clear the node mem layout is:
>> - next node pointer
>> - node value_type
>> - hash code (optional)
>>
>> The new node mem layout is:
>> - next node pointer
>> - hash code (optional)
>> - node value_type
>>
>> Here is the git log in case you validate it.
>>
>>     libstdc++: Store allocator::pointer in hashtable implementation
>>
>>     Use allocator pointer type in _Hashtable implementation.
>>
>>             * include/bits/hashtable_policy.h
>>             (_ReuseOrAllocNode<>::__node_type): Remove.
>>            (_ReuseOrAllocNode<>::__node_pointer): New.
>>             (_ReuseOrAllocNode(__node_pointer, 
>> __hashtable_alloc&)): Adapt to use
>>             latter.
>> (_ReuseOrAllocNode<>::operator()(_Arg&&)): Return latter.
>>             (_AllocNode<>::__node_type): Remove.
>>             (_AllocNode<>::__node_pointer): New.
>> (_AllocNode<>::operator()<>(_Arg&&)): Return latter.
>>             (_Hash_node_cust_ptr_base<>): New.
>>             (_Hash_node_cache_value<typename _Value, bool 
>> _Cache_hash_code>): New.
>>             (_Hash_node<>::__node_base): New.
>>             (_Hash_node<>::__node_ptr): New.
>>             (_Hash_node<>::__node_type): New.
>>            (_Hash_node<>::__node_value_cache_type): New.
>>             (_Hash_node<>::_M_next_ptr()): New.
>>             (_Hash_cust_ptr_node<typename _Ptr, bool 
>> _Cache_hash_code>): New.
>>             (_Hashtable_iterator_base<typename 
>> _NodeType>): New.
>>             (_Node_iterator_base<>): Inherits from latter.
>>             (_Hashtable_iterator<typename _NodeType, bool 
>> __constant_iterators>):
>>             New.
>>             (_Hashtable_const_iterator<typename _NodeType, 
>> bool __constant_iterators>):
>>             New.
>>             (_Insert_base<>::__alloc_ptr): New.
>>             (_Insert_base<>::__node_type): New. Define 
>> conditionally to _Hash_node<>
>>             or _Hash_cust_ptr_node<> depending on 
>> __alloc_ptr being a raw pointer.
>>            (_Insert_base<>::__node_alloc_type): New.
>>            (_Insert_base<>::__hashtable_alloc): Remove.
>>             (_Insert_base<>::iterator): Define 
>> conditionally to _Node_iterator<>
>>             or _Hashtable_iterator<> depending on 
>> __alloc_ptr being a raw pointer.
>>             (_Insert_base<>::const_iterator): Define 
>> conditionally to
>>             _Node_const_iterator<> or 
>> _Hashtable_const_iterator<> depending on
>>             __alloc_ptr being a raw pointer.
>>             (_Hashtable_local_iter_base<>): New.
>>            (_Hash_code_base<>::_M_bucket_index(const 
>> _Hash_node_cache_value<>&,
>>             size_t)): New.
>>             (_Hashtable_local_iter_base<>): New.
>>             (_Hashtable_local_iterator<>): New.
>>            (_Hashtable_const_local_iterator<>): New.
>>             (_Hashtable_base<>::_M_equals(const _Key&, 
>> __hash_code,
>>             const _Hash_node_cache_value<>&): New.
>>            (_Hashtable_base<>::_M_node_equals(const 
>> _Hash_node_cache_value<>&,
>>             const _Hash_node_cache_value<>&)): New.
>>            (_Hashtable_alloc<>::__value_alloc_traits): 
>> Remove.
>>            (_Hashtable_alloc<>::__node_base_ptr): Remove.
>>             * include/bits/hashtable.h (_Hashtable<>): Adapt.
>>             * 
>> testsuite/23_containers/unordered_map/allocator/ext_ptr.cc: New test.
>>             * 
>> testsuite/23_containers/unordered_multimap/allocator/ext_ptr.cc:
>>             New test.
>>             * 
>> testsuite/23_containers/unordered_multiset/allocator/ext_ptr.cc:
>>             New test.
>>             * 
>> testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Adapt.
>>
>> Tested under Linux x86_64 normal and version namespace modes.
>>
>> François
>>
>>
>> On 20/10/20 1:04 pm, Jonathan Wakely wrote:
>>> On 28/09/20 22:37 +0200, François Dumont via Libstdc++ wrote:
>>>> Following recent changes on _Hashtable I rebase the patch and 
>>>> completely review it.
>>>>
>>>> I managed to integrate the allocator custom pointer type without 
>>>> touching to _Hashtable base types like _Hash_code_base or 
>>>> _Hashtable_base. However I cannot see how to use the custom pointer 
>>>> type without impacting the node types like _Hash_node_base which 
>>>> now takes a template parameter, the custom pointer type.
>>>>
>>>> On an abi point of view node types are different however the data 
>>>> structure is the same. The only difference is that the 
>>>> _Hash_node_base _M_nxt is now a _Hash_node<> custom pointer rather 
>>>> than a simple _Hash_node_base*.
>>>>
>>>> Even if this patch can't go in because of the abi breaking change I 
>>>> am going to adapt some of the code simplifications for master. 
>>>> Especially the _Hash_code_base and _Local_iterator_base 
>>>> simplifications.
>>>>
>>>> Let me know if you can think of a way to integrate the custom 
>>>> pointer without impacting abi. Unless impacting node types and 
>>>> associated iterator types is fine even if I already noticed that 
>>>> pretty printer tests are broken with those changes.
>>>
>>> The approach I used for the other containers (which was never
>>> completed and committed) is something like:
>>>
>>> struct _Node_base
>>> {
>>>   _Node_base* _M_next;
>>> };
>>>
>>> template<typename _Ptr>
>>> struct _Fancy_node_base
>>> {
>>>   _Ptr _M_next;
>>> };
>>>
>>> template<typename _Ptr>
>>>   using node_base = conditional_t<is_pointer<_Ptr>::value,
>>>                                  
>>> _Node_base,
>>> _Fancy_node_base<_Ptr>>;
>>>
>>> This way all existing code that has allocators with non-fancy pointers
>>> continues to use the same type. Code using fancy pointers (which
>>> doesn't currently work properly anyway) changes to use the new types
>>> that depend on the pointer type.
>>>
>>
>
>> diff --git a/libstdc++-v3/include/bits/hashtable.h 
>> b/libstdc++-v3/include/bits/hashtable.h
>> index 6c6c5edde0b..86644d447ca 100644
>> --- a/libstdc++-v3/include/bits/hashtable.h
>> +++ b/libstdc++-v3/include/bits/hashtable.h
>> @@ -182,8 +182,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>                  _RehashPolicy, _Traits>,
>>       private __detail::_Hashtable_alloc<
>>     __alloc_rebind<_Alloc,
>> -               __detail::_Hash_node<_Value,
>> - _Traits::__hash_cached::value>>>
>> +#if _GLIBCXX_INLINE_VERSION
>> +               __detail::_Hash_cust_ptr_node<
>> +             __detail::__alloc_val_ptr<_Alloc, _Value>,
>> +             _Traits::__hash_cached::value>>>
>> +#else
>> +      typename std::conditional<
>> +        std::__is_pointer<
>> +          __detail::__alloc_val_ptr<_Alloc, _Value>>::__value,
>> +        __detail::_Hash_node<_Value, _Traits::__hash_cached::value>,
>> +        __detail::_Hash_cust_ptr_node<
>> +          __detail::__alloc_val_ptr<_Alloc, _Value>,
>> +          _Traits::__hash_cached::value>>::type>>
>> +#endif
>
> This ugliness should be hidden behind an alias template.
>
> Use is_pointer<P>::value, not std::__is_pointer<P>::__value.
> This is C++11 code, there's no need to use the C++98 traits. And you
> don't need the std:: qualification.
>
> _Hash_cust_ptr_node is not a good name if it's also used (sometimes)
> for normal pointers. How about _Hash_pnode, or something like that?
Ok.
>
>
>> +
>> +  /**
>> +   *  Primary template struct _Hash_cust_ptr_node.
>
> This comment is not useful. It shouldn't a Doxygen comment, because
> this is not something we need to put in the API documentation for
> end-users. I can tell it's the primary template, because I can read
> C++. What is the type for? How is it different to _Hash_node? That's
> what I'd like to read here.
Yes, that's a bad copy/paster.
>
>> +   */
>> +  template<typename _Ptr, bool _Cache_hash_code>
>> +    struct _Hash_cust_ptr_node
>> +    : _Hash_node_cust_ptr_base<__ptr_rebind<_Ptr,
>> +                   _Hash_cust_ptr_node<_Ptr, _Cache_hash_code>>>
>> +    , _Hash_node_cache_value<typename 
>> std::pointer_traits<_Ptr>::element_type,
>> +                 _Cache_hash_code>
>


  reply	other threads:[~2020-11-02 21:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-19 17:31 François Dumont
2020-05-15 21:12 ` François Dumont
2020-09-28 20:37   ` François Dumont
2020-10-20 11:04     ` Jonathan Wakely
2020-10-20 17:26       ` François Dumont
2020-11-01 21:48       ` François Dumont
2020-11-02 14:11         ` Jonathan Wakely
2020-11-02 21:33           ` François Dumont [this message]
2021-01-11 18:10           ` François Dumont
2021-06-10 17:22           ` François Dumont

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c36e84a8-8a78-d9c4-f819-697cf6c37443@gmail.com \
    --to=frs.dumont@gmail.com \
    --cc=jwakely@redhat.com \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).