public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jonathan Wakely <jwakely@redhat.com>
To: "François Dumont" <frs.dumont@gmail.com>
Cc: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>
Subject: Re: libstdc++ PR 57272 Fancy pointer support in Hashtable
Date: Mon, 2 Nov 2020 14:11:20 +0000	[thread overview]
Message-ID: <20201102141120.GR503596@redhat.com> (raw)
In-Reply-To: <528bdf1d-ed96-5d9d-89db-e1bf57268e24@gmail.com>

On 01/11/20 22:48 +0100, François Dumont via Libstdc++ wrote:
>Here is an other attempt.
>
>This time I am storing the node using allocator pointer just in the 
>singly linked list of nodes. Buckets are still __node_base* so that 
>the custom pointer is not manipulated too much. Moreover iterators are 
>also using node raw pointers.

There's no point doing it if you still use raw pointers.

It either has to be done completely, or it's a waste of time and
energy.

>As advised I introduced new types in case of custom pointers. But I am 
>also using those in gnu versioned namespace so that I could more 
>easily test this new code with all existing test cases.
>
>Note that as we are at introducing a new abi I am also changing node 
>memory layout in this case. I think it is better to put the hash code 
>cache before the node value rather than after. It will be closer to 
>the begining of the node and so accessible without mem page fault.

I think that's a bad idea. if somebody is using a fancy pointer which
is just a thin wrapper around a real pointer, and it has implicit
conversions to/from the real pointer), then I think it might work OK
today. For example, __gnu_cxx::_Ext_pointer in <ext/pointer.h>. If the
size and alignment of the fancy pointer is just the same as the real
pointer, and the layout of the node classes doesn't change order, I
think that will probably Just Work.

If you reorder the node members, it definitely won't work.

Is this a realistic scenario? I don't know. It might be.

If we want to do that only for the versioned namespace, that would be
OK, but should be a separate patch.

I'm also concerned about the number of differences that depend on
_GLIBCXX_INLINE_VERSION. The code gets a lot less maintainable with so
many differences, and they only exist to support a mode nobody uses.

Wouldn't implementing https://wg21.link/P0809R0 or
https://wg21.link/P0919R3 (and https://wg21.link/P1690R1) be a better
use of time?

There are a couple more comments below, for things that I noticed
while quickly skimming over the patch.


>To be clear the node mem layout is:
>- next node pointer
>- node value_type
>- hash code (optional)
>
>The new node mem layout is:
>- next node pointer
>- hash code (optional)
>- node value_type
>
>Here is the git log in case you validate it.
>
>    libstdc++: Store allocator::pointer in hashtable implementation
>
>    Use allocator pointer type in _Hashtable implementation.
>
>            * include/bits/hashtable_policy.h
>            (_ReuseOrAllocNode<>::__node_type): Remove.
>            (_ReuseOrAllocNode<>::__node_pointer): New.
>            (_ReuseOrAllocNode(__node_pointer, __hashtable_alloc&)): 
>Adapt to use
>            latter.
>(_ReuseOrAllocNode<>::operator()(_Arg&&)): Return latter.
>            (_AllocNode<>::__node_type): Remove.
>            (_AllocNode<>::__node_pointer): New.
>(_AllocNode<>::operator()<>(_Arg&&)): Return latter.
>            (_Hash_node_cust_ptr_base<>): New.
>            (_Hash_node_cache_value<typename _Value, bool 
>_Cache_hash_code>): New.
>            (_Hash_node<>::__node_base): New.
>            (_Hash_node<>::__node_ptr): New.
>            (_Hash_node<>::__node_type): New.
>            (_Hash_node<>::__node_value_cache_type): New.
>            (_Hash_node<>::_M_next_ptr()): New.
>            (_Hash_cust_ptr_node<typename _Ptr, bool 
>_Cache_hash_code>): New.
>            (_Hashtable_iterator_base<typename _NodeType>): New.
>            (_Node_iterator_base<>): Inherits from latter.
>            (_Hashtable_iterator<typename _NodeType, bool 
>__constant_iterators>):
>            New.
>            (_Hashtable_const_iterator<typename _NodeType, bool 
>__constant_iterators>):
>            New.
>            (_Insert_base<>::__alloc_ptr): New.
>            (_Insert_base<>::__node_type): New. Define conditionally 
>to _Hash_node<>
>            or _Hash_cust_ptr_node<> depending on __alloc_ptr being a 
>raw pointer.
>            (_Insert_base<>::__node_alloc_type): New.
>            (_Insert_base<>::__hashtable_alloc): Remove.
>            (_Insert_base<>::iterator): Define conditionally to 
>_Node_iterator<>
>            or _Hashtable_iterator<> depending on __alloc_ptr being a 
>raw pointer.
>            (_Insert_base<>::const_iterator): Define conditionally to
>            _Node_const_iterator<> or _Hashtable_const_iterator<> 
>depending on
>            __alloc_ptr being a raw pointer.
>            (_Hashtable_local_iter_base<>): New.
>            (_Hash_code_base<>::_M_bucket_index(const 
>_Hash_node_cache_value<>&,
>            size_t)): New.
>            (_Hashtable_local_iter_base<>): New.
>            (_Hashtable_local_iterator<>): New.
>            (_Hashtable_const_local_iterator<>): New.
>            (_Hashtable_base<>::_M_equals(const _Key&, __hash_code,
>            const _Hash_node_cache_value<>&): New.
>            (_Hashtable_base<>::_M_node_equals(const 
>_Hash_node_cache_value<>&,
>            const _Hash_node_cache_value<>&)): New.
>            (_Hashtable_alloc<>::__value_alloc_traits): Remove.
>            (_Hashtable_alloc<>::__node_base_ptr): Remove.
>            * include/bits/hashtable.h (_Hashtable<>): Adapt.
>            * 
>testsuite/23_containers/unordered_map/allocator/ext_ptr.cc: New test.
>            * 
>testsuite/23_containers/unordered_multimap/allocator/ext_ptr.cc:
>            New test.
>            * 
>testsuite/23_containers/unordered_multiset/allocator/ext_ptr.cc:
>            New test.
>            * 
>testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Adapt.
>
>Tested under Linux x86_64 normal and version namespace modes.
>
>François
>
>
>On 20/10/20 1:04 pm, Jonathan Wakely wrote:
>>On 28/09/20 22:37 +0200, François Dumont via Libstdc++ wrote:
>>>Following recent changes on _Hashtable I rebase the patch and 
>>>completely review it.
>>>
>>>I managed to integrate the allocator custom pointer type without 
>>>touching to _Hashtable base types like _Hash_code_base or 
>>>_Hashtable_base. However I cannot see how to use the custom 
>>>pointer type without impacting the node types like _Hash_node_base 
>>>which now takes a template parameter, the custom pointer type.
>>>
>>>On an abi point of view node types are different however the data 
>>>structure is the same. The only difference is that the 
>>>_Hash_node_base _M_nxt is now a _Hash_node<> custom pointer rather 
>>>than a simple _Hash_node_base*.
>>>
>>>Even if this patch can't go in because of the abi breaking change 
>>>I am going to adapt some of the code simplifications for master. 
>>>Especially the _Hash_code_base and _Local_iterator_base 
>>>simplifications.
>>>
>>>Let me know if you can think of a way to integrate the custom 
>>>pointer without impacting abi. Unless impacting node types and 
>>>associated iterator types is fine even if I already noticed that 
>>>pretty printer tests are broken with those changes.
>>
>>The approach I used for the other containers (which was never
>>completed and committed) is something like:
>>
>>struct _Node_base
>>{
>>  _Node_base* _M_next;
>>};
>>
>>template<typename _Ptr>
>>struct _Fancy_node_base
>>{
>>  _Ptr _M_next;
>>};
>>
>>template<typename _Ptr>
>>  using node_base = conditional_t<is_pointer<_Ptr>::value,
>>                                  _Node_base,
>>_Fancy_node_base<_Ptr>>;
>>
>>This way all existing code that has allocators with non-fancy pointers
>>continues to use the same type. Code using fancy pointers (which
>>doesn't currently work properly anyway) changes to use the new types
>>that depend on the pointer type.
>>
>

>diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
>index 6c6c5edde0b..86644d447ca 100644
>--- a/libstdc++-v3/include/bits/hashtable.h
>+++ b/libstdc++-v3/include/bits/hashtable.h
>@@ -182,8 +182,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> 				 _RehashPolicy, _Traits>,
>       private __detail::_Hashtable_alloc<
> 	__alloc_rebind<_Alloc,
>-		       __detail::_Hash_node<_Value,
>-					    _Traits::__hash_cached::value>>>
>+#if _GLIBCXX_INLINE_VERSION
>+		       __detail::_Hash_cust_ptr_node<
>+			 __detail::__alloc_val_ptr<_Alloc, _Value>,
>+			 _Traits::__hash_cached::value>>>
>+#else
>+	  typename std::conditional<
>+	    std::__is_pointer<
>+	      __detail::__alloc_val_ptr<_Alloc, _Value>>::__value,
>+	    __detail::_Hash_node<_Value, _Traits::__hash_cached::value>,
>+	    __detail::_Hash_cust_ptr_node<
>+	      __detail::__alloc_val_ptr<_Alloc, _Value>,
>+	      _Traits::__hash_cached::value>>::type>>
>+#endif

This ugliness should be hidden behind an alias template.

Use is_pointer<P>::value, not std::__is_pointer<P>::__value.
This is C++11 code, there's no need to use the C++98 traits. And you
don't need the std:: qualification.

_Hash_cust_ptr_node is not a good name if it's also used (sometimes)
for normal pointers. How about _Hash_pnode, or something like that?


>+
>+  /**
>+   *  Primary template struct _Hash_cust_ptr_node.

This comment is not useful. It shouldn't a Doxygen comment, because
this is not something we need to put in the API documentation for
end-users. I can tell it's the primary template, because I can read
C++. What is the type for? How is it different to _Hash_node? That's
what I'd like to read here.

>+   */
>+  template<typename _Ptr, bool _Cache_hash_code>
>+    struct _Hash_cust_ptr_node
>+    : _Hash_node_cust_ptr_base<__ptr_rebind<_Ptr,
>+				   _Hash_cust_ptr_node<_Ptr, _Cache_hash_code>>>
>+    , _Hash_node_cache_value<typename std::pointer_traits<_Ptr>::element_type,
>+			     _Cache_hash_code>


  reply	other threads:[~2020-11-02 14:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-19 17:31 François Dumont
2020-05-15 21:12 ` François Dumont
2020-09-28 20:37   ` François Dumont
2020-10-20 11:04     ` Jonathan Wakely
2020-10-20 17:26       ` François Dumont
2020-11-01 21:48       ` François Dumont
2020-11-02 14:11         ` Jonathan Wakely [this message]
2020-11-02 21:33           ` François Dumont
2021-01-11 18:10           ` François Dumont
2021-06-10 17:22           ` François Dumont

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201102141120.GR503596@redhat.com \
    --to=jwakely@redhat.com \
    --cc=frs.dumont@gmail.com \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).