From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by sourceware.org (Postfix) with ESMTPS id C73A03851C15 for ; Mon, 2 Nov 2020 21:33:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C73A03851C15 Received: by mail-wm1-x344.google.com with SMTP id c18so10808999wme.2 for ; Mon, 02 Nov 2020 13:33:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=uHHE1VRUHZJvYrcHfR1a0t76c3B4MukdO5t5xZNoXQw=; b=pLhRxOSSZHSKf954DA9vT5/1hl8h2iSRfoF96MkxYZDcTfwAgzuGKSBNZRqcUuyHj/ HGVoBtY72sCm1KIUU22wsqrjihzfrpqFtKRsxUPoAqJaDW3SSMjNwXZixrvmvtzXTZ2N 93TTDSxPHaD3us/NwTtNAE01ERH8YA4mhwp/UIAcIKFz9qjFXVpyeGvnZrDK05vlgfdB 250CUSTcSPtMkzeWyBFc/2cmaM7yFbebc5A3FIc0qcRKlQpSsrUDPkqXIWWJPB2MtpLY nJrqp8PHg+kl6u3qYwLWFi+F9JfnxigfTXt9JcoWgzsGGqp8RgPWCXMFoQOPfY0dJQLm NKaA== X-Gm-Message-State: AOAM53170qle6EQfLBwqRiSxQ1qnZ81Di1P8b74rqxUeb/3vKijH7BlC arFbMH/4pln5N8GDXfdismzq7L3o+gQ= X-Google-Smtp-Source: ABdhPJwXlKmUYbfdZJxexkMThi4ebKp7qekH28jwVbfMkAhSo7lHRgOIwDautVACfxFMz03PjQSDGw== X-Received: by 2002:a1c:3c84:: with SMTP id j126mr117795wma.151.1604352833394; Mon, 02 Nov 2020 13:33:53 -0800 (PST) Received: from ?IPv6:2a01:e0a:1dc:b1c0:3de5:afd:57a8:c1e9? ([2a01:e0a:1dc:b1c0:3de5:afd:57a8:c1e9]) by smtp.googlemail.com with ESMTPSA id j127sm788997wma.31.2020.11.02.13.33.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 02 Nov 2020 13:33:52 -0800 (PST) Subject: Re: libstdc++ PR 57272 Fancy pointer support in Hashtable To: Jonathan Wakely Cc: "libstdc++@gcc.gnu.org" References: <20201020110422.GA926136@redhat.com> <528bdf1d-ed96-5d9d-89db-e1bf57268e24@gmail.com> <20201102141120.GR503596@redhat.com> From: =?UTF-8?Q?Fran=c3=a7ois_Dumont?= Message-ID: Date: Mon, 2 Nov 2020 22:33:51 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20201102141120.GR503596@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: fr X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libstdc++@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libstdc++ mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Nov 2020 21:33:57 -0000 On 02/11/20 3:11 pm, Jonathan Wakely wrote: > On 01/11/20 22:48 +0100, François Dumont via Libstdc++ wrote: >> Here is an other attempt. >> >> This time I am storing the node using allocator pointer just in the >> singly linked list of nodes. Buckets are still __node_base* so that >> the custom pointer is not manipulated too much. Moreover iterators >> are also using node raw pointers. > > There's no point doing it if you still use raw pointers. > > It either has to be done completely, or it's a waste of time and > energy. Is there a clear Standard point I am violating in doing so ? Is there a chance that std::__to_address is ill formed ? In hashtable we have somehow 2 data structures, the singly link list and the buckets array. I thought it was rather a good idear to not store the custom pointers in both. It makes more clear that the node are stored in the singly linked list, not in the buckets. > >> As advised I introduced new types in case of custom pointers. But I >> am also using those in gnu versioned namespace so that I could more >> easily test this new code with all existing test cases. >> >> Note that as we are at introducing a new abi I am also changing node >> memory layout in this case. I think it is better to put the hash code >> cache before the node value rather than after. It will be closer to >> the begining of the node and so accessible without mem page fault. > > I think that's a bad idea. if somebody is using a fancy pointer which > is just a thin wrapper around a real pointer, and it has implicit > conversions to/from the real pointer), then I think it might work OK > today. For example, __gnu_cxx::_Ext_pointer in . If the > size and alignment of the fancy pointer is just the same as the real > pointer, and the layout of the node classes doesn't change order, I > think that will probably Just Work. Ok, good point, I'll remove this part of the patch. > If you reorder the node members, it definitely won't work. > > Is this a realistic scenario? I don't know. It might be. > > If we want to do that only for the versioned namespace, that would be > OK, but should be a separate patch. > > I'm also concerned about the number of differences that depend on > _GLIBCXX_INLINE_VERSION. The code gets a lot less maintainable with so > many differences, and they only exist to support a mode nobody uses. Still, it helps to validate the code which is normally dedicated to custom pointers. > > Wouldn't implementing https://wg21.link/P0809R0 or > https://wg21.link/P0919R3 (and https://wg21.link/P1690R1) be a better > use of time? I thought support of custom pointers was. But sure, I can have a look at those too. But I won't give up on this one ! > > There are a couple more comments below, for things that I noticed > while quickly skimming over the patch. > > >> To be clear the node mem layout is: >> - next node pointer >> - node value_type >> - hash code (optional) >> >> The new node mem layout is: >> - next node pointer >> - hash code (optional) >> - node value_type >> >> Here is the git log in case you validate it. >> >>     libstdc++: Store allocator::pointer in hashtable implementation >> >>     Use allocator pointer type in _Hashtable implementation. >> >>             * include/bits/hashtable_policy.h >>             (_ReuseOrAllocNode<>::__node_type): Remove. >>            (_ReuseOrAllocNode<>::__node_pointer): New. >>             (_ReuseOrAllocNode(__node_pointer, >> __hashtable_alloc&)): Adapt to use >>             latter. >> (_ReuseOrAllocNode<>::operator()(_Arg&&)): Return latter. >>             (_AllocNode<>::__node_type): Remove. >>             (_AllocNode<>::__node_pointer): New. >> (_AllocNode<>::operator()<>(_Arg&&)): Return latter. >>             (_Hash_node_cust_ptr_base<>): New. >>             (_Hash_node_cache_value> _Cache_hash_code>): New. >>             (_Hash_node<>::__node_base): New. >>             (_Hash_node<>::__node_ptr): New. >>             (_Hash_node<>::__node_type): New. >>            (_Hash_node<>::__node_value_cache_type): New. >>             (_Hash_node<>::_M_next_ptr()): New. >>             (_Hash_cust_ptr_node> _Cache_hash_code>): New. >>             (_Hashtable_iterator_base> _NodeType>): New. >>             (_Node_iterator_base<>): Inherits from latter. >>             (_Hashtable_iterator> __constant_iterators>): >>             New. >>             (_Hashtable_const_iterator> bool __constant_iterators>): >>             New. >>             (_Insert_base<>::__alloc_ptr): New. >>             (_Insert_base<>::__node_type): New. Define >> conditionally to _Hash_node<> >>             or _Hash_cust_ptr_node<> depending on >> __alloc_ptr being a raw pointer. >>            (_Insert_base<>::__node_alloc_type): New. >>            (_Insert_base<>::__hashtable_alloc): Remove. >>             (_Insert_base<>::iterator): Define >> conditionally to _Node_iterator<> >>             or _Hashtable_iterator<> depending on >> __alloc_ptr being a raw pointer. >>             (_Insert_base<>::const_iterator): Define >> conditionally to >>             _Node_const_iterator<> or >> _Hashtable_const_iterator<> depending on >>             __alloc_ptr being a raw pointer. >>             (_Hashtable_local_iter_base<>): New. >>            (_Hash_code_base<>::_M_bucket_index(const >> _Hash_node_cache_value<>&, >>             size_t)): New. >>             (_Hashtable_local_iter_base<>): New. >>             (_Hashtable_local_iterator<>): New. >>            (_Hashtable_const_local_iterator<>): New. >>             (_Hashtable_base<>::_M_equals(const _Key&, >> __hash_code, >>             const _Hash_node_cache_value<>&): New. >>            (_Hashtable_base<>::_M_node_equals(const >> _Hash_node_cache_value<>&, >>             const _Hash_node_cache_value<>&)): New. >>            (_Hashtable_alloc<>::__value_alloc_traits): >> Remove. >>            (_Hashtable_alloc<>::__node_base_ptr): Remove. >>             * include/bits/hashtable.h (_Hashtable<>): Adapt. >>             * >> testsuite/23_containers/unordered_map/allocator/ext_ptr.cc: New test. >>             * >> testsuite/23_containers/unordered_multimap/allocator/ext_ptr.cc: >>             New test. >>             * >> testsuite/23_containers/unordered_multiset/allocator/ext_ptr.cc: >>             New test. >>             * >> testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Adapt. >> >> Tested under Linux x86_64 normal and version namespace modes. >> >> François >> >> >> On 20/10/20 1:04 pm, Jonathan Wakely wrote: >>> On 28/09/20 22:37 +0200, François Dumont via Libstdc++ wrote: >>>> Following recent changes on _Hashtable I rebase the patch and >>>> completely review it. >>>> >>>> I managed to integrate the allocator custom pointer type without >>>> touching to _Hashtable base types like _Hash_code_base or >>>> _Hashtable_base. However I cannot see how to use the custom pointer >>>> type without impacting the node types like _Hash_node_base which >>>> now takes a template parameter, the custom pointer type. >>>> >>>> On an abi point of view node types are different however the data >>>> structure is the same. The only difference is that the >>>> _Hash_node_base _M_nxt is now a _Hash_node<> custom pointer rather >>>> than a simple _Hash_node_base*. >>>> >>>> Even if this patch can't go in because of the abi breaking change I >>>> am going to adapt some of the code simplifications for master. >>>> Especially the _Hash_code_base and _Local_iterator_base >>>> simplifications. >>>> >>>> Let me know if you can think of a way to integrate the custom >>>> pointer without impacting abi. Unless impacting node types and >>>> associated iterator types is fine even if I already noticed that >>>> pretty printer tests are broken with those changes. >>> >>> The approach I used for the other containers (which was never >>> completed and committed) is something like: >>> >>> struct _Node_base >>> { >>>   _Node_base* _M_next; >>> }; >>> >>> template >>> struct _Fancy_node_base >>> { >>>   _Ptr _M_next; >>> }; >>> >>> template >>>   using node_base = conditional_t::value, >>>                                  >>> _Node_base, >>> _Fancy_node_base<_Ptr>>; >>> >>> This way all existing code that has allocators with non-fancy pointers >>> continues to use the same type. Code using fancy pointers (which >>> doesn't currently work properly anyway) changes to use the new types >>> that depend on the pointer type. >>> >> > >> diff --git a/libstdc++-v3/include/bits/hashtable.h >> b/libstdc++-v3/include/bits/hashtable.h >> index 6c6c5edde0b..86644d447ca 100644 >> --- a/libstdc++-v3/include/bits/hashtable.h >> +++ b/libstdc++-v3/include/bits/hashtable.h >> @@ -182,8 +182,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >>                  _RehashPolicy, _Traits>, >>       private __detail::_Hashtable_alloc< >>     __alloc_rebind<_Alloc, >> -               __detail::_Hash_node<_Value, >> - _Traits::__hash_cached::value>>> >> +#if _GLIBCXX_INLINE_VERSION >> +               __detail::_Hash_cust_ptr_node< >> +             __detail::__alloc_val_ptr<_Alloc, _Value>, >> +             _Traits::__hash_cached::value>>> >> +#else >> +      typename std::conditional< >> +        std::__is_pointer< >> +          __detail::__alloc_val_ptr<_Alloc, _Value>>::__value, >> +        __detail::_Hash_node<_Value, _Traits::__hash_cached::value>, >> +        __detail::_Hash_cust_ptr_node< >> +          __detail::__alloc_val_ptr<_Alloc, _Value>, >> +          _Traits::__hash_cached::value>>::type>> >> +#endif > > This ugliness should be hidden behind an alias template. > > Use is_pointer

::value, not std::__is_pointer

::__value. > This is C++11 code, there's no need to use the C++98 traits. And you > don't need the std:: qualification. > > _Hash_cust_ptr_node is not a good name if it's also used (sometimes) > for normal pointers. How about _Hash_pnode, or something like that? Ok. > > >> + >> +  /** >> +   *  Primary template struct _Hash_cust_ptr_node. > > This comment is not useful. It shouldn't a Doxygen comment, because > this is not something we need to put in the API documentation for > end-users. I can tell it's the primary template, because I can read > C++. What is the type for? How is it different to _Hash_node? That's > what I'd like to read here. Yes, that's a bad copy/paster. > >> +   */ >> +  template >> +    struct _Hash_cust_ptr_node >> +    : _Hash_node_cust_ptr_base<__ptr_rebind<_Ptr, >> +                   _Hash_cust_ptr_node<_Ptr, _Cache_hash_code>>> >> +    , _Hash_node_cache_value> std::pointer_traits<_Ptr>::element_type, >> +                 _Cache_hash_code> >