From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jwakely@redhat.com>
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.129.124])
 by sourceware.org (Postfix) with ESMTPS id E88EF3858436
 for <libstdc++@gcc.gnu.org>; Wed,  5 Jan 2022 17:08:06 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E88EF3858436
Received: from mail-yb1-f197.google.com (mail-yb1-f197.google.com
 [209.85.219.197]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-37-wBfhsm7FPG63J0qbeeNwhg-1; Wed, 05 Jan 2022 12:07:56 -0500
X-MC-Unique: wBfhsm7FPG63J0qbeeNwhg-1
Received: by mail-yb1-f197.google.com with SMTP id
 u130-20020a254788000000b0060a9645f781so61506090yba.19
 for <libstdc++@gcc.gnu.org>; Wed, 05 Jan 2022 09:07:55 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=o+lRZMd6FWlDkfPoSjIZBpKusY6VDp4HEvT4g1QSBEQ=;
 b=mbHTVl9yiX3AT7Y5yQFnNfhKsDVATkr76SArJYjj+4xxI3JyUEI44+hiip4maVfUBh
 5J5HJ12AkUg3YtppxRlAwE0hDTpaybs3iOJtrnMGy9DB1uz6xP3A7r8rAf6VmBIbEnNQ
 MdLrZF7osYl/u5MGlbgIi/GCShACXWJUzsUJ5JqBjKtGS5Lo5AyL+3dNP/EJJsuNa2RK
 ELL6/0hoJaHmiTrUp15FM72En3duEXWJs3AHsjz9wXX9f23BBSw6ckaCaws1zCOWAiFn
 Sk3ErUeTD7nbO0aqoBEGGBdKe41Qme19980w1DR5vJj85CLYnK1erJmGzdK+vhaVmPSQ
 qcxQ==
X-Gm-Message-State: AOAM531sl+AOLVzkgBHKURkGn/iWnmzUdyGirkotDy946SjgP2QFzuLo
 4k3MtUYMsgYV5Z4mrbPX5yuFv/rOp/VXBo2YFOC4WkumVZvvWh0EpEXsm+4zYrcLWMio4mh6qao
 O6rSS9YXLo0SznuBcWWSF/k67sOXjB+I=
X-Received: by 2002:a25:dfcf:: with SMTP id
 w198mr43402919ybg.120.1641402475428; 
 Wed, 05 Jan 2022 09:07:55 -0800 (PST)
X-Google-Smtp-Source: ABdhPJwOwIfbCImrKOHMeOz2R9htz7S3zF6wUGLu075m5aqWlQviHUolv2seLAWl86BJTt/IIHokKVaHXt8WUStKjcE=
X-Received: by 2002:a25:dfcf:: with SMTP id
 w198mr43402903ybg.120.1641402475233; 
 Wed, 05 Jan 2022 09:07:55 -0800 (PST)
MIME-Version: 1.0
References: <88d3aa39-2146-3ea5-8bb2-a4d9b70accbd@gmail.com>
 <20200717125809.GC2827066@redhat.com>
 <0a5b6bcb-6475-2674-adfb-23f6793bfefc@gmail.com>
 <1491e266-fb74-dd9c-9955-8a7dd0334bb4@gmail.com>
 <73be2079-5046-c06f-f144-1af85052fc50@gmail.com>
 <YcRzoSSc534Lg+/F@redhat.com> <7b48215d-b56f-b726-c90d-87591a711c2a@gmail.com>
In-Reply-To: <7b48215d-b56f-b726-c90d-87591a711c2a@gmail.com>
From: Jonathan Wakely <jwakely@redhat.com>
Date: Wed, 5 Jan 2022 17:07:44 +0000
Message-ID: <CACb0b4=jObNkMRV0rLtOc97Fj4X1QEExr5fPicVsEOjmbGDaBg@mail.gmail.com>
Subject: Re: [PATCH][Hashtable 6/6] PR 68303 small size optimization
To: =?UTF-8?Q?Fran=C3=A7ois_Dumont?= <frs.dumont@gmail.com>
Cc: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>,
 gcc-patches <gcc-patches@gcc.gnu.org>
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 HTML_MESSAGE, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,
 SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.29
X-BeenThere: libstdc++@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libstdc++ mailing list <libstdc++.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/libstdc++/>
List-Post: <mailto:libstdc++@gcc.gnu.org>
List-Help: <mailto:libstdc++-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Jan 2022 17:08:09 -0000

On Sat, 25 Dec 2021 at 21:39, Fran=C3=A7ois Dumont <frs.dumont@gmail.com> w=
rote:

> On 23/12/21 2:03 pm, Jonathan Wakely wrote:
> > On 21/12/21 07:07 +0100, Fran=C3=A7ois Dumont wrote:
> >> Hi
> >>
> >> ??? Is there a chance for this patch to be integrated for next gcc
> >> release ?
> >
> > Yes, I think this can still make it for GCC 12 (the patch was first
> > posted long ago in stage 1 and it's just me being slow to review it).
> >
> > But I have some questions and comments ...
> >
> >
> >> diff --git a/libstdc++-v3/include/bits/hashtable.h
> >> b/libstdc++-v3/include/bits/hashtable.h
> >> index 6e2d4c10cfe..6b2c6b99fef 100644
> >> --- a/libstdc++-v3/include/bits/hashtable.h
> >> +++ b/libstdc++-v3/include/bits/hashtable.h
> >> @@ -419,6 +419,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>       _M_uses_single_bucket() const
> >>       { return _M_uses_single_bucket(_M_buckets); }
> >>
> >> +      static constexpr size_t
> >> +      __small_size_threshold()
> >> +      {
> >> +    return
> >> + __detail::_Hashtable_hash_traits<_Hash>::__small_size_threshold();
> >> +      }
> >> +
> I've added the noexcept qualification as you asked.
> >>       __hashtable_alloc&
> >>       _M_base_alloc() { return *this; }
> >>
> >> @@ -788,6 +795,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>       _M_bucket_index(__hash_code __c) const
> >>       { return __hash_code_base::_M_bucket_index(__c,
> >> _M_bucket_count); }
> >>
> >> +      __node_base_ptr
> >> +      _M_find_before_node(const key_type&);
> >> +
> >>       // Find and insert helper functions and types
> >>       // Find the node before the one matching the criteria.
> >>       __node_base_ptr
> >> @@ -831,6 +841,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>       __node_base_ptr
> >>       _M_get_previous_node(size_type __bkt, __node_ptr __n);
> >>
> >> +      pair<const_iterator, __hash_code>
> >> +      _M_compute_hash_code(const_iterator __hint, const key_type&
> >> __k) const;
> >> +
> >>       // Insert node __n with hash code __code, in bucket __bkt if no
> >>       // rehash (assumes no element with same key already present).
> >>       // Takes ownership of __n if insertion succeeds, throws otherwis=
e.
> >> @@ -1126,7 +1139,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>       void _M_rehash(size_type __bkt_count, const __rehash_state&
> >> __state);
> >>     };
> >>
> >> -
> >>   // Definitions of class template _Hashtable's out-of-line member
> >> functions.
> >>   template<typename _Key, typename _Value, typename _Alloc,
> >>        typename _ExtractKey, typename _Equal,
> >> @@ -1628,6 +1640,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>     find(const key_type& __k)
> >>     -> iterator
> >>     {
> >> +      if (size() <=3D __small_size_threshold())
> >> +    {
> >> +      for (auto __it =3D begin(); __it !=3D end(); ++__it)
> >> +        if (this->_M_key_equals(__k, *__it._M_cur))
> >> +          return __it;
> >> +      return end();
> >> +    }
> >
> > This loop is repeated a few times, would it be better factored out
> > into its own function? _M_find_loop or something? The return type is
> > different in some cases, so maybe it's OK like this.
> Yes, I also thought about that but there is often small changes from one
> loop to another as you noticed.
> >
> >
> >
> >> +
> >>       __hash_code __code =3D this->_M_hash_code(__k);
> >>       std::size_t __bkt =3D _M_bucket_index(__code);
> >>       return iterator(_M_find_node(__bkt, __k, __code));
> >> @@ -1643,6 +1663,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>     find(const key_type& __k) const
> >>     -> const_iterator
> >>     {
> >> +      if (size() <=3D __small_size_threshold())
> >> +    {
> >> +      for (auto __it =3D begin(); __it !=3D end(); ++__it)
> >> +        if (this->_M_key_equals(__k, *__it._M_cur))
> >> +          return __it;
> >> +      return end();
> >> +    }
> >> +
> >>       __hash_code __code =3D this->_M_hash_code(__k);
> >>       std::size_t __bkt =3D _M_bucket_index(__code);
> >>       return const_iterator(_M_find_node(__bkt, __k, __code));
> >> @@ -1855,6 +1883,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>       }
> >> #endif
> >>
> >> +  // Find the node before the one whose key compares equal to k.
> >> +  // Return nullptr if no node is found.
> >> +  template<typename _Key, typename _Value, typename _Alloc,
> >> +       typename _ExtractKey, typename _Equal,
> >> +       typename _Hash, typename _RangeHash, typename _Unused,
> >> +       typename _RehashPolicy, typename _Traits>
> >> +    auto
> >> +    _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
> >> +           _Hash, _RangeHash, _Unused, _RehashPolicy, _Traits>::
> >> +    _M_find_before_node(const key_type& __k)
> >> +    -> __node_base_ptr
> >> +    {
> >> +      __node_base_ptr __prev_p =3D &_M_before_begin;
> >
> > This is OK now, but would need to be std::__addressof(_M_before_begin)
> > if/when the _Hash_code_base type becomes dependent on the allocator's
> > pointer.
> Yes, maybe after gcc release we will talk about those fancy pointer
> types again.
> >
> >      __n_last =3D __n_last->_M_next();
> >> diff --git a/libstdc++-v3/include/bits/hashtable_policy.h
> >> b/libstdc++-v3/include/bits/hashtable_policy.h
> >> index 0b5443fc187..b4a8af081ce 100644
> >> --- a/libstdc++-v3/include/bits/hashtable_policy.h
> >> +++ b/libstdc++-v3/include/bits/hashtable_policy.h
> >> @@ -246,6 +246,20 @@ namespace __detail
> >>       using __unique_keys =3D __bool_constant<_Unique_keys>;
> >>     };
> >>
> >> +  /**
> >> +   *  struct _Hashtable_hash_traits
> >> +   *
> >> +   *  Important traits for hash tables depending on associated hasher=
.
> >> +   *
> >> +   */
> >> +  template<typename _Hash>
> >> +    struct _Hashtable_hash_traits
> >> +    {
> >> +      static constexpr std::size_t
> >> +      __small_size_threshold()
> >> +      { return std::__is_fast_hash<_Hash>::value ? 0 : 20; }
> >> +    };
> >
> > Yet another trait that nobody is ever going to specialize make me sad.
> > I don't have a better idea though.
>
> Sure, but maybe I can document it ?
>
> I also wonder why you did not propose to make it a constant rather than
> requesting to add the noexcept.
>
> >
> >        {
> >> @@ -1263,6 +1279,14 @@ namespace __detail
> >>         const _Hash_node_value<_Value, __cache_hash_code>& __n) const
> >>     { return _M_hash_code(_ExtractKey{}(__n._M_v())); }
> >>
> >> +      __hash_code
> >> +      _M_hash_code(const _Hash_node_value<_Value, false>& __n) const
> >> +      { return _M_hash_code(_ExtractKey{}(__n._M_v())); }
> >> +
> >> +      __hash_code
> >> +      _M_hash_code(const _Hash_node_value<_Value, true>& __n) const
> >> +      { return __n._M_hash_code; }
> >> +
> >>       std::size_t
> >>       _M_bucket_index(__hash_code __c, std::size_t __bkt_count) const
> >>       { return _RangeHash{}(__c, __bkt_count); }
> >> @@ -1273,17 +1297,14 @@ namespace __detail
> >>     noexcept( noexcept(declval<const _Hash&>()(declval<const _Key&>())=
)
> >>           && noexcept(declval<const _RangeHash&>()((__hash_code)0,
> >>                                (std::size_t)0)) )
> >> -      {
> >> -    return _RangeHash{}(_M_hash_code(_ExtractKey{}(__n._M_v())),
> >> -                __bkt_count);
> >> -      }
> >> +      { return _M_bucket_index(_M_hash_code(__n), __bkt_count); }
> >
> > Why add this extra level of indirection (and overload resolution)?
> >
> > We know this is a _Hash_node_value<V, false>, why call _M_hash_code to
> > decide how to get the hash code for it?
>
> Just to avoid code duplication but indeed it introduces overload
> resolution so I reverted it.
>
> >
> >
> >> diff --git a/libstdc++-v3/testsuite/util/testsuite_performance.h
> >> b/libstdc++-v3/testsuite/util/testsuite_performance.h
> >> index cba3a0d4b17..4ca15ab0e71 100644
> >> --- a/libstdc++-v3/testsuite/util/testsuite_performance.h
> >> +++ b/libstdc++-v3/testsuite/util/testsuite_performance.h
> >> @@ -239,7 +239,7 @@ namespace __gnu_test
> >>     out << std::setw(4) << t.real_time() << "r" << space;
> >>     out << std::setw(4) << t.user_time() << "u" << space;
> >>     out << std::setw(4) << t.system_time() << "s" << space;
> >> -    out << std::setw(8) << r.allocated_memory() << "mem" << space;
> >> +    out << std::setw(9) << r.allocated_memory() << "mem" << space;
> >
> > One day I need to figure out why the reported memory is garbage so
> > often
>
>
> Ok with those changes ?
>
>
Yes, thanks very much (and bonne ann=C3=A9e).