From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-435829-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 95902 invoked by alias); 14 Sep 2016 09:00:50 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 94871 invoked by uid 89); 14 Sep 2016 09:00:49 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=pity, offer
X-Spam-User: qpsmtpd, 2 recipients
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 14 Sep 2016 09:00:39 +0000
Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26])	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id 9D9C813D06;	Wed, 14 Sep 2016 09:00:37 +0000 (UTC)
Received: from localhost (ovpn-116-87.ams2.redhat.com [10.36.116.87])	by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u8E90aKm021931;	Wed, 14 Sep 2016 05:00:36 -0400
Date: Wed, 14 Sep 2016 09:01:00 -0000
From: Jonathan Wakely <jwakely@redhat.com>
To: =?iso-8859-1?Q?Fran=E7ois?= Dumont <frs.dumont@gmail.com>
Cc: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>,        gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: debug container mutex association
Message-ID: <20160914090036.GC17376@redhat.com>
References: <46b4f297-c0da-b0b0-03ef-bba019b97ec7@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1; format=flowed
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <46b4f297-c0da-b0b0-03ef-bba019b97ec7@gmail.com>
X-Clacks-Overhead: GNU Terry Pratchett
User-Agent: Mutt/1.7.0 (2016-08-17)
X-SW-Source: 2016-09/txt/msg00792.txt.bz2

On 13/09/16 22:37 +0200, François Dumont wrote:
>Hi
>
>    When I proposed to change std::hash for pointers my plan was to 
>use this approach for the debug containers. So here is the patch 
>leveraging on this technique to avoid going through _Hash_impl to hash 
>address and get mutex index from it. I think it is obious that new 
>access is faster so I didn't wrote a performance test to demonstrate 
>it.

Is this actually a bottleneck that needs to be made faster?

I know it's nice if Debug Mode isn't very slow, but you've already
made it much better, and performance is not the primary goal of Debug
Mode.

>diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver
>index 9b5bb23..c9a253e 100644
>--- a/libstdc++-v3/config/abi/pre/gnu.ver
>+++ b/libstdc++-v3/config/abi/pre/gnu.ver
>@@ -1947,6 +1947,21 @@ GLIBCXX_3.4.23 {
>     _ZNSsC[12]ERKSs[jmy]RKSaIcE;
>     _ZNSbIwSt11char_traitsIwESaIwEEC[12]ERKS2_mRKS1_;
> 
>+    # __gnu_debug::_Safe_sequence_base
>+    _ZN11__gnu_debug19_Safe_sequence_base18_M_detach_singularEm;

The 'm' here should be [jmy] because std::size_t mangles differently
on different targets.

>+    _ZN11__gnu_debug19_Safe_sequence_base22_M_revalidate_singularEm;
>+    _ZN11__gnu_debug19_Safe_sequence_base12_M_get_mutexEm;
>+    _ZN11__gnu_debug19_Safe_sequence_base7_M_swapERS0_m;
>+
>+    # __gnu_debug::_Safe_iterator_base
>+    _ZN11__gnu_debug19_Safe_iterator_base9_M_attachEPNS_19_Safe_sequence_baseEbm;
>+    _ZN11__gnu_debug19_Safe_iterator_base9_M_detachEm;
>+
>+    # __gnu_debug::_Safe_unordered_container_base and _Safe_local_iterator_base
>+    _ZN11__gnu_debug30_Safe_unordered_container_base7_M_swapERS0_m;
>+    _ZN11__gnu_debug25_Safe_local_iterator_base9_M_attachEPNS_19_Safe_sequence_baseEbm;
>+    _ZN11__gnu_debug25_Safe_local_iterator_base9_M_detachEm;
>+
> } GLIBCXX_3.4.22;

It's unfortunate to have to export and maintain new symbols when we
don't offer the same level of ABI guarantee for Debug Mode  :-(

>@@ -174,6 +191,18 @@ namespace __gnu_debug
> 		    const _Safe_iterator_base* __last)
>   { return __first->_M_can_compare(*__last); }
> 
>+  // Compute power of 2 not greater than __n.
>+  template<size_t __n>
>+    struct _Lowest_power_of_two
>+    {
>+      static const size_t __val
>+        = _Lowest_power_of_two< (__n >> 1) >::__val + 1;
>+    };
>+
>+  template<>
>+    struct _Lowest_power_of_two<1>
>+    { static const size_t __val = 1; };

The lowest power of two not greater than 1 is 2^0, but this gives the
result 1.

Similarly, the lowest power of two not greater than 2 is 2^1, i.e. 1.

And the name is misleading, it doesn't compute a power of two, it
computes the base-2 logarithm (then adds one), so it's a compile-time
version of floor(log2(n))+1.

>@@ -123,6 +125,36 @@ namespace __gnu_debug
>       template<typename _Predicate>
> 	void
> 	_M_transfer_from_if(_Safe_sequence& __from, _Predicate __pred);
>+
>+      static _GLIBCXX_CONSTEXPR std::size_t
>+      _S_alignment() _GLIBCXX_NOEXCEPT
>+      { return _Lowest_power_of_two<__alignof(_Sequence)>::__val; }

This function is called "alignment" but it returns log2(alignof(T))+1

Does it need to be constexpr? Is that for std::array?

You can get the same result without _Lowest_power_of_two:

  static _GLIBCXX_CONSTEXPR size_t
  _S_alignbits() _GLIBCXX_NOEXCEPT
  { return __builtin_ctz(__alignof(_Sequence)) + 1; }

But I'm not convinced the +1 is correct, see below.

>@@ -46,15 +47,30 @@ namespace
>   /** Returns different instances of __mutex depending on the passed address
>    *  in order to limit contention without breaking current library binary
>    *  compatibility. */
>+  const size_t mask = 0xf;
>+
>   __gnu_cxx::__mutex&
>-  get_safe_base_mutex(void* __address)
>+  get_mutex(size_t index)
>   {
>-    const size_t mask = 0xf;
>     static __gnu_cxx::__mutex safe_base_mutex[mask + 1];
>-    const size_t index = _Hash_impl::hash(__address) & mask;
>     return safe_base_mutex[index];
>   }

N.B. https://gcc.gnu.org/PR71312

> 
>+  __gnu_cxx::__mutex&
>+  get_safe_base_mutex(void* address)
>+  {
>+    const size_t index = _Hash_impl::hash(address) & mask;
>+    return get_mutex(index);
>+  }
>+
>+  __gnu_cxx::__mutex&
>+  get_safe_base_mutex(void* address, std::size_t alignment)
>+  {
>+    const size_t index
>+      = (reinterpret_cast<size_t>(address) >> alignment) & mask;
>+    return get_mutex(index);
>+  }

Since the "alignment" value (which is really log2(align)+1) is always
a positive value that means we always shift the address. But that
means we drop the least significant bit of all pointers, even when the
type has alignof(T)==1 and so all bits in the pointer are significant.
Is that intentional? Is the idea to only drop bits that are always
zero?

By changing the algorithm we use to map an address to a mutex do we
allow programs to get two different mutexes for the same object? That
can only happen when some objects are compiled with an older GCC and
so still use the old _M_get_mutex() function that doesn't take an
argument. We require all objects to be rebuilt with the same compiler
for Debug Mode (but maybe not if using __gnu_debug::vector etc.
explicitly in normal mode?) so that's not a big deal. But it's a pity
we have to keep the old symbols around if using them is unsafe.

>   void
>   swap_its(__gnu_debug::_Safe_sequence_base& __lhs,
> 	   __gnu_debug::_Safe_iterator_base*& __lhs_its,
>@@ -70,8 +86,8 @@ namespace
>   }
> 
>   void
>-  swap_seq(__gnu_debug::_Safe_sequence_base& __lhs,
>-	   __gnu_debug::_Safe_sequence_base& __rhs)
>+  swap_seq_single(__gnu_debug::_Safe_sequence_base& __lhs,
>+		  __gnu_debug::_Safe_sequence_base& __rhs)
>   {
>     swap(__lhs._M_version, __rhs._M_version);
>     swap_its(__lhs, __lhs._M_iterators,
>@@ -81,10 +97,41 @@ namespace
>   }
> 
>   void
>-  swap_ucont(__gnu_debug::_Safe_unordered_container_base& __lhs,
>-	    __gnu_debug::_Safe_unordered_container_base& __rhs)
>+  lock_and_run(__gnu_cxx::__mutex& lhs_mutex,
>+	       __gnu_cxx::__mutex& rhs_mutex,
>+	       std::function<void()> action)

Do all calls to this fit into the std::function small-object buffer,
so we don't need to allocate memory for it?