From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <frs.dumont@gmail.com>
Received: from mail-wr1-x429.google.com (mail-wr1-x429.google.com
 [IPv6:2a00:1450:4864:20::429])
 by sourceware.org (Postfix) with ESMTPS id C207B388B02A;
 Tue,  9 Jun 2020 20:35:29 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C207B388B02A
Received: by mail-wr1-x429.google.com with SMTP id h5so22805158wrc.7;
 Tue, 09 Jun 2020 13:35:29 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:cc:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding
 :content-language;
 bh=4h7ooGW8Sk5DD7AE0GgJZenH4wOINx/r4fLcTlrynDI=;
 b=PIPt3mAQ4CvO75OcWrL9EuGvzDmXH5Oaj6Eh3SxIxRjsmCdYwmOIV/kbWDP/Bwr3Om
 kW2dDbVsap3LQaLM8tyStMLvi/eK1NC3mMpzEhPkTvrLK1WN+aAKf6vQb0WHH8XSpP45
 WBxJQBIatRvr9z914R5fzWuGwSDi7IjBczwT4cyM3qzSE+2rr1SAaHD/FICcRj5IFagS
 1VCfPffqoOMlsOg1sXjMrTqW2gqewXgQKrhZqFDKeDu+UHjTcx51rRIix5OxQ1Jxf2U+
 b0nuUkQ7NzXzH3nT7iZj0XbEb/OeJ0fx7RCMpod/vZcX7tpWH+BLUsbcbksl8E2IqBX7
 nwpw==
X-Gm-Message-State: AOAM532UdUz1GQBjQPJxArooWine9Hn1kj/hrgHO5+QcN/Q0BUqg5xnd
 J3IG97x2kY0OUTucc4saz/wELN7WSPM=
X-Google-Smtp-Source: ABdhPJzkCGtuCv61mNEnVfJsGDtRWOPYeEXqQpyuIGcpu+Rlv36y0zM+jgeBdfsOL6lDeACwepWbRA==
X-Received: by 2002:adf:f0d2:: with SMTP id x18mr6318514wro.250.1591734928468; 
 Tue, 09 Jun 2020 13:35:28 -0700 (PDT)
Received: from ?IPv6:2a01:e0a:1dc:b1c0:ac43:34f3:4c5b:4208?
 ([2a01:e0a:1dc:b1c0:ac43:34f3:4c5b:4208])
 by smtp.googlemail.com with ESMTPSA id l1sm5172976wrb.31.2020.06.09.13.35.27
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Tue, 09 Jun 2020 13:35:27 -0700 (PDT)
Subject: Re: libstdc++: Extend memcmp optimization in
 std::lexicographical_compare
To: Jonathan Wakely <jwakely@redhat.com>
Cc: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>,
 gcc-patches <gcc-patches@gcc.gnu.org>
References: <0cfba2ec-cd66-fe64-a41c-4028403d9dea@gmail.com>
 <20200608182040.GA352899@redhat.com> <20200609102437.GW4137376@redhat.com>
From: =?UTF-8?Q?Fran=c3=a7ois_Dumont?= <frs.dumont@gmail.com>
Message-ID: <d977f60d-c5b3-65c6-299e-5fcd67fc56c2@gmail.com>
Date: Tue, 9 Jun 2020 22:35:26 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.8.0
MIME-Version: 1.0
In-Reply-To: <20200609102437.GW4137376@redhat.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: fr
X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM,
 GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.2
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
 server2.sourceware.org
X-BeenThere: libstdc++@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libstdc++ mailing list <libstdc++.gcc.gnu.org>
List-Unsubscribe: <http://gcc.gnu.org/mailman/options/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/libstdc++/>
List-Post: <mailto:libstdc++@gcc.gnu.org>
List-Help: <mailto:libstdc++-request@gcc.gnu.org?subject=help>
List-Subscribe: <http://gcc.gnu.org/mailman/listinfo/libstdc++>,
 <mailto:libstdc++-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Jun 2020 20:35:31 -0000

On 09/06/20 12:24 pm, Jonathan Wakely wrote:
> On 08/06/20 19:20 +0100, Jonathan Wakely wrote:
>> On 05/06/20 22:24 +0200, FranÃ§ois Dumont via Libstdc++ wrote:
>>> Hi
>>>
>>> Â Â Â  Here is the last of my algo patches this time to extend the 
>>> memcmp optimization to std::deque iterators and _GLIBCXX_DEBUG mode.
>>>
>>> Â Â Â  To do so I had to return int in implementation details of 
>>> lexicographical_compare to make sure we can treat a deque iterator 
>>> range by chunk in a performant way.
>>>
>>> Tested under Linux x86_64 normal and debug modes.
>>>
>>> Â Â Â Â Â Â Â Â Â Â Â  * include/bits/stl_algobase.h
>>> Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_impl): Return int.
>>> Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare::__lc): Likewise.
>>> Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux1(_II1, _II1, 
>>> _II2, _II2)): New.
>>> (__lexicographical_compare_aux1(_Deque_iterator<>, _Deque_iterator<>,
>>> Â Â Â Â Â Â Â Â Â Â Â  _II2, _II2)): Declare.
>>> Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux1(_II1, _II1,
>>> Â Â Â Â Â Â Â Â Â Â Â  _Deque_iterator<>, _Deque_iterator<>)): Declare.
>>> (__lexicographical_compare_aux1(_Deque_iterator<>, _Deque_iterator<>,
>>> Â Â Â Â Â Â Â Â Â Â Â  _Deque_iterator<>, _Deque_iterator<>)): Declare.
>>> Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux): Adapt, call 
>>> later.
>>
>> Is this meant to say "latter"? That's still not correct grammar
>> though. I think it would be better to name the function it calls
>> explicitly: "Call __lexicographical_compare_aux1."
>>
>>> Â Â Â Â Â Â Â Â Â Â Â 
>>> (__lexicographical_compare_aux(_Safe_iterator<>, _Safe_iterator<>,
>>> Â Â Â Â Â Â Â Â Â Â Â  _II2, _II2)): Declare.
>>> Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux(_II1, _II1,
>>> Â Â Â Â Â Â Â Â Â Â Â  _Safe_iterator<>, _Safe_iterator<>)): Declare.
>>> Â Â Â Â Â Â Â Â Â Â Â 
>>> (__lexicographical_compare_aux(_Safe_iterator<>, _Safe_iterator<>,
>>> Â Â Â Â Â Â Â Â Â Â Â  _Safe_iterator<>, _Safe_iterator<>)): Declare.
>>> Â Â Â Â Â Â Â Â Â Â Â  (std::lexicographical_compare): Adapt, call 
>>> later without __niter_base
>>> Â Â Â Â Â Â Â Â Â Â Â  usage.
>>> Â Â Â Â Â Â Â Â Â Â Â  * include/bits/deque.tcc (__lex_cmp_dit): New.
>>> Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux1): New.
>>> Â Â Â Â Â Â Â Â Â Â Â  * include/debug/safe_iterator.tcc
>>> Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux(const 
>>> _Safe_iterator<>&,
>>> Â Â Â Â Â Â Â Â Â Â Â  const _Safe_iterator<>&, _II2, _II2)): New.
>>> Â Â Â Â Â Â Â Â Â Â Â  (__lexicographical_compare_aux(
>>> Â Â Â Â Â Â Â Â Â Â Â  const _Safe_iterator<>&, const_Safe_iterator<>&,
>>> Â Â Â Â Â Â Â Â Â Â Â  const _Safe_iterator<>&, const 
>>> _Safe_iterator<>&)): New.
>>> Â Â Â Â Â Â Â Â Â Â Â  * 
>>> testsuite/25_algorithms/lexicographical_compare/1.cc (test6, test7):
>>> Â Â Â Â Â Â Â Â Â Â Â  New.
>>> Â Â Â Â Â Â Â Â Â Â Â  * 
>>> testsuite/25_algorithms/lexicographical_compare/deque_iterators/1.cc:
>>> Â Â Â Â Â Â Â Â Â Â Â  New test.
>>>
>>> Ok to commit ?
>>>
>>> FranÃ§ois
>>>
>>>
>>
>>> diff --git a/libstdc++-v3/include/bits/deque.tcc 
>>> b/libstdc++-v3/include/bits/deque.tcc
>>> index 1d32a1691c7..d7dbe64f3e1 100644
>>> --- a/libstdc++-v3/include/bits/deque.tcc
>>> +++ b/libstdc++-v3/include/bits/deque.tcc
>>> @@ -1261,6 +1261,98 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>>>      return true;
>>>    }
>>>
>>> +  template<typename _Tp, typename _Ref, typename _Ptr, typename _II>
>>
>> _II is the wrong name here, it mean InputIterator. All callers of this
>> function are constrained for random access iterators. This cannot be
>> called with an input iterator. Please use _RAIter.
>>
>>> +    int
>>> +    __lex_cmp_dit(
>>> +    const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr>& __first1,
>>> +    const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr>& __last1,
>>> +    _II __first2, _II __last2)
>>> +    {
>>> +      typedef _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr> _Iter;
>>> +      typedef typename _Iter::difference_type difference_type;
>>> +
>>> +      if (__first1._M_node != __last1._M_node)
>>> +    {
>>> +      difference_type __len = __last2 - __first2;
>>> +      difference_type __flen
>>
>> What does "flen" mean? Why not use len2 for last2 - first2, as we do
>> elsewhere? And then use len for min(len1, len2)?
>>
>>
>>> +        = std::min(__len, __first1._M_last - __first1._M_cur);
>>> +      if (int __ret = std::__lexicographical_compare_aux1(
>>
>> This call (and the three later in this function) will do overload
>> resolution again on the full set of __lexicographical_compare_aux1
>> overloads, which includes the ones for deque iterators. But we know
>> that __first1._M_cur and __first1._M_last are not deque iterators.
>>
>> __first2 could be a deque iterator, but I'm not sure if we really want
>> one function that has to handle that case anyway.
>>
>>> +          __first1._M_cur, __first1._M_last, __first2, __first2 + 
>>> __flen))
>>> +        return __ret;
>>> +
>>> +      __first2 += __flen;
>>> +      __len -= __flen;
>>> +      __flen = std::min<size_t>(__len, _Iter::_S_buffer_size());
>>> +      for (typename _Iter::_Map_pointer __node = __first1._M_node + 1;
>>> +           __node != __last1._M_node;
>>> +           __first2 += __flen, __len -= __flen,
>>> +           __flen = std::min<size_t>(__len, _Iter::_S_buffer_size()),
>>> +           ++__node)
>>> +        if (int __ret = std::__lexicographical_compare_aux1(
>>> +          *__node, *__node + _Iter::_S_buffer_size(),
>>> +          __first2, __first2 + __flen))
>>> +          return __ret;
>>> +
>>> +      return std::__lexicographical_compare_aux1(
>>> +        __last1._M_first, __last1._M_cur, __first2, __last2);
>>> +    }
>>> +
>>> +      return std::__lexicographical_compare_aux1(
>>> +          __first1._M_cur, __last1._M_cur, __first2, __last2);
>>> +    }
>>> +
>>> +  template<typename _Tp1, typename _Ref1, typename _Ptr1,
>>> +       typename _II2>
>>
>> This is not an input iterator either.
>>
>>> +    typename __gnu_cxx::__enable_if<
>>> +      __is_random_access_iter<_II2>::__value, int>::__type
>>
>> This should be inline.
>>
>> Also, I'm curious why these overloads use __enable_if rather than the
>> conventional approach of tag dispatching using iterator_category.
>> (The same question applies to th __equal_aux1 and __copy_move_a1
>> overloads for deque iterators as well). It doesn't really matter
>> though.
>>
>>> +    __lexicographical_compare_aux1(
>>> +        _GLIBCXX_STD_C::_Deque_iterator<_Tp1, _Ref1, _Ptr1> __first1,
>>> +        _GLIBCXX_STD_C::_Deque_iterator<_Tp1, _Ref1, _Ptr1> __last1,
>>> +        _II2 __first2, _II2 __last2)
>>> +    { return std::__lex_cmp_dit(__first1, __last1, __first2, 
>>> __last2); }
>>> +
>>> +  template<typename _Tp1, typename _Ref1, typename _Ptr1,
>>> +       typename _Tp2, typename _Ref2, typename _Ptr2>
>>> +    int
>>
>> This should be inline.
>>
>>> +    __lexicographical_compare_aux1(
>>> +        _GLIBCXX_STD_C::_Deque_iterator<_Tp1, _Ref1, _Ptr1> __first1,
>>> +        _GLIBCXX_STD_C::_Deque_iterator<_Tp1, _Ref1, _Ptr1> __last1,
>>> +        _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> __first2,
>>> +        _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> __last2)
>>> +    { return std::__lex_cmp_dit(__first1, __last1, __first2, 
>>> __last2); }
>>
>> I'm not convinced it's useful for this function and the one above it
>> to share the same logic.
>>
>>> +  template<typename _II1,
>>> +       typename _Tp2, typename _Ref2, typename _Ptr2>
>>> +    typename __gnu_cxx::__enable_if<
>>> +      __is_random_access_iter<_II1>::__value, int>::__type
>>> +    __lexicographical_compare_aux1(
>>> +        _II1 __first1, _II1 __last1,
>>> +        _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> __first2,
>>> +        _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> __last2)
>>> +    {
>>> +      typedef _GLIBCXX_STD_C::_Deque_iterator<_Tp2, _Ref2, _Ptr2> 
>>> _Iter;
>>> +      typedef typename _Iter::difference_type difference_type;
>>> +
>>> +      difference_type __len = __last1 - __first1;
>>> +      while (__len > 0)
>>> +    {
>>> +      const difference_type __flen = __first2._M_node == 
>>> __last2._M_node
>>> +        ? __last2._M_cur - __first2._M_cur
>>> +        : __first2._M_last - __first2._M_cur;
>>> +      const difference_type __clen = std::min(__len, __flen);
>>
>> "flen"? "clen"?
>>
>>> +      if (int __ret = std::__lexicographical_compare_aux1(
>>> +                __first1, __first1 + __clen,
>>> +                __first2._M_cur, __first2._M_cur + __flen))
>>> +        return __ret;
>>> +
>>> +      __first1 += __clen;
>>> +      __len -= __clen;
>>> +      __first2 += __clen;
>>> +    }
>>> +
>>> +      return __first1 == __last1 ? (__first2 != __last2 ? -1 : 0) : 1;
>>
>> Isn't this function doing basically the same as __lex_cmp_dit? Why
>> does it not use the same logic?
>>
>> Can this whole function just negate the result of __lex_cmp_dit called
>> with the arguments switched?
>>
>>  return -std::__lex_cmp_dit(__first2, __last2, __first1, __last1);
>>
>> That would also fix the bug that makes the following loop forever:
>>
>>  std::deque<int> d;
>>  int i = 0;
>>  std::lexicographical_compare(&i, &i + 1, d.begin(), d.end());
>
> Same question for __equal_aux1, why is it implemented twice? Why
> doesn't __equal_aux1(_II f1, _II l1, _Deque_iterator<> f2) just call
> __equal_aux1(f2, f2 + (l1 - f1), f1) ?
>
Because I didn't thought about doing it this way. Yet another patch if 
you don't do it yourself.