public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Marc Glisse <marc.glisse@inria.fr>
To: Jonathan Wakely <jwakely@redhat.com>
Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org
Subject: Re: [libstdc++/61347] std::distance(list.first(),list.end()) in O(1)
Date: Mon, 13 Apr 2015 14:15:00 -0000	[thread overview]
Message-ID: <alpine.DEB.2.11.1504131549210.13447@stedding.saclay.inria.fr> (raw)
In-Reply-To: <20150413134025.GI9755@redhat.com>

On Mon, 13 Apr 2015, Jonathan Wakely wrote:

> On 13/04/15 13:42 +0200, Marc Glisse wrote:
>> this patch makes std::distance(list.first(),list.end()) a constant time 
>> operation when optimizing, with no penalty for other calls. We could do the 
>> test always (no __builtin_constant_p) but then it will add a small runtime 
>> penalty for other calls, someone else can take responsibility for that.
>
> I like the way you've done it. No penalty for other calls is great
> and IMHO it's OK that the optimisation only happens when optimising.
>
> (Does it work even at -Og?)

No, the testcase currently passes with -O1 but not -Og.

>> I could protect the whole overload with #ifdef __OPTIMIZE__ (at -O0 the 
>> compiler does not remove the test ++end==first as dead code), but I assume 
>> it is better to minimize differences.
>
> Agreed.
>
>> There are other ways to specialize distance, but overloading __distance 
>> seems to have the least drawbacks (most others involve a new extra file). 
>> From an optimization POV, it would be a bit better to avoid the while loop 
>> and instead call a separate function that does it (say the regular 
>> __distance), it would make the function smaller and thus easier to inline, 
>> but it is simpler this way and works.
>
> Is there any (dis)advantage to making one call the other, to avoid
> duplicating the function bodies?
>
> template<typename _Tp>
>   inline ptrdiff_t
>   __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first,
> 	       _GLIBCXX_STD_C::_List_iterator<_Tp> __last,
>              input_iterator_tag __tag)
>   {
>     typedef _GLIBCXX_STD_C::_List_const_iterator<_Tp> _CIter;
>     return std::__distance(_CIter(__first), _CIter(__last), __tag);
>   }

I don't see any disadvantage, I'll do that.

>> We could do something similar for std::set, but C++ will not let us find 
>> the address of _Rb_tree_impl::_M_node_count from that of 
>> _Rb_tree_impl::_M_header, except when _Key_compare is pod, which luckily is 
>> an easily available information. Avoiding this complication is why I 
>> insisted on wrapping the size of std::list in a _List_node<size_t> for 
>> gcc-5, which may look a bit strange at first sight.
>
> Sadly, that node is going to look even stranger when I finish adding
> support for C++11 allocators, as the type of node becomes dependent on
> the allocator's pointer, which makes _List_node<size_t> much more
> complicated :-(

But then I assume _List_node_base is the part really getting more 
complicated, so without looking too closely it seems almost orthogonal.

> I'll have to remember to add additional __distance overloads to handle
> the new _List_ptr_iterator and _List_ptr_const_iterator types that
> will be used for fancy pointers (although if I forget the optimisation
> will still work for std::list<T, std::allocator<T>>, just not for the
> vanishingly small number of people using allocators with fancy
> pointers).
>
>> Index: include/bits/stl_iterator_base_funcs.h
>> ===================================================================
>> --- include/bits/stl_iterator_base_funcs.h	(revision 222041)
>> +++ include/bits/stl_iterator_base_funcs.h	(working copy)
>> @@ -107,22 +107,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>    *  n may be negative.
>>    *
>>    *  For random access iterators, this uses their @c + and @c - operations
>>    *  and are constant time.  For other %iterator classes they are linear 
>> time.
>>   */
>>   template<typename _InputIterator>
>>     inline typename iterator_traits<_InputIterator>::difference_type
>>     distance(_InputIterator __first, _InputIterator __last)
>>     {
>>       // concept requirements -- taken care of in __distance
>> -      return std::__distance(__first, __last,
>> -			     std::__iterator_category(__first));
>> +      return __distance(__first, __last, 
>> std::__iterator_category(__first));
>
> This should still be a qualified call to std::__distance, otherwise the
> compiler might end up instantiating things to figure out if there are
> any associated namespaces, e.g. for vector<unique_ptr<T>>::iterator we
> don't want to know T's base classes and rheir associated namespaces.

Then the approach will not work. C++ overload resolution will not consider 
the later __distance declarations in stl_list.h if the call is qualified 
(I didn't change it gratuitously). A forward declaration of list iterators 
and those __distance overloads in bits/stl_iterator_base_funcs.h is not 
very appealing but may work (or it may cause issues, I don't know). 
Otherwise, I guess we are back to creating a new file 
bits/list_iterator.h, which <list> includes if <iterator> has already been 
included and vice versa, and which provides overloads for distance 
directly.

>> +  // Detect when distance is used to compute the size of the whole list.
>> +  template<typename _Tp>
>> +    inline ptrdiff_t
>> +    __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first,
>> +	       _GLIBCXX_STD_C::_List_iterator<_Tp> __last,
>> +               input_iterator_tag)
>> +    {
>> +      typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel;
>> +      _GLIBCXX_STD_C::_List_iterator<_Tp> __beyond = __last;
>> +      ++__beyond;
>> +      bool __whole = __first == __beyond;
>> +      if (__builtin_constant_p (__whole) && __whole)
>
> This is clever :-)
>
> This shouldn't interfere with any changes we might need to test before
> backporting to the gcc-5-branch, so with the std:: qualification
> restored on the call to __distance it's OK for trunk.
>
> (I'll trust your judgment/masurements for whether it's worth making
> the _List_iterator overload call the _List_const_iterator one.)

-- 
Marc Glisse

  reply	other threads:[~2015-04-13 14:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-13 11:42 Marc Glisse
2015-04-13 13:40 ` Jonathan Wakely
2015-04-13 14:15   ` Marc Glisse [this message]
2015-04-13 15:11     ` Jonathan Wakely
2015-04-14  8:24       ` Marc Glisse
2015-04-14  9:32         ` Jonathan Wakely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.11.1504131549210.13447@stedding.saclay.inria.fr \
    --to=marc.glisse@inria.fr \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jwakely@redhat.com \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).