public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jonathan Wakely <jwakely@redhat.com>
To: Marc Glisse <marc.glisse@inria.fr>
Cc: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org
Subject: Re: [libstdc++/61347] std::distance(list.first(),list.end()) in O(1)
Date: Mon, 13 Apr 2015 13:40:00 -0000	[thread overview]
Message-ID: <20150413134025.GI9755@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1504131033550.13447@stedding.saclay.inria.fr>

On 13/04/15 13:42 +0200, Marc Glisse wrote:
>this patch makes std::distance(list.first(),list.end()) a constant 
>time operation when optimizing, with no penalty for other calls. We 
>could do the test always (no __builtin_constant_p) but then it will 
>add a small runtime penalty for other calls, someone else can take 
>responsibility for that.

I like the way you've done it. No penalty for other calls is great
and IMHO it's OK that the optimisation only happens when optimising.

(Does it work even at -Og?)

>I could protect the whole overload with 
>#ifdef __OPTIMIZE__ (at -O0 the compiler does not remove the test 
>++end==first as dead code), but I assume it is better to minimize 
>differences.

Agreed.

>There are other ways to specialize distance, but 
>overloading __distance seems to have the least drawbacks (most others 
>involve a new extra file). From an optimization POV, it would be a bit 
>better to avoid the while loop and instead call a separate function 
>that does it (say the regular __distance), it would make the function 
>smaller and thus easier to inline, but it is simpler this way and 
>works.

Is there any (dis)advantage to making one call the other, to avoid
duplicating the function bodies?

  template<typename _Tp>
    inline ptrdiff_t
    __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first,
	       _GLIBCXX_STD_C::_List_iterator<_Tp> __last,
               input_iterator_tag __tag)
    {
      typedef _GLIBCXX_STD_C::_List_const_iterator<_Tp> _CIter;
      return std::__distance(_CIter(__first), _CIter(__last), __tag);
    }

>We could do something similar for std::set, but C++ will not let us 
>find the address of _Rb_tree_impl::_M_node_count from that of 
>_Rb_tree_impl::_M_header, except when _Key_compare is pod, which 
>luckily is an easily available information. Avoiding this complication 
>is why I insisted on wrapping the size of std::list in a 
>_List_node<size_t> for gcc-5, which may look a bit strange at first 
>sight.

Sadly, that node is going to look even stranger when I finish adding
support for C++11 allocators, as the type of node becomes dependent on
the allocator's pointer, which makes _List_node<size_t> much more
complicated :-(

I'll have to remember to add additional __distance overloads to handle
the new _List_ptr_iterator and _List_ptr_const_iterator types that
will be used for fancy pointers (although if I forget the optimisation
will still work for std::list<T, std::allocator<T>>, just not for the
vanishingly small number of people using allocators with fancy
pointers).

>Index: include/bits/stl_iterator_base_funcs.h
>===================================================================
>--- include/bits/stl_iterator_base_funcs.h	(revision 222041)
>+++ include/bits/stl_iterator_base_funcs.h	(working copy)
>@@ -107,22 +107,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>    *  n may be negative.
>    *
>    *  For random access iterators, this uses their @c + and @c - operations
>    *  and are constant time.  For other %iterator classes they are linear time.
>   */
>   template<typename _InputIterator>
>     inline typename iterator_traits<_InputIterator>::difference_type
>     distance(_InputIterator __first, _InputIterator __last)
>     {
>       // concept requirements -- taken care of in __distance
>-      return std::__distance(__first, __last,
>-			     std::__iterator_category(__first));
>+      return __distance(__first, __last, std::__iterator_category(__first));

This should still be a qualified call to std::__distance, otherwise the
compiler might end up instantiating things to figure out if there are
any associated namespaces, e.g. for vector<unique_ptr<T>>::iterator we
don't want to know T's base classes and rheir associated namespaces.

>+  // Detect when distance is used to compute the size of the whole list.
>+  template<typename _Tp>
>+    inline ptrdiff_t
>+    __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first,
>+	       _GLIBCXX_STD_C::_List_iterator<_Tp> __last,
>+               input_iterator_tag)
>+    {
>+      typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel;
>+      _GLIBCXX_STD_C::_List_iterator<_Tp> __beyond = __last;
>+      ++__beyond;
>+      bool __whole = __first == __beyond;
>+      if (__builtin_constant_p (__whole) && __whole)

This is clever :-)

This shouldn't interfere with any changes we might need to test before
backporting to the gcc-5-branch, so with the std:: qualification
restored on the call to __distance it's OK for trunk.

(I'll trust your judgment/masurements for whether it's worth making
the _List_iterator overload call the _List_const_iterator one.)

  reply	other threads:[~2015-04-13 13:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-13 11:42 Marc Glisse
2015-04-13 13:40 ` Jonathan Wakely [this message]
2015-04-13 14:15   ` Marc Glisse
2015-04-13 15:11     ` Jonathan Wakely
2015-04-14  8:24       ` Marc Glisse
2015-04-14  9:32         ` Jonathan Wakely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150413134025.GI9755@redhat.com \
    --to=jwakely@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libstdc++@gcc.gnu.org \
    --cc=marc.glisse@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).