public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* Small optimization of vector (or other container comparisons)
@ 2020-11-16 19:08 Theodore Papadopoulo
  2020-11-17 13:31 ` Jonathan Wakely
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Papadopoulo @ 2020-11-16 19:08 UTC (permalink / raw)
  To: libstdc++

     Hi,

     Sorry if this is a naive question...

I wonder whether it will be legal and/or interesting to modify vector 
comparison so that it returns early when the vectors have the same address
ie replace

template<typename _Tp, typename _Alloc>
inline bool
     operator==(const vector<_Tp, _Alloc>& __x, const vector<_Tp, 
_Alloc>& __y)
     { return (__x.size() == __y.size()
           && std::equal(__x.begin(), __x.end(), __y.begin())); }

by

template<typename _Tp, typename _Alloc>
inline bool
     operator==(const vector<_Tp, _Alloc>& __x, const vector<_Tp, 
_Alloc>& __y)
     { return (&__x==&__y) || (__x.size() == __y.size()
           && std::equal(__x.begin(), __x.end(), __y.begin()))); }

Of course, the exact gain depends on the ratio of same vector vs 
different vector comparisons and of the actual size of
the vectors, but it seems to add little extra cost and the address test 
may even also sometimes be removed by the compiler.

Obviously, it is always possible for a user to do it by itself, but 
integrating it is probably easier for the average user.....

     Thanks

         Theo.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Small optimization of vector (or other container comparisons)
  2020-11-16 19:08 Small optimization of vector (or other container comparisons) Theodore Papadopoulo
@ 2020-11-17 13:31 ` Jonathan Wakely
  2020-11-17 19:33   ` Moritz Klammler
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Wakely @ 2020-11-17 13:31 UTC (permalink / raw)
  To: Theodore Papadopoulo; +Cc: libstdc++

On 16/11/20 20:08 +0100, Theodore Papadopoulo wrote:
>    Hi,
>
>    Sorry if this is a naive question...
>
>I wonder whether it will be legal and/or interesting to modify vector 
>comparison so that it returns early when the vectors have the same 
>address
>ie replace
>
>template<typename _Tp, typename _Alloc>
>inline bool
>    operator==(const vector<_Tp, _Alloc>& __x, const vector<_Tp, 
>_Alloc>& __y)
>    { return (__x.size() == __y.size()
>          && std::equal(__x.begin(), __x.end(), __y.begin())); }
>
>by
>
>template<typename _Tp, typename _Alloc>
>inline bool
>    operator==(const vector<_Tp, _Alloc>& __x, const vector<_Tp, 
>_Alloc>& __y)
>    { return (&__x==&__y) || (__x.size() == __y.size()
>          && std::equal(__x.begin(), __x.end(), __y.begin()))); }

N.B. this has to be std::addressof(__x) == std::addressof(__y) (but
that's only available for C++11 and later, so it has to be
__builtin_addressof), and should probably give a branch prediction
hint.

    {
       if (__builtin_expect(__builtin_addressof(__x)
                            == __builtin_addressof(__y), false))
         return true;
       return (__x.size() == __y.size()
          && std::equal(__x.begin(), __x.end(), __y.begin())));
     }

Another option would be to change std::equal so that it returns treue
if the itertors to the beginning of the sequences are equal. That
needs to be done carefully though, because comparing iterators to
distinct containers is undefined. We could do it for vector::iterator
and other std::CONTAINER iterators by extracting a pointer and then
comparing pointers.

>Of course, the exact gain depends on the ratio of same vector vs 
>different vector comparisons and of the actual size of
>the vectors, but it seems to add little extra cost and the address 
>test may even also sometimes be removed by the compiler.

Yes, I don't have enough information to judge how often it will
actually be beneficial.

My guess is that for the vast majority of comparisons, the two
operands are different objects. But without realistic benchmarks I
would be concerned about making the change.

>Obviously, it is always possible for a user to do it by itself, but 
>integrating it is probably easier for the average user.....

But if we leave it to users, they can decide to do it when they know
that comparing the vector to itself is a possibility. If we do it in
the library, they can't *not* do it even if they know it's not a
possibility.

You can always use a custom comparison function if you know it's a
possibility and measurement shows it benefits your use case, and then
you can use that for things like std::map<std::vector<...>, ...> or
std::find_if.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Small optimization of vector (or other container comparisons)
  2020-11-17 13:31 ` Jonathan Wakely
@ 2020-11-17 19:33   ` Moritz Klammler
  2020-11-17 20:07     ` Jonathan Wakely
  0 siblings, 1 reply; 4+ messages in thread
From: Moritz Klammler @ 2020-11-17 19:33 UTC (permalink / raw)
  To: Theodore Papadopoulo; +Cc: libstdc++

On 11/17/20 2:31 PM, Jonathan Wakely via Libstdc++ wrote:
> On 16/11/20 20:08 +0100, Theodore Papadopoulo wrote:
>>     Hi,
>>
>>     Sorry if this is a naive question...
>>
>> I wonder whether it will be legal and/or interesting to modify vector
>> comparison so that it returns early when the vectors have the same
>> address
>> ie replace
>>
>> template<typename _Tp, typename _Alloc>
>> inline bool
>>     operator==(const vector<_Tp, _Alloc>& __x, const vector<_Tp,
>> _Alloc>& __y)
>>     { return (__x.size() == __y.size()
>>           && std::equal(__x.begin(), __x.end(), __y.begin())); }
>>
>> by
>>
>> template<typename _Tp, typename _Alloc>
>> inline bool
>>     operator==(const vector<_Tp, _Alloc>& __x, const vector<_Tp,
>> _Alloc>& __y)
>>     { return (&__x==&__y) || (__x.size() == __y.size()
>>           && std::equal(__x.begin(), __x.end(), __y.begin()))); }
> 
> N.B. this has to be std::addressof(__x) == std::addressof(__y) (but
> that's only available for C++11 and later, so it has to be
> __builtin_addressof), and should probably give a branch prediction
> hint.
> 
>     {
>       if (__builtin_expect(__builtin_addressof(__x)
>                            == __builtin_addressof(__y), false))
>         return true;
>       return (__x.size() == __y.size()
>           && std::equal(__x.begin(), __x.end(), __y.begin())));
>     }

Maybe this is not a valid example but the current observed behavior is
that the assertion will pass and this change would make it fail, no?

#include <cassert>
#include <cmath>
#include <vector>

int main()
{
    const auto problematic = std::vector{1.0f, NAN, 3.0f};
    assert(problematic != problematic);
}

At least I used to (maybe incorrectly) assume that this behavior can be
relied upon.

Of course, there are types like integers, pointers and certain standard
library types like std::string which we know not to have such
problematic behavior as floating-point types do (and arbitrary
user-defined types might) so the optimization could be enabled selectively.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Small optimization of vector (or other container comparisons)
  2020-11-17 19:33   ` Moritz Klammler
@ 2020-11-17 20:07     ` Jonathan Wakely
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Wakely @ 2020-11-17 20:07 UTC (permalink / raw)
  To: Moritz Klammler; +Cc: Theodore Papadopoulo, libstdc++

On 17/11/20 20:33 +0100, Moritz Klammler wrote:
>On 11/17/20 2:31 PM, Jonathan Wakely via Libstdc++ wrote:
>> On 16/11/20 20:08 +0100, Theodore Papadopoulo wrote:
>>>     Hi,
>>>
>>>     Sorry if this is a naive question...
>>>
>>> I wonder whether it will be legal and/or interesting to modify vector
>>> comparison so that it returns early when the vectors have the same
>>> address
>>> ie replace
>>>
>>> template<typename _Tp, typename _Alloc>
>>> inline bool
>>>     operator==(const vector<_Tp, _Alloc>& __x, const vector<_Tp,
>>> _Alloc>& __y)
>>>     { return (__x.size() == __y.size()
>>>           && std::equal(__x.begin(), __x.end(), __y.begin())); }
>>>
>>> by
>>>
>>> template<typename _Tp, typename _Alloc>
>>> inline bool
>>>     operator==(const vector<_Tp, _Alloc>& __x, const vector<_Tp,
>>> _Alloc>& __y)
>>>     { return (&__x==&__y) || (__x.size() == __y.size()
>>>           && std::equal(__x.begin(), __x.end(), __y.begin()))); }
>>
>> N.B. this has to be std::addressof(__x) == std::addressof(__y) (but
>> that's only available for C++11 and later, so it has to be
>> __builtin_addressof), and should probably give a branch prediction
>> hint.
>>
>>     {
>>       if (__builtin_expect(__builtin_addressof(__x)
>>                            == __builtin_addressof(__y), false))
>>         return true;
>>       return (__x.size() == __y.size()
>>           && std::equal(__x.begin(), __x.end(), __y.begin())));
>>     }
>
>Maybe this is not a valid example but the current observed behavior is
>that the assertion will pass and this change would make it fail, no?
>
>#include <cassert>
>#include <cmath>
>#include <vector>
>
>int main()
>{
>    const auto problematic = std::vector{1.0f, NAN, 3.0f};
>    assert(problematic != problematic);
>}
>
>At least I used to (maybe incorrectly) assume that this behavior can be
>relied upon.

Yes, good point.

>Of course, there are types like integers, pointers and certain standard
>library types like std::string which we know not to have such
>problematic behavior as floating-point types do (and arbitrary
>user-defined types might) so the optimization could be enabled selectively.
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-11-17 20:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-16 19:08 Small optimization of vector (or other container comparisons) Theodore Papadopoulo
2020-11-17 13:31 ` Jonathan Wakely
2020-11-17 19:33   ` Moritz Klammler
2020-11-17 20:07     ` Jonathan Wakely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).