public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/90436] Redundant size checking in vector
       [not found] <bug-90436-4@http.gcc.gnu.org/bugzilla/>
@ 2020-06-19 22:35 ` glisse at gcc dot gnu.org
  2020-06-21 17:13 ` glisse at gcc dot gnu.org
  2020-06-21 18:03 ` glisse at gcc dot gnu.org
  2 siblings, 0 replies; 3+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-06-19 22:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90436

--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
(writing down some notes)

Calling

      size_type
      _M_check_len_one(const char* __s) const
      {
        if (max_size() - size() < 1)
          __throw_length_error(__N(__s));

        const size_type __len = size() + (std::max)(size(), (size_t)1);
        return (__len > max_size()) ? max_size() : __len;
      }

instead of _M_check_len reduces the running time of this micro-benchmark

#include <vector>
int main(){
  volatile int a=0;
  for(int i=0;i<1000000;++i){
    std::vector<int> v;
    for(int j=0;j<1000;++j){
      v.push_back(j);
    }
    a=v[a];
  }
}

from .88s to .66s at -O3. Two key elements (the perf gain only comes if we do
both) are removing the overflow check, and having the comparison between size
and max_size optimized to be done on byte length (not divided by the element
size).

I think the overflow check could be removed from the normal _M_check_len: we
have already checked that max_size() - size() >= __n so size() + __n cannot
overflow, and size() must be smaller than max_size(), which should be at most
SIZE_MAX/2, at least if ptrdiff_t and size_t have the same size, so size() +
size() cannot overflow either.
I should check if the compiler could help more. It is supposed to know how to
optimize .ADD_OVERFLOW based on the range of the operands.

I suspect that a single_use restriction explains why max_size() == size()
compares values without division while max_size() - size() < __n (for __n = 1)
doesn't.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug libstdc++/90436] Redundant size checking in vector
       [not found] <bug-90436-4@http.gcc.gnu.org/bugzilla/>
  2020-06-19 22:35 ` [Bug libstdc++/90436] Redundant size checking in vector glisse at gcc dot gnu.org
@ 2020-06-21 17:13 ` glisse at gcc dot gnu.org
  2020-06-21 18:03 ` glisse at gcc dot gnu.org
  2 siblings, 0 replies; 3+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-06-21 17:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90436

--- Comment #3 from Marc Glisse <glisse at gcc dot gnu.org> ---
      // possibly assumes that ptrdiff_t and size_t have the same size
      size_type
      _M_check_len_one(const char* __str) const
      {
        ptrdiff_t __n = sizeof(_Tp);
        ptrdiff_t __ms = max_size(); __ms *= sizeof(_Tp);
        ptrdiff_t __s = size(); __s *= sizeof(_Tp);
        if (__s > (__ms - __n))
          __throw_length_error(__N(__str));

        const ptrdiff_t __len = __s + (std::max)(__s, __n);
        if (__len <= 0) __builtin_unreachable();
        ptrdiff_t __ret = (std::min)(__len, __ms);
        return (_Tp*)__ret-(_Tp*)0; // hack to generate divexact, so it
simplifies with * sizeof(_Tp)
      }

generates nicer code. But after those experiments, it seems clear that the
performance of this code is irrelevant (not surprising since it is followed by
a call to operator new), and its effect on global performance is random.
Possibly it causes something to get aligned differently, which can randomly get
this 25% speed-up, but can just as randomly go back to the slow version.
Anyway, I don't think I'll be submitting any patch for this.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug libstdc++/90436] Redundant size checking in vector
       [not found] <bug-90436-4@http.gcc.gnu.org/bugzilla/>
  2020-06-19 22:35 ` [Bug libstdc++/90436] Redundant size checking in vector glisse at gcc dot gnu.org
  2020-06-21 17:13 ` glisse at gcc dot gnu.org
@ 2020-06-21 18:03 ` glisse at gcc dot gnu.org
  2 siblings, 0 replies; 3+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-06-21 18:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90436

--- Comment #4 from Marc Glisse <glisse at gcc dot gnu.org> ---
(side note not related to the redundant size checking)
It is surprising how, in the code from comment 2, adding v.reserve(1000) does
not help, it even slows the program down slightly here (yes, that's rather hard
to believe). To reap the benefits, I also need to add in the loop:
if(v.size()==v.capacity())__builtin_abort();
which enables the compiler to remove the reallocation code, and once that code
is removed it can actually prove that size never reaches capacity and remove
the call to abort! We don't even need __builtin_unreachable there. And once all
that dead code is removed, it can finally vectorize.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-06-21 18:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-90436-4@http.gcc.gnu.org/bugzilla/>
2020-06-19 22:35 ` [Bug libstdc++/90436] Redundant size checking in vector glisse at gcc dot gnu.org
2020-06-21 17:13 ` glisse at gcc dot gnu.org
2020-06-21 18:03 ` glisse at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).