public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/109022] New: Low performance of std::map constructor for already sorted data of std::pair<none-const key, value>
@ 2023-03-04 13:28 marekr22 at wp dot pl
  2023-03-06 17:38 ` [Bug libstdc++/109022] " redi at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: marekr22 at wp dot pl @ 2023-03-04 13:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109022

            Bug ID: 109022
           Summary: Low performance of std::map constructor for already
                    sorted data of std::pair<none-const key, value>
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: marekr22 at wp dot pl
  Target Milestone: ---

Performance of constructor accepting iterators to sorted data of
std::pair<none-const key, value> is much lower then inserting same data for
std::pair<const key, value> (matches value_type of std::map).

Here is benchmark showing the issue:
```
#include <sstream>
#include <map>
#include <iomanip>
#include <algorithm>
#include <random>

constexpr size_t DataSizeStart = 8 << 0;
constexpr size_t DataSizeStop = 8 << 3;

using TestData = std::vector<std::pair<std::string, int>>;
using TestDataConst = std::vector<std::pair<const std::string, int>>;

std::string makeStringFor(size_t x)
{
  std::ostringstream out;
  out << std::setfill('0') << std::setw(6) << x;
  return out.str();
}

TestData sortedData(size_t n)
{
  TestData r;
  r.reserve(n);
  size_t i = 0;
  std::generate_n(std::back_inserter(r), n, [&i]() {
    return std::pair{makeStringFor(++i), i};
  });
  return r;
}

TestData shuffledData(size_t n)
{
  auto data = sortedData(n);
  std::random_device rd;
  std::mt19937 g(rd());

  std::shuffle(data.begin(), data.end(), g);
  return data;
}

TestDataConst sortedConstData(size_t n)
{
  auto r = sortedData(n);
  return {r.begin(), r.end()};
}

TestDataConst shuffledConstData(size_t n)
{
  auto r = shuffledData(n);
  return {r.begin(), r.end()};
}

template<auto Data>
static void CreateMapInsert(benchmark::State& state) {
  auto n = state.range(0);
  auto data = Data(n);
  for (auto _ : state) {
    benchmark::DoNotOptimize(data);
    std::map<std::string, int> m;

    for (auto& p : data) {
      m.insert(m.end(), p);
    }
    benchmark::DoNotOptimize(m);
  }
}
BENCHMARK(CreateMapInsert<sortedData>)->RangeMultiplier(2)->Range(DataSizeStart,
DataSizeStop);
//
BENCHMARK(CreateMapInsert<shuffledData>)->RangeMultiplier(2)->Range(DataSizeStart,
DataSizeStop);
BENCHMARK(CreateMapInsert<sortedConstData>)->RangeMultiplier(2)->Range(DataSizeStart,
DataSizeStop);
//
BENCHMARK(CreateMapInsert<shuffledConstData>)->RangeMultiplier(2)->Range(DataSizeStart,
DataSizeStop);

template<auto Data>
static void CreateMapDirectly(benchmark::State& state) {
  auto n = state.range(0);
  auto data = Data(n);
  for (auto _ : state) {
    benchmark::DoNotOptimize(data);
    std::map<std::string, int> m{data.begin(), data.end()};
    benchmark::DoNotOptimize(m);
  }
}
BENCHMARK(CreateMapDirectly<sortedData>)->RangeMultiplier(2)->Range(DataSizeStart,
DataSizeStop);
//
BENCHMARK(CreateMapDirectly<shuffledData>)->RangeMultiplier(2)->Range(DataSizeStart,
DataSizeStop);
BENCHMARK(CreateMapDirectly<sortedConstData>)->RangeMultiplier(2)->Range(DataSizeStart,
DataSizeStop);
//
BENCHMARK(CreateMapDirectly<shuffledConstData>)->RangeMultiplier(2)->Range(DataSizeStart,
DataSizeStop);
```

Live demo gcc 12.2: https://quick-bench.com/q/H8lqugNgMG-sWQ6cKBsdB7EBRpU
Note that CreateMapInsert<sortedConstData> performs best since it feeds to
constructos to iterators to std::pair<const std::string, in> while
CreateMapInsert<sortedData> performs badly (looks like time complexity is
worse) adn only diffrence is that it provides iterators to std::<std::string,
int>.

Same issue is observed for clang 14.0 and libstdc++:
https://quick-bench.com/q/6VA3vRdKmqPEYrvTnFFla4gxwXk
but it is not a problem on clang 14.0 and libc++:
https://quick-bench.com/q/-S-lX3N8Y-8vL9CCl-5bW5jolWA

here is result for sing size input data (easier to read):
https://quick-bench.com/q/kbDOPA8PNZFfrzUH_-70cJgApeE

Issue was observed when filing answer on
https://stackoverflow.com/a/75535056/1387438

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug libstdc++/109022] Low performance of std::map constructor for already sorted data of std::pair<none-const key, value>
  2023-03-04 13:28 [Bug libstdc++/109022] New: Low performance of std::map constructor for already sorted data of std::pair<none-const key, value> marekr22 at wp dot pl
@ 2023-03-06 17:38 ` redi at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: redi at gcc dot gnu.org @ 2023-03-06 17:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109022

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2023-03-06
     Ever confirmed|0                           |1

--- Comment #1 from Jonathan Wakely <redi at gcc dot gnu.org> ---
For any type except the container's own type we don't use any hint for
insertion, and so it makes no difference whether the input is sorted or not:

      template<typename _InputIterator>
        __enable_if_t<__same_value_type<_InputIterator>::value>
        _M_insert_range_unique(_InputIterator __first, _InputIterator __last)
        {
          _Alloc_node __an(*this);
          for (; __first != __last; ++__first)
            _M_insert_unique_(end(), *__first, __an);
        }

      template<typename _InputIterator>
        __enable_if_t<!__same_value_type<_InputIterator>::value>
        _M_insert_range_unique(_InputIterator __first, _InputIterator __last)
        {
          for (; __first != __last; ++__first)
            _M_emplace_unique(*__first);
        }

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-03-06 17:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-04 13:28 [Bug libstdc++/109022] New: Low performance of std::map constructor for already sorted data of std::pair<none-const key, value> marekr22 at wp dot pl
2023-03-06 17:38 ` [Bug libstdc++/109022] " redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).