public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/67922] New: std::unordered_map::clear should take time linear in the number of elements
@ 2015-10-10 22:33 yegor.derevenets at gmail dot com
2015-10-12 8:05 ` [Bug libstdc++/67922] " rguenth at gcc dot gnu.org
2015-10-12 9:06 ` yegor.derevenets at gmail dot com
0 siblings, 2 replies; 3+ messages in thread
From: yegor.derevenets at gmail dot com @ 2015-10-10 22:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67922
Bug ID: 67922
Summary: std::unordered_map::clear should take time linear in
the number of elements
Product: gcc
Version: 5.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: yegor.derevenets at gmail dot com
Target Milestone: ---
std::unordered_map::clear internally clears the whole array of buckets using
memset. Consequently, the clearing time is proportional to the number of
buckets, instead of being proportional to the number of elements, which one
would expect, and what arguably should be the case according to the C++
Standard. (The Standard specifies that clear's complexity is linear, but
unfortunately does not say, linear in what.) This leads to terrible performance
of the following code:
#include <unordered_map>
int main() {
std::unordered_map<int, int> m;
for (int i = 0; i < 1000000; ++i) {
m[i] = i;
}
for (int i = 0; i < 1000; ++i) {
m.clear();
}
}
$ g++-5 -O2 test.cpp -std=c++11 && time ./a.out
real 0m4.009s
user 0m3.924s
sys 0m0.028s
$ clang++-3.7 -stdlib=libstdc++ -O2 test.cpp -std=c++11 && time ./a.out
real 0m4.153s
user 0m3.976s
sys 0m0.036s
If you build the same program with libc++, the problem goes away:
$ clang++-3.7 -stdlib=libc++ -O2 test.cpp -std=c++11 && time ./a.out
real 0m0.165s
user 0m0.128s
sys 0m0.036s
You can achieve similar performance with libstdc++ if you implement clear()
manually in the most naive way:
#include <unordered_map>
int main() {
std::unordered_map<int, int> m;
for (int i = 0; i < 1000000; ++i) {
m[i] = i;
}
for (int i = 0; i < 1000; ++i) {
while (m.begin() != m.end()) {
m.erase(m.begin());
}
}
}
$ g++-5 -O2 test.cpp -std=c++11 && time ./a.out
real 0m0.167s
user 0m0.132s
sys 0m0.028s
$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 5.2.1-21'
--with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-5 --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib
--disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-multiarch --with-arch-32=i586 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 5.2.1 20151003 (Debian 5.2.1-21)
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug libstdc++/67922] std::unordered_map::clear should take time linear in the number of elements
2015-10-10 22:33 [Bug libstdc++/67922] New: std::unordered_map::clear should take time linear in the number of elements yegor.derevenets at gmail dot com
@ 2015-10-12 8:05 ` rguenth at gcc dot gnu.org
2015-10-12 9:06 ` yegor.derevenets at gmail dot com
1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-10-12 8:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67922
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2015-10-12
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
But then the issue is that clear () doesn't shrink the map. Otherwise
O(number of slots) == O(number of elements)
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug libstdc++/67922] std::unordered_map::clear should take time linear in the number of elements
2015-10-10 22:33 [Bug libstdc++/67922] New: std::unordered_map::clear should take time linear in the number of elements yegor.derevenets at gmail dot com
2015-10-12 8:05 ` [Bug libstdc++/67922] " rguenth at gcc dot gnu.org
@ 2015-10-12 9:06 ` yegor.derevenets at gmail dot com
1 sibling, 0 replies; 3+ messages in thread
From: yegor.derevenets at gmail dot com @ 2015-10-12 9:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67922
--- Comment #2 from Yegor Derevenets <yegor.derevenets at gmail dot com> ---
> But then the issue is that clear () doesn't shrink the map.
No, the issue is that clear() touches all the buckets, instead of touching only
those containing the elements. libc++'s implementation does not shrink the map
(does not decrease the number of buckets), and is still fast. Actually, if in
libstdc++ you do m.erase(m.begin(), m.end()) instead of m.clear(), it will be
fast too.
#include <iostream>
#include <unordered_map>
int main() {
std::unordered_map<int, int> m;
for (int i = 0; i < 1000000; ++i) {
m[i] = i;
}
std::cout << "Before clear(): " << m.bucket_count() << std::endl;
m.clear();
std::cout << "After clear(): " << m.bucket_count() << std::endl;
}
$ clang++-3.7 -stdlib=libstdc++ -O2 -std=c++11 test.cpp && time ./a.out
Before clear(): 1056323
After clear(): 1056323
real 0m0.074s
user 0m0.076s
sys 0m0.000s
$ clang++-3.7 -stdlib=libc++ -O2 -std=c++11 test.cpp && time ./a.out
Before clear(): 1646237
After clear(): 1646237
real 0m0.102s
user 0m0.080s
sys 0m0.016s
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-10-12 9:06 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-10 22:33 [Bug libstdc++/67922] New: std::unordered_map::clear should take time linear in the number of elements yegor.derevenets at gmail dot com
2015-10-12 8:05 ` [Bug libstdc++/67922] " rguenth at gcc dot gnu.org
2015-10-12 9:06 ` yegor.derevenets at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).