* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
@ 2012-07-23 23:08 ` likan_999.student at sina dot com
2012-07-23 23:09 ` likan_999.student at sina dot com
` (46 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: likan_999.student at sina dot com @ 2012-07-23 23:08 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #1 from likan_999.student at sina dot com 2012-07-23 23:08:07 UTC ---
Created attachment 27861
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27861
Profiling using google-perftools
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
2012-07-23 23:08 ` [Bug libstdc++/54075] " likan_999.student at sina dot com
@ 2012-07-23 23:09 ` likan_999.student at sina dot com
2012-07-23 23:12 ` paolo.carlini at oracle dot com
` (45 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: likan_999.student at sina dot com @ 2012-07-23 23:09 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #2 from likan_999.student at sina dot com 2012-07-23 23:09:43 UTC ---
Created attachment 27862
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27862
Profiling of gcc-4.6.2 using google-perftools
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
2012-07-23 23:08 ` [Bug libstdc++/54075] " likan_999.student at sina dot com
2012-07-23 23:09 ` likan_999.student at sina dot com
@ 2012-07-23 23:12 ` paolo.carlini at oracle dot com
2012-07-23 23:23 ` paolo.carlini at oracle dot com
` (44 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-07-23 23:12 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
Paolo Carlini <paolo.carlini at oracle dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2012-07-23
CC| |fdumont at gcc dot gnu.org
Ever Confirmed|0 |1
Severity|critical |normal
--- Comment #3 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-07-23 23:12:15 UTC ---
Francois, can you have a look? Thanks!
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (2 preceding siblings ...)
2012-07-23 23:12 ` paolo.carlini at oracle dot com
@ 2012-07-23 23:23 ` paolo.carlini at oracle dot com
2012-07-24 0:17 ` likan_999.student at sina dot com
` (43 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-07-23 23:23 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #4 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-07-23 23:23:27 UTC ---
I wonder, anyway, if the apparent slow down is just an artifact caused by a
different handling of the load factor in the reworked unordered containers
which we have been delivering since 4.7.0. I would suggest submitter to
experiment a bit with max_load_factor, and see if when 4.6.x seems faster at
insertion time actually the load factor is higher (too searching would be
slower).
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (3 preceding siblings ...)
2012-07-23 23:23 ` paolo.carlini at oracle dot com
@ 2012-07-24 0:17 ` likan_999.student at sina dot com
2012-07-24 0:29 ` paolo.carlini at oracle dot com
` (42 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: likan_999.student at sina dot com @ 2012-07-24 0:17 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #5 from likan_999.student at sina dot com 2012-07-24 00:17:10 UTC ---
@Paolo Carlini: can you talk more about how to experiment with max_load_factor?
As long as I use the same max_load_factor for 4.6 and 4.7, I can still see the
performance difference (3x slower is the best result):
max_load_factor gcc-4.6.2 gcc-4.7.1
0.2 1.600s 7.650s
0.5 1.125s 4.251s
1.0 1.004s 3.378s
2.0 0.914s 20.471s
5.0 0.917s 29.132s
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (4 preceding siblings ...)
2012-07-24 0:17 ` likan_999.student at sina dot com
@ 2012-07-24 0:29 ` paolo.carlini at oracle dot com
2012-07-24 0:43 ` likan_999.student at sina dot com
` (41 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-07-24 0:29 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #6 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-07-24 00:29:38 UTC ---
In some cases 4.6.x was handling max_load_factor incorrectly. Thus, the idea
isn't comparing 4.6.x to 4.7.x with the same max_load_factor (I don't think
it's useful), instead, actually measure load_factor(s), see if for some values
of max_load_factor, the actual load_factor(s) are different in 4.6.x vs 4.7.x.
In any case, measure the actual load_factor while the map grows.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (5 preceding siblings ...)
2012-07-24 0:29 ` paolo.carlini at oracle dot com
@ 2012-07-24 0:43 ` likan_999.student at sina dot com
2012-07-24 10:42 ` plasmahh at gmx dot net
` (40 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: likan_999.student at sina dot com @ 2012-07-24 0:43 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #7 from likan_999.student at sina dot com 2012-07-24 00:42:57 UTC ---
@Paolo Carlini: the problem is, with different max_load_factor in range
[0.2-5], the *best* result of 4.7.1 is still 2x slower than the *worst* of
4.6.2.
I printed out the average load factor during the insertion. Looks like 4.7.1's
load factor is very close to the max_load_factor, and the average for 4.6.2 is
about 1/4 of the max_load_factor. But what conclusion can we get from this
result?
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (6 preceding siblings ...)
2012-07-24 0:43 ` likan_999.student at sina dot com
@ 2012-07-24 10:42 ` plasmahh at gmx dot net
2012-07-24 20:15 ` fdumont at gcc dot gnu.org
` (39 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: plasmahh at gmx dot net @ 2012-07-24 10:42 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
Dennis Lubert <plasmahh at gmx dot net> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |plasmahh at gmx dot net
--- Comment #8 from Dennis Lubert <plasmahh at gmx dot net> 2012-07-24 10:41:54 UTC ---
I just stumbled over this bug while searching for something related to the max
load factor (it seems that since 4.7 the hashtable also shrinks when you set
the load factor higher, which is at least for me unfortunate).
I did just change the testcase to count the number of rehashes (by checking
bucket count before and after insert) and found:
gcc4.6 without reserve: 21
gcc4.6 with reserve: 1
gcc4.7 without reserve: 155
gcc4.7 with reserve: 160
Then in callgrind I had a look and most time seems to be spend in rehashing. So
I would assume that when the 4.7 version gets the number of rehashing down to
where it was, then we also get the speed down to where it was.
I would say that the reserve behaviour though is definetly broken, as it even
increases the amount of rehashings. I personally would just not have the
hashtable ever shrink on insert and/or load factor setting, just only ever on
explicit rehash() calls...
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (7 preceding siblings ...)
2012-07-24 10:42 ` plasmahh at gmx dot net
@ 2012-07-24 20:15 ` fdumont at gcc dot gnu.org
2012-07-24 20:19 ` fdumont at gcc dot gnu.org
` (38 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-07-24 20:15 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #9 from François Dumont <fdumont at gcc dot gnu.org> 2012-07-24 20:15:10 UTC ---
I confirm that the reserve method is broken. I had correctly handle the size
hint that can be given through the hashtable constructor, I set
_M_rehash_policy._M_prev_resize to 0 just to avoid the container to be shrink
until it reaches the targetted size. The same thing should be done in the
rehash method that is called from reserve one. I will submit a patch soon.
Regarding the shrink on insert calls, well, it can be easily avoided by calling
(normally) reserve so I considered that it is better to offer a container with
a symetric behavior.
I will also I think reconsider the grow factor. During one of my propositions
of hashtable redesign the number of empty buckets was a problem for
performance. With the current implementation it is not a problem anymore so we
could grow a little bit faster.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (8 preceding siblings ...)
2012-07-24 20:15 ` fdumont at gcc dot gnu.org
@ 2012-07-24 20:19 ` fdumont at gcc dot gnu.org
2012-07-25 9:56 ` paolo.carlini at oracle dot com
` (37 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-07-24 20:19 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
François Dumont <fdumont at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
AssignedTo|unassigned at gcc dot |fdumont at gcc dot gnu.org
|gnu.org |
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (9 preceding siblings ...)
2012-07-24 20:19 ` fdumont at gcc dot gnu.org
@ 2012-07-25 9:56 ` paolo.carlini at oracle dot com
2012-07-25 19:33 ` fdumont at gcc dot gnu.org
` (36 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-07-25 9:56 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #10 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-07-25 09:56:15 UTC ---
A patch is available here:
http://gcc.gnu.org/ml/libstdc++/2012-07/msg00051.html
Submitter and interested people can give it a try before it goes in.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (10 preceding siblings ...)
2012-07-25 9:56 ` paolo.carlini at oracle dot com
@ 2012-07-25 19:33 ` fdumont at gcc dot gnu.org
2012-07-26 12:30 ` plasmahh at gmx dot net
` (35 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-07-25 19:33 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #11 from François Dumont <fdumont at gcc dot gnu.org> 2012-07-25 19:32:53 UTC ---
Author: fdumont
Date: Wed Jul 25 19:32:48 2012
New Revision: 189863
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189863
Log:
2012-07-25 François Dumont <fdumont@gcc.gnu.org>
PR libstdc++/54075
* include/bits/hashtable.h
(_Hashtable<>::_Hashtable(_InputIterator, _InputIterator,
size_type, ...): Remove std::max usage to guarantee that hashtable
state is consistent with hash policy state.
(_Hashtable<>::rehash): Likewise. Set _M_prev_resize to 0 to avoid
the hashtable to be shrinking on next insertion.
* testsuite/23_containers/unordered_set/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_multiset/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_map/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_multimap/modifiers/reserve.cc: New.
Added:
trunk/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/reserve.cc
trunk/libstdc++-v3/testsuite/23_containers/unordered_multimap/modifiers/reserve.cc
trunk/libstdc++-v3/testsuite/23_containers/unordered_multiset/modifiers/reserve.cc
trunk/libstdc++-v3/testsuite/23_containers/unordered_set/modifiers/reserve.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/hashtable.h
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (11 preceding siblings ...)
2012-07-25 19:33 ` fdumont at gcc dot gnu.org
@ 2012-07-26 12:30 ` plasmahh at gmx dot net
2012-07-26 12:32 ` fdumont at gcc dot gnu.org
` (34 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: plasmahh at gmx dot net @ 2012-07-26 12:30 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #12 from Dennis Lubert <plasmahh at gmx dot net> 2012-07-26 12:30:00 UTC ---
I can confirm that now the reserve works as I would expect it (causing no
further rehashes). However the amount of rehashes done in the testcase is still
155 (needing 4.5s), while gcc 4.6 only needs 21 (needing 1.5s).
Monitoring the bucket counts on resizes it seems that gcc 4.8 is now much more
fine grained than gcc 4.6. I am unsure if this is expected, deliberate and/or a
good idea.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (12 preceding siblings ...)
2012-07-26 12:30 ` plasmahh at gmx dot net
@ 2012-07-26 12:32 ` fdumont at gcc dot gnu.org
2012-07-26 17:36 ` paolo.carlini at oracle dot com
` (33 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-07-26 12:32 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #13 from François Dumont <fdumont at gcc dot gnu.org> 2012-07-26 12:31:56 UTC ---
Author: fdumont
Date: Thu Jul 26 12:31:50 2012
New Revision: 189889
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189889
Log:
2012-07-26 François Dumont <fdumont@gcc.gnu.org>
PR libstdc++/54075
* include/bits/hashtable.h
(_Hashtable<>::_Hashtable(_InputIterator, _InputIterator,
size_type, ...): Remove std::max usage to guarantee that hashtable
state is consistent with hash policy state.
(_Hashtable<>::rehash): Likewise. Set _M_prev_resize to 0 to avoid
the hashtable shrinking on next insertion.
* testsuite/23_containers/unordered_set/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_multiset/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_map/modifiers/reserve.cc: New.
* testsuite/23_containers/unordered_multimap/modifiers/reserve.cc: New.
Added:
branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/reserve.cc
branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_multimap/modifiers/reserve.cc
branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_multiset/modifiers/reserve.cc
branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_set/modifiers/reserve.cc
Modified:
branches/gcc-4_7-branch/libstdc++-v3/ChangeLog
branches/gcc-4_7-branch/libstdc++-v3/include/bits/hashtable.h
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (13 preceding siblings ...)
2012-07-26 12:32 ` fdumont at gcc dot gnu.org
@ 2012-07-26 17:36 ` paolo.carlini at oracle dot com
2012-07-26 22:10 ` likan_999.student at sina dot com
` (32 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-07-26 17:36 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #14 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-07-26 17:36:28 UTC ---
In any case, please add the testcase showing 4.5s vs 1.5s.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (14 preceding siblings ...)
2012-07-26 17:36 ` paolo.carlini at oracle dot com
@ 2012-07-26 22:10 ` likan_999.student at sina dot com
2012-07-26 22:50 ` chip at pobox dot com
` (31 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: likan_999.student at sina dot com @ 2012-07-26 22:10 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #15 from likan_999.student at sina dot com 2012-07-26 22:10:21 UTC ---
Tried the patch and just as Dennis Lubert pointed out, the number of rehashes
is not decreased. Is there any plan to fix this issue?
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (15 preceding siblings ...)
2012-07-26 22:10 ` likan_999.student at sina dot com
@ 2012-07-26 22:50 ` chip at pobox dot com
2012-07-26 22:55 ` paolo.carlini at oracle dot com
` (30 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: chip at pobox dot com @ 2012-07-26 22:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #16 from Chip Salzenberg <chip at pobox dot com> 2012-07-26 22:50:17 UTC ---
In my tests, with this patch, 4.7.1 is about 10% slower than 4.6 ... a vast
improvement but certainly not parity.
./bench46 1.75s user 0.82s system 99% cpu 2.577 total
./bench47 8.01s user 2.78s system 99% cpu 10.800 total
./bench47+patch 1.95s user 0.80s system 99% cpu 2.764 total
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (16 preceding siblings ...)
2012-07-26 22:50 ` chip at pobox dot com
@ 2012-07-26 22:55 ` paolo.carlini at oracle dot com
2012-07-26 23:39 ` [Bug libstdc++/54075] [4.7.1] unordered_map insert " chip at pobox dot com
` (29 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-07-26 22:55 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #17 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-07-26 22:55:15 UTC ---
Because of more rehashing, unrelated to reserve, I suppose?
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (17 preceding siblings ...)
2012-07-26 22:55 ` paolo.carlini at oracle dot com
@ 2012-07-26 23:39 ` chip at pobox dot com
2012-07-27 0:33 ` redi at gcc dot gnu.org
` (28 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: chip at pobox dot com @ 2012-07-26 23:39 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #18 from Chip Salzenberg <chip at pobox dot com> 2012-07-26 23:38:34 UTC ---
I couldn't say. I don't understand the issue, I'm just reporting results and
deploying packages for my fellow devs.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (18 preceding siblings ...)
2012-07-26 23:39 ` [Bug libstdc++/54075] [4.7.1] unordered_map insert " chip at pobox dot com
@ 2012-07-27 0:33 ` redi at gcc dot gnu.org
2012-07-27 1:00 ` chip at pobox dot com
` (27 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: redi at gcc dot gnu.org @ 2012-07-27 0:33 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #19 from Jonathan Wakely <redi at gcc dot gnu.org> 2012-07-27 00:32:54 UTC ---
Testcase please.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (19 preceding siblings ...)
2012-07-27 0:33 ` redi at gcc dot gnu.org
@ 2012-07-27 1:00 ` chip at pobox dot com
2012-07-27 7:58 ` fdumont at gcc dot gnu.org
` (26 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: chip at pobox dot com @ 2012-07-27 1:00 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #20 from Chip Salzenberg <chip at pobox dot com> 2012-07-27 01:00:14 UTC ---
Are you talking to me? 'cause I was providing results for the patch already
committed to svn, using the code in this very bug description.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (20 preceding siblings ...)
2012-07-27 1:00 ` chip at pobox dot com
@ 2012-07-27 7:58 ` fdumont at gcc dot gnu.org
2012-07-27 19:21 ` plasmahh at gmx dot net
` (25 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-07-27 7:58 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #21 from François Dumont <fdumont at gcc dot gnu.org> 2012-07-27 07:57:59 UTC ---
I haven't touch the grow speed for the moment. I prefer to fix the reserve
Standard conformity first.
Now I can restore the 4.6 grow speed as it seems to be a relatively correct
one. Note that 3x slower with so many more rehash operations doesn't seem so
bad ! Of course growing faster will also induce higher memory consumption which
is not shown by the simple benchmark proposed here.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (21 preceding siblings ...)
2012-07-27 7:58 ` fdumont at gcc dot gnu.org
@ 2012-07-27 19:21 ` plasmahh at gmx dot net
2012-07-29 16:44 ` fdumont at gcc dot gnu.org
` (24 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: plasmahh at gmx dot net @ 2012-07-27 19:21 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #22 from Dennis Lubert <plasmahh at gmx dot net> 2012-07-27 19:20:54 UTC ---
Am on vacation so I don't have the testcase at hand, but it is the same as
likan posted in the original bugreport, minus the reserve. The main difference
is that without reserve I see 21 rehashes in gcc 4.6 and 155 rehashes. I have
added some code that after each insert tests if the bucket count has changed, I
think that should be trivial to add for yourselves without me providing it.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (22 preceding siblings ...)
2012-07-27 19:21 ` plasmahh at gmx dot net
@ 2012-07-29 16:44 ` fdumont at gcc dot gnu.org
2012-07-29 17:06 ` fdumont at gcc dot gnu.org
` (23 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-07-29 16:44 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #23 from François Dumont <fdumont at gcc dot gnu.org> 2012-07-29 16:44:26 UTC ---
Author: fdumont
Date: Sun Jul 29 16:44:18 2012
New Revision: 189938
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189938
Log:
2012-07-29 François Dumont <fdumont@gcc.gnu.org>
PR libstdc++/54075
* include/bits/hashtable_policy.h
(_Prime_rehash_policy::_M_next_bkt): Add a growth factor set to 2
to boost growth in the number of buckets.
* testsuite/performance/23_containers/insert/unordered_set.cc: New.
Added:
trunk/libstdc++-v3/testsuite/performance/23_containers/insert/unordered_set.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/hashtable_policy.h
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (23 preceding siblings ...)
2012-07-29 16:44 ` fdumont at gcc dot gnu.org
@ 2012-07-29 17:06 ` fdumont at gcc dot gnu.org
2012-09-26 23:11 ` paolo.carlini at oracle dot com
` (22 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-07-29 17:06 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #24 from François Dumont <fdumont at gcc dot gnu.org> 2012-07-29 17:06:25 UTC ---
Author: fdumont
Date: Sun Jul 29 17:06:21 2012
New Revision: 189941
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189941
Log:
2012-07-29 François Dumont <fdumont@gcc.gnu.org>
PR libstdc++/54075
* include/bits/hashtable_policy.h
(_Prime_rehash_policy::_M_next_bkt): Add a growth factor set to 2
to boost growth in the number of buckets.
* testsuite/performance/23_containers/insert/unordered_set.cc: New.
Added:
branches/gcc-4_7-branch/libstdc++-v3/testsuite/performance/23_containers/insert/unordered_set.cc
Modified:
branches/gcc-4_7-branch/libstdc++-v3/ChangeLog
branches/gcc-4_7-branch/libstdc++-v3/include/bits/hashtable_policy.h
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (24 preceding siblings ...)
2012-07-29 17:06 ` fdumont at gcc dot gnu.org
@ 2012-09-26 23:11 ` paolo.carlini at oracle dot com
2012-10-22 20:50 ` cracauer at cons dot org
` (21 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-09-26 23:11 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
Paolo Carlini <paolo.carlini at oracle dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
--- Comment #25 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-09-26 23:11:28 UTC ---
Fixed.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (25 preceding siblings ...)
2012-09-26 23:11 ` paolo.carlini at oracle dot com
@ 2012-10-22 20:50 ` cracauer at cons dot org
2012-10-22 21:05 ` paolo.carlini at oracle dot com
` (20 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: cracauer at cons dot org @ 2012-10-22 20:50 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
Martin Cracauer <cracauer at cons dot org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |cracauer at cons dot org
--- Comment #26 from Martin Cracauer <cracauer at cons dot org> 2012-10-22 20:50:25 UTC ---
I'm afraid this doesn't quite do it.
I still observe a > 60% slowdown going from gcc-4.4 to gcc-4.7, with this fix
already in, specifically for insert(). It's a 320,000 member table I am
dealing with here.
I can make 4.7 be as fast as 4.4 by preemptively setting the reserve to what I
know (for this test) to be the maximum size I need, but measured resident
memory shoots up (not unexpected). And resident memory of the 4.7 build was
already higher than 4.4 so I don't think this can be the answer here.
Playing with the load factor resulted in a minor speedup with 0.4 (from 1.0),
but not reaching 4.4 performance. Other load factors (lower than 0.4 and higher
than 1.0) are even slower.
Is there more specific information available about the tradeoff numbers that
made GNU pick this new implementation?
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (26 preceding siblings ...)
2012-10-22 20:50 ` cracauer at cons dot org
@ 2012-10-22 21:05 ` paolo.carlini at oracle dot com
2012-10-22 22:04 ` cracauer at cons dot org
` (19 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-10-22 21:05 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #27 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-10-22 21:05:41 UTC ---
I can only recommend profiling (by gprof or other means).
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (27 preceding siblings ...)
2012-10-22 21:05 ` paolo.carlini at oracle dot com
@ 2012-10-22 22:04 ` cracauer at cons dot org
2012-10-22 22:47 ` paolo.carlini at oracle dot com
` (18 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: cracauer at cons dot org @ 2012-10-22 22:04 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #28 from Martin Cracauer <cracauer at cons dot org> 2012-10-22 22:04:27 UTC ---
I should clarify that I was pointed to this problem (with insert) by profiling
and for us nothing pops up as faster (or smaller for that matter). Hence the
question.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (28 preceding siblings ...)
2012-10-22 22:04 ` cracauer at cons dot org
@ 2012-10-22 22:47 ` paolo.carlini at oracle dot com
2012-10-24 19:28 ` fdumont at gcc dot gnu.org
` (17 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-10-22 22:47 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #29 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-10-22 22:47:31 UTC ---
Indeed, if we have to do something about that, we need to know those profiles.
That's my point. Otherwise, blindly, it could be anything, ie, not necessarily
rehashes which are the topic of this specific PR.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (29 preceding siblings ...)
2012-10-22 22:47 ` paolo.carlini at oracle dot com
@ 2012-10-24 19:28 ` fdumont at gcc dot gnu.org
2012-10-25 17:48 ` cracauer at cons dot org
` (16 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-10-24 19:28 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #30 from François Dumont <fdumont at gcc dot gnu.org> 2012-10-24 19:27:58 UTC ---
I am going to take a look to 4.4 implementation to see if I can explain the
difference. But waiting for that can you confirm that without the reserve the
number of rehash is similar to what you had with version 4.4.
Regarding the resident memory, I can't remember any overhead with the new
implementation but once again checking 4.4 implementation will point me to it.
Do you have it with the code originally posted, that is to say an
unordered_map<int, int> ?
For information the major reason to review the implementation was for Standard
conformity. The erase method had to return the iterator following the erased
one.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (30 preceding siblings ...)
2012-10-24 19:28 ` fdumont at gcc dot gnu.org
@ 2012-10-25 17:48 ` cracauer at cons dot org
2012-10-26 14:35 ` paolo.carlini at oracle dot com
` (15 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: cracauer at cons dot org @ 2012-10-25 17:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #31 from Martin Cracauer <cracauer at cons dot org> 2012-10-25 17:47:45 UTC ---
The case I reported is in a large system. I did an isolated test case which
sees the time shoots up from 32.54 seconds in gcc-4.4 to 42.86 s in gcc-4.7 and
back down to 32.27s when using gcc-4.7 with the tr1 hashtables.
As you can see there also is a reasonably catastrophic increase in minor page
faults.
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1)
extra flags:
meh: 1635550270 4081940 288998848
0:32.54 32.54 real 30.50 user 1.97 sys 99% CPU 0/1394855 faults
text data bss dec hex filename
5456 688 3840032 3846176 3ab020 sdriver
gcc version 4.7.2 (Debian 4.7.2-4)
extra flags:
meh: 1635550270 4081940 288998848
0:42.86 42.86 real 40.21 user 2.56 sys 99% CPU 0/2101681 faults
text data bss dec hex filename
6822 760 3840032 3847614 3ab5be sdriver
gcc version 4.7.2 (Debian 4.7.2-4)
extra flags: -DCRA_USE_TR1
meh: 1635550270 4081940 288998848
0:32.27 32.27 real 30.29 user 1.92 sys 99% CPU 0/1394853 faults
text data bss dec hex filename
5426 736 3840032 3846194 3ab032 sdriver
Profiles for the three cases:
http://www.cons.org/gcc-hash-slowdown/
Source code for test case:
http://www.cons.org/gcc-hash-slowdown/reproduce-slow-hash.tar.gz
Please feel free to copy things if you want to append things to this pr.
My original case with worse slowdown is in a complicated system and driven from
Lisp. For no other reason than Murphy's law the profiler in there does not
penetrate into layers of C++ frames for this run, although it normally does. So
I'm not able to break up the insert() into it's parts right now. I have no
reason to believe that it is different than this test case, probably just a
slower hash. I have seen gcc-4.7 + tr1 hashtable faster than gcc-4.4 both in
the large system and in the test case so I'm pretty confident that we are
looking at the same thing here. I'm fixing that profiler, more later.
Hope this helps.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (31 preceding siblings ...)
2012-10-25 17:48 ` cracauer at cons dot org
@ 2012-10-26 14:35 ` paolo.carlini at oracle dot com
2012-11-03 15:28 ` fdumont at gcc dot gnu.org
` (14 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-10-26 14:35 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #32 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-10-26 14:34:37 UTC ---
Thanks. Francois, can you please further investigate this issue? In fact, if
the slowdown goes away with a preliminary reserve, it must be possible to
handle the problem rather easily, by tweaking the grow policy or something (We
should be a bit patient and consider that vs 4.4.x the implementation is almost
completely different)
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (32 preceding siblings ...)
2012-10-26 14:35 ` paolo.carlini at oracle dot com
@ 2012-11-03 15:28 ` fdumont at gcc dot gnu.org
2012-11-04 4:55 ` foom at fuhm dot net
` (13 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-11-03 15:28 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #33 from François Dumont <fdumont at gcc dot gnu.org> 2012-11-03 15:28:30 UTC ---
I fear that this performance issue is a normal drawback of the major
enhancement for PR 41975. Before this evolution the hashtable data model was
like a std::vector<std::forward_list>. During the insert operation it was easy
to find out if the inserted element was already in the targetted bucket by
simply checking for equality with each element of the bucket forwward_list. The
code was something like:
for (auto it = bucket.begin(); it != bucket.end(); ++it)
if (comp(*it, element))
// The element already exists
return false;
// Ok, no equivalent element
return true;
After the evolution the data model became a combination of a
std::forward_list containing all elements and a
std::vector<std::forward_list<>::iterator> representing buckets. Now to check
if an element is already in a bucket we still need to compare for equality with
each element of the bucket but the element of the bucket are harder to
identified. There is no more bucket end so we must also check that the element
we are about to compare is still in the bucket. The code became something like:
// Pre: bucket not empty.
for (auto it = buckets[n];)
{
if (comp(*it, element))
// The element already exists
return false;
++it;
if (it == end() || bucket(it) != n)
// We are not in bucket n anymore
return false;
}
// Ok, no equivalent element
return true;
The new design helps to iterate through the container because even if it is
almost empty we simply need to iterate within the forward_list. In the previous
implementation we needed to iterate within a bucket forward_list and then jump
above all empty buckets which could be very expensive. Try to run
testsuite/performance/23_containers/insert_erase/41975.cc, you can easily tweak
it to make it run with the tr1 implementation and you will notice the
difference. This test also show the insert overhead even if with a perfect
hasher like the one use in this test the impact is very limited.
To make bucket(it) not too expensive the new implementation caches most of the
time the hash code which explain the additional memory. You can try to ask for
it not to be cache but you might have to qualified your hasher with noexcept,
static assertions will tell you so.
So for the moment I can see a way to restore tr1 insert performance without
impacting erase performance. In my opinion, a Standard implementation needs to
behave correctly in all use cases and not to be perform very well in one use
case but very bad in an other.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert 3x slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (33 preceding siblings ...)
2012-11-03 15:28 ` fdumont at gcc dot gnu.org
@ 2012-11-04 4:55 ` foom at fuhm dot net
2012-11-05 12:55 ` [Bug libstdc++/54075] [4.7.1] unordered_map insert still " paolo.carlini at oracle dot com
` (12 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: foom at fuhm dot net @ 2012-11-04 4:55 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
James Y Knight <foom at fuhm dot net> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |foom at fuhm dot net
--- Comment #34 from James Y Knight <foom at fuhm dot net> 2012-11-04 04:55:30 UTC ---
Well, I haven't looked into this issue in detail, but, it looks like everyone
is about the same speed if you put a foo.{reserve or rehash}(n) before the
inserts.
Unfortunately, it doesn't look as simple as the new impl calling for a rehash
more often than the old, cause it's not. So, I don't know if the slowness is
because rehash is now a lot slower, or if insert is slower but only if there
aren't a huge number of empty buckets.
I'll also note that libc++'s implementation (seen here:
http://llvm.org/svn/llvm-project/libcxx/trunk/) appears to be getting the same
speed as the old libstdc++ implementation, while appearing to have
approximately the same bucket datastructure as the new libstdc++
implementation. That says to me that it should be *possible* somehow to make
the new version fast.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (34 preceding siblings ...)
2012-11-04 4:55 ` foom at fuhm dot net
@ 2012-11-05 12:55 ` paolo.carlini at oracle dot com
2012-11-05 22:12 ` tlawrence85 at gmail dot com
` (11 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-11-05 12:55 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
Paolo Carlini <paolo.carlini at oracle dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
CC|fdumont at gcc dot gnu.org |
Resolution|FIXED |
Summary|[4.7.1] unordered_map |[4.7.1] unordered_map
|insert 3x slower than 4.6.2 |insert still slower than
| |4.6.2
--- Comment #35 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-11-05 12:55:09 UTC ---
Ok, let's reopen this with an adjusted Summary.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (35 preceding siblings ...)
2012-11-05 12:55 ` [Bug libstdc++/54075] [4.7.1] unordered_map insert still " paolo.carlini at oracle dot com
@ 2012-11-05 22:12 ` tlawrence85 at gmail dot com
2012-11-06 0:59 ` paolo.carlini at oracle dot com
` (10 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: tlawrence85 at gmail dot com @ 2012-11-05 22:12 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
Lawrence <tlawrence85 at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tlawrence85 at gmail dot
| |com
--- Comment #36 from Lawrence <tlawrence85 at gmail dot com> 2012-11-05 22:12:12 UTC ---
It seems that this commit doesn't fully fix this issue. If you call rehash
multiple times with the same size, the second call to rehash resets
_M_prev_resize to a non-zero value in _M_next_bkt(). Here is a sample program
that shows this behavior:
#include <stdio.h>
#include <unordered_map>
int main(void) {
std::unordered_map<int, int> myMap;
myMap.rehash(4000000);
myMap.rehash(4000000);
unsigned long long buckets = myMap.bucket_count();
int i = 0;
while (i < 2000000000) {
myMap.insert(std::make_pair(i, 0));
++i;
if (buckets != myMap.bucket_count()) {
printf("buckets %lu -> %lu\n", buckets, myMap.bucket_count());
buckets = myMap.bucket_count();
}
}
return 0;
}
(In reply to comment #13)
> Author: fdumont
> Date: Thu Jul 26 12:31:50 2012
> New Revision: 189889
>
> URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189889
> Log:
> 2012-07-26 François Dumont <fdumont@gcc.gnu.org>
>
> PR libstdc++/54075
> * include/bits/hashtable.h
> (_Hashtable<>::_Hashtable(_InputIterator, _InputIterator,
> size_type, ...): Remove std::max usage to guarantee that hashtable
> state is consistent with hash policy state.
> (_Hashtable<>::rehash): Likewise. Set _M_prev_resize to 0 to avoid
> the hashtable shrinking on next insertion.
> * testsuite/23_containers/unordered_set/modifiers/reserve.cc: New.
> * testsuite/23_containers/unordered_multiset/modifiers/reserve.cc: New.
> * testsuite/23_containers/unordered_map/modifiers/reserve.cc: New.
> * testsuite/23_containers/unordered_multimap/modifiers/reserve.cc: New.
>
> Added:
>
> branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/reserve.cc
>
> branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_multimap/modifiers/reserve.cc
>
> branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_multiset/modifiers/reserve.cc
>
> branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_set/modifiers/reserve.cc
> Modified:
> branches/gcc-4_7-branch/libstdc++-v3/ChangeLog
> branches/gcc-4_7-branch/libstdc++-v3/include/bits/hashtable.h
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (36 preceding siblings ...)
2012-11-05 22:12 ` tlawrence85 at gmail dot com
@ 2012-11-06 0:59 ` paolo.carlini at oracle dot com
2012-11-06 21:23 ` fdumont at gcc dot gnu.org
` (9 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-11-06 0:59 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #37 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-11-06 00:58:37 UTC ---
Francois, can you please look further into this, possibly basing on the new
testcase? Thanks!
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (37 preceding siblings ...)
2012-11-06 0:59 ` paolo.carlini at oracle dot com
@ 2012-11-06 21:23 ` fdumont at gcc dot gnu.org
2012-11-06 21:34 ` paolo.carlini at oracle dot com
` (8 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-11-06 21:23 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #38 from François Dumont <fdumont at gcc dot gnu.org> 2012-11-06 21:22:48 UTC ---
Sure, I will. However I don't expect this problem to have any relation with the
performance subject of this PR.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (38 preceding siblings ...)
2012-11-06 21:23 ` fdumont at gcc dot gnu.org
@ 2012-11-06 21:34 ` paolo.carlini at oracle dot com
2012-11-07 22:03 ` frs.dumont at gmail dot com
` (7 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-11-06 21:34 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #39 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-11-06 21:33:57 UTC ---
Ok thanks. I guess depending on the complexity of the fixes we can apply some
only to mainline first and reconsider the 4_7 branch later. Please do your best
to work on both issues: we just entered Stage 3 thus no new features from now
on, we are all concentrated on bug fixes until the release.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (39 preceding siblings ...)
2012-11-06 21:34 ` paolo.carlini at oracle dot com
@ 2012-11-07 22:03 ` frs.dumont at gmail dot com
2012-11-08 0:59 ` jwakely.gcc at gmail dot com
` (6 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: frs.dumont at gmail dot com @ 2012-11-07 22:03 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #40 from frs.dumont at gmail dot com 2012-11-07 22:02:56 UTC ---
Here is the patch to fix the redundant rehash/reserve issue.
2012-11-07 François Dumont <fdumont@gcc.gnu.org>
PR libstdc++/54075
* include/bits/hashtable.h (_Hashtable<>::rehash): Reset hash
policy state if no rehash.
* testsuite/23_containers/unordered_set/modifiers/reserve.cc
(test02): New.
I had prepared and tested it in 4.7 branch but I can apply the same on
trunk.
Ok to commit ? If so, where ?
François
On 11/06/2012 10:33 PM, paolo.carlini at oracle dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
>
> --- Comment #39 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-11-06 21:33:57 UTC ---
> Ok thanks. I guess depending on the complexity of the fixes we can apply some
> only to mainline first and reconsider the 4_7 branch later. Please do your best
> to work on both issues: we just entered Stage 3 thus no new features from now
> on, we are all concentrated on bug fixes until the release.
>
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (40 preceding siblings ...)
2012-11-07 22:03 ` frs.dumont at gmail dot com
@ 2012-11-08 0:59 ` jwakely.gcc at gmail dot com
2012-11-08 1:56 ` paolo.carlini at oracle dot com
` (5 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: jwakely.gcc at gmail dot com @ 2012-11-08 0:59 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #41 from Jonathan Wakely <jwakely.gcc at gmail dot com> 2012-11-08 00:58:55 UTC ---
On 7 November 2012 22:02, François Dumont wrote:
>
> Ok to commit ? If so, where ?
That patch is OK for trunk and 4.7, thanks.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (41 preceding siblings ...)
2012-11-08 0:59 ` jwakely.gcc at gmail dot com
@ 2012-11-08 1:56 ` paolo.carlini at oracle dot com
2012-11-08 2:26 ` paolo.carlini at oracle dot com
` (4 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-11-08 1:56 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #42 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-11-08 01:56:15 UTC ---
On 11/08/2012 01:58 AM, Jonathan Wakely wrote:
> On 7 November 2012 22:02, François Dumont wrote:
>> Ok to commit ? If so, where ?
> That patch is OK for trunk and 4.7, thanks.
... I was having a look at this code - patch per se is straightforward
enough and can in any case go in now - and something is puzzling me a
lot: we always compute things, in _M_next_bkt etc, in terms of
__grown_n, thus __n * 2, until the final _M_rehash call.
On the other hand, the old-old code for rehash didn't use
_M_growth_factor in these computations, it just literally enforced the
post-conditions of the Standard. Are we sure we aren't so to speak
rehashing too much? For example, when the load factor is very low and
doesn't count, it looks like a current rehash(n) accomplishes the same
of an old rehash(n * 2)?!? Something seems wrong, can you double check
that? In any case the comments before _M_next_bkt would need fixing.
Thanks,
Paolo.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (42 preceding siblings ...)
2012-11-08 1:56 ` paolo.carlini at oracle dot com
@ 2012-11-08 2:26 ` paolo.carlini at oracle dot com
2012-11-08 20:06 ` fdumont at gcc dot gnu.org
` (3 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: paolo.carlini at oracle dot com @ 2012-11-08 2:26 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #43 from Paolo Carlini <paolo.carlini at oracle dot com> 2012-11-08 02:26:12 UTC ---
On 11/08/2012 02:56 AM, Paolo Carlini wrote:
> On the other hand, the old-old code for rehash didn't use
> _M_growth_factor in these computations, it just literally enforced the
> post-conditions of the Standard. Are we sure we aren't so to speak
> rehashing too much? For example, when the load factor is very low and
> doesn't count, it looks like a current rehash(n) accomplishes the same
> of an old rehash(n * 2)?!? Something seems wrong, can you double check
> that? In any case the comments before _M_next_bkt would need fixing.
... in other terms, I really think that the only uses of
_S_growth_factor should return to be inside _M_need_rehash, because
that's the function called by the inserts, when the container must
automatically grow the number of buckets. Elsewhere, like the
constructors, rehash, should not need to know about _S_growth_factor.
Paolo.
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (43 preceding siblings ...)
2012-11-08 2:26 ` paolo.carlini at oracle dot com
@ 2012-11-08 20:06 ` fdumont at gcc dot gnu.org
2012-11-08 20:16 ` fdumont at gcc dot gnu.org
` (2 subsequent siblings)
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-11-08 20:06 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #44 from François Dumont <fdumont at gcc dot gnu.org> 2012-11-08 20:06:08 UTC ---
Author: fdumont
Date: Thu Nov 8 20:06:00 2012
New Revision: 193335
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193335
Log:
2012-11-08 François Dumont <fdumont@gcc.gnu.org>
PR libstdc++/54075
* include/bits/hashtable.h (_Hashtable<>::rehash): Reset hash
policy state if no rehash.
* testsuite/23_containers/unordered_set/modifiers/reserve.cc
(test02): New.
Modified:
branches/gcc-4_7-branch/libstdc++-v3/ChangeLog
branches/gcc-4_7-branch/libstdc++-v3/include/bits/hashtable.h
branches/gcc-4_7-branch/libstdc++-v3/testsuite/23_containers/unordered_set/modifiers/reserve.cc
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (44 preceding siblings ...)
2012-11-08 20:06 ` fdumont at gcc dot gnu.org
@ 2012-11-08 20:16 ` fdumont at gcc dot gnu.org
2012-11-08 20:21 ` frs.dumont at gmail dot com
2012-11-08 21:19 ` frs.dumont at gmail dot com
47 siblings, 0 replies; 49+ messages in thread
From: fdumont at gcc dot gnu.org @ 2012-11-08 20:16 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #45 from François Dumont <fdumont at gcc dot gnu.org> 2012-11-08 20:16:15 UTC ---
Author: fdumont
Date: Thu Nov 8 20:16:04 2012
New Revision: 193339
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193339
Log:
2012-11-08 François Dumont <fdumont@gcc.gnu.org>
PR libstdc++/54075
* include/bits/hashtable.h (_Hashtable<>::rehash): Reset hash
policy state if no rehash.
* testsuite/23_containers/unordered_set/modifiers/reserve.cc
(test02): New.
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/bits/hashtable.h
trunk/libstdc++-v3/testsuite/23_containers/unordered_set/modifiers/reserve.cc
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (45 preceding siblings ...)
2012-11-08 20:16 ` fdumont at gcc dot gnu.org
@ 2012-11-08 20:21 ` frs.dumont at gmail dot com
2012-11-08 21:19 ` frs.dumont at gmail dot com
47 siblings, 0 replies; 49+ messages in thread
From: frs.dumont at gmail dot com @ 2012-11-08 20:21 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #46 from frs.dumont at gmail dot com 2012-11-08 20:21:15 UTC ---
Attached patch applied to trunk and 4.7 branch.
2012-11-08 François Dumont <fdumont@gcc.gnu.org>
PR libstdc++/54075
* include/bits/hashtable.h (_Hashtable<>::rehash): Reset hash
policy state if no rehash.
* testsuite/23_containers/unordered_set/modifiers/reserve.cc
(test02): New.
François
On 11/08/2012 01:58 AM, Jonathan Wakely wrote:
> On 7 November 2012 22:02, François Dumont wrote:
>> Ok to commit ? If so, where ?
> That patch is OK for trunk and 4.7, thanks.
>
^ permalink raw reply [flat|nested] 49+ messages in thread
* [Bug libstdc++/54075] [4.7.1] unordered_map insert still slower than 4.6.2
2012-07-23 23:07 [Bug libstdc++/54075] New: [4.7.1] unordered_map 3x slower than 4.6.2 likan_999.student at sina dot com
` (46 preceding siblings ...)
2012-11-08 20:21 ` frs.dumont at gmail dot com
@ 2012-11-08 21:19 ` frs.dumont at gmail dot com
47 siblings, 0 replies; 49+ messages in thread
From: frs.dumont at gmail dot com @ 2012-11-08 21:19 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54075
--- Comment #47 from frs.dumont at gmail dot com 2012-11-08 21:19:05 UTC ---
On 11/08/2012 03:25 AM, Paolo Carlini wrote:
> On 11/08/2012 02:56 AM, Paolo Carlini wrote:
>> On the other hand, the old-old code for rehash didn't use
>> _M_growth_factor in these computations, it just literally enforced
>> the post-conditions of the Standard. Are we sure we aren't so to
>> speak rehashing too much? For example, when the load factor is very
>> low and doesn't count, it looks like a current rehash(n) accomplishes
>> the same of an old rehash(n * 2)?!? Something seems wrong, can you
>> double check that? In any case the comments before _M_next_bkt would
>> need fixing.
> ... in other terms, I really think that the only uses of
> _S_growth_factor should return to be inside _M_need_rehash, because
> that's the function called by the inserts, when the container must
> automatically grow the number of buckets. Elsewhere, like the
> constructors, rehash, should not need to know about _S_growth_factor.
>
> Paolo.
>
I haven't yet considered the following emails but based on those
good remarks I have done the attached patch. Surprisingly it seems to
have a good impact on performance even if before it
testsuite/performance/23_containers/insert/unordered_set.cc was showing
that new implementation was better.
I have also starting to adapt tests so that it's possible to check
unordered performance with std or std::tr1 implementations. Can I
generalize it to other tests ? If so, is it a good approach ?
François
^ permalink raw reply [flat|nested] 49+ messages in thread