public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "temporal at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug libstdc++/58153] unordered_multimap::erase(iterator) is not constant-time when many entries have the same key
Date: Wed, 14 Aug 2013 11:16:00 -0000	[thread overview]
Message-ID: <bug-58153-4-taxSHLukek@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-58153-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58153

--- Comment #2 from Kenton Varda <temporal at gmail dot com> ---
> The standard says average case O(1), worst case O(a.size()), so if every
> element in the container has the same key then it's O(n).

I'm not sure that follows.  Yes, the standard says "worst case O(n)", but why
should expect worst-case performance if I have lots of entries with the same
key?  In the absence of explicit language to the contrary, it seems entirely
reasonable to expect unordered_multimap<K, V> to perform similarly to
unordered_map<K, list<V>>, where in fact it turns out to be more like
unordered_map<K, slist<V>>.

I guess the real bug here is with all of the C++ reference sites that fail to
mention this rather large caveat...  that unordered_multimap supports duplicate
keys, but only as long as there aren't "too many" duplicates.

I think part of the problem might be that people are thinking: "Well, duh, hash
maps perform poorly if lots of keys have the same hash!".  But I don't think
that's quite the same thing.  Unique hash maps perform poorly in that case
because it's necessary to linearly scan through many entries to find the one
that matches the key.  But when looking for a key in a multimap, you stop at
the first match, so even if there are lots of entries with the same key, lookup
can still be O(1).  So, I don't think it's intrinsically true that a
hashtable-based multimap should perform badly when many entries have the same
key.


  parent reply	other threads:[~2013-08-14 11:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-14 10:11 [Bug libstdc++/58153] New: " temporal at gmail dot com
2013-08-14 10:35 ` [Bug libstdc++/58153] " redi at gcc dot gnu.org
2013-08-14 11:16 ` temporal at gmail dot com [this message]
2013-08-24 16:30 ` fdumont at gcc dot gnu.org
2013-08-24 19:16 ` temporal at gmail dot com
2013-08-26 20:06 ` fdumont at gcc dot gnu.org
2013-08-27  3:10 ` temporal at gmail dot com
2015-04-09 14:49 ` redi at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-58153-4-taxSHLukek@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).