public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/23244] New: __gnu_cxx::hash_map inconsistent iterator behavior
@ 2005-08-05  2:57 jzampier at gmail dot com
  2005-08-05  2:58 ` [Bug libstdc++/23244] " jzampier at gmail dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: jzampier at gmail dot com @ 2005-08-05  2:57 UTC (permalink / raw)
  To: gcc-bugs

The iterators over hash_map behave in a manner inconsistent with the other
std::map, std::list, etc... data structures. This can cause segfaults which are
hard to root-cause. Sample Code Included with additional information:

Specifically, the iterators over hash_map require the 'first' datamember to
be valid until the iterator has advanced past that point. (see code for clarity)
The iterator operator++ should not depend on being able to call the
hash functor on that data member to advance. Because, as in the example, 
you can't call the functor when the pointer has been deleted.

This behavior happens on other hosts of g++ (tested on 3.4.3, linux). (I just
have cygwin on here...)

The offending code is placed in #if 0 and #endif directives.

While this isn't a bug per se, it is a problem with the behavior of 
hash_map that should at least be documented in LARGE letters 
in the hash_map, if not fixed to work like the rest of the stl.

// $Id$
// Program: hashbroke
// Purpose: To demonstrate how __gnu_cxx::hash_map is broken. (sorta)
// Author: Jeffrey Zampieron
// License: GPL.

#include <iostream>
#include <ext/hash_map>
#include <map>

using namespace std;
using namespace __gnu_cxx;

typedef struct {
  size_t operator()( const string* str ) const {
    return __gnu_cxx::__stl_hash_string( str->c_str() );
  }
} strptrhash;

typedef struct {
  bool operator()( const string* lhs, const string* rhs ) const {
    return *lhs == *rhs;
  }
} strptrequal;

typedef struct {
  bool operator()( const string* lhs, const string* rhs ) const {
    return *lhs < *rhs;
  }
} strptrless;

typedef hash_map< string*, string*, strptrhash, strptrequal > StringPtrHash;
typedef std::map< string*, string*, strptrless > StringPtrMap;

int main() {
  StringPtrHash testhash;
  // test keys
  string* k1 = new string( "key1" );
  string* k2 = new string( "key2" );
  string* k3 = new string( "key3" );
  string* k4 = new string( "key4" );
  string* k5 = new string( "key5" );
  string* k6 = new string( "key6" );

  // test vals
  string* v1 = new string( "val1" );
  string* v2 = new string( "val2" );
  string* v3 = new string( "val3" );
  string* v4 = new string( "val4" );
  string* v5 = new string( "val5" );
  string* v6 = new string( "val6" );

  testhash[ k1 ] = v1;
  testhash[ k2 ] = v2;
  testhash[ k3 ] = v3;
  testhash[ k4 ] = v4;
  testhash[ k5 ] = v5;
  testhash[ k6 ] = v6;
  
  cout << "Hash:\n";
  for( StringPtrHash::iterator iter = testhash.begin(); 
       iter != testhash.end();
       iter++ ) {
    cout << "Key: " << *( iter->first ) << endl;
    cout << "Val: " << *( iter->second ) << endl;
  }

  // this SHOULD be valid.
  // and is for std::map
  // but causes a segfault here.
#if 0
  for( StringPtrHash::iterator iter = testhash.begin(); 
       iter != testhash.end();
       iter++ ) {
    delete( iter->first );
    delete( iter->second );
  }
#endif 

  // Work around:
  string* prev = __null;
  for( StringPtrHash::iterator iter = testhash.begin(); 
       iter != testhash.end();
       iter++ ) {
    delete( prev );
    delete( iter->second );
    prev = iter->first;
  }
  delete( prev );

  // That this works with map:
  StringPtrMap testmap;
  k1 = new string( "key1" );
  k2 = new string( "key2" );
  k3 = new string( "key3" );
  k4 = new string( "key4" );
  k5 = new string( "key5" );
  k6 = new string( "key6" );
  
  // test vals
  v1 = new string( "val1" );
  v2 = new string( "val2" );
  v3 = new string( "val3" );
  v4 = new string( "val4" );
  v5 = new string( "val5" );
  v6 = new string( "val6" );
  

  testmap[ k1 ] = v1;
  testmap[ k2 ] = v2;
  testmap[ k3 ] = v3;
  testmap[ k4 ] = v4;
  testmap[ k5 ] = v5;
  testmap[ k6 ] = v6;

  cout << "\nMap:\n";
  for( StringPtrMap::iterator iter = testmap.begin(); 
       iter != testmap.end();
       iter++ ) {
    cout << "Key: " << *( iter->first ) << endl;
    cout << "Val: " << *( iter->second ) << endl;
  }

  // works like map should.
  for( StringPtrMap::iterator iter = testmap.begin(); 
       iter != testmap.end();
       iter++ ) {
    delete( iter->first );
    delete( iter->second );
  }
 
  // The problem seems to be that the 
  // operator++ in the hash_map iterator
  // needs to call the function provided 
  // by strptrhash in order to advance...
  // but in my example that pointer has been deleted...
  // hence the segfault.
  // This is apparent if you run valgrind on the exec.
  // (I don't have it on this machine, or I'd include the output)

  // Why is this the case? Iterating over a map
  // should not require computation of a hash key.
  // While not really a bug, this is a caveat that should
  // be documented somewhere, or ideally fixed
  // Notice that map does not call strptrcmp to advance the iterator

}

-- 
           Summary: __gnu_cxx::hash_map inconsistent iterator behavior
           Product: gcc
           Version: 3.4.4
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: libstdc++
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: jzampier at gmail dot com
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: i686-pc-cygwin
  GCC host triplet: i686-pc-cygwin
GCC target triplet: i686-pc-cygwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23244


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug libstdc++/23244] __gnu_cxx::hash_map inconsistent iterator behavior
  2005-08-05  2:57 [Bug libstdc++/23244] New: __gnu_cxx::hash_map inconsistent iterator behavior jzampier at gmail dot com
@ 2005-08-05  2:58 ` jzampier at gmail dot com
  2005-08-05  8:30 ` chris at bubblescope dot net
  2005-08-09  8:41 ` pcarlini at suse dot de
  2 siblings, 0 replies; 4+ messages in thread
From: jzampier at gmail dot com @ 2005-08-05  2:58 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From jzampier at gmail dot com  2005-08-05 02:58 -------
Created an attachment (id=9434)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9434&action=view)
The offending code.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23244


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug libstdc++/23244] __gnu_cxx::hash_map inconsistent iterator behavior
  2005-08-05  2:57 [Bug libstdc++/23244] New: __gnu_cxx::hash_map inconsistent iterator behavior jzampier at gmail dot com
  2005-08-05  2:58 ` [Bug libstdc++/23244] " jzampier at gmail dot com
@ 2005-08-05  8:30 ` chris at bubblescope dot net
  2005-08-09  8:41 ` pcarlini at suse dot de
  2 siblings, 0 replies; 4+ messages in thread
From: chris at bubblescope dot net @ 2005-08-05  8:30 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From chris at bubblescope dot net  2005-08-05 08:29 -------
While it doesn't fix your bug, if you want a hash_map I'd advise looking at tr1/unordered_map, which 
I'd expect to work better. I think however you'll find that all implementations of hash_map strictly tell 
you to not change how the indexing data member is changed, so I suspect this is generally in the world 
of undefined behaviour anyway (although like you say, fixing the undefined behaviour wouldn't be 
easy). For example, while you might get away with this in map or set, if you then did anything (like add 
new or erase existing elements), things blow up horribly.

This would also be very hard to "fix", even if we wanted to, because at the moment when we reach the 
end of one bucket there is no way of knowing which bucket we are in without hashing the current value. 
We would probably have the either redesign the buckets so the end of each pointed to the beginning of 
the next (which might actually not be a totally bad idea, but would require some major redesign) or 
doubling (at least) the size of the iterators, which I doubt will happen.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23244


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug libstdc++/23244] __gnu_cxx::hash_map inconsistent iterator behavior
  2005-08-05  2:57 [Bug libstdc++/23244] New: __gnu_cxx::hash_map inconsistent iterator behavior jzampier at gmail dot com
  2005-08-05  2:58 ` [Bug libstdc++/23244] " jzampier at gmail dot com
  2005-08-05  8:30 ` chris at bubblescope dot net
@ 2005-08-09  8:41 ` pcarlini at suse dot de
  2 siblings, 0 replies; 4+ messages in thread
From: pcarlini at suse dot de @ 2005-08-09  8:41 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pcarlini at suse dot de  2005-08-09 08:41 -------
Yeah, I concur with Chris. And we already agreed (with Matt too) that there is
no point in spending time on the legacy hashed containers.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |WONTFIX


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23244


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-08-09  8:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-08-05  2:57 [Bug libstdc++/23244] New: __gnu_cxx::hash_map inconsistent iterator behavior jzampier at gmail dot com
2005-08-05  2:58 ` [Bug libstdc++/23244] " jzampier at gmail dot com
2005-08-05  8:30 ` chris at bubblescope dot net
2005-08-09  8:41 ` pcarlini at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).