>Number: 5444 >Category: libstdc++ >Synopsis: in multi-processor environment basic_string ist not thread safe >Confidential: no >Severity: serious >Priority: medium >Responsible: unassigned >State: open >Class: sw-bug >Submitter-Id: net >Arrival-Date: Mon Jan 21 12:36:00 PST 2002 >Closed-Date: >Last-Modified: >Originator: markus.breuer@materna.de >Release: shipped with gcc 3.0.3 (3.0.951ß9 >Organization: >Environment: SUN Solaris 2.8 multi-processor machine >Description: This bug is not a clone of libstdc++/5037 I tested gcc 3.0.3 with a small sample application when the memory allocation became corrupt. After some time the process size grows up to maximum of available memory. Then the application runs out of memory und tries to write a core dump. Because of less free disk space this core is not comletly ritten. I also applied the fix for atomcity (not yet released, but in 3.1-stream), but it seems not to solve the problem. When starting the application in a single processor machine, there are no problems. In my opinion there are two possible problems: 1. The atomicity works fine, but because of the mass of strings the reference counter becomes an overflow. I could not see any protection in the context of basic_string. But I also could not find any hint (within the debugger) that an overflow may occure. The conditional breakpoint at the reference counter never reached a size of 1000 or more. 2. The atomicity does not work correct or there are accesses which are not synchronized. The changing accesses are done by using atomicity.h. But simple read accesses are done directly (see at header for example in _M_is_shared() in bits/basic_string.h.) I allready mailed nathan myers, who was involved in the basic_string implementation. I am not sure if he found any error, but I am really sure that gcc 3.0.3 (with the patch you described) is not stable when running my application. Currently I am using a mix of 2.95.3 and 3.x-atomicity. This mix runs without any problem. But because of the new implementation I am not able to apply my changes to the new implementation, nearly everthing is new. While fixing gcc 2.95.3 I found out, that the usage of only one global synchronizer for any string may be a bottle neck. So I decided to remove the global string synchronizer, instead of this I added it to the Rep structure. It works while using chunks of 16 byte, so my additional byte requires not really more effort or memory usage. As result shared Rep's use their own synchronizing atomic char. After this change my test-application was running much faster . I saw it on the terminal. Also there was less cpu usage. I think the atomic implementations behavior is like following: /* !!! atomic uses more than on instruction to implement mutex !!! */ /* lock */ do { current = atomics_value; // a context switch to an thread running on another processor // will force the other thread to loop until current thread unlocks // the mutex => active waiting } while( current != 0 ); >How-To-Repeat: It seems there is a relationship between processor idle time and the core dumps. To repeat the error try to start two or three instances of the application. Open a further shell running top-command (interval 1s) and watch the applications memory usage. After about half a minute one processes memory usage will grow.