From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 52101 invoked by alias); 1 Sep 2015 14:56:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 52033 invoked by uid 89); 1 Sep 2015 14:56:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_00,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_LOW,SPF_PASS,T_RP_MATCHES_RCVD,URIBL_BLACK autolearn=no version=3.3.2 X-HELO: mail-wi0-f177.google.com Received: from mail-wi0-f177.google.com (HELO mail-wi0-f177.google.com) (209.85.212.177) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 01 Sep 2015 14:56:43 +0000 Received: by wicmc4 with SMTP id mc4so36180909wic.0 for ; Tue, 01 Sep 2015 07:56:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=sXpl04FnjkMpVQXlhwkzap2L2soN/l5n3aOjqbUP+ns=; b=aqFQWmpqr4mQ4WtRy3pNAsbo9zsz0DarZ1QDvOVVvQ8zZd6ARCSA4t4REZW/CDKQjB FvoW5mqTyBOUoAptx8G00Xhp1aDvd7fUYvcJ2CxQFkuxVwEoGIcFLPwU6HutE/fHfZAl fBgv9cwt6+LQyRwFo3wmnWPKITXliteNkoQZE8y2YD83ZTobVye3ulBaVx2t4OD5UTE1 RotrEpfPfW6MfBt5YQDDIHFuqvXbl3RSuZepl7DoZt0j0fdSdLaTiQ1iPOnG0frZtLK7 6AOlQDQ9C+y7KK+cYw3v0ZXx1y7a3kOGyei6g8heBQXw1B77V9wMl9xdLFV8jQM36n9M orBQ== X-Gm-Message-State: ALoCoQlx3JTGlIRAsxVb+8AcQVZ9NkqS/BpQJr8U8phSO9He867AFj/UUHduwBlRk8HNzmTRHjrM X-Received: by 10.194.205.37 with SMTP id ld5mr37206795wjc.14.1441119399821; Tue, 01 Sep 2015 07:56:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.64.193 with HTTP; Tue, 1 Sep 2015 07:56:20 -0700 (PDT) In-Reply-To: <20150901142713.GG2631@redhat.com> References: <20150901142713.GG2631@redhat.com> From: Dmitry Vyukov Date: Tue, 01 Sep 2015 14:56:00 -0000 Message-ID: Subject: Re: [Patch, libstdc++] Fix data races in basic_string implementation To: Jonathan Wakely Cc: GCC Patches , libstdc++@gcc.gnu.org, Alexander Potapenko , Kostya Serebryany , Torvald Riegel Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-09/txt/msg00065.txt.bz2 On Tue, Sep 1, 2015 at 4:27 PM, Jonathan Wakely wrote: > On 01/09/15 14:51 +0200, Dmitry Vyukov wrote: >> >> Hello, >> >> The refcounted basic_string implementation contains several data races >> on _M_refcount: >> 1. _M_is_leaked loads _M_refcount concurrently with mutations of >> _M_refcount. This loads needs to be memory_order_relaxed load, as >> _M_is_leaked predicate does not change under the feet. >> 2. _M_is_shared loads _M_refcount concurrently with mutations of >> _M_refcount. This loads needs to be memory_order_acquire, as another >> thread can drop _M_refcount to zero concurrently which makes the >> string non-shared and so the current thread can mutate the string. We >> need reads of the string in another thread (while it was shared) to >> happen-before the writes to the string in this thread (now that it is >> non-shared). >> >> This patch adds __gnu_cxx::__atomic_load_dispatch function to do the >> loads of _M_refcount. The function does an acquire load. Acquire is >> non needed for _M_is_leaked, but for simplicity as still do acquire >> (hopefully the refcounted impl will go away in future). > > > It's unlikely to go away for a long time. ack >> This patch also uses the new function to do loads of _M_refcount in >> string implementation. >> >> I did non update doc/xml/manual/concurrency_extensions.xml to document >> __gnu_cxx::__atomic_load_dispatch, because I am not sure we want to >> extend that api now that we have language-level atomics. If you still >> want me to update it, please say how to regenerate >> doc/html/manual/ext_concurrency.html. > > > I don't know if we need the new __atomic_load_dispatch function at all, > we could just use atomic loads directly e.g. > > bool > _M_is_leaked() const _GLIBCXX_NOEXCEPT > #if defined(__GTHREADS) && defined(_GLIBCXX_ATOMIC_BUILTINS) > + { return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0; > } > +#else > { return this->_M_refcount > 0; } > +#endif > > bool > _M_is_shared() const _GLIBCXX_NOEXCEPT > #if defined(__GTHREADS) && defined(_GLIBCXX_ATOMIC_BUILTINS) > + { return __atomic_load(&this->_M_refcount, __ATOMIC_ACQUIRE) > 0; } > +#else > { return this->_M_refcount > 0; } > +#endif Looks better to me. I wasn't sure why that dispatch indirection is there at all, so I did what the current does. > The __atomic_xxx_dispatch functions check __gthread_active_p() as an > optimisation to avoid potentially expensive atomic stores, but I don't > think we need that for atomic loads here, especially the relaxed load. > We could always add the dispatch in later if we get complaints about > single-threaded performance on targets where the loads are expensive. > > This doesn't fix the problem for targets that don't define > _GLIBCXX_ATOMIC_BUILTINS but I don't know how many of them there are. > We could make it work on more targets by adding a new configure check > just for __atomic_load(int*, ...), because _GLIBCXX_ATOMIC_BUILTINS > requires several builtins to support various object sizes, but here we > don't need all of that. I don't understand how a new gcc may not support __atomic builtins on ints. How it is even possible? That's a portable API provided by recent gcc's... >> The race was detected with ThreadSanitizer on the following program: >> >> #define _GLIBCXX_USE_CXX11_ABI 0 >> #include >> #include >> #include >> #include >> >> int main() { >> std::string s = "foo"; >> std::thread t([=](){ >> std::string x = s; >> std::cout << &x << std::endl; >> }); >> std::this_thread::sleep_for(std::chrono::seconds(1)); >> std::cout << &s[0] << std::endl; >> t.join(); >> } >> >> $ g++ string.cc -std=c++11 -pthread -O1 -g -fsanitize=thread >> $ ./a.out >> >> WARNING: ThreadSanitizer: data race (pid=98135) >> Read of size 4 at 0x7d080000efd0 by main thread: >> #0 std::string::_Rep::_M_is_leaked() const >> include/c++/6.0.0/bits/basic_string.h:2605 (a.out+0x000000401d7c) >> #1 std::string::_M_leak() >> include/c++/6.0.0/bits/basic_string.h:2730 (a.out+0x000000401d7c) >> #2 std::string::operator[](unsigned long) >> include/c++/6.0.0/bits/basic_string.h:3274 (a.out+0x000000401d7c) >> #3 main /tmp/string.cc:14 (a.out+0x000000401d7c) >> >> Previous atomic write of size 4 at 0x7d080000efd0 by thread T1: >> #0 __tsan_atomic32_fetch_add >> ../../../../libsanitizer/tsan/tsan_interface_atomic.cc:611 >> (libtsan.so.0+0x00000005811f) >> #1 __exchange_and_add include/c++/6.0.0/ext/atomicity.h:59 >> (a.out+0x000000401a19) >> #2 __exchange_and_add_dispatch >> include/c++/6.0.0/ext/atomicity.h:92 (a.out+0x000000401a19) >> #3 std::string::_Rep::_M_dispose(std::allocator const&) >> include/c++/6.0.0/bits/basic_string.h:2659 (a.out+0x000000401a19) >> #4 std::basic_string, >> std::allocator >::~basic_string() >> include/c++/6.0.0/bits/basic_string.h:2961 (a.out+0x000000401a19) >> #5 ~ /tmp/string.cc:9 (a.out+0x000000401a19) >> #6 ~_Head_base include/c++/6.0.0/tuple:102 (a.out+0x000000401a19) >> #7 ~_Tuple_impl include/c++/6.0.0/tuple:338 (a.out+0x000000401a19) >> #8 ~tuple include/c++/6.0.0/tuple:521 (a.out+0x000000401a19) >> #9 ~_Bind_simple include/c++/6.0.0/functional:1503 >> (a.out+0x000000401a19) >> #10 ~_Impl include/c++/6.0.0/thread:107 (a.out+0x000000401a19) >> #11 destroy()> >>> >>> > include/c++/6.0.0/ext/new_allocator.h:124 (a.out+0x0000004015c7) >> >> #12 >> _S_destroy()> >>> >>> >, std::thread::_Impl()> > > >> >> include/c++/6.0.0/bits/alloc_traits.h:236 (a.out+0x0000004015c7) >> #13 destroy()> >>> >>> > include/c++/6.0.0/bits/alloc_traits.h:336 (a.out+0x0000004015c7) >> >> #14 _M_dispose include/c++/6.0.0/bits/shared_ptr_base.h:529 >> (a.out+0x0000004015c7) >> #15 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() >> >> /usr/local/google/home/dvyukov/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/shared_ptr_base.h:150 >> (libstdc++.so.6+0x0000000b6da4) >> #16 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() >> >> /usr/local/google/home/dvyukov/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/shared_ptr_base.h:662 >> (libstdc++.so.6+0x0000000b6da4) >> #17 std::__shared_ptr> (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() >> >> /usr/local/google/home/dvyukov/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/shared_ptr_base.h:928 >> (libstdc++.so.6+0x0000000b6da4) >> #18 std::shared_ptr::~shared_ptr() >> >> /usr/local/google/home/dvyukov/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/shared_ptr.h:93 >> (libstdc++.so.6+0x0000000b6da4) >> #19 execute_native_thread_routine >> libstdc++-v3/src/c++11/thread.cc:79 (libstdc++.so.6+0x0000000b6da4) >> >> Location is heap block of size 28 at 0x7d080000efc0 allocated by main >> thread: >> #0 operator new(unsigned long) >> ../../../../libsanitizer/tsan/tsan_interceptors.cc:571 >> (libtsan.so.0+0x0000000255a3) >> #1 __gnu_cxx::new_allocator::allocate(unsigned long, void >> const*) >> /usr/local/google/home/dvyukov/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:104 >> (libstdc++.so.6+0x0000000cbaa8) >> #2 std::string::_Rep::_S_create(unsigned long, unsigned long, >> std::allocator const&) >> >> /usr/local/google/home/dvyukov/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:1051 >> (libstdc++.so.6+0x0000000cbaa8) >> #3 __libc_start_main (libc.so.6+0x000000021ec4) >> >> Thread T1 (tid=98137, finished) created by main thread at: >> #0 pthread_create >> ../../../../libsanitizer/tsan/tsan_interceptors.cc:895 >> (libtsan.so.0+0x000000026d04) >> #1 __gthread_create >> >> /usr/local/google/home/dvyukov/src/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:662 >> (libstdc++.so.6+0x0000000b6e52) >> #2 >> std::thread::_M_start_thread(std::shared_ptr, >> void (*)()) libstdc++-v3/src/c++11/thread.cc:149 >> (libstdc++.so.6+0x0000000b6e52) >> #3 __libc_start_main (libc.so.6+0x000000021ec4) >> >> OK for trunk? > > >> Index: include/bits/basic_string.h >> =================================================================== >> --- include/bits/basic_string.h (revision 227363) >> +++ include/bits/basic_string.h (working copy) >> @@ -2601,11 +2601,11 @@ >> >> bool >> _M_is_leaked() const _GLIBCXX_NOEXCEPT >> - { return this->_M_refcount < 0; } >> + { return __gnu_cxx::__atomic_load_dispatch(&this->_M_refcount) < >> 0; } >> >> bool >> _M_is_shared() const _GLIBCXX_NOEXCEPT >> - { return this->_M_refcount > 0; } >> + { return __gnu_cxx::__atomic_load_dispatch(&this->_M_refcount) > >> 0; } >> >> void >> _M_set_leaked() _GLIBCXX_NOEXCEPT >> Index: include/ext/atomicity.h >> =================================================================== >> --- include/ext/atomicity.h (revision 227363) >> +++ include/ext/atomicity.h (working copy) >> @@ -35,6 +35,16 @@ >> #include >> #include >> >> +// Even if the CPU doesn't need a memory barrier, we need to ensure >> +// that the compiler doesn't reorder memory accesses across the >> +// barriers. >> +#ifndef _GLIBCXX_READ_MEM_BARRIER >> +#define _GLIBCXX_READ_MEM_BARRIER __atomic_thread_fence >> (__ATOMIC_ACQUIRE) >> +#endif >> +#ifndef _GLIBCXX_WRITE_MEM_BARRIER >> +#define _GLIBCXX_WRITE_MEM_BARRIER __atomic_thread_fence >> (__ATOMIC_RELEASE) >> +#endif >> + >> namespace __gnu_cxx _GLIBCXX_VISIBILITY(default) >> { >> _GLIBCXX_BEGIN_NAMESPACE_VERSION >> @@ -50,7 +60,7 @@ >> >> static inline void >> __atomic_add(volatile _Atomic_word* __mem, int __val) >> - { __atomic_fetch_add(__mem, __val, __ATOMIC_ACQ_REL); } >> + { __atomic_fetch_add(__mem, __val, __ATOMIC_RELEASE); } >> #else >> _Atomic_word >> __attribute__ ((__unused__)) >> @@ -101,17 +111,27 @@ >> #endif >> } >> >> + static inline _Atomic_word >> + __attribute__ ((__unused__)) >> + __atomic_load_dispatch(const _Atomic_word* __mem) >> + { >> +#ifdef __GTHREADS >> + if (__gthread_active_p()) >> + { >> +#ifdef _GLIBCXX_ATOMIC_BUILTINS >> + return __atomic_load_n(__mem, __ATOMIC_ACQUIRE); >> +#else >> + // The best we can get with an old compiler. > > > We don't support old compilers in libstdc++ trunk, so this comment is > misleading. The fallback is needed for targets without support for all > the builtins, not for old compilers. > >> + _Atomic_word v = *(volatile _Atomic_word*)__mem; >> + _GLIBCXX_READ_MEM_BARRIER; > > > If a target doesn't define _GLIBCXX_ATOMIC_BUILTINS do we know it will > define __atomic_thread_fence ? > > I guess those targets should be redefining _GLIBCXX_READ_MEM_BARRIER > anyway. Yeah, that was my understanding. _GLIBCXX_READ_MEM_BARRIER is already intended to be a portable thing whatever way it is implemented. >> + return v; >> +#endif >> + } >> +#endif >> + return *__mem; >> + } >> + >> _GLIBCXX_END_NAMESPACE_VERSION >> } // namespace >> >> -// Even if the CPU doesn't need a memory barrier, we need to ensure >> -// that the compiler doesn't reorder memory accesses across the >> -// barriers. >> -#ifndef _GLIBCXX_READ_MEM_BARRIER >> -#define _GLIBCXX_READ_MEM_BARRIER __atomic_thread_fence >> (__ATOMIC_ACQUIRE) >> -#endif >> -#ifndef _GLIBCXX_WRITE_MEM_BARRIER >> -#define _GLIBCXX_WRITE_MEM_BARRIER __atomic_thread_fence >> (__ATOMIC_RELEASE) >> -#endif >> - >> #endif > >