From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 114372 invoked by alias); 25 Aug 2019 10:05:04 -0000 Mailing-List: contact elfutils-devel-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Post: List-Help: List-Subscribe: Sender: elfutils-devel-owner@sourceware.org Received: (qmail 114358 invoked by uid 89); 25 Aug 2019 10:05:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.100.3 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-6.6 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy= X-Spam-Status: No, score=-6.6 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on sourceware.org X-Spam-Level: X-HELO: gnu.wildebeest.org Received: from wildebeest.demon.nl (HELO gnu.wildebeest.org) (212.238.236.112) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sun, 25 Aug 2019 10:05:01 +0000 Received: from tarox.wildebeest.org (tarox.wildebeest.org [172.31.17.39]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id 5D85230012F6; Sun, 25 Aug 2019 12:04:59 +0200 (CEST) Received: by tarox.wildebeest.org (Postfix, from userid 1000) id 0955C4000BA0; Sun, 25 Aug 2019 12:04:59 +0200 (CEST) Message-ID: <5ba06557703ee363e19488c994cbddd92ade25be.camel@klomp.org> Subject: Re: [PATCH] libdw: add thread-safety to dwarf_getabbrev() From: Mark Wielaard To: Jonathon Anderson Cc: elfutils-devel@sourceware.org, Srdan Milakovic Date: Sun, 25 Aug 2019 10:05:00 -0000 In-Reply-To: <1566695452.979.1@smtp.mail.rice.edu> References: <1565983469.1826.0@smtp.mail.rice.edu> <20190824232438.GA2622@wildebeest.org> <1566695452.979.1@smtp.mail.rice.edu> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Evolution 3.28.5 (3.28.5-5.el7) Mime-Version: 1.0 X-Spam-Flag: NO X-IsSubscribed: yes X-SW-Source: 2019-q3/txt/msg00133.txt.bz2 Hi Jonathon, On Sat, 2019-08-24 at 20:10 -0500, Jonathon Anderson wrote: > On Sat, Aug 24, 2019 at 6:24 PM, Mark Wielaard wrote: > > But what if realloc moves the block? > > Then all dbg->mem_tails[thread_id] pointers become invalid. > > And this function drops the lock before returning a libdw_memblock*. >=20 > Not quite, dbg->mem_tails is an array of pointers (note the ** in its=20 > definition, and the use of the plural "tails"). So while the=20 > dbg->mem_tails pointer becomes invalid, the dbg->mem_tails[thread_id]=20 > pointers don't. >=20 > It would be a different story if dbg->mem_tails was an array of=20 > libdw_memblocks, but its not. Aha. Yes, they are pointers to the mem_blocks, not the mem_blocks themselves. The pointer values are moved, but never changed. So this is indeed fine. I was confused. > That would increase the "memory leak"=20 > issue mentioned previously (to ~4K per dead thread) and require more=20 > work on the part of the reallocing thread to initialize the new entries=20 > (which at the moment should reduce to a memset, assuming the compiler=20 > is smart enough and NULL =3D=3D 0). >=20 > >=20 > > So I think the lock needs to extend beyond this function somehow. Or > > we need to prevent another thread reallocing while we are dealing with > > the assigned memblock. >=20 > No need, in fact we want to drop the lock as soon as possible to let=20 > new threads in for realloc's. Under the assumption that we don't need=20 > to allocate new blocks (extremely) often, the extra cost to relock when=20 > we do should be relatively small. Also see __libdw_allocate. Yes, now that I have a better (correct) mental picture of the data structure, this makes total sense. > As mentioned in other mails, this management scheme isn't (always)=20 > optimally memory efficient, but its speed is good under parallel=20 > stress. Far better than wrapping the whole thing with a single=20 > pthread_mutex_t. I wouldn't tweak it too much if you know it is working correctly now. We do have to make sure it doesn't slow down the single-threaded use case (since most programs still are single threaded). There is the new locking, which slows things down. But the memory use should be about the same since you don't duplicate the mem_blocks till there is access from multiple threads. If there isn't much parallel stress or allocation in practice. That is, even in multi-threaded programs, there is still just one thread that does most/all of the allocations. Then you could maybe optimize a bit by having one "owner" thread for a Dwarf. That would be the first that hits __libdw_alloc_tail. For that "owner thread" you then setup an owner thread_id field, and have one special mem_block allocated. You only expand the mem_blocks once another thread does an allocation. Then the memory use in multi-threaded programs, where only one thread handles a Dwarf object (or happens to allocate) would be the same as for single-threaded programs. But this depends on the usage pattern. Which might vary very between programs. So don't try it, till you know it actually helps for a real world program. And the extra memory use, might not really matter that much in practice anyway. Cheers, Mark