From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 69936 invoked by alias); 26 Oct 2019 16:45:57 -0000 Mailing-List: contact elfutils-devel-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Post: List-Help: List-Subscribe: Sender: elfutils-devel-owner@sourceware.org Received: (qmail 69869 invoked by uid 89); 26 Oct 2019 16:45:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.100.3 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.1 spammy=tax, perspective, frequent, semantic X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on sourceware.org X-Spam-Level: X-HELO: mx0a-0010f301.pphosted.com Received: from mx0a-0010f301.pphosted.com (HELO mx0a-0010f301.pphosted.com) (148.163.149.254) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 26 Oct 2019 16:45:45 +0000 Received: from pps.filterd (m0102857.ppops.net [127.0.0.1]) by mx0b-0010f301.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id x9QGjQ4u003890; Sat, 26 Oct 2019 11:45:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rice.edu; h=date : from : subject : to : cc : message-id : in-reply-to : references : mime-version : content-type; s=ricemail; bh=Vg3TroGxYR7aKTUr5FuJMaVUulnusve/Dm4HWUlTDWY=; b=RJoI9Y8OKt/NvEljbTjBMESm0oiB0eLhyw8TRrpDquoitUiCz7THRYw/U8d4+tBuo4ZE FbexiE6aAg9Pkb9UexLo2vysWg6IAI3q0cMAQPBEaJEooq9+KEp8tc1WnDdJrzN/0jEG RywL2BCxNCyZ9wlQBaWso751n6ARtkTo5FKAwglQX9yqhXmC22nfOi0lsutlbgk2AhAM oXvAc1vW9YCWHm2MJofgTBIM5CFB6LbuLQMsDzorYhVQzZO252e+T4gEs4SXgr7EIGvF cmXFQf5D2VMOlgTPLNO7tf4U1qqhjBWJ+EQt90cs6s7K/gSOa7Q6NcYf0ZZ4VCe+wXyv zQ== Received: from mh3.mail.rice.edu (mh3.mail.rice.edu [128.42.199.10]) by mx0b-0010f301.pphosted.com with ESMTP id 2vvgf6gbw0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sat, 26 Oct 2019 11:45:36 -0500 Received-X: from mh3.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh3.mail.rice.edu (Postfix) with ESMTP id 4E3A040281; Sat, 26 Oct 2019 11:45:35 -0500 (CDT) Received-X: from mh3.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh3.mail.rice.edu (Postfix) with ESMTP id 4CF794021D; Sat, 26 Oct 2019 11:45:35 -0500 (CDT) X-Virus-Scanned: by amavis-2.7.0 at mh3.mail.rice.edu, auth channel Received-X: from mh3.mail.rice.edu ([127.0.0.1]) by mh3.mail.rice.edu (mh3.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id UhxGU5LlLqKN; Sat, 26 Oct 2019 11:45:35 -0500 (CDT) Received: from c-98-200-170-211.hsd1.tx.comcast.net (c-98-200-170-211.hsd1.tx.comcast.net [98.200.170.211]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: jma14) by mh3.mail.rice.edu (Postfix) with ESMTPSA id 79405401B3; Sat, 26 Oct 2019 11:45:34 -0500 (CDT) Date: Sat, 26 Oct 2019 16:45:00 -0000 From: Jonathon Anderson Subject: Re: [PATCH] libdw: add thread-safety to dwarf_getabbrev() To: Florian Weimer Cc: Mark Wielaard , elfutils-devel@sourceware.org, Srdan Milakovic Message-Id: <1572108332.6121.0@rice.edu> In-Reply-To: <87imobfpqg.fsf@mid.deneb.enyo.de> References: <1565983469.1826.0@smtp.mail.rice.edu> <20190824232438.GA2622@wildebeest.org> <1566695452.979.1@smtp.mail.rice.edu> <5ba06557703ee363e19488c994cbddd92ade25be.camel@klomp.org> <1566782688.5803.0@smtp.mail.rice.edu> <1566826627.5246.0@smtp.mail.rice.edu> <1566877968.10901.0@smtp.mail.rice.edu> <87tv7vg4l0.fsf@mid.deneb.enyo.de> <8fde453a2c570fa150aa39b0dabd0f925c7b0970.camel@klomp.org> X-Mailer: geary/3.34.1 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.95,1.0.8 definitions=2019-10-26_04:2019-10-25,2019-10-26 signatures=0 X-X-Proofpoint-Spam-Details: rule=outbound_spam_definite policy=outbound score=100 mlxscore=100 lowpriorityscore=100 mlxlogscore=-999 suspectscore=11 phishscore=0 spamscore=100 impostorscore=0 malwarescore=0 clxscore=1011 priorityscore=1501 bulkscore=100 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-1908290000 definitions=main-1910260171 Content-Type: text/plain; charset=us-ascii; format=flowed X-SW-Source: 2019-q4/txt/msg00049.txt.bz2 Hello Florian Weimer, I'm the original author of this patch, so I'll try to answer what I can. For some overall perspective, this patch replaces the original libdw allocator with a thread-safe variant. The original acts both as a suballocator (to keep from paying the malloc tax on frequent small allocations) and a garbage collection list (to free internal structures on dwarf_end). The patch attempts to replicate the same overall behavior in the more volatile parallel case. On Sat, Oct 26, 2019 at 18:14, Florian Weimer wrote: > * Mark Wielaard: > >> I'll see if I can create a case where that is a problem. Then we can >> see how to adjust things to use less pthread_keys. Is there a >> different >> pattern we can use? > > It's unclear what purpose thread-local storage serves in this context. The thread-local storage provides the suballocator side: for each Dwarf, each thread has its own "top block" to perform allocations from. To make this simple, each Dwarf has a key to give threads local storage specific to that Dwarf. Or at least that was the intent, I didn't think to consider the limit, we didn't run into it in our use cases. There may be other ways to handle this, I'm high-level considering potential alternatives (with more atomics, of course). The difficulty is mostly in providing the same performance in the single-threaded case. > You already have a Dwarf *. I would consider adding some sort of > clone function which creates a shallow Dwarf * with its own embedded > allocator or something like that. The downside with this is that its an API addition, which we (the Dyninst + HPCToolkit projects) would need to enforce. Which isn't a huge deal for us, but I will need to make a case to those teams to make the shift. On the upside, it does provide a very understandable semantic in the case of parallelism. For an API without synchronization clauses, this would put our work back into the realm of "reasonably correct" (from "technically incorrect but works.") > This assumes that memory allocation > is actually a performance problem, otherwise you could let malloc > handle the details. In our (Dyninst + HPCToolkit) work, we have found that malloc tends to be slow in the multithreaded case, in particular with many small allocations. The glibc implementation (which most of our clients use) uses a full mutex to provide thread-safety. While we could do a lot better in our own projects with regards to memory management, the fact remains that malloc alone is a notable facet to the performance of libdw. Hopefully this helps give a little light on the issue. -Jonathon