From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-0010f301.pphosted.com (mx0b-0010f301.pphosted.com [148.163.153.244]) by sourceware.org (Postfix) with ESMTPS id 721A43858D3C for ; Thu, 5 May 2022 19:56:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 721A43858D3C Received: from pps.filterd (m0102859.ppops.net [127.0.0.1]) by mx0b-0010f301.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 245IFN5D006166; Thu, 5 May 2022 14:56:20 -0500 Received: from mh2.mail.rice.edu (mh2.mail.rice.edu [128.42.201.21]) by mx0b-0010f301.pphosted.com (PPS) with ESMTPS id 3fs0fyq5e2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 05 May 2022 14:56:20 -0500 Received-X: from mh2.mail.rice.edu (localhost [127.0.0.1]) by mh2.mail.rice.edu (Postfix) with ESMTP id 5984D22F7AA; Thu, 5 May 2022 14:56:20 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mh2.mail.rice.edu (Postfix) with ESMTP id 57E4D200C5E; Thu, 5 May 2022 14:56:20 -0500 (CDT) X-Virus-Scanned: by amavis-2.12.1 at mh2.mail.rice.edu, auth channel Received: from mh2.mail.rice.edu ([127.0.0.1]) by localhost (mh2.mail.rice.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id sD25sdoFH2Op; Thu, 5 May 2022 14:56:10 -0500 (CDT) Received: from [73.232.138.214] (c-73-232-138-214.hsd1.tx.comcast.net [73.232.138.214]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: jma14) by mh2.mail.rice.edu (Postfix) with ESMTPSA id A5DC2227680; Thu, 5 May 2022 14:56:10 -0500 (CDT) Message-ID: Date: Thu, 5 May 2022 14:56:10 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: LD_AUDIT: Not enough space in static TLS block Content-Language: en-US To: Florian Weimer Cc: Carlos O'Donell , Ben Woodard , Adhemerval Zanella , "Legendre, Matthew P." , libc-alpha@sourceware.org, John Mellor-Crummey References: <87fsmiiw3x.fsf@oldenburg.str.redhat.com> <875ymnxepr.fsf@oldenburg.str.redhat.com> <87k0azlwea.fsf@oldenburg.str.redhat.com> From: Jonathon Anderson In-Reply-To: <87k0azlwea.fsf@oldenburg.str.redhat.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-05_08,2022-05-05_01,2022-02-23_01 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, HTML_MESSAGE, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2022 19:56:30 -0000 On 5/5/22 12:30, Florian Weimer wrote: > * Florian Weimer: > >> * Florian Weimer: >> >>> * Jonathon Anderson: >>> >>>> Hello all, >>>> >>>> We (the HPCToolkit team) have encountered another critical LD_AUDIT >>>> bug. When LD_AUDIT is specified, the allocation of the static TLS >>>> block does not account for the TLS requirements of executable >>>> dependencies or of the auditors themselves. If: >>>>  - an executable accesses a thread-local variable in a linked library >>>> with sufficiently large TLS requirements, or >>>>  - an auditor itself uses sufficiently large TLS and optimizes access >>>> with `-ftls-model=initial-exec`, >>>> >>>> then the process or auditor will fail with the error "cannot allocate >>>> memory in static TLS block." >>> We have a tunable that can be used as a workaround. Your reproducer >>> passes for me with our 2.28 backport (glibc-2.28-164.el8) if I run it >>> like this: >>> >>> GLIBC_TUNABLES=glibc.rtld.optional_static_tls=4000 make >>> >>> The best we can do in the short term would be an increase of the default >>> limit. On 64-bit platforms, defaulting to a dozen or so kilobytes per >>> thread should not be a problem as far as virtual address space >>> consumption is concerned. We can also add an additional reservation of >>> similar size for every auditor that is loaded, to compensate for the >>> lack of auto-tuning of the TLS allocation size in auditing mode. >>> >>> The fundamental issue is that there is always going to be a hard limit >>> for initial-exec TLS. Initial-exec TLS requires a fixed offset from the >>> thread pointer, and we cannot relocate TLS variables because they are >>> ordinary C objects with an observable address. There are some other >>> things we can try to improve auto-tuning, but in the end, there is >>> always going to be a fixed-size reserved area dedicated to initial-exec >>> TLS set up at process startup, and with dlopen, that might not be enough >>> even without any auditor use. >> Jonathon, >> >> does setting the environment variable work for you? > Do you have any additional feedback here? Sorry for the delayed response (it's ECP conference week). *This tunable works for us as a stopgap until a long-term solution can be implemented.* I had a separate (email) chat with Ben Woodard bouncing ideas for a long-term solution. A major difficulty is that LD_AUDIT currently introduces a cyclic dependency:  - auditors must be loaded before searching for the application's dependencies (since la_objsearch may modify the results), and  - dependency searches must complete before the static TLS auto-tuning (since the TLS sizes of the initial link-map must be known), but  - the static TLS block must be allocated before auditors are loaded (since auditors may also use initial-exec TLS). So, I'm not hopeful for a long-term solution that does not involve another LAV_CURRENT bump. We (me and Ben) came up with a couple of initial solutions: disallowing initial-exec TLS in auditors, or per-auditor static TLS blocks (ie. TLS namespaces). Comments and ideas are welcome. (I would love to have a detailed LD_AUDIT discussion at STW in June.) > In the meantime, we have updated Fedora rawhide with the bug fix to > enable early usage from auditors, and the new RTLD_DI_PHDR > dlinfo is included as well. If you could test glibc-2.35.9000-16 or > later, that would be great. Thanks! Our reproducer for the early dl* bug passes with the latest Fedora Rawhide, I'll look into using RTLD_DI_PHDR in HPCToolkit in the coming weeks. -Jonathon