From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 55313 invoked by alias); 8 Feb 2016 20:19:49 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Received: (qmail 55296 invoked by uid 89); 8 Feb 2016 20:19:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=sorts, customers, thousands, Short X-HELO: mx1.redhat.com Subject: Re: A per-user or per-application ld.so.cache? To: Florian Weimer References: <56B8E105.8030906@redhat.com> <56B8E810.1040609@redhat.com> Cc: libc-alpha@sourceware.org, Ben Woodard From: "Carlos O'Donell" X-Enigmail-Draft-Status: N1110 Message-ID: <56B8F860.6060707@redhat.com> Date: Mon, 08 Feb 2016 20:19:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <56B8E810.1040609@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-SW-Source: 2016-02/txt/msg00108.txt.bz2 On 02/08/2016 02:10 PM, Florian Weimer wrote: > On 02/08/2016 07:40 PM, Carlos O'Donell wrote: >> Under what conditions might it make sense to implement >> a per-user ld.so.cache? >> >> At Red Hat we have some customers, particularly in HPC, >> which deploy quite large applications across systems that >> they don't themselves maintain. In this case the given >> application could have thousands of DSOs. When you load >> up such an application the normal search paths apply >> and that's not very optimal. > > Are these processes short-lived? No. See [1]. > Is symbol lookup performance an issue as well? Yes. So are the various O(n^2) algorithms we need to fix inside the loader, particularly the DSO sorts we use. > What's the total size of all relevant DSOs, combined? What does the > directory structure look like? I don't know. We should as Ben Woodard. To get us that data. Ben? > Which ELF dynamic linking features are used? I don't know. > Is the bulk of those DSOs pulled in with dlopen, after the initial > dynamic link? If yes, does this happen directly (many DSOs dlopen'ed > individually) or indirectly (few of them pull in a huge cascade of > dependencies)? I do not believe the bulk of the DSOs are pulled in with dlopen. Though for python code I know that might be the reverse with each python module being a DSO that is loaded by the interpreter. Which means we probably have two cases: * Long chains of DSOs (non-python applications) * Short single DSO chains, but lots of them (python modules). > If the processes are not short-lived and most of the DSOs are loaded > after user code has started executing, I doubt an on-disk cache is the > right solution. Why would a long-lived process that uses dlopen fail to benefit from an on-disk cache? The on-disk cache, as it is today, is used for a similar situation already, why not extend it? The biggest difference is that we trust the cache we have today and mmap into memory. We would have to harden the code that processes that cache, but it should not be that hard. Would you mind expanding on your concern that the solution would not work? Cheers, Carlos. [1] http://computation.llnl.gov/projects/spindle/spindle-paper.pdf