From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 208173858433 for ; Thu, 18 Nov 2021 14:10:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 208173858433 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-82-8wxBQD9LMnO0fKM_rBo2Og-1; Thu, 18 Nov 2021 09:10:19 -0500 X-MC-Unique: 8wxBQD9LMnO0fKM_rBo2Og-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 941158CD523; Thu, 18 Nov 2021 14:09:22 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.39.192.131]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4181B56A8C; Thu, 18 Nov 2021 14:09:21 +0000 (UTC) From: Florian Weimer To: Jakub Jelinek Cc: libc-alpha@sourceware.org, gcc-patches@gcc.gnu.org, Jason Merrill Subject: Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function References: <20211118134303.GB2646553@tucnak> Date: Thu, 18 Nov 2021 15:09:19 +0100 In-Reply-To: <20211118134303.GB2646553@tucnak> (Jakub Jelinek's message of "Thu, 18 Nov 2021 14:43:03 +0100") Message-ID: <87bl2hh7xs.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=unavailable autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Nov 2021 14:10:25 -0000 * Jakub Jelinek: > On Wed, Nov 03, 2021 at 05:28:02PM +0100, Florian Weimer wrote: >> This function is similar to __gnu_Unwind_Find_exidx as used on arm. >> It can be used to speed up the libgcc unwinder. > > I'm little bit worried that this trades the speed of exceptions for > speed of dlopen/dlclose and extra memory use in each process. > I admit I haven't been paying close attention to how many shared libraries > apps typically link against and how many dlopen/dlclose calls they do > in the last decade and half, but I'd think more applications don't use > exceptions compared to apps that do use them, and of many of those that do > use them don't use them for really exceptional cases, so speeding those > is a good thing. dlopen has many sources of quadratic behavior already, and many involve chasing pointers. The new data structure is very compact, so the new work during dlopen does not show up prominently in profiles. > So, I'd wonder, could this overhead be added lazily, when _dl_find_eh_frame > is called for the first time just take the rtld lock, prepare anything you > populate right now already from the process start up and every > dlopen/dlclose before the first _dl_find_eh_frame call and only since then > keep it updated on dlopen/dlclose? I think it's possible to do this lazily (except the memory allocation). But I don't want to do this unless we have performance numbers that suggest it is actually required. > Thus, for the expected majority of apps that aren't using exceptions at all > nothing would change for dlopen/dlclose overhead, while all but the first > _dl_find_eh_frame would be faster and with no locking? One thing I'd like to do is to use the data structure in _dl_find_dso_for_object, and that is actually called during dlopen to determine the caller DSO. _dl_find_dso_for_object can show up in profiles with a lot of dlopen calls, particularly if an object loaded later calls dlopen, so that the current implementation takes more time to find the object. _dl_find_dso_for_object is also used in dlsym, although we skip it if the caller passes an explicit handle (but RTLD_DEFAULT, RTLD_NEXT, etc. definitely need it). We can also replace the soname and file identity lookup with a hash table. *That* will definitely recover any losses from _dl_find_eh_frame_update. In my profiles strcmp always shows up higher than _dl_find_eh_frame_update. Thanks, Florian