From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dd14210.kasserver.com (dd14210.kasserver.com [85.13.138.83]) by sourceware.org (Postfix) with ESMTPS id C8B253858D33 for ; Thu, 1 Feb 2024 11:34:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C8B253858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=milianw.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=milianw.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C8B253858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=85.13.138.83 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706787279; cv=none; b=gtBZ8HCCAFGU0qFMQ2CCHeHs2KLAszDS9Sj3JFRRkPk5P6BG78PCe/6PMXbkTggd960iaFqJhK3dxDWQFIdxomugNFOvodGp7tWolT2ydpey2PFT1maWk35lhIr9aHFlzrmcTGqnr3vC4ANjNgGUhL63JvNpu+h15V+264oj0rg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706787279; c=relaxed/simple; bh=Ncx5lwg0hRhNaWDvyhcZZC/lnMcS0k3RY+J3ZgGVnWo=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=YeQ6FbVygLFzM8/yRVRE7LYJxD/JcathYwdeT/mFTxKMLo6m3p4xFJeag0vB+ARss5TCgxw4LM/88G+ccLtymAyIRHugfU2huVGLyD1GpyrrR44roCuKjKf8C1mrItWK9j61Yta2ZIbmSH1zJ0ngi/8fVMyBGC8z9VwFGxTLhRA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from milian-workstation.localnet (p5b3a0a95.dip0.t-ipconnect.de [91.58.10.149]) by dd14210.kasserver.com (Postfix) with ESMTPSA id 368402400BB for ; Thu, 1 Feb 2024 12:34:36 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=milianw.de; s=kas202309021127; t=1706787276; bh=CHChLKf8Zhr/buQ4j4nfjWvQMh+7hanHObQiChEXpVY=; h=From:To:Subject:Date:From; b=sq76wUHcfYr1vznyTks+3pmGJHm3eiRbZKXyvURqb/z9VtMhIGIqobrt5tD5mIaT9 xcYMcmWvmPa3M1h7Sg9y9iMrAXm5OR/3dj4sG2VxFnpGOUdrnDKfHNvZHYOn/e954p gQttdF375nuLENr1+9TjYhIAMrBJBukaXhroQG4OO+ru0Y/V53GDxQhuOGKmh1E9JF AIvPsMMzhEg7yMYFUMl4GQPQkJ5tEMJ559onqQ/soXTfaxYSJY1CitABUJu4FLAT7Z KNi2LjxjBm/ullPkOTRHO5z8SVO0Pfjgybz0GPMnqPZv+uEfO4vgNCmYDSnNlOOpNu Ie26odFJvv2KQ== From: Milian Wolff To: elfutils-devel@sourceware.org Subject: Optimizing elfutils usage for unwinding Date: Thu, 01 Feb 2024 12:34:30 +0100 Message-ID: <2741613.GmtGnDTZHg@milian-workstation> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2343258.3Fcu8GHntG"; micalg="pgp-sha256"; protocol="application/pgp-signature" X-Spamd-Bar: - X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --nextPart2343258.3Fcu8GHntG Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii"; protected-headers="v1" From: Milian Wolff To: elfutils-devel@sourceware.org Subject: Optimizing elfutils usage for unwinding Date: Thu, 01 Feb 2024 12:34:30 +0100 Message-ID: <2741613.GmtGnDTZHg@milian-workstation> MIME-Version: 1.0 Hey all, I'm working on perfparser/hotspot which ingests perf.data files and does unwinding and symbolication etc. We got a bug report by a user [1] with a worst-case performance situation in usage of elfutils, which I do not know how to handle - thus me reaching out to you all here. The problem is that the perf.data file for the workload there contains samples for tens of thousands of short lived processes - but overall there are only about a hundred _different_ binaries being executed. When we analyze the data, we currently have one dwfl* per process. We already employ excessive caching on the symbolication end, which allows us to only look at inline frames and demangling once per executable or library, instead of once per process. But this kind of caching across dwfl* is not possible for what dwfl_thread_getframes does internally. Profiling our analysis, I see that most of the time is spent by this stack: ``` dwfl_thread_getframes __libdwfl_frame_unwind handle_cfi dwarf_cfi_addrframe __libdw_find_fde intern_fde ``` Another big chunk is then later on when `dwfl_end` cleans up the modules and we get into `__libdw_destroy_frame_cache`. So, there already seems to be a cache of sorts being build - but it's tied to the `dwfl*` structure. In our case we have tens of thousands of these structures, each very short lived (as the processes underneath are shortlived). I understand that each process will have its own custom address space mapping, but is the CFI/FDE data also tied to such process-specific data? Or could it in theory be reuses across `dwfl*` instances? Thanks [1]: https://github.com/KDAB/hotspot/issues/394 -- Milian Wolff mail@milianw.de http://milianw.de --nextPart2343258.3Fcu8GHntG Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEezawi1aUvUGg3A1+8zYW/HGdOX8FAmW7gcYACgkQ8zYW/HGd OX/PzBAApj6CiN/DWawCbs8HVFu8Oi0S+tkU60rkWjKELTiGBmsRX3wJLAeEqe9m XSp3TrgFpF7ijnte55xvxYpgyTA+f2K0VR3L2T/1EBUVX2VfTFn61JrxJT55FPJi AM+WVApiY2nFO3x95jXThRSLorF30KiQuPMZ2/PX0aZqxocH9iXyP2u8cQvnaVMT 3NEIW8EHY5R+FzmuH93npKuJwJPWMFBtZRuPzkRg7zKA7WK2cVsS766XfToKzhGV Fl5MIzvQOEPPGYMdlkshteggRz3gDqmMvJrrWEu7l2Mudepx3OHc7+82fBmLBRAZ xo5R3W0ytN0e5UpG37BVgvANFdMgaMmYXrz91kMc5aZeAGx7qrnt4xqwGsCXh1M/ UyXN73wrnF/vpEM/20j7FDoRLeQ8+P1sJUU0n3ELlJ/Ihz4Jxr+RZWCUMzO8ehH9 G0918/lvPfnEp7e7aS2DnQVL+Xg4ToYzNY85IJRsMqqSWa35JfQSC/3YofoAfN2E HEIwDdRcEdimchhUXTVnV0lxtRl1gQMG02qMrFIf7S54/7y2K8hhFy9+jbdiPN82 4D0gav74k7ZEU4ujG2xKfWFFatMihvkUiayukCLRkwN8o+/ph8IsQfkUMrcmXSnr sUrFEYZG2gs4jQfJzMka3G/9PV/gyZmBK0FRCCPiBwD7wuQwF5k= =TqdU -----END PGP SIGNATURE----- --nextPart2343258.3Fcu8GHntG--