From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 44210 invoked by alias); 21 Mar 2018 13:01:52 -0000 Mailing-List: contact elfutils-devel-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Post: List-Help: List-Subscribe: Sender: elfutils-devel-owner@sourceware.org Received: (qmail 44190 invoked by uid 89); 21 Mar 2018 13:01:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.99.4 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=pub, concentrate, scm, blame X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on sourceware.org X-Spam-Level: X-HELO: dd17628.kasserver.com Received: from dd17628.kasserver.com (HELO dd17628.kasserver.com) (85.13.138.83) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 21 Mar 2018 13:01:49 +0000 Received: from agathebauer.localnet (ip5f5bf42a.dynamic.kabel-deutschland.de [95.91.244.42]) by dd17628.kasserver.com (Postfix) with ESMTPSA id DD4506280F50; Wed, 21 Mar 2018 14:01:46 +0100 (CET) From: Milian Wolff To: elfutils-devel@sourceware.org Cc: Mark Wielaard Subject: Re: How to associate Elf with Dwfl_Module returned by dwfl_report_module Date: Wed, 21 Mar 2018 13:01:00 -0000 Message-ID: <1946852.ajpeOdNFGP@agathebauer> In-Reply-To: <20180320220549.GD6269@wildebeest.org> References: <3517953.ztkfjMdy38@agathebauer> <20180320220549.GD6269@wildebeest.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2420787.1TBufLSNFU"; micalg="pgp-sha256"; protocol="application/pgp-signature" X-IsSubscribed: yes X-SW-Source: 2018-q1/txt/msg00093.txt.bz2 --nextPart2420787.1TBufLSNFU Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="iso-8859-1" Content-length: 4478 On Dienstag, 20. M=E4rz 2018 23:05:49 CET Mark Wielaard wrote: > Hi Milian, Hey Mark :) > On Sat, Mar 17, 2018 at 02:14:48PM +0100, Milian Wolff wrote: > > a recurring issue in the libdwfl integration of perf and perfparser are > > supposedly overlapping modules. The perf data file contains the exact > > mappings for all files corresponding to the actual mmap events that > > occurred during runtime, e.g.: > >=20 > > ``` > > $ perf script --show-mmap-events | grep MMAP | grep stdc > > heaptrack_print 13962 87163.483450: PERF_RECORD_MMAP2 13962/13962: > > [0x7fd0aea84000(0x387000) @ 0 08:03 413039 3864781083]: r-xp > > /usr/lib/libstdc+ +.so.6.0.24 > > heaptrack_print 13962 87163.483454: PERF_RECORD_MMAP2 13962/13962: > > [0x7fd0aebfc000(0x1ff000) @ 0x178000 08:03 413039 3864781083]: ---p > > /usr/lib/ libstdc++.so.6.0.24 > > heaptrack_print 13962 87163.483458: PERF_RECORD_MMAP2 13962/13962: > > [0x7fd0aedfb000(0xd000) @ 0x177000 08:03 413039 3864781083]: rw-p > > /usr/lib/ > > libstdc++.so.6.0.24 > > heaptrack_print 13962 87163.484860: PERF_RECORD_MMAP2 13962/13962: > > [0x7fd0aedfb000(0xc000) @ 0x177000 08:03 413039 3864781083]: r--p > > /usr/lib/ > > libstdc++.so.6.0.24 > > ``` > > So far, both perf and perfparser are using dwfl_report_elf to report the > > file. But that API is deducing the mapping addresses internally, which > > may or may not be what we saw at runtime. I suspect that this is the > > reason for some issues we are seeing, such as supposedly overlapping > > modules. >=20 > How exactly are you calling dwfl_report_elf? Here's the code for the perf tools: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/ perf/util/unwind-libdw.c?h=3Dperf/core#n52 Here's the code for the perfparser: http://code.qt.io/cgit/qt-creator/perfparser.git/tree/app/ perfsymboltable.cpp#n479 Let's concentrate on perf for now, but perfparser has similar logic: We parse the mmap events in the perf.data file and store that information.= =20 Note that the perf.data file does not contain events for munmap calls. Then= =20 while unwinding the callstack of a perf sample, we lookup the most recent m= map=20 event for every given instruction pointer address, and ensure that the=20 corresponding ELF was registered with libdw. > Specifically are you using false for the add_p_vaddr argument? Yes, we are. > And could you provide some example where the reported address is > wrong/different from the start address of the Dwfl_Module? I don't think it's the start address that is wrong, rather it's the end=20 address. But it's hard for me to come up with a small selfcontained example= at=20 this stage. I am regularly seeing broken backtraces for samples where I hav= e=20 the gut feeling that missing reported ELFs are to blame. But we report=20 everything, except for scenarios where the mmap events seemingly overlap. T= his=20 overlapping is, as far as I can see, actually a side effect of remapping=20 taking place in the dynamic linker (i.e. a single dlopen/dynamic linked=20 library can yield multiple mmap events). One way or another, we end up with= a=20 situation where we cannot report an ELF to dwfl due to two issues: a) either ELF tells us we are overlapping some module and just stops which = is=20 bad, since we would actually much prefer the newly reported ELF to take=20 precedence b) we find an mmap event that with a non-zero pgoff, and have no clue how t= o=20 call dwfl_report_elf and just give up. In both cases, I was hopeing for dwfl_report_module to help since it seemin= gly=20 allows me to exactly recreate the mapping that was traced originally. > > Looking at the Dwfl API, I cannot figure out how to feed the mapping > > directly. There's dwfl_report_module, but how would I associate an Elf* > > and int fd with it, as done by dwfl_report_elf? >=20 > When using dwfl_report_module the find_elf callback will be called that > you registered with dwfl_begin. That callback is called "lazily" the > first time dwfl_module_getelf is called. The callback can then set the > Elf*. But that does mean you have to keep track yourself (or immediately > call dwfl_module_getelf). Ah, thanks! > I would like to understand better what is really going wrong with > dwfl_report_elf before diving into using dwfl_module_report. See above, I would very much value your input. I'm still far from having fu= lly=20 grasped this situation. Thanks --=20 Milian Wolff mail@milianw.de http://milianw.de= --nextPart2420787.1TBufLSNFU Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit Content-length: 833 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEezawi1aUvUGg3A1+8zYW/HGdOX8FAlqyV7UACgkQ8zYW/HGd OX+34A/9GU5IlUJON2LfZlJdw8/kjU5d+I7FKVbZ2FOxy9vQR7aOccdWZErofZfo P5o9kPaPyxY0pWYLZmmR76ZadEJs4wggn2FidvNDsirY7tg2zu3uYTtvjKlDwwH8 1z3okO+ofuifZd+5iRnVqelP62q/k3uYKhiMz3K9R6qN76N3GESKcfBiEbGocCu0 u9Zyc55GsQ9yQ9j/SKTGU32CiZyISwyDIoQ6ILcURynR3ZOaj4Wnl9jLpWwbXnXO AnIbqo8sUIcqy16QYygRDdTbrFV3SfkZnwW7cN9h2wsLuZQFfRw3S+oXZO7B6LA7 JjQR31EN3tNF6JbWm7OaaJWpb4InkvYT8dCTScX4Nsb/9EwRVMKlZfMgWXhDGtEy qyV3cSMvY/yVIkdzzLvKNcjCdbvlNwGgvOrwpZZf+MBNYOYs4wd9aySHx3qwZojS pvJlgMCSLefMPQu65H1bMqli4CxV3ZlneDiEiZLeS67EWVr0LsrVugqjqgYYXdkJ G8xRvlIoMikYaimW553adKocy+KkVhOP23ZU+BTVgamd5ebZRxmFXDP8ynDbJ81u wZsgGhbZQGnNgOg601W/H+u5O3eZoapQbAqgWzODkkXbRcAOzhcuWFRSrvB3eUVf lYyrFBN+b6WzqyokYogK5iKEBsLO9wdDnW2mSn83d1TT0T6VpV0= =JIRl -----END PGP SIGNATURE----- --nextPart2420787.1TBufLSNFU--