From: Mark Wielaard <mark@klomp.org>
To: Aaron Merey <amerey@redhat.com>, elfutils-devel@sourceware.org
Subject: Re: [PATCH] libdwfl: Correctly handle corefile non-contiguous segments
Date: Tue, 14 Nov 2023 15:03:33 +0100 [thread overview]
Message-ID: <5b0d1e19085a0b29a9b9ca6c14c21746dff8995a.camel@klomp.org> (raw)
In-Reply-To: <20231112201645.25762-1-amerey@redhat.com>
Hi Aaron,
On Sun, 2023-11-12 at 15:16 -0500, Aaron Merey wrote:
> It is possible for segments of different shared libaries to be interleaved
> in memory such that the segments of one library are located in between
> non-contiguous segments of another library.
>
> For example, this can be seen with firefox on RHEL 7.9 where multiple
> shared libraries could be mapped in between ld-2.17.so segments:
>
> [...]
> 7f0972082000-7f09720a4000 00000000 139264 /usr/lib64/ld-2.17.so
> 7f09720a4000-7f09720a5000 00000000 4096 /memfd:mozilla-ipc (deleted)
> 7f09720a5000-7f09720a7000 00000000 8192 /memfd:mozilla-ipc (deleted)
> 7f09720a7000-7f09720a9000 00000000 8192 /memfd:mozilla-ipc (deleted)
> 7f0972134000-7f0972136000 00000000 8192 /usr/lib64/firefox/libmozwayland.so
> 7f0972136000-7f0972137000 00002000 4096 /usr/lib64/firefox/libmozwayland.so
> 7f0972137000-7f0972138000 00003000 4096 /usr/lib64/firefox/libmozwayland.so
> 7f0972138000-7f0972139000 00003000 4096 /usr/lib64/firefox/libmozwayland.so
> 7f097213a000-7f0972147000 00000000 53248 /usr/lib64/firefox/libmozsqlite3.so
> 7f0972147000-7f097221e000 0000d000 880640 /usr/lib64/firefox/libmozsqlite3.so
> 7f097221e000-7f0972248000 000e4000 172032 /usr/lib64/firefox/libmozsqlite3.so
> 7f0972248000-7f0972249000 0010e000 4096 /usr/lib64/firefox/libmozsqlite3.so
> 7f0972249000-7f097224c000 0010e000 12288 /usr/lib64/firefox/libmozsqlite3.so
> 7f097224c000-7f0972250000 00111000 16384 /usr/lib64/firefox/libmozsqlite3.so
> 7f0972250000-7f0972253000 00000000 12288 /usr/lib64/firefox/liblgpllibs.so
> [...]
> 7f09722a3000-7f09722a4000 00021000 4096 /usr/lib64/ld-2.17.so
> 7f09722a4000-7f09722a5000 00022000 4096 /usr/lib64/ld-2.17.so
>
> dwfl_segment_report_module did not account for the possibility of
> interleaving non-contiguous segments, resulting in premature closure
> of modules as well as failing to report modules.
Nice description of the issue.
> Fix this by removing segment skipping in dwfl_segment_report_module.
> When dwfl_segment_report_module reported a module, it would return
> the index of the segment immediately following the end address of the
> current module. Since there's a chance that other modules might fall
> within this address range, dwfl_segment_report_module instead returns
> the index of the next segment.
This makes sense.
> This patch also fixes premature module closure that can occur in
> dwfl_segment_report_module when interleaving non-contiguous segments
> are found. Previously modules with start and end addresses that overlap
> with the current segment would have their build-id compared the build-id
> associated with the current segment. If there was a mismatch, that module
> would be closed. Avoid closing modules in this case when mismatching
> build-ids correspond to distinct modules.
Nice find.
> A couple caveats should be mentioned. First, start and end addresses
> of reported modules cannot be assumed to contain segments from only
> that module. This has always been the case however.
There is dwfl_addrmodule/dwfl_addrsegment to find the module that
covers a specific address. Defined in libdwfl/segment.c. I think this
should handle this by checking the closes load address. But I have not
tested it.
Normally only kernel modules (.ko ET_REL files) have multiple segments.
So it might make sense to double check none of this impacts systemtap.
> Second, the testcases in this patch use a firefox corefile that is
> fairly large. The .bz2 corefile is about 47M. A clean elfutils repo
> is currently about 42M, so this corefile more than doubles the size of
> the elfutils repo. I looked for a much smaller process with
> interleaving non-contiguous shared library sections but was not able
> to find one. I've included the corefile and tests in this patch but
> they can be removed if we'd prefer to not approx. double the size of
> the repo.
I really appreciate the testcase, but it really is too big. It is 736M
bunzip2ed (which would happen on any make check). I think this makes
the repo and the build/check a little too heavy. Also we try to include
instructions to recreate any binary test files and that isn't really
possible in this case.
> https://sourceware.org/bugzilla/show_bug.cgi?id=30975
>
> Signed-off-by: Aaron Merey <amerey@redhat.com>
> ---
> libdwfl/dwfl_segment_report_module.c | 37 +++++--
> tests/Makefile.am | 5 +-
> tests/run-unstrip-noncontig.sh | 155 +++++++++++++++++++++++++++
> tests/testcore-noncontig.bz2 | Bin 0 -> 49146091 bytes
> 4 files changed, 184 insertions(+), 13 deletions(-)
> create mode 100755 tests/run-unstrip-noncontig.sh
> create mode 100644 tests/testcore-noncontig.bz2
>
> diff --git a/libdwfl/dwfl_segment_report_module.c b/libdwfl/dwfl_segment_report_module.c
> index 3ef62a7d..09ee37b3 100644
> --- a/libdwfl/dwfl_segment_report_module.c
> +++ b/libdwfl/dwfl_segment_report_module.c
> @@ -737,17 +737,34 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
> && invalid_elf (module->elf, module->disk_file_has_build_id,
> &build_id))
> {
> - elf_end (module->elf);
> - close (module->fd);
> - module->elf = NULL;
> - module->fd = -1;
> + /* If MODULE's build-id doesn't match the disk file's
> + build-id, close ELF only if MODULE and ELF refer to
> + different builds of files with the same name. This
> + prevents premature closure of the correct ELF in cases
> + where segments of a module are non-contiguous in memory. */
> + if (name != NULL && module->name[0] != '\0'
> + && strcmp (basename (module->name), basename (name)) == 0)
> + {
> + elf_end (module->elf);
> + close (module->fd);
> + module->elf = NULL;
> + module->fd = -1;
> + }
> }
Nice.
> - if (module->elf != NULL)
> + else if (module->elf != NULL)
> {
> - /* Ignore this found module if it would conflict in address
> - space with any already existing module of DWFL. */
> + /* This module has already been reported. */
> skip_this_module = true;
> }
> + else
> + {
> + /* Only report this module if we haven't already done so. */
> + for (Dwfl_Module *mod = dwfl->modulelist; mod != NULL;
> + mod = mod->next)
> + if (mod->low_addr == module_start
> + && mod->high_addr == module_end)
> + skip_this_module = true;
> + }
> }
> if (skip_this_module)
> goto out;
OK. So now we only skip modules if we have already seen this exact
module.
> @@ -781,10 +798,6 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
> }
> }
>
> - /* Our return value now says to skip the segments contained
> - within the module. */
> - ndx = addr_segndx (dwfl, segment, module_end, true);
> -
> /* Examine its .dynamic section to get more interesting details.
> If it has DT_SONAME, we'll use that as the module name.
> If it has a DT_DEBUG, then it's actually a PIE rather than a DSO.
> @@ -929,6 +942,8 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
> ndx = -1;
> goto out;
> }
> + else
> + ndx++;
>
> /* We have reported the module. Now let the caller decide whether we
> should read the whole thing in right now. */
Right, so addr_segndx would normally return this ndx because next ==
true. But now we just always just report the next ndx.
> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index 7fb8efb1..a12df1d3 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -212,7 +212,7 @@ TESTS = run-arextract.sh run-arsymtest.sh run-ar.sh newfile test-nlist \
> $(asm_TESTS) run-disasm-bpf.sh run-low_high_pc-dw-form-indirect.sh \
> run-nvidia-extended-linemap-libdw.sh run-nvidia-extended-linemap-readelf.sh \
> run-readelf-dw-form-indirect.sh run-strip-largealign.sh \
> - run-readelf-Dd.sh
> + run-readelf-Dd.sh run-unstrip-noncontig.sh
>
> if !BIARCH
> export ELFUTILS_DISABLE_BIARCH = 1
> @@ -632,7 +632,8 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \
> run-nvidia-extended-linemap-libdw.sh run-nvidia-extended-linemap-readelf.sh \
> testfile_nvidia_linemap.bz2 \
> testfile-largealign.o.bz2 run-strip-largealign.sh \
> - run-funcretval++11.sh
> + run-funcretval++11.sh \
> + run-unstrip-noncontig.sh testcore-noncontig.bz2
>
>
> if USE_VALGRIND
I really like to have a testcase, but only if we can find/construct
something much smaller.
Thanks,
Mark
next prev parent reply other threads:[~2023-11-14 14:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-12 20:16 Aaron Merey
2023-11-13 14:44 ` Aaron Merey
2023-11-14 14:03 ` Mark Wielaard [this message]
2023-11-14 18:46 ` Aaron Merey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5b0d1e19085a0b29a9b9ca6c14c21746dff8995a.camel@klomp.org \
--to=mark@klomp.org \
--cc=amerey@redhat.com \
--cc=elfutils-devel@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).