public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
From: Mark Wielaard <mark@klomp.org>
To: Aaron Merey <amerey@redhat.com>, elfutils-devel@sourceware.org
Subject: Re: [PATCH] libdwfl: Correctly handle corefile non-contiguous segments
Date: Tue, 14 Nov 2023 15:03:33 +0100	[thread overview]
Message-ID: <5b0d1e19085a0b29a9b9ca6c14c21746dff8995a.camel@klomp.org> (raw)
In-Reply-To: <20231112201645.25762-1-amerey@redhat.com>

Hi Aaron,

On Sun, 2023-11-12 at 15:16 -0500, Aaron Merey wrote:
> It is possible for segments of different shared libaries to be interleaved
> in memory such that the segments of one library are located in between
> non-contiguous segments of another library.
> 
> For example, this can be seen with firefox on RHEL 7.9 where multiple
> shared libraries could be mapped in between ld-2.17.so segments:
> 
>       [...]
>       7f0972082000-7f09720a4000 00000000 139264      /usr/lib64/ld-2.17.so
>       7f09720a4000-7f09720a5000 00000000 4096        /memfd:mozilla-ipc (deleted)
>       7f09720a5000-7f09720a7000 00000000 8192        /memfd:mozilla-ipc (deleted)
>       7f09720a7000-7f09720a9000 00000000 8192        /memfd:mozilla-ipc (deleted)
>       7f0972134000-7f0972136000 00000000 8192        /usr/lib64/firefox/libmozwayland.so
>       7f0972136000-7f0972137000 00002000 4096        /usr/lib64/firefox/libmozwayland.so
>       7f0972137000-7f0972138000 00003000 4096        /usr/lib64/firefox/libmozwayland.so
>       7f0972138000-7f0972139000 00003000 4096        /usr/lib64/firefox/libmozwayland.so
>       7f097213a000-7f0972147000 00000000 53248       /usr/lib64/firefox/libmozsqlite3.so
>       7f0972147000-7f097221e000 0000d000 880640      /usr/lib64/firefox/libmozsqlite3.so
>       7f097221e000-7f0972248000 000e4000 172032      /usr/lib64/firefox/libmozsqlite3.so
>       7f0972248000-7f0972249000 0010e000 4096        /usr/lib64/firefox/libmozsqlite3.so
>       7f0972249000-7f097224c000 0010e000 12288       /usr/lib64/firefox/libmozsqlite3.so
>       7f097224c000-7f0972250000 00111000 16384       /usr/lib64/firefox/libmozsqlite3.so
>       7f0972250000-7f0972253000 00000000 12288       /usr/lib64/firefox/liblgpllibs.so
>       [...]
>       7f09722a3000-7f09722a4000 00021000 4096        /usr/lib64/ld-2.17.so
>       7f09722a4000-7f09722a5000 00022000 4096        /usr/lib64/ld-2.17.so
> 
> dwfl_segment_report_module did not account for the possibility of
> interleaving non-contiguous segments, resulting in premature closure
> of modules as well as failing to report modules.

Nice description of the issue.

> Fix this by removing segment skipping in dwfl_segment_report_module.
> When dwfl_segment_report_module reported a module, it would return
> the index of the segment immediately following the end address of the
> current module.  Since there's a chance that other modules might fall
> within this address range, dwfl_segment_report_module instead returns
> the index of the next segment.

This makes sense.

> This patch also fixes premature module closure that can occur in
> dwfl_segment_report_module when interleaving non-contiguous segments
> are found.  Previously modules with start and end addresses that overlap
> with the current segment would have their build-id compared the build-id
> associated with the current segment.  If there was a mismatch, that module
> would be closed.  Avoid closing modules in this case when mismatching
> build-ids correspond to distinct modules.

Nice find.

> A couple caveats should be mentioned.  First, start and end addresses
> of reported modules cannot be assumed to contain segments from only
> that module.  This has always been the case however.

There is dwfl_addrmodule/dwfl_addrsegment to find the module that
covers a specific address. Defined in libdwfl/segment.c. I think this
should handle this by checking the closes load address. But I have not
tested it.

Normally only kernel modules (.ko ET_REL files) have multiple segments.
So it might make sense to double check none of this impacts systemtap.

> Second, the testcases in this patch use a firefox corefile that is
> fairly large.  The .bz2 corefile is about 47M.  A clean elfutils repo
> is currently about 42M, so this corefile more than doubles the size of
> the elfutils repo.  I looked for a much smaller process with
> interleaving non-contiguous shared library sections but was not able
> to find one.  I've included the corefile and tests in this patch but
> they can be removed if we'd prefer to not approx. double the size of
> the repo.

I really appreciate the testcase, but it really is too big. It is 736M
bunzip2ed (which would happen on any make check). I think this makes
the repo and the build/check a little too heavy. Also we try to include
instructions to recreate any binary test files and that isn't really
possible in this case.

> https://sourceware.org/bugzilla/show_bug.cgi?id=30975
> 
> Signed-off-by: Aaron Merey <amerey@redhat.com>
> ---
>  libdwfl/dwfl_segment_report_module.c |  37 +++++--
>  tests/Makefile.am                    |   5 +-
>  tests/run-unstrip-noncontig.sh       | 155 +++++++++++++++++++++++++++
>  tests/testcore-noncontig.bz2         | Bin 0 -> 49146091 bytes
>  4 files changed, 184 insertions(+), 13 deletions(-)
>  create mode 100755 tests/run-unstrip-noncontig.sh
>  create mode 100644 tests/testcore-noncontig.bz2
> 
> diff --git a/libdwfl/dwfl_segment_report_module.c b/libdwfl/dwfl_segment_report_module.c
> index 3ef62a7d..09ee37b3 100644
> --- a/libdwfl/dwfl_segment_report_module.c
> +++ b/libdwfl/dwfl_segment_report_module.c
> @@ -737,17 +737,34 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
>  	        && invalid_elf (module->elf, module->disk_file_has_build_id,
>  				&build_id))
>  	      {
> -		elf_end (module->elf);
> -		close (module->fd);
> -		module->elf = NULL;
> -		module->fd = -1;
> +		/* If MODULE's build-id doesn't match the disk file's
> +		   build-id, close ELF only if MODULE and ELF refer to
> +		   different builds of files with the same name.  This
> +		   prevents premature closure of the correct ELF in cases
> +		   where segments of a module are non-contiguous in memory.  */
> +		if (name != NULL && module->name[0] != '\0'
> +		    && strcmp (basename (module->name), basename (name)) == 0)
> +		  {
> +		    elf_end (module->elf);
> +		    close (module->fd);
> +		    module->elf = NULL;
> +		    module->fd = -1;
> +		  }
>  	      }

Nice.

> -	    if (module->elf != NULL)
> +	    else if (module->elf != NULL)
>  	      {
> -		/* Ignore this found module if it would conflict in address
> -		   space with any already existing module of DWFL.  */
> +		/* This module has already been reported.  */
>  		skip_this_module = true;
>  	      }
> +	    else
> +	      {
> +		/* Only report this module if we haven't already done so.  */
> +		for (Dwfl_Module *mod = dwfl->modulelist; mod != NULL;
> +		     mod = mod->next)
> +		  if (mod->low_addr == module_start
> +		      && mod->high_addr == module_end)
> +		    skip_this_module = true;
> +	      }
>  	  }
>        if (skip_this_module)
>  	goto out;

OK. So now we only skip modules if we have already seen this exact
module.

> @@ -781,10 +798,6 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
>  	}
>      }
>  
> -  /* Our return value now says to skip the segments contained
> -     within the module.  */
> -  ndx = addr_segndx (dwfl, segment, module_end, true);
> -
>    /* Examine its .dynamic section to get more interesting details.
>       If it has DT_SONAME, we'll use that as the module name.
>       If it has a DT_DEBUG, then it's actually a PIE rather than a DSO.
> @@ -929,6 +942,8 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
>        ndx = -1;
>        goto out;
>      }
> +  else
> +    ndx++;
>  
>    /* We have reported the module.  Now let the caller decide whether we
>       should read the whole thing in right now.  */

Right, so addr_segndx would normally return this ndx because next ==
true. But now we just always just report the next ndx.

> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index 7fb8efb1..a12df1d3 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -212,7 +212,7 @@ TESTS = run-arextract.sh run-arsymtest.sh run-ar.sh newfile test-nlist \
>  	$(asm_TESTS) run-disasm-bpf.sh run-low_high_pc-dw-form-indirect.sh \
>  	run-nvidia-extended-linemap-libdw.sh run-nvidia-extended-linemap-readelf.sh \
>  	run-readelf-dw-form-indirect.sh run-strip-largealign.sh \
> -	run-readelf-Dd.sh
> +	run-readelf-Dd.sh run-unstrip-noncontig.sh
>  
>  if !BIARCH
>  export ELFUTILS_DISABLE_BIARCH = 1
> @@ -632,7 +632,8 @@ EXTRA_DIST = run-arextract.sh run-arsymtest.sh run-ar.sh \
>  	     run-nvidia-extended-linemap-libdw.sh run-nvidia-extended-linemap-readelf.sh \
>  	     testfile_nvidia_linemap.bz2 \
>  	     testfile-largealign.o.bz2 run-strip-largealign.sh \
> -	     run-funcretval++11.sh
> +	     run-funcretval++11.sh \
> +	     run-unstrip-noncontig.sh testcore-noncontig.bz2
>  
>  
>  if USE_VALGRIND

I really like to have a testcase, but only if we can find/construct
something much smaller.

Thanks,

Mark

  parent reply	other threads:[~2023-11-14 14:03 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-12 20:16 Aaron Merey
2023-11-13 14:44 ` Aaron Merey
2023-11-14 14:03 ` Mark Wielaard [this message]
2023-11-14 18:46   ` Aaron Merey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5b0d1e19085a0b29a9b9ca6c14c21746dff8995a.camel@klomp.org \
    --to=mark@klomp.org \
    --cc=amerey@redhat.com \
    --cc=elfutils-devel@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).