public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] libdwfl: Read no more than required to parse dynamic sections
@ 2022-11-29  6:26 gavin
  2022-11-29 15:25 ` Mark Wielaard
  0 siblings, 1 reply; 6+ messages in thread
From: gavin @ 2022-11-29  6:26 UTC (permalink / raw)
  To: elfutils-devel; +Cc: Mark Wielaard, Gavin Li

From: Gavin Li <gavin@matician.com>

Since size checking has been moved to dwfl_elf_phdr_memory_callback(),
there is no longer a need for dwfl_segment_report_module() to enforce
the same. Reading beyond the end of the dynamic section actually causes
issues when passing the data to elfXX_xlatetom() because it is possible
that src->d_size is not a multiple of recsize (for ELF_T_DYN, recsize is
16 while the minimum required alignment is 8), causing elfXX_xlatetom()
to return ELF_E_INVALID_DATA.

Signed-off-by: Gavin Li <gavin@matician.com>
---
 libdwfl/dwfl_segment_report_module.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/libdwfl/dwfl_segment_report_module.c b/libdwfl/dwfl_segment_report_module.c
index 287fc002..08aca0eb 100644
--- a/libdwfl/dwfl_segment_report_module.c
+++ b/libdwfl/dwfl_segment_report_module.c
@@ -821,12 +821,6 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
       && ! read_portion (&read_state, &dyn_data, &dyn_data_size,
 			 start, segment, dyn_vaddr, dyn_filesz))
     {
-      /* dyn_data_size will be zero if we got everything from the initial
-         buffer, otherwise it will be the size of the new buffer that
-         could be read.  */
-      if (dyn_data_size != 0)
-	dyn_filesz = dyn_data_size;
-
       if ((dyn_filesz / dyn_entsize) == 0
 	  || dyn_filesz > (SIZE_MAX / dyn_entsize))
 	goto out;
-- 
2.38.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] libdwfl: Read no more than required to parse dynamic sections
  2022-11-29  6:26 [PATCH] libdwfl: Read no more than required to parse dynamic sections gavin
@ 2022-11-29 15:25 ` Mark Wielaard
  2022-11-29 21:48   ` Gavin Li
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Wielaard @ 2022-11-29 15:25 UTC (permalink / raw)
  To: gavin, elfutils-devel

Hi Gavin,

On Mon, 2022-11-28 at 22:26 -0800, gavin@matician.com wrote:
> Since size checking has been moved to
> dwfl_elf_phdr_memory_callback(),
> there is no longer a need for dwfl_segment_report_module() to enforce
> the same. Reading beyond the end of the dynamic section actually causes
> issues when passing the data to elfXX_xlatetom() because it is possible
> that src->d_size is not a multiple of recsize (for ELF_T_DYN, recsize is
> 16 while the minimum required alignment is 8), causing elfXX_xlatetom()
> to return ELF_E_INVALID_DATA.

I don't fully follow this logic.

The code as written doesn't seem to guarantee that
dwfl_segment_report_module will always be called with
dwfl_elf_phdr_memory_callback as memory_callback. Although it probably
will be in practice.

So you are removing this check:

>       && ! read_portion (&read_state, &dyn_data, &dyn_data_size,
>  			 start, segment, dyn_vaddr, dyn_filesz))
>      {
> -      /* dyn_data_size will be zero if we got everything from the initial
> -         buffer, otherwise it will be the size of the new buffer that
> -         could be read.  */
> -      if (dyn_data_size != 0)
> -	dyn_filesz = dyn_data_size;
> -

Reading read_portion it shows dyn_data_size being set to zero if
read_state->buffer_available has everything (dyn_filesz) requested.
Otherwise memory_callback (we assume dwfl_elf_phdr_memory_callback) is
called with *data_size = dyn_filesz. Which will then be set to the
actual buffer size being read.

So dyn_data_size might be bigger than the dynfilesz we are requesting?
Or smaller I assume.

If you are protecting against it becoming bigger, should the check be
changed to:

	if (dyn_data_size != 0 && dyn_data_size < dyn_filesz)
	  dyn_filesz = dyn_data_size;

?

Thanks,

Mark

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] libdwfl: Read no more than required to parse dynamic sections
  2022-11-29 15:25 ` Mark Wielaard
@ 2022-11-29 21:48   ` Gavin Li
  2022-11-30 23:14     ` Mark Wielaard
  0 siblings, 1 reply; 6+ messages in thread
From: Gavin Li @ 2022-11-29 21:48 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: elfutils-devel

Hi Mark,

Thanks for looking over this patch. Responses are inline.

> The code as written doesn't seem to guarantee that
> dwfl_segment_report_module will always be called with
> dwfl_elf_phdr_memory_callback as memory_callback. Although it probably
> will be in practice.

All file/line references relate to commit
98bdf533c4990728f0db653ab4e98a503d7654ce.

dwfl_segment_report_module is an internal function that is currently
only called from
dwfl_core_file_report. Because of this, I think it would be fine to enforce that
memory_callback implementations must enforce minread or return an error.
dwfl_elf_phdr_memory_callback does return an error at core-file.c:336 if
the amount that is able to be read is less than minread.
dwfl_segment_report_module.c:340
does not buffer_available either, since it assumes that
memory_callback will return an error
if it is unable to read sizeof(Elf64_Ehdr) bytes.

The main issue I am currently seeing is that
dwfl_elf_phdr_memory_callback can return
a *buffer_available that is sometimes much larger than minread,
especially if the ELF file
is mmaped (elf->map_address != NULL). See core-file.c:324-325.

For example, on my core file, opened with elf_begin(fd, ELF_C_READ_MMAP, NULL),
dyn_data_size would be set to about 130000, representing all the data
between the point
at which the dynamic section is found in the core file, and the end of
the core file itself
(which is around 454KB). We then pass the 130KB buffer to
elf64_xlatetom, which if
we're fortunate, returns an error because the buffer size is not a
multiple of sizeof(Elf64_Dyn),
but if we're unfortunate, it treats the whole 130KB buffer as
Elf64_Dyn entries and fills
xlateto.d_buf with garbage.

Similar behavior likely occurs everywhere read_portion() is used:
dwfl_segment_report_module.c
lines 447-453, 545-546.

> Reading read_portion it shows dyn_data_size being set to zero if
> read_state->buffer_available has everything (dyn_filesz) requested.
> Otherwise memory_callback (we assume dwfl_elf_phdr_memory_callback) is
> called with *data_size = dyn_filesz. Which will then be set to the
> actual buffer size being read.

dwfl_elf_phdr_memory_callback() may read much more than minread or *buffer_size
if the ELF file is already mapped in as described above.

> If you are protecting against it becoming bigger, should the check be
> changed to:
>
>         if (dyn_data_size != 0 && dyn_data_size < dyn_filesz)
>           dyn_filesz = dyn_data_size;
>

I think for the purposes of reading small segments (like PT_DYNAMIC
and PT_NOTE),
we should ignore *buffer_available altogether.

Best,
Gavin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] libdwfl: Read no more than required to parse dynamic sections
  2022-11-29 21:48   ` Gavin Li
@ 2022-11-30 23:14     ` Mark Wielaard
  2022-12-01 21:13       ` Gavin Li
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Wielaard @ 2022-11-30 23:14 UTC (permalink / raw)
  To: Gavin Li; +Cc: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 467 bytes --]

Hi Gavin,

On Tue, Nov 29, 2022 at 01:48:42PM -0800, Gavin Li wrote:
> I think for the purposes of reading small segments (like PT_DYNAMIC
> and PT_NOTE), we should ignore *buffer_available altogether.

Thanks for walking me through the code. I think you are right and none
of the buffer_available checks are necessary. So I removed them all. I
also adjusted the commit message a bit. Could you look at this patch
and let me know if this works for you?

Cheers,

Mark

[-- Attachment #2: 0001-libdwfl-Read-no-more-than-required-in-dwfl_segment_r.patch --]
[-- Type: text/x-diff, Size: 3866 bytes --]

From 59f1b49c1a1163ebde891196c1b24e3f4225915b Mon Sep 17 00:00:00 2001
From: Gavin Li <gavin@matician.com>
Date: Wed, 30 Nov 2022 18:26:19 +0100
Subject: [PATCH] libdwfl: Read no more than required in
 dwfl_segment_report_module

Since read_portion and the standard dwfl_elf_phdr_memory_callback
functions make sure to read at least minread bytes there is no need
for dwfl_segment_report_module to check and adjust the data to the
actual buffer size read. Reading beyond the end of the expected data
size (if the buffer read is much larger) actually causes issues when
passing the data to elfXX_xlatetom() because it is possible that
src->d_size is not a multiple of recsize (for ELF_T_DYN, recsize is 16
while the minimum required alignment is 8), causing elfXX_xlatetom()
to return ELF_E_INVALID_DATA.

Signed-off-by: Gavin Li <gavin@matician.com>
Signed-off-by: Mark Wielaard <mark@klomp.org>
---
 libdwfl/ChangeLog                    |  6 ++++++
 libdwfl/dwfl_segment_report_module.c | 28 +++-------------------------
 2 files changed, 9 insertions(+), 25 deletions(-)

diff --git a/libdwfl/ChangeLog b/libdwfl/ChangeLog
index 6dd84a6f..68527327 100644
--- a/libdwfl/ChangeLog
+++ b/libdwfl/ChangeLog
@@ -1,3 +1,9 @@
+2022-11-28  Gavin Li  <gavin@matician.com>
+	    Mark Wielaard  <mark@klomp.org>
+
+	* dwfl_segment_report_module.c (dwfl_segment_report_module): Remove
+	data size check after read_portion memory_callback.
+
 2022-10-21  Yonggang Luo  <luoyonggang@gmail.com>
 
 	* argp-std.c: Don't include unistd.h.
diff --git a/libdwfl/dwfl_segment_report_module.c b/libdwfl/dwfl_segment_report_module.c
index 287fc002..6a7b77f0 100644
--- a/libdwfl/dwfl_segment_report_module.c
+++ b/libdwfl/dwfl_segment_report_module.c
@@ -441,18 +441,8 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
 		    start + phoff, xlatefrom.d_size))
     goto out;
 
-  /* ph_buffer_size will be zero if we got everything from the initial
-     buffer, otherwise it will be the size of the new buffer that
-     could be read.  */
-  if (ph_buffer_size != 0)
-    {
-      phnum = ph_buffer_size / phentsize;
-      if (phnum == 0)
-	goto out;
-      xlatefrom.d_size = ph_buffer_size;
-    }
-
   xlatefrom.d_buf = ph_buffer;
+  xlatefrom.d_size = xlatefrom.d_size;
 
   bool class32 = ei_class == ELFCLASS32;
   size_t phdr_size = class32 ? sizeof (Elf32_Phdr) : sizeof (Elf64_Phdr);
@@ -533,18 +523,12 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
               /* We calculate from the p_offset of the note segment,
                because we don't yet know the bias for its p_vaddr.  */
               const GElf_Addr note_vaddr = start + offset;
-              void *data;
-              size_t data_size;
+              void *data = NULL;
+              size_t data_size = 0;
               if (read_portion (&read_state, &data, &data_size,
 				start, segment, note_vaddr, filesz))
                 continue; /* Next header */
 
-              /* data_size will be zero if we got everything from the initial
-                 buffer, otherwise it will be the size of the new buffer that
-                 could be read.  */
-              if (data_size != 0)
-                filesz = data_size;
-
 	      if (filesz > SIZE_MAX / sizeof (Elf32_Nhdr))
 		continue;
 
@@ -821,12 +805,6 @@ dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
       && ! read_portion (&read_state, &dyn_data, &dyn_data_size,
 			 start, segment, dyn_vaddr, dyn_filesz))
     {
-      /* dyn_data_size will be zero if we got everything from the initial
-         buffer, otherwise it will be the size of the new buffer that
-         could be read.  */
-      if (dyn_data_size != 0)
-	dyn_filesz = dyn_data_size;
-
       if ((dyn_filesz / dyn_entsize) == 0
 	  || dyn_filesz > (SIZE_MAX / dyn_entsize))
 	goto out;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] libdwfl: Read no more than required to parse dynamic sections
  2022-11-30 23:14     ` Mark Wielaard
@ 2022-12-01 21:13       ` Gavin Li
  2022-12-13 16:49         ` Mark Wielaard
  0 siblings, 1 reply; 6+ messages in thread
From: Gavin Li @ 2022-12-01 21:13 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: elfutils-devel

Awesome, thanks for looking over this. I only have one comment:
there's an extra "xlatefrom.d_size = xlatefrom.d_size;" line that
should be removed.

dwfl_elf_phdr_memory_callback is called from dwfl_link_map_report
but if any issues arise, those could be addressed in a separate patch.

Best,
Gavin

On Wed, Nov 30, 2022 at 3:14 PM Mark Wielaard <mark@klomp.org> wrote:
>
> Hi Gavin,
>
> On Tue, Nov 29, 2022 at 01:48:42PM -0800, Gavin Li wrote:
> > I think for the purposes of reading small segments (like PT_DYNAMIC
> > and PT_NOTE), we should ignore *buffer_available altogether.
>
> Thanks for walking me through the code. I think you are right and none
> of the buffer_available checks are necessary. So I removed them all. I
> also adjusted the commit message a bit. Could you look at this patch
> and let me know if this works for you?
>
> Cheers,
>
> Mark

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] libdwfl: Read no more than required to parse dynamic sections
  2022-12-01 21:13       ` Gavin Li
@ 2022-12-13 16:49         ` Mark Wielaard
  0 siblings, 0 replies; 6+ messages in thread
From: Mark Wielaard @ 2022-12-13 16:49 UTC (permalink / raw)
  To: Gavin Li; +Cc: elfutils-devel

Hi Gavin,

On Thu, 2022-12-01 at 13:13 -0800, Gavin Li wrote:
> Awesome, thanks for looking over this. I only have one comment:
> there's an extra "xlatefrom.d_size = xlatefrom.d_size;" line that
> should be removed.

Thanks for spotting that. Odd the compiler didn't warn for that. There
is a xlatefrom.d_size = phnum * phentsize; just before this that does
the correct assignment.

Pushed with that line removed.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-12-13 16:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-29  6:26 [PATCH] libdwfl: Read no more than required to parse dynamic sections gavin
2022-11-29 15:25 ` Mark Wielaard
2022-11-29 21:48   ` Gavin Li
2022-11-30 23:14     ` Mark Wielaard
2022-12-01 21:13       ` Gavin Li
2022-12-13 16:49         ` Mark Wielaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).