public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v1] dwarf-reader: fix undefined behaviour in get_binary_load_address
@ 2019-01-01  0:00 Matthias Maennich via libabigail
  2019-01-01  0:00 ` Dodji Seketeli
  0 siblings, 1 reply; 4+ messages in thread
From: Matthias Maennich via libabigail @ 2019-01-01  0:00 UTC (permalink / raw)
  To: libabigail; +Cc: kernel-team, Matthias Maennich

Within the loop, the call `gelf_getphdr(elf_handle, i, &ph_mem)` is
returning a pointer to `ph_mem` that is only valid in this loop
iteration. The later assignment to *lowest_program_header and its
eventual use to assign load_address leads to undefined behaviour.

Refactor the method to just keep track of the minimal address without
doing so through a pointer to GElf_Phdr.

 * src/abg-dwarf-reader.cc: fix UB in get_binary_load_address

Signed-off-by: Matthias Maennich <maennich@google.com>
---
 src/abg-dwarf-reader.cc | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index 7bde0f3c8738..322320cad208 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -1098,33 +1098,33 @@ find_data1_section(Elf* elf_handle)
 /// @return true if the function could get the binary load address
 /// and assign @p load_address to it.
 static bool
-get_binary_load_address(Elf *elf_handle,
-			GElf_Addr &load_address)
+get_binary_load_address(Elf *elf_handle, GElf_Addr &load_address)
 {
+  bool address_found = false;
+  GElf_Addr result = std::numeric_limits<GElf_Addr>::max();
+
   GElf_Ehdr eh_mem;
   GElf_Ehdr *elf_header = gelf_getehdr(elf_handle, &eh_mem);
   size_t num_segments = elf_header->e_phnum;
-  GElf_Phdr *lowest_program_header = 0;
 
   for (unsigned i = 0; i < num_segments; ++i)
     {
-      GElf_Phdr ph_mem;
-      GElf_Phdr *program_header = gelf_getphdr(elf_handle, i, &ph_mem);
-      if (program_header->p_type == PT_LOAD
-	  && (!lowest_program_header
-	      || program_header->p_vaddr < lowest_program_header->p_vaddr))
-	lowest_program_header = program_header;
+      GElf_Phdr program_header;
+      gelf_getphdr(elf_handle, i, &program_header);
+      if (program_header.p_type == PT_LOAD)
+	{
+	  // So far, this program header represents the segment containing the
+	  // first byte of this binary. We want to return the address
+	  // at which the segment is loaded in memory.
+	  result = std::min(result, program_header.p_vaddr);
+	  address_found = true;
+	}
     }
 
-  if (lowest_program_header)
-    {
-      // This program header represent the segment containing the
-      // first byte of this binary.  We want to return the address
-      // at which the segment is loaded in memory.
-      load_address = lowest_program_header->p_vaddr;
-      return true;
-    }
-  return false;
+  if (address_found)
+    load_address = result;
+
+  return address_found;
 }
 
 /// Find the file name of the alternate debug info file, as well as
-- 
2.21.0.392.gf8f6787159e-goog

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] dwarf-reader: fix undefined behaviour in get_binary_load_address
  2019-01-01  0:00 ` Dodji Seketeli
@ 2019-01-01  0:00   ` Matthias Maennich via libabigail
  2019-01-01  0:00     ` Dodji Seketeli
  0 siblings, 1 reply; 4+ messages in thread
From: Matthias Maennich via libabigail @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: libabigail, kernel-team

Hi Dodji!

On Thu, Apr 18, 2019 at 11:59:07AM +0200, Dodji Seketeli wrote:
> "Matthias Maennich via libabigail" <libabigail@sourceware.org> a écrit:
> 
> > Refactor the method to just keep track of the minimal address without
> > doing so through a pointer to GElf_Phdr.
> 
> The logic of the refactoring seems correct to me.
> 
> However, one thing to keep in mind is that unfortunately, the code of
> libabigail is still in C++ < 11 (effectively c++98) because at the time,
> it was meant to also run on distributions that didn't have C++11 yet,
> and some of these distributions are supported for a long time.  They
> thus expect libabigail to still be running there.  Time is coming
> though, for us to move to C++ >= 11, I think :-)  But we haven't flipped
> that switch yet.  Until then, I think we need to stick to C++ < 11.
> 
> Therefore ...
> 
> [...]
> 
> > +  bool address_found = false;
> > +  GElf_Addr result = std::numeric_limits<GElf_Addr>::max();
> 
> This change is not yet acceptable as std::numeric_limits doesn't exist
> pre C++11.
> 
Actually, (I had to look it up first) std::numeric_limits is already part of
C++98 ([lib.limits/18.2.1]). So, I guess we can use that then?

By the way, any concrete plans to introduce C++11 or later? Just curious.

> I thus went ahead and edited your patch to keep the minimal amount of
> change that would still fix the issue you are validly raising in the
> subject of the patch.
> 
Thanks for taking care of that. I have one more question about that. As far as
I understand, `ph_mem` gets updated unconditionally during every iteration via
the call to `gelf_getphdr`. Also, `program_header` will always be `&ph_mem` as
that is what `gelf_getphdr` returns. So, once we update
`lowest_program_header`, it will be `&ph_mem`. And if we update `ph_mem` in the
next iteration, we effectively update `*lowest_program_header`. Is that the
intention? My original patch was therefore keeping track of the minimum
`program_header.p_vaddr` value seen in case of `program_header.p_type ==
PT_LOAD`. Maybe I am wrong, but can you double check the logic?

-- 
Cheers,
Matthias

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] dwarf-reader: fix undefined behaviour in get_binary_load_address
  2019-01-01  0:00 [PATCH v1] dwarf-reader: fix undefined behaviour in get_binary_load_address Matthias Maennich via libabigail
@ 2019-01-01  0:00 ` Dodji Seketeli
  2019-01-01  0:00   ` Matthias Maennich via libabigail
  0 siblings, 1 reply; 4+ messages in thread
From: Dodji Seketeli @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Matthias Maennich via libabigail; +Cc: Matthias Maennich, kernel-team

[-- Attachment #1: Type: text/plain, Size: 1824 bytes --]

Hello Matthias,

"Matthias Maennich via libabigail" <libabigail@sourceware.org> a écrit:

> Within the loop, the call `gelf_getphdr(elf_handle, i, &ph_mem)` is
> returning a pointer to `ph_mem` that is only valid in this loop
> iteration. The later assignment to *lowest_program_header and its
> eventual use to assign load_address leads to undefined behaviour.

Right, good catch!  Thank you for looking into this!


> Refactor the method to just keep track of the minimal address without
> doing so through a pointer to GElf_Phdr.

The logic of the refactoring seems correct to me.

However, one thing to keep in mind is that unfortunately, the code of
libabigail is still in C++ < 11 (effectively c++98) because at the time,
it was meant to also run on distributions that didn't have C++11 yet,
and some of these distributions are supported for a long time.  They
thus expect libabigail to still be running there.  Time is coming
though, for us to move to C++ >= 11, I think :-)  But we haven't flipped
that switch yet.  Until then, I think we need to stick to C++ < 11.

Therefore ...

[...]

> +  bool address_found = false;
> +  GElf_Addr result = std::numeric_limits<GElf_Addr>::max();

This change is not yet acceptable as std::numeric_limits doesn't exist
pre C++11.

I thus went ahead and edited your patch to keep the minimal amount of
change that would still fix the issue you are validly raising in the
subject of the patch.

I have thus committed the patch attached below, which is a trimmed down
version of yours.  I also took the liberty to edit the commit log for
the changelog part, to make it comply with the commit log guidelines of
the project, as defined at https://sourceware.org/git/gitweb.cgi?p=libabigail.git;a=blob;f=COMMIT-LOG-GUIDELINES.

Thanks!


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: fix undefined behavious in get_binary_load_address --]
[-- Type: text/x-patch, Size: 1618 bytes --]

From 177f76ee3be44a88fced2788aabb835771c87c00 Mon Sep 17 00:00:00 2001
From: Matthias Maennich <maennich@google.com>
Date: Thu, 18 Apr 2019 11:33:34 +0200
Subject: [PATCH] dwarf-reader: fix undefined behaviour in
 get_binary_load_address

Within the loop, the call `gelf_getphdr(elf_handle, i, &ph_mem)` is
returning a pointer to `ph_mem` that is only valid in this loop
iteration. The later assignment to *lowest_program_header and its
eventual use to assign load_address leads to undefined behaviour.

	* src/abg-dwarf-reader.cc (get_binary_load_address): Move the
	ph_mem and program_header variables out of the inner for-loop.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
---
 src/abg-dwarf-reader.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index e0638c9d..1815034c 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -1104,12 +1104,12 @@ get_binary_load_address(Elf *elf_handle,
   GElf_Ehdr eh_mem;
   GElf_Ehdr *elf_header = gelf_getehdr(elf_handle, &eh_mem);
   size_t num_segments = elf_header->e_phnum;
-  GElf_Phdr *lowest_program_header = 0;
+  GElf_Phdr *lowest_program_header = 0, *program_header = 0;
+  GElf_Phdr ph_mem;
 
   for (unsigned i = 0; i < num_segments; ++i)
     {
-      GElf_Phdr ph_mem;
-      GElf_Phdr *program_header = gelf_getphdr(elf_handle, i, &ph_mem);
+      program_header = gelf_getphdr(elf_handle, i, &ph_mem);
       if (program_header->p_type == PT_LOAD
 	  && (!lowest_program_header
 	      || program_header->p_vaddr < lowest_program_header->p_vaddr))
-- 
2.21.0


[-- Attachment #3: Type: text/plain, Size: 14 bytes --]



-- 
		Dodji

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v1] dwarf-reader: fix undefined behaviour in get_binary_load_address
  2019-01-01  0:00   ` Matthias Maennich via libabigail
@ 2019-01-01  0:00     ` Dodji Seketeli
  0 siblings, 0 replies; 4+ messages in thread
From: Dodji Seketeli @ 2019-01-01  0:00 UTC (permalink / raw)
  To: Matthias Maennich; +Cc: libabigail, kernel-team

Matthias Maennich <maennich@google.com> a écrit:

> Actually, (I had to look it up first) std::numeric_limits is already part of
> C++98 ([lib.limits/18.2.1]). So, I guess we can use that then?

For a reason it wasn't building on my system (EL7) here.

> By the way, any concrete plans to introduce C++11 or later? Just
> curious.

Concrete plan, not yet.  The thing is that we still have to support long
term supported el6 (until ~ end of 2020, I believe) and in Fedora, the
package buildroot doesn't contain c++11.

That being said, if at some point before that date, we feel we have a
strong/stable enough libabigail I guess we'll stop updating it in el6.
This is just a matter of consensus, I guess.

[...]

> I have one more question about that. As far as
> I understand, `ph_mem` gets updated unconditionally during every iteration via
> the call to `gelf_getphdr`. Also, `program_header` will always be `&ph_mem` as
> that is what `gelf_getphdr` returns. So, once we update
> `lowest_program_header`, it will be `&ph_mem`. And if we update `ph_mem` in the
> next iteration, we effectively update `*lowest_program_header`. Is that the
> intention? My original patch was therefore keeping track of the minimum
> `program_header.p_vaddr` value seen in case of `program_header.p_type ==
> PT_LOAD`. Maybe I am wrong, but can you double check the logic?

No, I think you are right.  Thanks for spotting that.

I have updated that code accordingly and I have committed it at
https://sourceware.org/git/gitweb.cgi?p=libabigail.git;a=commit;h=d403118cd0ba5bd3d0633555e8619a49ddafc68c.

Thanks again!

-- 
		Dodji

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-10 12:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01  0:00 [PATCH v1] dwarf-reader: fix undefined behaviour in get_binary_load_address Matthias Maennich via libabigail
2019-01-01  0:00 ` Dodji Seketeli
2019-01-01  0:00   ` Matthias Maennich via libabigail
2019-01-01  0:00     ` Dodji Seketeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).