[PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels

public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed

* [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels
@ 2019-01-01  0:00 maennich.google.com via libabigail
  2019-01-01  0:00 ` [PATCH v2 2/4] dwarf-reader: templatize read_int_from_array_of_bytes maennich.google.com via libabigail
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: maennich.google.com via libabigail @ 2019-01-01  0:00 UTC (permalink / raw)
  To: libabigail; +Cc: dodji, kernel-team, Matthias Maennich

Hi!

This series addresses an issue with reading the __ksymtab of v4.19+ aarch64
kernels that has been introduced post libabigail-1.6.

The minimal patch would be to cast the symbol addresses acquired by
read_int_from_array_of_bytes to int32_t. But treating that values as a
32bit value already when reading allows to remove the architecture specific
treatment and the need to deal with overflows.

Cheers,
Matthias

Matthias Maennich (4):
  dwarf-reader: Fix comments for try_reading_first_ksymtab_entry_using_{pre_,}v4_19_format
  dwarf-reader: templatize read_int_from_array_of_bytes
  Bug 24431 Read 32bit values when testing for the v4.19 symbol table format
  Bug 24431 Treat __ksymtab as int32_t for v4.19+ kernels

 src/abg-dwarf-reader.cc | 75 ++++++++++++++---------------------------
 1 file changed, 26 insertions(+), 49 deletions(-)

-- 
2.21.0.1020.gf2820cf01a-goog

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/4] dwarf-reader: Fix comments for try_reading_first_ksymtab_entry_using_{pre_,}v4_19_format
  2019-01-01  0:00 [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels maennich.google.com via libabigail
  2019-01-01  0:00 ` [PATCH v2 2/4] dwarf-reader: templatize read_int_from_array_of_bytes maennich.google.com via libabigail
  2019-01-01  0:00 ` [PATCH v2 4/4] Bug 24431 Treat __ksymtab as int32_t for v4.19+ kernels maennich.google.com via libabigail
@ 2019-01-01  0:00 ` maennich.google.com via libabigail
  2019-01-01  0:00 ` [PATCH v2 3/4] Bug 24431 Read 32bit values when testing for the v4.19 symbol table format maennich.google.com via libabigail
  2019-01-01  0:00 ` [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels Dodji Seketeli
  4 siblings, 0 replies; 6+ messages in thread
From: maennich.google.com via libabigail @ 2019-01-01  0:00 UTC (permalink / raw)
  To: libabigail; +Cc: dodji, kernel-team, Matthias Maennich

Swap the descriptive comments for the two functions.

	* src/abg-dwarf-reader.cc: swap the comments of
	try_reading_first_ksymtab_entry_using_{pre_,}v4_19_format

Signed-off-by: Matthias Maennich <maennich@google.com>
---
 src/abg-dwarf-reader.cc | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index a6f51b9f93cd..3eb835908abd 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -7186,14 +7186,14 @@ public:
   }
 
   /// Try reading the first __ksymtab section entry as if it is in the
-  /// v4_19 format and lookup a symbol from the .symbol section to see
-  /// if that succeeds.  If it does, then we can assume the __ksymtab
-  /// section is in the v4_19 format.
+  /// pre-v4_19 format and lookup a symbol from the .symbol section to
+  /// see if that succeeds.  If it does, then we can assume the
+  /// __ksymtab section is in the pre-v4_19 format.
   ///
   /// @return the symbol resulting from the lookup of the symbol
   /// address we got from reading the first entry of the ksymtab
-  /// section assuming the v4.19 format.  If nil, it means the
-  /// __ksymtab section is not in the v4.19 format.
+  /// section assuming the pre-v4.19 format.  If nil, it means the
+  /// __ksymtab section is not in the pre-v4.19 format.
   elf_symbol_sptr
   try_reading_first_ksymtab_entry_using_pre_v4_19_format() const
   {
@@ -7215,14 +7215,14 @@ public:
   }
 
   /// Try reading the first __ksymtab section entry as if it is in the
-  /// pre-v4_19 format and lookup a symbol from the .symbol section to
-  /// see if that succeeds.  If it does, then we can assume the
-  /// __ksymtab section is in the pre-v4_19 format.
+  /// v4_19 format and lookup a symbol from the .symbol section to see
+  /// if that succeeds.  If it does, then we can assume the __ksymtab
+  /// section is in the v4_19 format.
   ///
   /// @return the symbol resulting from the lookup of the symbol
   /// address we got from reading the first entry of the ksymtab
-  /// section assuming the pre-v4.19 format.  If nil, it means the
-  /// __ksymtab section is not in the pre-v4.19 format.
+  /// section assuming the v4.19 format.  If nil, it means the
+  /// __ksymtab section is not in the v4.19 format.
   elf_symbol_sptr
   try_reading_first_ksymtab_entry_using_v4_19_format() const
   {
-- 
2.21.0.1020.gf2820cf01a-goog

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels
  2019-01-01  0:00 [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels maennich.google.com via libabigail
                   ` (3 preceding siblings ...)
  2019-01-01  0:00 ` [PATCH v2 3/4] Bug 24431 Read 32bit values when testing for the v4.19 symbol table format maennich.google.com via libabigail
@ 2019-01-01  0:00 ` Dodji Seketeli
  4 siblings, 0 replies; 6+ messages in thread
From: Dodji Seketeli @ 2019-01-01  0:00 UTC (permalink / raw)
  To: maennich; +Cc: libabigail, kernel-team

Hello Matthias,

maennich@google.com a écrit:

> This series addresses an issue with reading the __ksymtab of v4.19+ aarch64
> kernels that has been introduced post libabigail-1.6.
>
> The minimal patch would be to cast the symbol addresses acquired by
> read_int_from_array_of_bytes to int32_t. But treating that values as a
> 32bit value already when reading allows to remove the architecture specific
> treatment and the need to deal with overflows.

The patch series looks good to me, and I have checked it in as it seems
to pass on the the few kernel binaries I am watching.

As a side note, this makes me think that we need a regression test
harness for the kernel (and for all huge binaries), a bit like what we
have when we run make check today.  That new harness would have to be
separate from the libabigail tarball/repository though, as kernel
binaries are big :-) I am thinking maybe something that would be hosted
on github or somewhere, would either store (kernel) binaries or know how
to build them, and store the expected output of the various libabigail
tools on those binaries.  The harness would then run the tool on the
binary and compare the outcome with the expected one.  This is key to
ensure we keep functionalities working as we move forward.

Parties (like you) who have specific binaries/areas of interests, would
submit data to that repository to ensure that subsequent changes to the
libabigail doesn't break the particular functionality they are
interested in.

But that would be an effort for another time.  I just thought I'd
write this down for future reference :-)

Thanks!

-- 
		Dodji

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 2/4] dwarf-reader: templatize read_int_from_array_of_bytes
  2019-01-01  0:00 [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels maennich.google.com via libabigail
@ 2019-01-01  0:00 ` maennich.google.com via libabigail
  2019-01-01  0:00 ` [PATCH v2 4/4] Bug 24431 Treat __ksymtab as int32_t for v4.19+ kernels maennich.google.com via libabigail
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: maennich.google.com via libabigail @ 2019-01-01  0:00 UTC (permalink / raw)
  To: libabigail; +Cc: dodji, kernel-team, Matthias Maennich

Making the return value type a template type allows for signed types to
be passed and successfully interpreted.

	* src/abg-dwarf-reader.cc (read_int_from_array_of_bytes):
	templatize return type to allow passing of signed integer references

Signed-off-by: Matthias Maennich <maennich@google.com>
---
 src/abg-dwarf-reader.cc | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index 3eb835908abd..1e7965292d58 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -6125,9 +6125,10 @@ public:
     return read_int_from_array_of_bytes(bytes, 8, is_big_endian, result);
   }
 
-  /// Read N bytes and convert their value into an uint64_t.
+  /// Read N bytes and convert their value into an integer type T.
   ///
-  /// Note that N cannot be bigger than 8 for now.
+  /// Note that N cannot be bigger than 8 for now. The type passed needs to be
+  /// at least of the size of number_of_bytes.
   ///
   /// @param bytes the array of bytes to read the next 8 bytes from.
   /// Note that this array must be at least 8 bytes long.
@@ -6138,22 +6139,24 @@ public:
   /// @param is_big_endian if true, read the 8 bytes in Big Endian
   /// mode, otherwise, read them in Little Endian.
   ///
-  /// @param result where to store the resuting uint64_t that was read.
+  /// @param result where to store the resuting integer that was read.
   ///
   ///
   /// @param true if the 8 bytes could be read, false otherwise.
+  template<typename T>
   bool
   read_int_from_array_of_bytes(const uint8_t	*bytes,
 			       unsigned char	number_of_bytes,
 			       bool		is_big_endian,
-			       uint64_t	&result) const
+			       T		&result) const
   {
     if (!bytes)
       return false;
 
     ABG_ASSERT(number_of_bytes <= 8);
+    ABG_ASSERT(number_of_bytes <= sizeof(T));
 
-    uint64_t res = 0;
+    T res = 0;
 
     const uint8_t *cur = bytes;
     if (is_big_endian)
@@ -6165,7 +6168,7 @@ public:
 
 	// Now read the remaining least significant bytes.
 	for (uint i = 1; i < number_of_bytes; ++i)
-	  res = (res << 8) | ((uint64_t)msb[i]);
+	  res = (res << 8) | ((T)msb[i]);
       }
     else
       {
@@ -6175,7 +6178,7 @@ public:
 	res = *lsb;
 	// Now read the remaining most significant bytes.
 	for (uint i = 1; i < number_of_bytes; ++i)
-	  res = res | (((uint64_t)lsb[i]) << i * 8);
+	  res = res | (((T)lsb[i]) << i * 8);
       }
 
     result = res;
-- 
2.21.0.1020.gf2820cf01a-goog

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 3/4] Bug 24431 Read 32bit values when testing for the v4.19 symbol table format
  2019-01-01  0:00 [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels maennich.google.com via libabigail
                   ` (2 preceding siblings ...)
  2019-01-01  0:00 ` [PATCH v2 1/4] dwarf-reader: Fix comments for try_reading_first_ksymtab_entry_using_{pre_,}v4_19_format maennich.google.com via libabigail
@ 2019-01-01  0:00 ` maennich.google.com via libabigail
  2019-01-01  0:00 ` [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels Dodji Seketeli
  4 siblings, 0 replies; 6+ messages in thread
From: maennich.google.com via libabigail @ 2019-01-01  0:00 UTC (permalink / raw)
  To: libabigail; +Cc: dodji, kernel-team, Matthias Maennich

Reading into uint64_t when reading the symbol table values drops the
sign and subsequently offset calculations will only be correct if either
the offset is positive or if the calculation overflows.

Read the relative value as signed int32_t (indepently of the target's
bitness) to allow negative offsets.  That also allows to drop the code
that formerly handled the overflow.

That change fixes an assertion raised when dealing with aarch64 kernel
binaries that have a __ksymtab with 32bit relocations. i.e. Bug #24431

	* src/abg-dwarf-reader.cc
	(try_reading_first_ksymtab_entry_using_v4_19_format): attempt to
	read first __ksymtab entry into int32_t to preserve sign

Signed-off-by: Matthias Maennich <maennich@google.com>
---
 src/abg-dwarf-reader.cc | 20 ++++----------------
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index 1e7965292d58..107cd6d90008 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -7234,29 +7234,17 @@ public:
     uint8_t *bytes = reinterpret_cast<uint8_t*>(elf_data->d_buf);
     bool is_big_endian = elf_architecture_is_big_endian();
     elf_symbol_sptr symbol;
-    unsigned char symbol_value_size = 4;
-    unsigned char arch_word_size = architecture_word_size();
 
+    int32_t offset = 0;
+    const unsigned char symbol_value_size = sizeof(offset);
     GElf_Addr symbol_address = 0, adjusted_symbol_address = 0;
     ABG_ASSERT(read_int_from_array_of_bytes(bytes,
 					    symbol_value_size,
 					    is_big_endian,
-					    symbol_address));
+					    offset));
     GElf_Shdr mem;
     GElf_Shdr *section_header = gelf_getshdr(section, &mem);
-    if (arch_word_size == 4)
-      symbol_address = (uint32_t)(symbol_address + section_header->sh_addr);
-    else if (arch_word_size == 8)
-      {
-	symbol_address = symbol_address + section_header->sh_addr;
-	if (symbol_address < ((uint64_t)1 << 32))
-	  // The symbol address is expressed in 32 bits.  So let's
-	  // convert it to a 64 bits address with the 4 most
-	  // significant bytes set to ff each.
-	  symbol_address = ((uint64_t)0xffffffff << 32) | symbol_address;
-      }
-    else
-      ABG_ASSERT_NOT_REACHED;
+    symbol_address = offset + section_header->sh_addr;
 
     adjusted_symbol_address = maybe_adjust_fn_sym_address(symbol_address);
     symbol = lookup_elf_symbol_from_address(adjusted_symbol_address);
-- 
2.21.0.1020.gf2820cf01a-goog

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 4/4] Bug 24431 Treat __ksymtab as int32_t for v4.19+ kernels
  2019-01-01  0:00 [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels maennich.google.com via libabigail
  2019-01-01  0:00 ` [PATCH v2 2/4] dwarf-reader: templatize read_int_from_array_of_bytes maennich.google.com via libabigail
@ 2019-01-01  0:00 ` maennich.google.com via libabigail
  2019-01-01  0:00 ` [PATCH v2 1/4] dwarf-reader: Fix comments for try_reading_first_ksymtab_entry_using_{pre_,}v4_19_format maennich.google.com via libabigail
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: maennich.google.com via libabigail @ 2019-01-01  0:00 UTC (permalink / raw)
  To: libabigail; +Cc: dodji, kernel-team, Matthias Maennich

Calculating the relocation for values in __ksymtab with GElf_Addr (i.e.
uint64_t), makes the calculation rely on overflows for negative offsets.
Address that by treating these as 32bit signed values for the v4.19+
__ksymtabs and calculate the offset with them. This also allows, similar
an earlier commit, to drop the distinction between 64bit and 32bit
kernels.

	* src/abg-dwarf-reader.cc (maybe_adjust_sym_address_from_v4_19_ksymtab):
	treat passed addr as 32bit signed offset in case of v4.19+ __ksymtabs

Signed-off-by: Matthias Maennich <maennich@google.com>
---
 src/abg-dwarf-reader.cc | 18 ++----------------
 1 file changed, 2 insertions(+), 16 deletions(-)

diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index 107cd6d90008..08840b577d65 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -7671,24 +7671,10 @@ public:
 
     if (get_ksymtab_format() == V4_19_KSYMTAB_FORMAT)
       {
+	int32_t offset = addr;
 	GElf_Shdr mem;
 	GElf_Shdr *section_header = gelf_getshdr(ksymtab_section, &mem);
-	if (architecture_word_size() == 4)
-	  result = (uint32_t)(addr + section_header->sh_addr + addr_offset);
-	else if (architecture_word_size() == 8)
-	  {
-	    result = addr + section_header->sh_addr + addr_offset;
-	    if (result < ((uint64_t)1 << 32))
-	      // The symbol address is expressed in 32 bits.  So let's
-	      // convert it to a 64 bits address with the 4 most
-	      // significant bytes set to ff each.  This is how 64
-	      // bits addresses of symbols are in the .symbol section,
-	      // so we need this address to be consistent with that
-	      // format.
-	      result = ((uint64_t)0xffffffff << 32) | result;
-	  }
-	else
-	  ABG_ASSERT_NOT_REACHED;
+	result = offset + section_header->sh_addr + addr_offset;
       }
 
     return result;
-- 
2.21.0.1020.gf2820cf01a-goog

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-05-10  5:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01  0:00 [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels maennich.google.com via libabigail
2019-01-01  0:00 ` [PATCH v2 2/4] dwarf-reader: templatize read_int_from_array_of_bytes maennich.google.com via libabigail
2019-01-01  0:00 ` [PATCH v2 4/4] Bug 24431 Treat __ksymtab as int32_t for v4.19+ kernels maennich.google.com via libabigail
2019-01-01  0:00 ` [PATCH v2 1/4] dwarf-reader: Fix comments for try_reading_first_ksymtab_entry_using_{pre_,}v4_19_format maennich.google.com via libabigail
2019-01-01  0:00 ` [PATCH v2 3/4] Bug 24431 Read 32bit values when testing for the v4.19 symbol table format maennich.google.com via libabigail
2019-01-01  0:00 ` [PATCH v2 0/4] Bug 24431 Fix analysis of v4.19 aarch64 kernels Dodji Seketeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).