public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed
From: Jessica Yu <jeyu@kernel.org>
To: Matthias Maennich <maennich@google.com>
Cc: libabigail@sourceware.org, Dodji Seketeli <dodji@redhat.com>,
	kernel-team@android.com
Subject: Re: [RFC PATCH] Support pre and post v4.19 ksymtabs for Linux kernel modules
Date: Tue, 01 Jan 2019 00:00:00 -0000	[thread overview]
Message-ID: <20190821123436.GB8668@linux-8ccs> (raw)
In-Reply-To: <20190819080508.GA189145@google.com>

+++ Matthias Maennich [19/08/19 09:05 +0100]:
>Hi Jessica!
>
>On Fri, 09 Aug, 17:49, Jessica Yu wrote:
>>As described in commit ad8c2531fb9, the format of the Linux kernel
>>ksymtab changed in v4.19 to use relative references instead of absolute
>>references. This changes the type of relocations emitted for ksymtab
>>sections to be place-relative 32-bit relocations instead of absolute
>>relocations. One side-effect of this is that libdwfl will not relocate
>>the ksymtab sections due to the PC-relative relocations. This breaks
>>load_kernel_symbol_table() for kernel modules because it only reads in
>>zeros from the unrelocated ksymtab section and is subsequently unable to
>>determine what exported symbols it refers to. Since a vmlinux binary is
>>already fully linked and relocated (and therefore we can read its
>>ksymtab section just fine), this problem is only relevant to Linux
>>kernel modules.
>>
>>To work around this, we utilize the ksymtab relocation sections to
>>determine which symbols the ksymtab entries refer to. We do this by
>>inspecting each relocation's r_info field for the symbol table index and
>>from there we are able to read each symbol's value and subsequently add
>>that to the set of exported symbols.
>>
>>In addition, for Linux kernel modules, we can utilize relocation types
>>to implement a new heuristic to determine the ksymtab format we have.
>>The presence of PC-relative relocations suggest the new v4.19 format,
>>and absolute relocation types suggest the old pre v4.19 format.
>>
>>        * include/abg-ir.h (elf_symbol::{elf_symbol, create}): Take new
>>        symbol value and shndx parameters.
>>        (elf_symbol::{get_value, get_shndx}): Declare new accessors.
>>        * src/abg-ir.cc (elf_symbol::priv::{value_, shndx_}): New data members.
>>        (elf_symbol::priv::priv): Adjust.
>>        (elf_symbol::elf_symbol): Take new value and shndx parameters.
>>        (elf_symbol::create): Likewise.
>>        (elf_symbol::{get_value, get_shndx}): Define new accessors.
>>        * src/abg-reader.cc (build_elf_symbol): Adjust.
>>        * src/abg-dwarf-reader.cc (lookup_symbol_from_sysv_hash_tab)
>>        (lookup_symbol_from_gnu_hash_tab)
>>        (lookup_symbol_from_symtab): Adjust.
>>        (read_context::{ksymtab_reloc_section_,
>>        ksymtab_gpl_reloc_section_,
>>        ksymtab_strings_section_}): New data members.
>>        (read_context::read_context): Initialize ksymtab_reloc_section_,
>>        ksymtab_gpl_reloc_section_, ksymtab_strings_section_.
>>        (read_context::{find_ksymtab_reloc_section,
>>        find_ksymtab_gpl_reloc_section, find_ksymtab_strings_section,
>>        find_any_ksymtab_reloc_section, get_ksymtab_format_module,
>>        populate_symbol_map_from_ksymtab,
>>        populate_symbol_map_from_ksymtab_reloc,
>>        is_linux_kernel_module}): New member functions.
>>        (read_context::load_kernel_symbol_table): Adjust to call either
>>        populate_symbol_map_from_ksymtab{_reloc,} depending on ksymtab
>>        format.
>>        (read_context::get_ksymtab_format): Adjust to call
>>        get_ksymtab_format_module for linux kernel modules.
>>        (read_context::lookup_elf_symbol_from_index): Adjust.
>>        (create_default_var_sym, create_default_fn_sym): Adjust.
>>
>>Signed-off-by: Jessica Yu <jeyu@kernel.org>
>>---
>>include/abg-ir.h        |  10 ++
>>src/abg-dwarf-reader.cc | 391 +++++++++++++++++++++++++++++++++++++++++-------
>>src/abg-ir.cc           |  38 ++++-
>>src/abg-reader.cc       |   2 +
>>4 files changed, 388 insertions(+), 53 deletions(-)
>>
>>diff --git a/include/abg-ir.h b/include/abg-ir.h
>>index 170a844..d26f472 100644
>>--- a/include/abg-ir.h
>>+++ b/include/abg-ir.h
>>@@ -796,6 +796,8 @@ private:
>>  elf_symbol(const environment* e,
>>	     size_t		i,
>>	     size_t		s,
>>+	     uint64_t		v,
>>+	     uint32_t		x,
>
>I would like to see some comments here to describe the purpose of the
>struct's members. I know there weren't any before, but maybe that is a
>good chance.
>
>Also, I am not entirely sure why these are all one-character names, but
>it makes it hard to come from x to "section header index". Maybe we
>should give them more meaningful names.
>
>>	     const string&	n,
>>	     type		t,
>>	     binding		b,
>>@@ -818,6 +820,8 @@ public:
>>  create(const environment*	e,
>>	 size_t		i,
>>	 size_t		s,
>>+	 uint64_t	v,
>>+	 uint32_t	x,
>>	 const string&		n,
>>	 type			t,
>>	 binding		b,
>>@@ -838,6 +842,12 @@ public:
>>  void
>>  set_index(size_t);
>>
>>+  uint64_t
>>+  get_value() const;
>>+
>>+  uint32_t
>>+  get_shndx() const;
>>+
>>  const string&
>>  get_name() const;
>>
>>diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
>>index 47af7e3..835bf3d 100644
>>--- a/src/abg-dwarf-reader.cc
>>+++ b/src/abg-dwarf-reader.cc
>>@@ -1724,6 +1724,8 @@ lookup_symbol_from_sysv_hash_tab(const environment*		env,
>>  GElf_Sym symbol;
>>  const char* sym_name_str;
>>  size_t sym_size;
>>+  uint64_t sym_value;
>>+  uint32_t sym_shndx;
>>  elf_symbol::type sym_type;
>>  elf_symbol::binding sym_binding;
>>  elf_symbol::visibility sym_visibility;
>>@@ -1743,6 +1745,8 @@ lookup_symbol_from_sysv_hash_tab(const environment*		env,
>>	  sym_visibility =
>>	    stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(symbol.st_other));
>>	  sym_size = symbol.st_size;
>>+	  sym_value = symbol.st_value;
>>+	  sym_shndx = symbol.st_shndx;
>>	  elf_symbol::version ver;
>>	  if (get_version_for_symbol(elf_handle, symbol_index,
>>				     /*get_def_version=*/true, ver))
>>@@ -1751,6 +1755,8 @@ lookup_symbol_from_sysv_hash_tab(const environment*		env,
>>	    elf_symbol::create(env,
>>			       symbol_index,
>>			       sym_size,
>>+			       sym_value,
>>+			       sym_shndx,
>>			       sym_name_str,
>>			       sym_type,
>>			       sym_binding,
>>@@ -2028,7 +2034,8 @@ lookup_symbol_from_gnu_hash_tab(const environment*		env,
>>	    ABG_ASSERT(!ver.str().empty());
>>
>>	  elf_symbol_sptr symbol_found =
>>-	    elf_symbol::create(env, i, symbol.st_size, sym_name_str,
>>+	    elf_symbol::create(env, i, symbol.st_size, symbol.st_value,
>>+			       symbol.st_shndx, sym_name_str,
>>			       sym_type, sym_binding,
>>			       symbol.st_shndx != SHN_UNDEF,
>>			       symbol.st_shndx == SHN_COMMON,
>>@@ -2181,6 +2188,7 @@ lookup_symbol_from_symtab(const environment*		env,
>>	    ABG_ASSERT(!ver.str().empty());
>>	  elf_symbol_sptr symbol_found =
>>	    elf_symbol::create(env, i, sym->st_size,
>>+			       sym->st_value, sym->st_shndx,
>>			       name_str, sym_type,
>>			       sym_binding, sym_is_defined,
>>			       sym_is_common, ver, sym_visibility);
>>@@ -3081,7 +3089,10 @@ public:
>>  /// to symbols exported using the EXPORT_SYMBOL_GPL macro from the
>>  /// linux kernel.
>>  Elf_Scn*			ksymtab_section_;
>>+  Elf_Scn*			ksymtab_reloc_section_;
>>  Elf_Scn*			ksymtab_gpl_section_;
>>+  Elf_Scn*			ksymtab_gpl_reloc_section_;
>>+  Elf_Scn*			ksymtab_strings_section_;
>>  Elf_Scn*			versym_section_;
>>  Elf_Scn*			verdef_section_;
>>  Elf_Scn*			verneed_section_;
>>@@ -3277,7 +3288,10 @@ public:
>>    nb_ksymtab_entries_ = 0;
>>    nb_ksymtab_gpl_entries_ = 0;
>>    ksymtab_section_ = 0;
>>+    ksymtab_reloc_section_ = 0;
>>    ksymtab_gpl_section_ = 0;
>>+    ksymtab_gpl_reloc_section_ = 0;
>>+    ksymtab_strings_section_ = 0;
>>    versym_section_ = 0;
>>    verdef_section_ = 0;
>>    verneed_section_ = 0;
>>@@ -6079,6 +6093,23 @@ public:
>>    return ksymtab_section_;
>>  }
>>
>>+  /// Return the .rel{a,}__ksymtab section of a linux kernel ELF file (either
>>+  /// a vmlinux binary or a kernel module).
>>+  ///
>>+  /// @return the .rel{a,}__ksymtab section if found, nil otherwise.
>>+  Elf_Scn*
>>+  find_ksymtab_reloc_section() const
>>+  {
>>+    if (!ksymtab_reloc_section_)
>>+      {
>>+        Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab", SHT_RELA);
>>+        if (!sec)
>>+          sec = find_section(elf_handle(), ".rel__ksymtab", SHT_REL);
>>+        const_cast<read_context*>(this)->ksymtab_reloc_section_ = sec;
>>+      }
>>+    return ksymtab_reloc_section_;
>>+  }
>>+
>>  /// Return the __ksymtab_gpl section of a linux kernel ELF file
>>  /// (either a vmlinux binary or a kernel module).
>>  ///
>>@@ -6092,6 +6123,36 @@ public:
>>    return ksymtab_gpl_section_;
>>  }
>>
>>+  /// Return the .rel{a,}__ksymtab_gpl section of a linux kernel ELF file
>>+  /// (either a vmlinux binary or a kernel module).
>>+  ///
>>+  /// @return the .rel{a,}__ksymtab_gpl section if found, nil otherwise.
>>+  Elf_Scn*
>>+  find_ksymtab_gpl_reloc_section() const
>>+  {
>>+    if (!ksymtab_gpl_reloc_section_)
>>+      {
>>+        Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab_gpl", SHT_RELA);
>>+        if (!sec)
>>+          sec = find_section(elf_handle(), ".rel__ksymtab_gpl", SHT_REL);
>>+        const_cast<read_context*>(this)->ksymtab_gpl_reloc_section_ = sec;
>>+      }
>>+    return ksymtab_gpl_reloc_section_;
>>+  }
>>+
>>+  /// Return the __ksymtab_strings section of a linux kernel ELF file
>>+  /// (either a vmlinux binary or a kernel module).
>>+  ///
>>+  /// @return the __ksymtab_strings section if found, nil otherwise.
>>+  Elf_Scn*
>>+  find_ksymtab_strings_section() const
>>+  {
>>+    if (!ksymtab_strings_section_)
>>+      const_cast<read_context*>(this)->ksymtab_strings_section_ =
>>+        find_section(elf_handle(), "__ksymtab_strings", SHT_PROGBITS);
>>+    return ksymtab_strings_section_;
>>+  }
>>+
>
>We should find a way to deduplicate these functions. Not necessarily in
>this patch, but in a followup. Most of them only differ in the section
>name and whether they check for an alternative. Something generic like
>
> struct SectionQuery{ string name, int type };
> Elf_Scn* find_section(const vector<SectionQuery>&);
>
>should be sufficient to forward from all the other variants to one
>implementation.
>
>>  /// Return either a __ksymtab or a __ksymtab_gpl section, in case
>>  /// only the __ksymtab_gpl exists.
>>  ///
>>@@ -6106,6 +6167,19 @@ public:
>>    return result;
>>  }
>>
>>+  /// Return either a .rel{a,}__ksymtab or a .rel{a,}__ksymtab_gpl section
>>+  ///
>>+  /// @return the .rel{a,}__ksymtab section if it exists, or the
>>+  /// .rel{a,}__ksymtab_gpl; or NULL if neither is found.
>>+  Elf_Scn*
>>+  find_any_ksymtab_reloc_section() const
>>+  {
>>+    Elf_Scn *result = find_ksymtab_reloc_section();
>>+    if (!result)
>>+      result = find_ksymtab_gpl_reloc_section();
>>+    return result;
>>+  }
>>+
>>  /// Return the SHT_GNU_versym, SHT_GNU_verdef and SHT_GNU_verneed
>>  /// sections that are involved in symbol versionning.
>>  ///
>>@@ -6283,7 +6357,8 @@ public:
>>      stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(s->st_other));
>>
>>    elf_symbol_sptr sym =
>>-      elf_symbol::create(env(), symbol_index, s->st_size, name_str,
>>+      elf_symbol::create(env(), symbol_index, s->st_size,
>>+			 s->st_value, s->st_shndx, name_str,
>>			 stt_to_elf_symbol_type(GELF_ST_TYPE(s->st_info)),
>>			 stb_to_elf_symbol_binding(GELF_ST_BIND(s->st_info)),
>>			 sym_is_defined, sym_is_common, ver, vis);
>>@@ -7435,6 +7510,64 @@ public:
>>    return symbol;
>>  }
>>
>>+  /// Determine the format of the __ksymtab and __ksymtab_gpl
>>+  /// sections of Linux kernel modules.
>>+  ///
>>+  /// This is important because we need the know the format of these
>
>s/need the/need to/                       ^^^
>
>>+  /// sections to be able to read from them.
>>+  ///
>>+  /// @return the format the __ksymtab[_gpl] sections.
>>+  enum ksymtab_format
>>+  get_ksymtab_format_module() const
>>+  {
>>+    Elf_Scn *section = find_any_ksymtab_reloc_section();
>>+
>>+    if (!section)
>>+      return UNDEFINED_KSYMTAB_FORMAT;
>>+
>>+    // Libdwfl has a weird quirk where, in the process of obtaining an Elf
>>+    // descriptor via dwfl_module_getelf(), it will apply all relocations it
>>+    // knows how to and it will zero a relocation after applying it. If the
>
>will zero the relocation info after
>
>>+    // .rela__ksymtab* section contained only simple (absolute) relocations,
>>+    // they will have been all applied and sh_size will be 0. For arches that
>>+    // support relative ksymtabs, simple relocations only appear in pre-4.19
>>+    // kernel modules.
>>+    GElf_Shdr section_mem;
>>+    GElf_Shdr *section_shdr = gelf_getshdr(section, &section_mem);
>>+    if (section_shdr->sh_size == 0)
>>+      return PRE_V4_19_KSYMTAB_FORMAT;
>>+    /* SHT_REL sections not supported yet */
>>+    if (section_shdr->sh_type == SHT_REL)
>>+      return UNDEFINED_KSYMTAB_FORMAT;
>>+
>>+    // If we still have a normal non-zeroed relocation section, we can guess
>>+    // what format the ksymtab is in depending on what types of relocs it
>>+    // contains.
>>+    GElf_Rela rela;
>>+    Elf_Data *rela_section_data = elf_rawdata(section, 0);
>>+    gelf_getrela(rela_section_data, 0, &rela);
>>+
>>+    // Sigh, I dislike the arch-dependent code here, but this seems to be a
>>+    // reliable heuristic for kernel modules for now. Relative ksymtabs only
>>+    // supported on x86 and arm64 as of v4.19.
>>+    ksymtab_format format;
>>+    switch (GELF_R_TYPE(rela.r_info))
>>+      {
>>+      case R_X86_64_64: // Same as R_386_32
>
>add 'fallthrough' to the comments, else younger compilers will warn
>about the potentially unintentional fallthrough.
>
>>+      case R_AARCH64_ABS64:
>>+        format = PRE_V4_19_KSYMTAB_FORMAT;
>>+	break;
>>+      case R_X86_64_PC32: // Same as R_386_PC32
>>+      case R_AARCH64_PREL32:
>>+	format = V4_19_KSYMTAB_FORMAT;
>>+	break;
>>+      default:
>>+        format = UNDEFINED_KSYMTAB_FORMAT;
>
>We should probably emit a warning in this case. Are there more
>architectures that have PREL32 enabled as of now that are not yet
>covered here?
>
>>+	break;
>>+      }
>>+    return format;
>>+  }
>>+
>>  /// Determine the format of the __ksymtab and __ksymtab_gpl
>>  /// sections.
>>  ///
>>@@ -7451,6 +7584,18 @@ public:
>>      {
>>	if (ksymtab_format_ == UNDEFINED_KSYMTAB_FORMAT)
>>	  {
>>+            // Since Linux kernel modules are relocatable, we can first try
>>+            // using a heuristic based on relocations to guess the ksymtab format.
>>+            if (is_linux_kernel_module())
>>+              {
>>+                ksymtab_format_ = get_ksymtab_format_module();
>>+                if (ksymtab_format_ != UNDEFINED_KSYMTAB_FORMAT)
>>+                  return ksymtab_format_;
>>+              }
>
>nit: use tabs
>
>>+
>>+	    // If it's not a kernel module or we couldn't determine its format
>>+	    // with relocations, fall back to the heuristic below.
>>+
>>	    // OK this is a dirty little heuristic to determine the
>>	    // format of the ksymtab section.
>>	    //
>>@@ -7553,48 +7698,24 @@ public:
>>    return nb_ksymtab_gpl_entries_;
>>  }
>>
>>-  /// Load a given kernel symbol table.
>>+  /// Populate the symbol map by reading exported symbols from the
>>+  /// ksymtab directly.
>>  ///
>>-  /// One can thus retrieve the resulting symbols by using the
>>-  /// accessors read_context::linux_exported_fn_syms(),
>>-  /// read_context::linux_exported_var_syms(),
>>-  /// read_context::linux_exported_gpl_fn_syms(), or
>>-  /// read_context::linux_exported_gpl_var_syms().
>>+  /// @param section the ksymtab section to read from
>>  ///
>>-  /// @param kind the kind of kernel symbol table to load.
>>+  /// @param exported_fns_set the set of exported functions
>>+  ///
>>+  /// @param exported_vars_set the set of exported variables
>>+  ///
>>+  /// @param nb_entries the number of ksymtab entries to read
>>  ///
>>  /// @return true upon successful completion, false otherwise.
>>  bool
>>-  load_kernel_symbol_table(kernel_symbol_table_kind kind)
>>+  populate_symbol_map_from_ksymtab(Elf_Scn *section,
>>+                                   address_set_sptr exported_fns_set,
>>+                                   address_set_sptr exported_vars_set,
>>+                                   size_t nb_entries)
>>  {
>>-    size_t nb_entries = 0;
>>-    Elf_Scn *section = 0;
>>-    address_set_sptr linux_exported_fns_set, linux_exported_vars_set;
>>-
>>-    switch (kind)
>>-      {
>>-      case KERNEL_SYMBOL_TABLE_KIND_UNDEFINED:
>>-	break;
>>-      case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB:
>>-	section = find_ksymtab_section();
>>-	nb_entries = get_nb_ksymtab_entries();
>>-	linux_exported_fns_set = create_or_get_linux_exported_fn_syms();
>>-	linux_exported_vars_set = create_or_get_linux_exported_var_syms();
>>-	break;
>>-      case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL:
>>-	section = find_ksymtab_gpl_section();
>>-	nb_entries = get_nb_ksymtab_gpl_entries();
>>-	linux_exported_fns_set = create_or_get_linux_exported_gpl_fn_syms();
>>-	linux_exported_vars_set = create_or_get_linux_exported_gpl_var_syms();
>>-	break;
>>-      }
>>-
>>-    if (!linux_exported_vars_set
>>-	|| !linux_exported_fns_set
>>-	|| !section
>>-	|| !nb_entries)
>>-      return false;
>>-
>>    // The data of the section.
>>    Elf_Data *elf_data = elf_rawdata(section, 0);
>>
>>@@ -7654,28 +7775,30 @@ public:
>>	// (aka ELF symbol table).
>>	symbol = lookup_elf_symbol_from_address(adjusted_symbol_address);
>>	if (!symbol)
>>-	  {
>>-	    adjusted_symbol_address =
>>-	      maybe_adjust_var_sym_address(symbol_address);
>>-	    symbol = lookup_elf_symbol_from_address(adjusted_symbol_address);
>>-	    if (!symbol)
>>-	      // This must be a symbol that is of type neither FUNC
>>-	      // (function) nor OBJECT (variable).  There are for intance,
>>-	      // symbols of type 'NOTYPE' in the ksymtab symbol table.  I
>>-	      // am not sure what those are.
>>-	      continue;
>>-	  }
>>+          {
>>+            adjusted_symbol_address =
>>+              maybe_adjust_var_sym_address(symbol_address);
>>+            symbol = lookup_elf_symbol_from_address(adjusted_symbol_address);
>>+            if (!symbol)
>>+              {
>>+	        // This must be a symbol that is of type neither FUNC
>>+	        // (function) nor OBJECT (variable).  There are for intance,
>>+	        // symbols of type 'NOTYPE' in the ksymtab symbol table.  I
>>+	        // am not sure what those are.
>>+	        continue;
>>+	      }
>>+          }
>
>This part of the diff seems to a change purely of the indentation/tabs.
>Maybe you can revert this part.
>
>>
>>	address_set_sptr set;
>>	if (symbol->is_function())
>>	  {
>>	    ABG_ASSERT(lookup_elf_fn_symbol_from_address(adjusted_symbol_address));
>>-	    set = linux_exported_fns_set;
>>+	    set = exported_fns_set;
>>	  }
>>	else if (symbol->is_variable())
>>	  {
>>	    ABG_ASSERT(lookup_elf_var_symbol_from_address(adjusted_symbol_address));
>>-	    set = linux_exported_vars_set;
>>+	    set = exported_vars_set;
>>	  }
>>	else
>>	  ABG_ASSERT_NOT_REACHED;
>>@@ -7684,6 +7807,157 @@ public:
>>    return true;
>>  }
>>
>>+  /// Populate the symbol map by extracting the exported symbols from a
>>+  /// ksymtab rela section.
>>+  ///
>>+  /// @param section the ksymtab section to read from
>>+  ///
>>+  /// @param strings_section the __ksymtab_strings section
>>+  ///
>>+  /// @param exported_fns_set the set of exported functions
>>+  ///
>>+  /// @param exported_vars_set the set of exported variables
>>+  ///
>>+  /// @return true upon successful completion, false otherwise.
>>+  bool
>>+  populate_symbol_map_from_ksymtab_reloc(Elf_Scn *reloc_section, Elf_Scn *strings_section,
>>+                                         address_set_sptr exported_fns_set,
>>+                                         address_set_sptr exported_vars_set)
>>+  {
>>+    GElf_Shdr reloc_section_mem;
>>+    GElf_Shdr *reloc_section_shdr = gelf_getshdr(reloc_section, &reloc_section_mem);
>
>gelf_getshdr returns &reloc_section_mem. So you might want to use
>reloc_section_mem directly rather than introducing reloc_section_mem.
>
>>+    size_t reloc_count = reloc_section_shdr->sh_size / reloc_section_shdr->sh_entsize;
>>+
>>+    Elf_Data *reloc_section_data = elf_rawdata(reloc_section, 0);
>>+    size_t strings_ndx = elf_ndxscn(strings_section);
>>+
>>+    GElf_Rela rela;
>>+    elf_symbol_sptr symbol;
>>+    for (unsigned int i = 0; i < reloc_count; i++)
>>+      {
>>+        gelf_getrela(reloc_section_data, i, &rela);
>>+        symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info));
>>+
>>+        // This shouldn't happen..
>>+        if (!symbol)
>
>Maybe ABG_ASSERT then?
>
>>+          return false;
>>+
>>+        // If the symbol is from __ksymtab_strings, ignore it.
>>+        if (symbol->get_shndx() == strings_ndx)
>>+          continue;
>>+
>>+	if (!symbol->is_function() && !symbol->is_variable())
>>+          {
>>+            if (do_log())
>>+              {
>>+                if (symbol->get_type() == elf_symbol::NOTYPE_TYPE)
>>+                  cerr << "skipping NOTYPE symbol "
>>+                       << symbol->get_name()
>>+                       << " shndx: "
>>+                       << symbol->get_index()
>>+                       << " @"
>>+                       << elf_path()
>>+                       << "\n";
>>+                else if (symbol->get_type() == elf_symbol::SECTION_TYPE)
>>+                  cerr << "skipping SECTION symbol "
>>+                       << "shndx: "
>>+                       << symbol->get_index()
>>+                       << " @"
>>+                       << elf_path()
>>+                       << "\n";
>>+               }
>>+            continue;
>>+	  }
>>+
>>+	address_set_sptr set;
>>+	if (symbol->is_function())
>>+	  {
>>+	    ABG_ASSERT(lookup_elf_fn_symbol_from_address(symbol->get_value()));
>>+	    set = exported_fns_set;
>>+	  }
>>+	else if (symbol->is_variable())
>>+	  {
>>+	    ABG_ASSERT(lookup_elf_var_symbol_from_address(symbol->get_value()));
>>+	    set = exported_vars_set;
>>+	  }
>>+	else
>>+	  ABG_ASSERT_NOT_REACHED;
>>+	set->insert(symbol->get_value());
>>+      }
>>+    return true;
>>+  }
>>+
>>+  /// Load a given kernel symbol table.
>>+  ///
>>+  /// One can thus retrieve the resulting symbols by using the
>>+  /// accessors read_context::linux_exported_fn_syms(),
>>+  /// read_context::linux_exported_var_syms(),
>>+  /// read_context::linux_exported_gpl_fn_syms(), or
>>+  /// read_context::linux_exported_gpl_var_syms().
>>+  ///
>>+  /// @param kind the kind of kernel symbol table to load.
>>+  ///
>>+  /// @return true upon successful completion, false otherwise.
>>+  bool
>>+  load_kernel_symbol_table(kernel_symbol_table_kind kind)
>>+  {
>>+    size_t nb_entries = 0;
>>+    Elf_Scn *section = 0, *reloc_section = 0, *strings_section = 0;
>>+    address_set_sptr linux_exported_fns_set, linux_exported_vars_set;
>>+
>>+    switch (kind)
>>+      {
>>+      case KERNEL_SYMBOL_TABLE_KIND_UNDEFINED:
>>+	break;
>>+      case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB:
>>+	section = find_ksymtab_section();
>>+	reloc_section = find_ksymtab_reloc_section();
>>+	nb_entries = get_nb_ksymtab_entries();
>>+	linux_exported_fns_set = create_or_get_linux_exported_fn_syms();
>>+	linux_exported_vars_set = create_or_get_linux_exported_var_syms();
>>+	break;
>>+      case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL:
>>+	section = find_ksymtab_gpl_section();
>>+	reloc_section = find_ksymtab_gpl_reloc_section();
>>+	nb_entries = get_nb_ksymtab_gpl_entries();
>>+	linux_exported_fns_set = create_or_get_linux_exported_gpl_fn_syms();
>>+	linux_exported_vars_set = create_or_get_linux_exported_gpl_var_syms();
>>+	break;
>>+      }
>>+
>>+    strings_section = find_ksymtab_strings_section();
>>+
>>+    if (!linux_exported_vars_set
>>+	|| !linux_exported_fns_set
>>+	|| !section
>>+	|| !strings_section
>>+	|| !nb_entries)
>>+      return false;
>>+
>>+    GElf_Ehdr eh_mem;
>>+    GElf_Ehdr* elf_header = gelf_getehdr(elf_handle(), &eh_mem);
>>+    ksymtab_format format = get_ksymtab_format();
>>+
>>+    // Although pre-v4.19 kernel modules can have a relocation section for the
>>+    // __ksymtab section, libdwfl zeroes the rela section after applying
>>+    // "simple" absolute relocations via dwfl_module_getelf(). For v4.19 and
>>+    // above, we get PC-relative relocations so dwfl_module_getelf() doesn't
>>+    // apply those relocations and we're safe to read the relocation section to
>>+    // determine which exported symbols are in the ksymtab.
>>+    //
>>+    // Note: For 32-bit binaries, only populate_symbol_map_from_ksymtab() is
>>+    // supported as of now.
>>+    if (!reloc_section
>>+	|| format == PRE_V4_19_KSYMTAB_FORMAT
>>+	|| elf_header->e_ident[EI_CLASS] == ELFCLASS32)
>>+      return populate_symbol_map_from_ksymtab(section, linux_exported_fns_set,
>>+                                              linux_exported_vars_set, nb_entries);
>>+    else
>>+      return populate_symbol_map_from_ksymtab_reloc(reloc_section, strings_section,
>>+                                                    linux_exported_fns_set,
>>+                                                    linux_exported_vars_set);
>>+  }
>>+
>>  /// Load the special __ksymtab section. This is for linux kernel
>>  /// (module) files.
>>  ///
>>@@ -8352,6 +8626,17 @@ public:
>>  is_linux_kernel_binary() const
>>  {return find_section(elf_handle(), "__ksymtab_strings", SHT_PROGBITS);}
>>
>>+  /// Guess if the current binary is a Linux Kernel module.
>>+  ///
>>+  /// To guess that, the function looks for the presence of the
>>+  /// special ".modinfo" section in the binary.
>>+  ///
>>+  bool
>>+  is_linux_kernel_module() const
>>+  {
>>+    return find_section(elf_handle(), ".modinfo", SHT_PROGBITS);
>>+  }
>>+
>
>Also refer to this proposal to restrict checking for a module a bit
>more: https://sourceware.org/ml/libabigail/2019-q3/msg00022.html
>
>>  /// Getter of the "show_stats" flag.
>>  ///
>>  /// This flag tells if we should emit statistics about various
>>@@ -15734,6 +16019,8 @@ create_default_var_sym(const string& sym_name, const environment *env)
>>    elf_symbol::create(env,
>>		       /*symbol index=*/ 0,
>>		       /*symbol size=*/ 0,
>>+		       /*symbol value=*/ 0,
>>+		       /*symbol shndx=*/ 0,
>>		       sym_name,
>>		       /*symbol type=*/ elf_symbol::OBJECT_TYPE,
>>		       /*symbol binding=*/ elf_symbol::GLOBAL_BINDING,
>>@@ -16173,6 +16460,8 @@ create_default_fn_sym(const string& sym_name, const environment *env)
>>    elf_symbol::create(env,
>>		       /*symbol index=*/ 0,
>>		       /*symbol size=*/ 0,
>>+		       /*symbol value=*/ 0,
>>+		       /*symbol shndx=*/ 0,
>>		       sym_name,
>>		       /*symbol type=*/ elf_symbol::FUNC_TYPE,
>>		       /*symbol binding=*/ elf_symbol::GLOBAL_BINDING,
>>diff --git a/src/abg-ir.cc b/src/abg-ir.cc
>>index a5243c3..ad1c261 100644
>>--- a/src/abg-ir.cc
>>+++ b/src/abg-ir.cc
>>@@ -1169,6 +1169,8 @@ struct elf_symbol::priv
>>  const environment*	env_;
>>  size_t		index_;
>>  size_t		size_;
>>+  uint64_t		value_;
>>+  uint32_t		shndx_;
>>  string		name_;
>>  elf_symbol::type	type_;
>>  elf_symbol::binding	binding_;
>>@@ -1213,6 +1215,8 @@ struct elf_symbol::priv
>>    : env_(),
>>      index_(),
>>      size_(),
>>+      value_(),
>>+      shndx_(),
>>      type_(elf_symbol::NOTYPE_TYPE),
>>      binding_(elf_symbol::GLOBAL_BINDING),
>>      visibility_(elf_symbol::DEFAULT_VISIBILITY),
>>@@ -1223,6 +1227,8 @@ struct elf_symbol::priv
>>  priv(const environment*		e,
>>       size_t				i,
>>       size_t				s,
>>+       uint64_t				v,
>>+       uint32_t				x,
>>       const string&			n,
>>       elf_symbol::type		t,
>>       elf_symbol::binding		b,
>>@@ -1233,6 +1239,8 @@ struct elf_symbol::priv
>>    : env_(e),
>>      index_(i),
>>      size_(s),
>>+      value_(v),
>>+      shndx_(x),
>>      name_(n),
>>      type_(t),
>>      binding_(b),
>>@@ -1269,6 +1277,10 @@ elf_symbol::elf_symbol()
>>///
>>/// @param s the size of the symbol.
>>///
>>+/// @param v the value of the symbol.
>>+///
>>+/// @param x the section header index (shndx) of the symbol.
>>+///
>>/// @param n the name of the symbol.
>>///
>>/// @param t the type of the symbol.
>>@@ -1285,6 +1297,8 @@ elf_symbol::elf_symbol()
>>elf_symbol::elf_symbol(const environment*	e,
>>		       size_t			i,
>>		       size_t			s,
>>+		       uint64_t			v,
>>+		       uint32_t			x,
>>		       const string&		n,
>>		       type			t,
>>		       binding			b,
>>@@ -1292,7 +1306,7 @@ elf_symbol::elf_symbol(const environment*	e,
>>		       bool			c,
>>		       const version&		ve,
>>		       visibility		vi)
>>-  : priv_(new priv(e, i, s, n, t, b, d, c, ve, vi))
>>+  : priv_(new priv(e, i, s, v, x, n, t, b, d, c, ve, vi))
>>{}
>>
>>/// Factory of instances of @ref elf_symbol.
>>@@ -1319,6 +1333,10 @@ elf_symbol::create()
>>///
>>/// @param s the size of the symbol.
>>///
>>+/// @param v the value of the symbol.
>>+///
>>+/// @param x the section header index (shndx) of the symbol.
>>+///
>>/// @param n the name of the symbol.
>>///
>>/// @param t the type of the symbol.
>>@@ -1339,6 +1357,8 @@ elf_symbol_sptr
>>elf_symbol::create(const environment*	e,
>>		   size_t		i,
>>		   size_t		s,
>>+		   uint64_t		v,
>>+		   uint32_t		x,
>>		   const string&	n,
>>		   type		t,
>>		   binding		b,
>>@@ -1347,7 +1367,7 @@ elf_symbol::create(const environment*	e,
>>		   const version&	ve,
>>		   visibility		vi)
>>{
>>-  elf_symbol_sptr sym(new elf_symbol(e, i, s, n, t, b, d, c, ve, vi));
>>+  elf_symbol_sptr sym(new elf_symbol(e, i, s, v, x, n, t, b, d, c, ve, vi));
>>  sym->priv_->main_symbol_ = sym;
>>  return sym;
>>}
>>@@ -1411,6 +1431,20 @@ void
>>elf_symbol::set_index(size_t s)
>>{priv_->index_ = s;}
>>
>>+/// Getter for the symbol value.
>>+///
>>+/// @return the value of the symbol.
>>+uint64_t
>>+elf_symbol::get_value() const
>>+{return priv_->value_;}
>>+
>>+/// Getter for the symbol section header index.
>>+///
>>+/// @return the section header index of the symbol.
>>+uint32_t
>>+elf_symbol::get_shndx() const
>>+{return priv_->shndx_;}
>>+
>>/// Getter for the name of the @ref elf_symbol.
>>///
>>/// @return a reference to the name of the @ref symbol.
>>diff --git a/src/abg-reader.cc b/src/abg-reader.cc
>>index 810a65f..3c481ad 100644
>>--- a/src/abg-reader.cc
>>+++ b/src/abg-reader.cc
>>@@ -2659,6 +2659,8 @@ build_elf_symbol(read_context& ctxt, const xmlNodePtr node)
>>
>>  const environment* env = ctxt.get_environment();
>>  elf_symbol_sptr e = elf_symbol::create(env, /*index=*/0, size,
>>+					 /*value=*/0,
>>+					 /*shndx=*/0,
>>					 name, type, binding,
>>					 is_defined, is_common,
>>					 version, visibility);
>>-- 
>>2.16.4
>>
>
>Nice work! Thank you for that!
>The patch looks in general good to me. I am a bit concerned about the
>platform support. There are some corner cases that need to be covered.
>See the testing below.
>
>You might find the .clang-format file useful to match the coding style
>at some points (especially tabs vs spaces). Although when reviewing the
>patch, I had a bit trouble with unchanged but reformatted lines as they
>were confusing my diff. Maybe you can revert the changes that were
>purely for formatting and we take care of formatting in a later patch.
>
>I conducted some testing. I choose a minimal configuration with modules:
>
> $ export ARCH=<arch>
> $ make allnoconfig
> $ ./scripts/config   \
>     -e DEBUG_INFO    \  # dwarf info
>     -e MODULES       \  # module support
>     -e USB_SUPPORT   \
>     -m USB           \
>     -m USB_DUMMY_HCD \  # dummy_hcd.ko is a module without exports
>     -m USB_GADGET       # depends on usb-common.ko, which I test with
>
> $ make olddefconfig
> $ make
> $ abidw ./drivers/usb/common/usb-common.ko | less
> # then check for the existence of the elf-function-symbols section
>
>                    master         patched
>Linux v4.18
>
> x86_64             works            works
> x86                works            works
> arm64              works            works
>
>Linux v4.19
>
> x86_64             broken           works
> x86                broken           STILL BROKEN
> arm64              broken           works
>
>So, for x86_64 and arm64 the situation got better and elf symbols are
>now identified. For x86, this patch did not improve the situation. I did
>not check that the list of symbols are entirely correct and that they
>work downstream with abidiff etc. Just the pure existence is a very good
>indicator for this patch to work.

Hi Matthias! Thanks a bunch for your review! :-)

I'm aware of the x86 issue and I think the diff below should fix it.
Could you please let me know if it passes your x86 testing? (note, I have
yet to address your other comments, so this is to be continued!)

diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index 835bf3d..ed8a8db 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -7536,22 +7536,33 @@ public:
     GElf_Shdr *section_shdr = gelf_getshdr(section, &section_mem);
     if (section_shdr->sh_size == 0)
       return PRE_V4_19_KSYMTAB_FORMAT;
-    /* SHT_REL sections not supported yet */
-    if (section_shdr->sh_type == SHT_REL)
-      return UNDEFINED_KSYMTAB_FORMAT;
+
+    bool is_relasec = (section_shdr->sh_type == SHT_RELA);
 
     // If we still have a normal non-zeroed relocation section, we can guess
     // what format the ksymtab is in depending on what types of relocs it
     // contains.
-    GElf_Rela rela;
-    Elf_Data *rela_section_data = elf_rawdata(section, 0);
-    gelf_getrela(rela_section_data, 0, &rela);
+
+    int type;
+    Elf_Data *section_data = elf_rawdata(section, 0);
+    if (is_relasec)
+      {
+        GElf_Rela rela;
+        gelf_getrela(section_data, 0, &rela);
+        type = GELF_R_TYPE(rela.r_info);
+      }
+    else
+      {
+        GElf_Rel rel;
+        gelf_getrel(section_data, 0, &rel);
+        type = GELF_R_TYPE(rel.r_info);
+      }
 
     // Sigh, I dislike the arch-dependent code here, but this seems to be a
     // reliable heuristic for kernel modules for now. Relative ksymtabs only
     // supported on x86 and arm64 as of v4.19.
     ksymtab_format format;
-    switch (GELF_R_TYPE(rela.r_info))
+    switch (type)
       {
       case R_X86_64_64: // Same as R_386_32
       case R_AARCH64_ABS64:
@@ -7831,12 +7842,22 @@ public:
     Elf_Data *reloc_section_data = elf_rawdata(reloc_section, 0);
     size_t strings_ndx = elf_ndxscn(strings_section);
 
-    GElf_Rela rela;
+    bool is_relasec = (reloc_section_shdr->sh_type == SHT_RELA);
     elf_symbol_sptr symbol;
     for (unsigned int i = 0; i < reloc_count; i++)
       {
-        gelf_getrela(reloc_section_data, i, &rela);
-        symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info));
+        if (is_relasec)
+          {
+            GElf_Rela rela;
+            gelf_getrela(reloc_section_data, i, &rela);
+            symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info));
+          }
+        else
+          {
+            GElf_Rel rel;
+            gelf_getrel(reloc_section_data, i, &rel);
+            symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rel.r_info));
+          }
 
         // This shouldn't happen..
         if (!symbol)
@@ -7934,8 +7955,6 @@ public:
 	|| !nb_entries)
       return false;
 
-    GElf_Ehdr eh_mem;
-    GElf_Ehdr* elf_header = gelf_getehdr(elf_handle(), &eh_mem);
     ksymtab_format format = get_ksymtab_format();
 
     // Although pre-v4.19 kernel modules can have a relocation section for the
@@ -7947,9 +7966,7 @@ public:
     //
     // Note: For 32-bit binaries, only populate_symbol_map_from_ksymtab() is
     // supported as of now.
-    if (!reloc_section
-	|| format == PRE_V4_19_KSYMTAB_FORMAT
-	|| elf_header->e_ident[EI_CLASS] == ELFCLASS32)
+    if (!reloc_section || format == PRE_V4_19_KSYMTAB_FORMAT)
       return populate_symbol_map_from_ksymtab(section, linux_exported_fns_set,
                                               linux_exported_vars_set, nb_entries);
     else

  parent reply	other threads:[~2019-08-21 12:34 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-01  0:00 Jessica Yu
2019-01-01  0:00 ` Matthias Maennich via libabigail
2019-01-01  0:00   ` Jessica Yu
2019-01-01  0:00     ` Matthias Maennich via libabigail
2019-01-01  0:00   ` Jessica Yu [this message]
2019-01-01  0:00     ` Matthias Maennich via libabigail
2019-01-01  0:00       ` Jessica Yu
2019-01-01  0:00         ` Matthias Maennich via libabigail
2019-01-01  0:00   ` Jessica Yu
2019-01-01  0:00     ` Matthias Maennich via libabigail
2019-01-01  0:00       ` Jessica Yu
2019-01-01  0:00 ` Jessica Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190821123436.GB8668@linux-8ccs \
    --to=jeyu@kernel.org \
    --cc=dodji@redhat.com \
    --cc=kernel-team@android.com \
    --cc=libabigail@sourceware.org \
    --cc=maennich@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).