From: Jessica Yu <jeyu@kernel.org>
To: Matthias Maennich <maennich@google.com>
Cc: libabigail@sourceware.org, Dodji Seketeli <dodji@redhat.com>,
kernel-team@android.com
Subject: Re: [RFC PATCH] Support pre and post v4.19 ksymtabs for Linux kernel modules
Date: Tue, 01 Jan 2019 00:00:00 -0000 [thread overview]
Message-ID: <20190821123436.GB8668@linux-8ccs> (raw)
In-Reply-To: <20190819080508.GA189145@google.com>
+++ Matthias Maennich [19/08/19 09:05 +0100]:
>Hi Jessica!
>
>On Fri, 09 Aug, 17:49, Jessica Yu wrote:
>>As described in commit ad8c2531fb9, the format of the Linux kernel
>>ksymtab changed in v4.19 to use relative references instead of absolute
>>references. This changes the type of relocations emitted for ksymtab
>>sections to be place-relative 32-bit relocations instead of absolute
>>relocations. One side-effect of this is that libdwfl will not relocate
>>the ksymtab sections due to the PC-relative relocations. This breaks
>>load_kernel_symbol_table() for kernel modules because it only reads in
>>zeros from the unrelocated ksymtab section and is subsequently unable to
>>determine what exported symbols it refers to. Since a vmlinux binary is
>>already fully linked and relocated (and therefore we can read its
>>ksymtab section just fine), this problem is only relevant to Linux
>>kernel modules.
>>
>>To work around this, we utilize the ksymtab relocation sections to
>>determine which symbols the ksymtab entries refer to. We do this by
>>inspecting each relocation's r_info field for the symbol table index and
>>from there we are able to read each symbol's value and subsequently add
>>that to the set of exported symbols.
>>
>>In addition, for Linux kernel modules, we can utilize relocation types
>>to implement a new heuristic to determine the ksymtab format we have.
>>The presence of PC-relative relocations suggest the new v4.19 format,
>>and absolute relocation types suggest the old pre v4.19 format.
>>
>> * include/abg-ir.h (elf_symbol::{elf_symbol, create}): Take new
>> symbol value and shndx parameters.
>> (elf_symbol::{get_value, get_shndx}): Declare new accessors.
>> * src/abg-ir.cc (elf_symbol::priv::{value_, shndx_}): New data members.
>> (elf_symbol::priv::priv): Adjust.
>> (elf_symbol::elf_symbol): Take new value and shndx parameters.
>> (elf_symbol::create): Likewise.
>> (elf_symbol::{get_value, get_shndx}): Define new accessors.
>> * src/abg-reader.cc (build_elf_symbol): Adjust.
>> * src/abg-dwarf-reader.cc (lookup_symbol_from_sysv_hash_tab)
>> (lookup_symbol_from_gnu_hash_tab)
>> (lookup_symbol_from_symtab): Adjust.
>> (read_context::{ksymtab_reloc_section_,
>> ksymtab_gpl_reloc_section_,
>> ksymtab_strings_section_}): New data members.
>> (read_context::read_context): Initialize ksymtab_reloc_section_,
>> ksymtab_gpl_reloc_section_, ksymtab_strings_section_.
>> (read_context::{find_ksymtab_reloc_section,
>> find_ksymtab_gpl_reloc_section, find_ksymtab_strings_section,
>> find_any_ksymtab_reloc_section, get_ksymtab_format_module,
>> populate_symbol_map_from_ksymtab,
>> populate_symbol_map_from_ksymtab_reloc,
>> is_linux_kernel_module}): New member functions.
>> (read_context::load_kernel_symbol_table): Adjust to call either
>> populate_symbol_map_from_ksymtab{_reloc,} depending on ksymtab
>> format.
>> (read_context::get_ksymtab_format): Adjust to call
>> get_ksymtab_format_module for linux kernel modules.
>> (read_context::lookup_elf_symbol_from_index): Adjust.
>> (create_default_var_sym, create_default_fn_sym): Adjust.
>>
>>Signed-off-by: Jessica Yu <jeyu@kernel.org>
>>---
>>include/abg-ir.h | 10 ++
>>src/abg-dwarf-reader.cc | 391 +++++++++++++++++++++++++++++++++++++++++-------
>>src/abg-ir.cc | 38 ++++-
>>src/abg-reader.cc | 2 +
>>4 files changed, 388 insertions(+), 53 deletions(-)
>>
>>diff --git a/include/abg-ir.h b/include/abg-ir.h
>>index 170a844..d26f472 100644
>>--- a/include/abg-ir.h
>>+++ b/include/abg-ir.h
>>@@ -796,6 +796,8 @@ private:
>> elf_symbol(const environment* e,
>> size_t i,
>> size_t s,
>>+ uint64_t v,
>>+ uint32_t x,
>
>I would like to see some comments here to describe the purpose of the
>struct's members. I know there weren't any before, but maybe that is a
>good chance.
>
>Also, I am not entirely sure why these are all one-character names, but
>it makes it hard to come from x to "section header index". Maybe we
>should give them more meaningful names.
>
>> const string& n,
>> type t,
>> binding b,
>>@@ -818,6 +820,8 @@ public:
>> create(const environment* e,
>> size_t i,
>> size_t s,
>>+ uint64_t v,
>>+ uint32_t x,
>> const string& n,
>> type t,
>> binding b,
>>@@ -838,6 +842,12 @@ public:
>> void
>> set_index(size_t);
>>
>>+ uint64_t
>>+ get_value() const;
>>+
>>+ uint32_t
>>+ get_shndx() const;
>>+
>> const string&
>> get_name() const;
>>
>>diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
>>index 47af7e3..835bf3d 100644
>>--- a/src/abg-dwarf-reader.cc
>>+++ b/src/abg-dwarf-reader.cc
>>@@ -1724,6 +1724,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env,
>> GElf_Sym symbol;
>> const char* sym_name_str;
>> size_t sym_size;
>>+ uint64_t sym_value;
>>+ uint32_t sym_shndx;
>> elf_symbol::type sym_type;
>> elf_symbol::binding sym_binding;
>> elf_symbol::visibility sym_visibility;
>>@@ -1743,6 +1745,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env,
>> sym_visibility =
>> stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(symbol.st_other));
>> sym_size = symbol.st_size;
>>+ sym_value = symbol.st_value;
>>+ sym_shndx = symbol.st_shndx;
>> elf_symbol::version ver;
>> if (get_version_for_symbol(elf_handle, symbol_index,
>> /*get_def_version=*/true, ver))
>>@@ -1751,6 +1755,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env,
>> elf_symbol::create(env,
>> symbol_index,
>> sym_size,
>>+ sym_value,
>>+ sym_shndx,
>> sym_name_str,
>> sym_type,
>> sym_binding,
>>@@ -2028,7 +2034,8 @@ lookup_symbol_from_gnu_hash_tab(const environment* env,
>> ABG_ASSERT(!ver.str().empty());
>>
>> elf_symbol_sptr symbol_found =
>>- elf_symbol::create(env, i, symbol.st_size, sym_name_str,
>>+ elf_symbol::create(env, i, symbol.st_size, symbol.st_value,
>>+ symbol.st_shndx, sym_name_str,
>> sym_type, sym_binding,
>> symbol.st_shndx != SHN_UNDEF,
>> symbol.st_shndx == SHN_COMMON,
>>@@ -2181,6 +2188,7 @@ lookup_symbol_from_symtab(const environment* env,
>> ABG_ASSERT(!ver.str().empty());
>> elf_symbol_sptr symbol_found =
>> elf_symbol::create(env, i, sym->st_size,
>>+ sym->st_value, sym->st_shndx,
>> name_str, sym_type,
>> sym_binding, sym_is_defined,
>> sym_is_common, ver, sym_visibility);
>>@@ -3081,7 +3089,10 @@ public:
>> /// to symbols exported using the EXPORT_SYMBOL_GPL macro from the
>> /// linux kernel.
>> Elf_Scn* ksymtab_section_;
>>+ Elf_Scn* ksymtab_reloc_section_;
>> Elf_Scn* ksymtab_gpl_section_;
>>+ Elf_Scn* ksymtab_gpl_reloc_section_;
>>+ Elf_Scn* ksymtab_strings_section_;
>> Elf_Scn* versym_section_;
>> Elf_Scn* verdef_section_;
>> Elf_Scn* verneed_section_;
>>@@ -3277,7 +3288,10 @@ public:
>> nb_ksymtab_entries_ = 0;
>> nb_ksymtab_gpl_entries_ = 0;
>> ksymtab_section_ = 0;
>>+ ksymtab_reloc_section_ = 0;
>> ksymtab_gpl_section_ = 0;
>>+ ksymtab_gpl_reloc_section_ = 0;
>>+ ksymtab_strings_section_ = 0;
>> versym_section_ = 0;
>> verdef_section_ = 0;
>> verneed_section_ = 0;
>>@@ -6079,6 +6093,23 @@ public:
>> return ksymtab_section_;
>> }
>>
>>+ /// Return the .rel{a,}__ksymtab section of a linux kernel ELF file (either
>>+ /// a vmlinux binary or a kernel module).
>>+ ///
>>+ /// @return the .rel{a,}__ksymtab section if found, nil otherwise.
>>+ Elf_Scn*
>>+ find_ksymtab_reloc_section() const
>>+ {
>>+ if (!ksymtab_reloc_section_)
>>+ {
>>+ Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab", SHT_RELA);
>>+ if (!sec)
>>+ sec = find_section(elf_handle(), ".rel__ksymtab", SHT_REL);
>>+ const_cast<read_context*>(this)->ksymtab_reloc_section_ = sec;
>>+ }
>>+ return ksymtab_reloc_section_;
>>+ }
>>+
>> /// Return the __ksymtab_gpl section of a linux kernel ELF file
>> /// (either a vmlinux binary or a kernel module).
>> ///
>>@@ -6092,6 +6123,36 @@ public:
>> return ksymtab_gpl_section_;
>> }
>>
>>+ /// Return the .rel{a,}__ksymtab_gpl section of a linux kernel ELF file
>>+ /// (either a vmlinux binary or a kernel module).
>>+ ///
>>+ /// @return the .rel{a,}__ksymtab_gpl section if found, nil otherwise.
>>+ Elf_Scn*
>>+ find_ksymtab_gpl_reloc_section() const
>>+ {
>>+ if (!ksymtab_gpl_reloc_section_)
>>+ {
>>+ Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab_gpl", SHT_RELA);
>>+ if (!sec)
>>+ sec = find_section(elf_handle(), ".rel__ksymtab_gpl", SHT_REL);
>>+ const_cast<read_context*>(this)->ksymtab_gpl_reloc_section_ = sec;
>>+ }
>>+ return ksymtab_gpl_reloc_section_;
>>+ }
>>+
>>+ /// Return the __ksymtab_strings section of a linux kernel ELF file
>>+ /// (either a vmlinux binary or a kernel module).
>>+ ///
>>+ /// @return the __ksymtab_strings section if found, nil otherwise.
>>+ Elf_Scn*
>>+ find_ksymtab_strings_section() const
>>+ {
>>+ if (!ksymtab_strings_section_)
>>+ const_cast<read_context*>(this)->ksymtab_strings_section_ =
>>+ find_section(elf_handle(), "__ksymtab_strings", SHT_PROGBITS);
>>+ return ksymtab_strings_section_;
>>+ }
>>+
>
>We should find a way to deduplicate these functions. Not necessarily in
>this patch, but in a followup. Most of them only differ in the section
>name and whether they check for an alternative. Something generic like
>
> struct SectionQuery{ string name, int type };
> Elf_Scn* find_section(const vector<SectionQuery>&);
>
>should be sufficient to forward from all the other variants to one
>implementation.
>
>> /// Return either a __ksymtab or a __ksymtab_gpl section, in case
>> /// only the __ksymtab_gpl exists.
>> ///
>>@@ -6106,6 +6167,19 @@ public:
>> return result;
>> }
>>
>>+ /// Return either a .rel{a,}__ksymtab or a .rel{a,}__ksymtab_gpl section
>>+ ///
>>+ /// @return the .rel{a,}__ksymtab section if it exists, or the
>>+ /// .rel{a,}__ksymtab_gpl; or NULL if neither is found.
>>+ Elf_Scn*
>>+ find_any_ksymtab_reloc_section() const
>>+ {
>>+ Elf_Scn *result = find_ksymtab_reloc_section();
>>+ if (!result)
>>+ result = find_ksymtab_gpl_reloc_section();
>>+ return result;
>>+ }
>>+
>> /// Return the SHT_GNU_versym, SHT_GNU_verdef and SHT_GNU_verneed
>> /// sections that are involved in symbol versionning.
>> ///
>>@@ -6283,7 +6357,8 @@ public:
>> stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(s->st_other));
>>
>> elf_symbol_sptr sym =
>>- elf_symbol::create(env(), symbol_index, s->st_size, name_str,
>>+ elf_symbol::create(env(), symbol_index, s->st_size,
>>+ s->st_value, s->st_shndx, name_str,
>> stt_to_elf_symbol_type(GELF_ST_TYPE(s->st_info)),
>> stb_to_elf_symbol_binding(GELF_ST_BIND(s->st_info)),
>> sym_is_defined, sym_is_common, ver, vis);
>>@@ -7435,6 +7510,64 @@ public:
>> return symbol;
>> }
>>
>>+ /// Determine the format of the __ksymtab and __ksymtab_gpl
>>+ /// sections of Linux kernel modules.
>>+ ///
>>+ /// This is important because we need the know the format of these
>
>s/need the/need to/ ^^^
>
>>+ /// sections to be able to read from them.
>>+ ///
>>+ /// @return the format the __ksymtab[_gpl] sections.
>>+ enum ksymtab_format
>>+ get_ksymtab_format_module() const
>>+ {
>>+ Elf_Scn *section = find_any_ksymtab_reloc_section();
>>+
>>+ if (!section)
>>+ return UNDEFINED_KSYMTAB_FORMAT;
>>+
>>+ // Libdwfl has a weird quirk where, in the process of obtaining an Elf
>>+ // descriptor via dwfl_module_getelf(), it will apply all relocations it
>>+ // knows how to and it will zero a relocation after applying it. If the
>
>will zero the relocation info after
>
>>+ // .rela__ksymtab* section contained only simple (absolute) relocations,
>>+ // they will have been all applied and sh_size will be 0. For arches that
>>+ // support relative ksymtabs, simple relocations only appear in pre-4.19
>>+ // kernel modules.
>>+ GElf_Shdr section_mem;
>>+ GElf_Shdr *section_shdr = gelf_getshdr(section, §ion_mem);
>>+ if (section_shdr->sh_size == 0)
>>+ return PRE_V4_19_KSYMTAB_FORMAT;
>>+ /* SHT_REL sections not supported yet */
>>+ if (section_shdr->sh_type == SHT_REL)
>>+ return UNDEFINED_KSYMTAB_FORMAT;
>>+
>>+ // If we still have a normal non-zeroed relocation section, we can guess
>>+ // what format the ksymtab is in depending on what types of relocs it
>>+ // contains.
>>+ GElf_Rela rela;
>>+ Elf_Data *rela_section_data = elf_rawdata(section, 0);
>>+ gelf_getrela(rela_section_data, 0, &rela);
>>+
>>+ // Sigh, I dislike the arch-dependent code here, but this seems to be a
>>+ // reliable heuristic for kernel modules for now. Relative ksymtabs only
>>+ // supported on x86 and arm64 as of v4.19.
>>+ ksymtab_format format;
>>+ switch (GELF_R_TYPE(rela.r_info))
>>+ {
>>+ case R_X86_64_64: // Same as R_386_32
>
>add 'fallthrough' to the comments, else younger compilers will warn
>about the potentially unintentional fallthrough.
>
>>+ case R_AARCH64_ABS64:
>>+ format = PRE_V4_19_KSYMTAB_FORMAT;
>>+ break;
>>+ case R_X86_64_PC32: // Same as R_386_PC32
>>+ case R_AARCH64_PREL32:
>>+ format = V4_19_KSYMTAB_FORMAT;
>>+ break;
>>+ default:
>>+ format = UNDEFINED_KSYMTAB_FORMAT;
>
>We should probably emit a warning in this case. Are there more
>architectures that have PREL32 enabled as of now that are not yet
>covered here?
>
>>+ break;
>>+ }
>>+ return format;
>>+ }
>>+
>> /// Determine the format of the __ksymtab and __ksymtab_gpl
>> /// sections.
>> ///
>>@@ -7451,6 +7584,18 @@ public:
>> {
>> if (ksymtab_format_ == UNDEFINED_KSYMTAB_FORMAT)
>> {
>>+ // Since Linux kernel modules are relocatable, we can first try
>>+ // using a heuristic based on relocations to guess the ksymtab format.
>>+ if (is_linux_kernel_module())
>>+ {
>>+ ksymtab_format_ = get_ksymtab_format_module();
>>+ if (ksymtab_format_ != UNDEFINED_KSYMTAB_FORMAT)
>>+ return ksymtab_format_;
>>+ }
>
>nit: use tabs
>
>>+
>>+ // If it's not a kernel module or we couldn't determine its format
>>+ // with relocations, fall back to the heuristic below.
>>+
>> // OK this is a dirty little heuristic to determine the
>> // format of the ksymtab section.
>> //
>>@@ -7553,48 +7698,24 @@ public:
>> return nb_ksymtab_gpl_entries_;
>> }
>>
>>- /// Load a given kernel symbol table.
>>+ /// Populate the symbol map by reading exported symbols from the
>>+ /// ksymtab directly.
>> ///
>>- /// One can thus retrieve the resulting symbols by using the
>>- /// accessors read_context::linux_exported_fn_syms(),
>>- /// read_context::linux_exported_var_syms(),
>>- /// read_context::linux_exported_gpl_fn_syms(), or
>>- /// read_context::linux_exported_gpl_var_syms().
>>+ /// @param section the ksymtab section to read from
>> ///
>>- /// @param kind the kind of kernel symbol table to load.
>>+ /// @param exported_fns_set the set of exported functions
>>+ ///
>>+ /// @param exported_vars_set the set of exported variables
>>+ ///
>>+ /// @param nb_entries the number of ksymtab entries to read
>> ///
>> /// @return true upon successful completion, false otherwise.
>> bool
>>- load_kernel_symbol_table(kernel_symbol_table_kind kind)
>>+ populate_symbol_map_from_ksymtab(Elf_Scn *section,
>>+ address_set_sptr exported_fns_set,
>>+ address_set_sptr exported_vars_set,
>>+ size_t nb_entries)
>> {
>>- size_t nb_entries = 0;
>>- Elf_Scn *section = 0;
>>- address_set_sptr linux_exported_fns_set, linux_exported_vars_set;
>>-
>>- switch (kind)
>>- {
>>- case KERNEL_SYMBOL_TABLE_KIND_UNDEFINED:
>>- break;
>>- case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB:
>>- section = find_ksymtab_section();
>>- nb_entries = get_nb_ksymtab_entries();
>>- linux_exported_fns_set = create_or_get_linux_exported_fn_syms();
>>- linux_exported_vars_set = create_or_get_linux_exported_var_syms();
>>- break;
>>- case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL:
>>- section = find_ksymtab_gpl_section();
>>- nb_entries = get_nb_ksymtab_gpl_entries();
>>- linux_exported_fns_set = create_or_get_linux_exported_gpl_fn_syms();
>>- linux_exported_vars_set = create_or_get_linux_exported_gpl_var_syms();
>>- break;
>>- }
>>-
>>- if (!linux_exported_vars_set
>>- || !linux_exported_fns_set
>>- || !section
>>- || !nb_entries)
>>- return false;
>>-
>> // The data of the section.
>> Elf_Data *elf_data = elf_rawdata(section, 0);
>>
>>@@ -7654,28 +7775,30 @@ public:
>> // (aka ELF symbol table).
>> symbol = lookup_elf_symbol_from_address(adjusted_symbol_address);
>> if (!symbol)
>>- {
>>- adjusted_symbol_address =
>>- maybe_adjust_var_sym_address(symbol_address);
>>- symbol = lookup_elf_symbol_from_address(adjusted_symbol_address);
>>- if (!symbol)
>>- // This must be a symbol that is of type neither FUNC
>>- // (function) nor OBJECT (variable). There are for intance,
>>- // symbols of type 'NOTYPE' in the ksymtab symbol table. I
>>- // am not sure what those are.
>>- continue;
>>- }
>>+ {
>>+ adjusted_symbol_address =
>>+ maybe_adjust_var_sym_address(symbol_address);
>>+ symbol = lookup_elf_symbol_from_address(adjusted_symbol_address);
>>+ if (!symbol)
>>+ {
>>+ // This must be a symbol that is of type neither FUNC
>>+ // (function) nor OBJECT (variable). There are for intance,
>>+ // symbols of type 'NOTYPE' in the ksymtab symbol table. I
>>+ // am not sure what those are.
>>+ continue;
>>+ }
>>+ }
>
>This part of the diff seems to a change purely of the indentation/tabs.
>Maybe you can revert this part.
>
>>
>> address_set_sptr set;
>> if (symbol->is_function())
>> {
>> ABG_ASSERT(lookup_elf_fn_symbol_from_address(adjusted_symbol_address));
>>- set = linux_exported_fns_set;
>>+ set = exported_fns_set;
>> }
>> else if (symbol->is_variable())
>> {
>> ABG_ASSERT(lookup_elf_var_symbol_from_address(adjusted_symbol_address));
>>- set = linux_exported_vars_set;
>>+ set = exported_vars_set;
>> }
>> else
>> ABG_ASSERT_NOT_REACHED;
>>@@ -7684,6 +7807,157 @@ public:
>> return true;
>> }
>>
>>+ /// Populate the symbol map by extracting the exported symbols from a
>>+ /// ksymtab rela section.
>>+ ///
>>+ /// @param section the ksymtab section to read from
>>+ ///
>>+ /// @param strings_section the __ksymtab_strings section
>>+ ///
>>+ /// @param exported_fns_set the set of exported functions
>>+ ///
>>+ /// @param exported_vars_set the set of exported variables
>>+ ///
>>+ /// @return true upon successful completion, false otherwise.
>>+ bool
>>+ populate_symbol_map_from_ksymtab_reloc(Elf_Scn *reloc_section, Elf_Scn *strings_section,
>>+ address_set_sptr exported_fns_set,
>>+ address_set_sptr exported_vars_set)
>>+ {
>>+ GElf_Shdr reloc_section_mem;
>>+ GElf_Shdr *reloc_section_shdr = gelf_getshdr(reloc_section, &reloc_section_mem);
>
>gelf_getshdr returns &reloc_section_mem. So you might want to use
>reloc_section_mem directly rather than introducing reloc_section_mem.
>
>>+ size_t reloc_count = reloc_section_shdr->sh_size / reloc_section_shdr->sh_entsize;
>>+
>>+ Elf_Data *reloc_section_data = elf_rawdata(reloc_section, 0);
>>+ size_t strings_ndx = elf_ndxscn(strings_section);
>>+
>>+ GElf_Rela rela;
>>+ elf_symbol_sptr symbol;
>>+ for (unsigned int i = 0; i < reloc_count; i++)
>>+ {
>>+ gelf_getrela(reloc_section_data, i, &rela);
>>+ symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info));
>>+
>>+ // This shouldn't happen..
>>+ if (!symbol)
>
>Maybe ABG_ASSERT then?
>
>>+ return false;
>>+
>>+ // If the symbol is from __ksymtab_strings, ignore it.
>>+ if (symbol->get_shndx() == strings_ndx)
>>+ continue;
>>+
>>+ if (!symbol->is_function() && !symbol->is_variable())
>>+ {
>>+ if (do_log())
>>+ {
>>+ if (symbol->get_type() == elf_symbol::NOTYPE_TYPE)
>>+ cerr << "skipping NOTYPE symbol "
>>+ << symbol->get_name()
>>+ << " shndx: "
>>+ << symbol->get_index()
>>+ << " @"
>>+ << elf_path()
>>+ << "\n";
>>+ else if (symbol->get_type() == elf_symbol::SECTION_TYPE)
>>+ cerr << "skipping SECTION symbol "
>>+ << "shndx: "
>>+ << symbol->get_index()
>>+ << " @"
>>+ << elf_path()
>>+ << "\n";
>>+ }
>>+ continue;
>>+ }
>>+
>>+ address_set_sptr set;
>>+ if (symbol->is_function())
>>+ {
>>+ ABG_ASSERT(lookup_elf_fn_symbol_from_address(symbol->get_value()));
>>+ set = exported_fns_set;
>>+ }
>>+ else if (symbol->is_variable())
>>+ {
>>+ ABG_ASSERT(lookup_elf_var_symbol_from_address(symbol->get_value()));
>>+ set = exported_vars_set;
>>+ }
>>+ else
>>+ ABG_ASSERT_NOT_REACHED;
>>+ set->insert(symbol->get_value());
>>+ }
>>+ return true;
>>+ }
>>+
>>+ /// Load a given kernel symbol table.
>>+ ///
>>+ /// One can thus retrieve the resulting symbols by using the
>>+ /// accessors read_context::linux_exported_fn_syms(),
>>+ /// read_context::linux_exported_var_syms(),
>>+ /// read_context::linux_exported_gpl_fn_syms(), or
>>+ /// read_context::linux_exported_gpl_var_syms().
>>+ ///
>>+ /// @param kind the kind of kernel symbol table to load.
>>+ ///
>>+ /// @return true upon successful completion, false otherwise.
>>+ bool
>>+ load_kernel_symbol_table(kernel_symbol_table_kind kind)
>>+ {
>>+ size_t nb_entries = 0;
>>+ Elf_Scn *section = 0, *reloc_section = 0, *strings_section = 0;
>>+ address_set_sptr linux_exported_fns_set, linux_exported_vars_set;
>>+
>>+ switch (kind)
>>+ {
>>+ case KERNEL_SYMBOL_TABLE_KIND_UNDEFINED:
>>+ break;
>>+ case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB:
>>+ section = find_ksymtab_section();
>>+ reloc_section = find_ksymtab_reloc_section();
>>+ nb_entries = get_nb_ksymtab_entries();
>>+ linux_exported_fns_set = create_or_get_linux_exported_fn_syms();
>>+ linux_exported_vars_set = create_or_get_linux_exported_var_syms();
>>+ break;
>>+ case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL:
>>+ section = find_ksymtab_gpl_section();
>>+ reloc_section = find_ksymtab_gpl_reloc_section();
>>+ nb_entries = get_nb_ksymtab_gpl_entries();
>>+ linux_exported_fns_set = create_or_get_linux_exported_gpl_fn_syms();
>>+ linux_exported_vars_set = create_or_get_linux_exported_gpl_var_syms();
>>+ break;
>>+ }
>>+
>>+ strings_section = find_ksymtab_strings_section();
>>+
>>+ if (!linux_exported_vars_set
>>+ || !linux_exported_fns_set
>>+ || !section
>>+ || !strings_section
>>+ || !nb_entries)
>>+ return false;
>>+
>>+ GElf_Ehdr eh_mem;
>>+ GElf_Ehdr* elf_header = gelf_getehdr(elf_handle(), &eh_mem);
>>+ ksymtab_format format = get_ksymtab_format();
>>+
>>+ // Although pre-v4.19 kernel modules can have a relocation section for the
>>+ // __ksymtab section, libdwfl zeroes the rela section after applying
>>+ // "simple" absolute relocations via dwfl_module_getelf(). For v4.19 and
>>+ // above, we get PC-relative relocations so dwfl_module_getelf() doesn't
>>+ // apply those relocations and we're safe to read the relocation section to
>>+ // determine which exported symbols are in the ksymtab.
>>+ //
>>+ // Note: For 32-bit binaries, only populate_symbol_map_from_ksymtab() is
>>+ // supported as of now.
>>+ if (!reloc_section
>>+ || format == PRE_V4_19_KSYMTAB_FORMAT
>>+ || elf_header->e_ident[EI_CLASS] == ELFCLASS32)
>>+ return populate_symbol_map_from_ksymtab(section, linux_exported_fns_set,
>>+ linux_exported_vars_set, nb_entries);
>>+ else
>>+ return populate_symbol_map_from_ksymtab_reloc(reloc_section, strings_section,
>>+ linux_exported_fns_set,
>>+ linux_exported_vars_set);
>>+ }
>>+
>> /// Load the special __ksymtab section. This is for linux kernel
>> /// (module) files.
>> ///
>>@@ -8352,6 +8626,17 @@ public:
>> is_linux_kernel_binary() const
>> {return find_section(elf_handle(), "__ksymtab_strings", SHT_PROGBITS);}
>>
>>+ /// Guess if the current binary is a Linux Kernel module.
>>+ ///
>>+ /// To guess that, the function looks for the presence of the
>>+ /// special ".modinfo" section in the binary.
>>+ ///
>>+ bool
>>+ is_linux_kernel_module() const
>>+ {
>>+ return find_section(elf_handle(), ".modinfo", SHT_PROGBITS);
>>+ }
>>+
>
>Also refer to this proposal to restrict checking for a module a bit
>more: https://sourceware.org/ml/libabigail/2019-q3/msg00022.html
>
>> /// Getter of the "show_stats" flag.
>> ///
>> /// This flag tells if we should emit statistics about various
>>@@ -15734,6 +16019,8 @@ create_default_var_sym(const string& sym_name, const environment *env)
>> elf_symbol::create(env,
>> /*symbol index=*/ 0,
>> /*symbol size=*/ 0,
>>+ /*symbol value=*/ 0,
>>+ /*symbol shndx=*/ 0,
>> sym_name,
>> /*symbol type=*/ elf_symbol::OBJECT_TYPE,
>> /*symbol binding=*/ elf_symbol::GLOBAL_BINDING,
>>@@ -16173,6 +16460,8 @@ create_default_fn_sym(const string& sym_name, const environment *env)
>> elf_symbol::create(env,
>> /*symbol index=*/ 0,
>> /*symbol size=*/ 0,
>>+ /*symbol value=*/ 0,
>>+ /*symbol shndx=*/ 0,
>> sym_name,
>> /*symbol type=*/ elf_symbol::FUNC_TYPE,
>> /*symbol binding=*/ elf_symbol::GLOBAL_BINDING,
>>diff --git a/src/abg-ir.cc b/src/abg-ir.cc
>>index a5243c3..ad1c261 100644
>>--- a/src/abg-ir.cc
>>+++ b/src/abg-ir.cc
>>@@ -1169,6 +1169,8 @@ struct elf_symbol::priv
>> const environment* env_;
>> size_t index_;
>> size_t size_;
>>+ uint64_t value_;
>>+ uint32_t shndx_;
>> string name_;
>> elf_symbol::type type_;
>> elf_symbol::binding binding_;
>>@@ -1213,6 +1215,8 @@ struct elf_symbol::priv
>> : env_(),
>> index_(),
>> size_(),
>>+ value_(),
>>+ shndx_(),
>> type_(elf_symbol::NOTYPE_TYPE),
>> binding_(elf_symbol::GLOBAL_BINDING),
>> visibility_(elf_symbol::DEFAULT_VISIBILITY),
>>@@ -1223,6 +1227,8 @@ struct elf_symbol::priv
>> priv(const environment* e,
>> size_t i,
>> size_t s,
>>+ uint64_t v,
>>+ uint32_t x,
>> const string& n,
>> elf_symbol::type t,
>> elf_symbol::binding b,
>>@@ -1233,6 +1239,8 @@ struct elf_symbol::priv
>> : env_(e),
>> index_(i),
>> size_(s),
>>+ value_(v),
>>+ shndx_(x),
>> name_(n),
>> type_(t),
>> binding_(b),
>>@@ -1269,6 +1277,10 @@ elf_symbol::elf_symbol()
>>///
>>/// @param s the size of the symbol.
>>///
>>+/// @param v the value of the symbol.
>>+///
>>+/// @param x the section header index (shndx) of the symbol.
>>+///
>>/// @param n the name of the symbol.
>>///
>>/// @param t the type of the symbol.
>>@@ -1285,6 +1297,8 @@ elf_symbol::elf_symbol()
>>elf_symbol::elf_symbol(const environment* e,
>> size_t i,
>> size_t s,
>>+ uint64_t v,
>>+ uint32_t x,
>> const string& n,
>> type t,
>> binding b,
>>@@ -1292,7 +1306,7 @@ elf_symbol::elf_symbol(const environment* e,
>> bool c,
>> const version& ve,
>> visibility vi)
>>- : priv_(new priv(e, i, s, n, t, b, d, c, ve, vi))
>>+ : priv_(new priv(e, i, s, v, x, n, t, b, d, c, ve, vi))
>>{}
>>
>>/// Factory of instances of @ref elf_symbol.
>>@@ -1319,6 +1333,10 @@ elf_symbol::create()
>>///
>>/// @param s the size of the symbol.
>>///
>>+/// @param v the value of the symbol.
>>+///
>>+/// @param x the section header index (shndx) of the symbol.
>>+///
>>/// @param n the name of the symbol.
>>///
>>/// @param t the type of the symbol.
>>@@ -1339,6 +1357,8 @@ elf_symbol_sptr
>>elf_symbol::create(const environment* e,
>> size_t i,
>> size_t s,
>>+ uint64_t v,
>>+ uint32_t x,
>> const string& n,
>> type t,
>> binding b,
>>@@ -1347,7 +1367,7 @@ elf_symbol::create(const environment* e,
>> const version& ve,
>> visibility vi)
>>{
>>- elf_symbol_sptr sym(new elf_symbol(e, i, s, n, t, b, d, c, ve, vi));
>>+ elf_symbol_sptr sym(new elf_symbol(e, i, s, v, x, n, t, b, d, c, ve, vi));
>> sym->priv_->main_symbol_ = sym;
>> return sym;
>>}
>>@@ -1411,6 +1431,20 @@ void
>>elf_symbol::set_index(size_t s)
>>{priv_->index_ = s;}
>>
>>+/// Getter for the symbol value.
>>+///
>>+/// @return the value of the symbol.
>>+uint64_t
>>+elf_symbol::get_value() const
>>+{return priv_->value_;}
>>+
>>+/// Getter for the symbol section header index.
>>+///
>>+/// @return the section header index of the symbol.
>>+uint32_t
>>+elf_symbol::get_shndx() const
>>+{return priv_->shndx_;}
>>+
>>/// Getter for the name of the @ref elf_symbol.
>>///
>>/// @return a reference to the name of the @ref symbol.
>>diff --git a/src/abg-reader.cc b/src/abg-reader.cc
>>index 810a65f..3c481ad 100644
>>--- a/src/abg-reader.cc
>>+++ b/src/abg-reader.cc
>>@@ -2659,6 +2659,8 @@ build_elf_symbol(read_context& ctxt, const xmlNodePtr node)
>>
>> const environment* env = ctxt.get_environment();
>> elf_symbol_sptr e = elf_symbol::create(env, /*index=*/0, size,
>>+ /*value=*/0,
>>+ /*shndx=*/0,
>> name, type, binding,
>> is_defined, is_common,
>> version, visibility);
>>--
>>2.16.4
>>
>
>Nice work! Thank you for that!
>The patch looks in general good to me. I am a bit concerned about the
>platform support. There are some corner cases that need to be covered.
>See the testing below.
>
>You might find the .clang-format file useful to match the coding style
>at some points (especially tabs vs spaces). Although when reviewing the
>patch, I had a bit trouble with unchanged but reformatted lines as they
>were confusing my diff. Maybe you can revert the changes that were
>purely for formatting and we take care of formatting in a later patch.
>
>I conducted some testing. I choose a minimal configuration with modules:
>
> $ export ARCH=<arch>
> $ make allnoconfig
> $ ./scripts/config \
> -e DEBUG_INFO \ # dwarf info
> -e MODULES \ # module support
> -e USB_SUPPORT \
> -m USB \
> -m USB_DUMMY_HCD \ # dummy_hcd.ko is a module without exports
> -m USB_GADGET # depends on usb-common.ko, which I test with
>
> $ make olddefconfig
> $ make
> $ abidw ./drivers/usb/common/usb-common.ko | less
> # then check for the existence of the elf-function-symbols section
>
> master patched
>Linux v4.18
>
> x86_64 works works
> x86 works works
> arm64 works works
>
>Linux v4.19
>
> x86_64 broken works
> x86 broken STILL BROKEN
> arm64 broken works
>
>So, for x86_64 and arm64 the situation got better and elf symbols are
>now identified. For x86, this patch did not improve the situation. I did
>not check that the list of symbols are entirely correct and that they
>work downstream with abidiff etc. Just the pure existence is a very good
>indicator for this patch to work.
Hi Matthias! Thanks a bunch for your review! :-)
I'm aware of the x86 issue and I think the diff below should fix it.
Could you please let me know if it passes your x86 testing? (note, I have
yet to address your other comments, so this is to be continued!)
diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc
index 835bf3d..ed8a8db 100644
--- a/src/abg-dwarf-reader.cc
+++ b/src/abg-dwarf-reader.cc
@@ -7536,22 +7536,33 @@ public:
GElf_Shdr *section_shdr = gelf_getshdr(section, §ion_mem);
if (section_shdr->sh_size == 0)
return PRE_V4_19_KSYMTAB_FORMAT;
- /* SHT_REL sections not supported yet */
- if (section_shdr->sh_type == SHT_REL)
- return UNDEFINED_KSYMTAB_FORMAT;
+
+ bool is_relasec = (section_shdr->sh_type == SHT_RELA);
// If we still have a normal non-zeroed relocation section, we can guess
// what format the ksymtab is in depending on what types of relocs it
// contains.
- GElf_Rela rela;
- Elf_Data *rela_section_data = elf_rawdata(section, 0);
- gelf_getrela(rela_section_data, 0, &rela);
+
+ int type;
+ Elf_Data *section_data = elf_rawdata(section, 0);
+ if (is_relasec)
+ {
+ GElf_Rela rela;
+ gelf_getrela(section_data, 0, &rela);
+ type = GELF_R_TYPE(rela.r_info);
+ }
+ else
+ {
+ GElf_Rel rel;
+ gelf_getrel(section_data, 0, &rel);
+ type = GELF_R_TYPE(rel.r_info);
+ }
// Sigh, I dislike the arch-dependent code here, but this seems to be a
// reliable heuristic for kernel modules for now. Relative ksymtabs only
// supported on x86 and arm64 as of v4.19.
ksymtab_format format;
- switch (GELF_R_TYPE(rela.r_info))
+ switch (type)
{
case R_X86_64_64: // Same as R_386_32
case R_AARCH64_ABS64:
@@ -7831,12 +7842,22 @@ public:
Elf_Data *reloc_section_data = elf_rawdata(reloc_section, 0);
size_t strings_ndx = elf_ndxscn(strings_section);
- GElf_Rela rela;
+ bool is_relasec = (reloc_section_shdr->sh_type == SHT_RELA);
elf_symbol_sptr symbol;
for (unsigned int i = 0; i < reloc_count; i++)
{
- gelf_getrela(reloc_section_data, i, &rela);
- symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info));
+ if (is_relasec)
+ {
+ GElf_Rela rela;
+ gelf_getrela(reloc_section_data, i, &rela);
+ symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info));
+ }
+ else
+ {
+ GElf_Rel rel;
+ gelf_getrel(reloc_section_data, i, &rel);
+ symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rel.r_info));
+ }
// This shouldn't happen..
if (!symbol)
@@ -7934,8 +7955,6 @@ public:
|| !nb_entries)
return false;
- GElf_Ehdr eh_mem;
- GElf_Ehdr* elf_header = gelf_getehdr(elf_handle(), &eh_mem);
ksymtab_format format = get_ksymtab_format();
// Although pre-v4.19 kernel modules can have a relocation section for the
@@ -7947,9 +7966,7 @@ public:
//
// Note: For 32-bit binaries, only populate_symbol_map_from_ksymtab() is
// supported as of now.
- if (!reloc_section
- || format == PRE_V4_19_KSYMTAB_FORMAT
- || elf_header->e_ident[EI_CLASS] == ELFCLASS32)
+ if (!reloc_section || format == PRE_V4_19_KSYMTAB_FORMAT)
return populate_symbol_map_from_ksymtab(section, linux_exported_fns_set,
linux_exported_vars_set, nb_entries);
else
next prev parent reply other threads:[~2019-08-21 12:34 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-01 0:00 Jessica Yu
2019-01-01 0:00 ` Matthias Maennich via libabigail
2019-01-01 0:00 ` Jessica Yu
2019-01-01 0:00 ` Matthias Maennich via libabigail
2019-01-01 0:00 ` Jessica Yu [this message]
2019-01-01 0:00 ` Matthias Maennich via libabigail
2019-01-01 0:00 ` Jessica Yu
2019-01-01 0:00 ` Matthias Maennich via libabigail
2019-01-01 0:00 ` Jessica Yu
2019-01-01 0:00 ` Matthias Maennich via libabigail
2019-01-01 0:00 ` Jessica Yu
2019-01-01 0:00 ` Jessica Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190821123436.GB8668@linux-8ccs \
--to=jeyu@kernel.org \
--cc=dodji@redhat.com \
--cc=kernel-team@android.com \
--cc=libabigail@sourceware.org \
--cc=maennich@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).