From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6233 invoked by alias); 2 Sep 2019 16:49:38 -0000 Mailing-List: contact libabigail-help@sourceware.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Id: List-Subscribe: Sender: libabigail-owner@sourceware.org Received: (qmail 6223 invoked by uid 89); 2 Sep 2019 16:49:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.100.3 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.4 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_STOCKGEN,SPF_PASS autolearn=ham version=3.3.1 spammy=lined, continued X-Spam-Status: No, score=-25.4 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_STOCKGEN,SPF_PASS autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on sourceware.org X-Spam-Level: X-HELO: mail.kernel.org Received: from mail.kernel.org (HELO mail.kernel.org) (198.145.29.99) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 02 Sep 2019 16:49:33 +0000 Received: from linux-8ccs.fritz.box (unknown [92.117.146.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3F3F221744; Mon, 2 Sep 2019 16:49:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1567442971; bh=HPgr4SwNdrvjDZsTd5g7AXxCzaWYi54fqKMI0aGndHE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ENUmkiC9Xwo6ZFpiiCrEL5fnoVbCN+DlW7l9VyrThfNQSbOi1GzM0QPQzuQqYs0wP o8EVnicsL+JAXmIIfMnxjK6WSfRpzOCFK3hgXAo5i5fUpt9/XMF1Jmj0E0MGsJHy5O p0VTwjcj55fYibrZBjjFK0kvp43yG2w7Ysbfzp5M= Date: Tue, 01 Jan 2019 00:00:00 -0000 From: Jessica Yu To: Matthias Maennich Cc: libabigail@sourceware.org, Dodji Seketeli , kernel-team@android.com Subject: Re: [RFC PATCH] Support pre and post v4.19 ksymtabs for Linux kernel modules Message-ID: <20190902164927.GA5974@linux-8ccs.fritz.box> References: <20190809154900.13751-1-jeyu@kernel.org> <20190819080508.GA189145@google.com> <20190821123436.GB8668@linux-8ccs> <20190822180418.GA233853@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20190822180418.GA233853@google.com> X-OS: Linux linux-8ccs 4.12.14-lp150.12.61-default x86_64 User-Agent: Mutt/1.10.1 (2018-07-13) X-IsSubscribed: yes X-SW-Source: 2019-q3/txt/msg00029.txt.bz2 +++ Matthias Maennich [22/08/19 19:04 +0100]: >On Wed, Aug 21, 2019 at 02:34:37PM +0200, Jessica Yu wrote: >>+++ Matthias Maennich [19/08/19 09:05 +0100]: >>>Hi Jessica! >>> >>>On Fri, 09 Aug, 17:49, Jessica Yu wrote: >>>>As described in commit ad8c2531fb9, the format of the Linux kernel >>>>ksymtab changed in v4.19 to use relative references instead of absolute >>>>references. This changes the type of relocations emitted for ksymtab >>>>sections to be place-relative 32-bit relocations instead of absolute >>>>relocations. One side-effect of this is that libdwfl will not relocate >>>>the ksymtab sections due to the PC-relative relocations. This breaks >>>>load_kernel_symbol_table() for kernel modules because it only reads in >>>>zeros from the unrelocated ksymtab section and is subsequently unable to >>>>determine what exported symbols it refers to. Since a vmlinux binary is >>>>already fully linked and relocated (and therefore we can read its >>>>ksymtab section just fine), this problem is only relevant to Linux >>>>kernel modules. >>>> >>>>To work around this, we utilize the ksymtab relocation sections to >>>>determine which symbols the ksymtab entries refer to. We do this by >>>>inspecting each relocation's r_info field for the symbol table index and >>>>from there we are able to read each symbol's value and subsequently add >>>>that to the set of exported symbols. >>>> >>>>In addition, for Linux kernel modules, we can utilize relocation types >>>>to implement a new heuristic to determine the ksymtab format we have. >>>>The presence of PC-relative relocations suggest the new v4.19 format, >>>>and absolute relocation types suggest the old pre v4.19 format. >>>> >>>> * include/abg-ir.h (elf_symbol::{elf_symbol, create}): Take new >>>> symbol value and shndx parameters. >>>> (elf_symbol::{get_value, get_shndx}): Declare new accessors. >>>> * src/abg-ir.cc (elf_symbol::priv::{value_, shndx_}): New data members. >>>> (elf_symbol::priv::priv): Adjust. >>>> (elf_symbol::elf_symbol): Take new value and shndx parameters. >>>> (elf_symbol::create): Likewise. >>>> (elf_symbol::{get_value, get_shndx}): Define new accessors. >>>> * src/abg-reader.cc (build_elf_symbol): Adjust. >>>> * src/abg-dwarf-reader.cc (lookup_symbol_from_sysv_hash_tab) >>>> (lookup_symbol_from_gnu_hash_tab) >>>> (lookup_symbol_from_symtab): Adjust. >>>> (read_context::{ksymtab_reloc_section_, >>>> ksymtab_gpl_reloc_section_, >>>> ksymtab_strings_section_}): New data members. >>>> (read_context::read_context): Initialize ksymtab_reloc_section_, >>>> ksymtab_gpl_reloc_section_, ksymtab_strings_section_. >>>> (read_context::{find_ksymtab_reloc_section, >>>> find_ksymtab_gpl_reloc_section, find_ksymtab_strings_section, >>>> find_any_ksymtab_reloc_section, get_ksymtab_format_module, >>>> populate_symbol_map_from_ksymtab, >>>> populate_symbol_map_from_ksymtab_reloc, >>>> is_linux_kernel_module}): New member functions. >>>> (read_context::load_kernel_symbol_table): Adjust to call either >>>> populate_symbol_map_from_ksymtab{_reloc,} depending on ksymtab >>>> format. >>>> (read_context::get_ksymtab_format): Adjust to call >>>> get_ksymtab_format_module for linux kernel modules. >>>> (read_context::lookup_elf_symbol_from_index): Adjust. >>>> (create_default_var_sym, create_default_fn_sym): Adjust. >>>> >>>>Signed-off-by: Jessica Yu >>>>--- >>>>include/abg-ir.h | 10 ++ >>>>src/abg-dwarf-reader.cc | 391 +++++++++++++++++++++++++++++++++++++++++------- >>>>src/abg-ir.cc | 38 ++++- >>>>src/abg-reader.cc | 2 + >>>>4 files changed, 388 insertions(+), 53 deletions(-) >>>> >>>>diff --git a/include/abg-ir.h b/include/abg-ir.h >>>>index 170a844..d26f472 100644 >>>>--- a/include/abg-ir.h >>>>+++ b/include/abg-ir.h >>>>@@ -796,6 +796,8 @@ private: >>>>elf_symbol(const environment* e, >>>> size_t i, >>>> size_t s, >>>>+ uint64_t v, >>>>+ uint32_t x, >>> >>>I would like to see some comments here to describe the purpose of the >>>struct's members. I know there weren't any before, but maybe that is a >>>good chance. >>> >>>Also, I am not entirely sure why these are all one-character names, but >>>it makes it hard to come from x to "section header index". Maybe we >>>should give them more meaningful names. >>> >>>> const string& n, >>>> type t, >>>> binding b, >>>>@@ -818,6 +820,8 @@ public: >>>>create(const environment* e, >>>> size_t i, >>>> size_t s, >>>>+ uint64_t v, >>>>+ uint32_t x, >>>> const string& n, >>>> type t, >>>> binding b, >>>>@@ -838,6 +842,12 @@ public: >>>>void >>>>set_index(size_t); >>>> >>>>+ uint64_t >>>>+ get_value() const; >>>>+ >>>>+ uint32_t >>>>+ get_shndx() const; >>>>+ >>>>const string& >>>>get_name() const; >>>> >>>>diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc >>>>index 47af7e3..835bf3d 100644 >>>>--- a/src/abg-dwarf-reader.cc >>>>+++ b/src/abg-dwarf-reader.cc >>>>@@ -1724,6 +1724,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env, >>>>GElf_Sym symbol; >>>>const char* sym_name_str; >>>>size_t sym_size; >>>>+ uint64_t sym_value; >>>>+ uint32_t sym_shndx; >>>>elf_symbol::type sym_type; >>>>elf_symbol::binding sym_binding; >>>>elf_symbol::visibility sym_visibility; >>>>@@ -1743,6 +1745,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env, >>>> sym_visibility = >>>> stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(symbol.st_other)); >>>> sym_size = symbol.st_size; >>>>+ sym_value = symbol.st_value; >>>>+ sym_shndx = symbol.st_shndx; >>>> elf_symbol::version ver; >>>> if (get_version_for_symbol(elf_handle, symbol_index, >>>> /*get_def_version=*/true, ver)) >>>>@@ -1751,6 +1755,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env, >>>> elf_symbol::create(env, >>>> symbol_index, >>>> sym_size, >>>>+ sym_value, >>>>+ sym_shndx, >>>> sym_name_str, >>>> sym_type, >>>> sym_binding, >>>>@@ -2028,7 +2034,8 @@ lookup_symbol_from_gnu_hash_tab(const environment* env, >>>> ABG_ASSERT(!ver.str().empty()); >>>> >>>> elf_symbol_sptr symbol_found = >>>>- elf_symbol::create(env, i, symbol.st_size, sym_name_str, >>>>+ elf_symbol::create(env, i, symbol.st_size, symbol.st_value, >>>>+ symbol.st_shndx, sym_name_str, >>>> sym_type, sym_binding, >>>> symbol.st_shndx != SHN_UNDEF, >>>> symbol.st_shndx == SHN_COMMON, >>>>@@ -2181,6 +2188,7 @@ lookup_symbol_from_symtab(const environment* env, >>>> ABG_ASSERT(!ver.str().empty()); >>>> elf_symbol_sptr symbol_found = >>>> elf_symbol::create(env, i, sym->st_size, >>>>+ sym->st_value, sym->st_shndx, >>>> name_str, sym_type, >>>> sym_binding, sym_is_defined, >>>> sym_is_common, ver, sym_visibility); >>>>@@ -3081,7 +3089,10 @@ public: >>>>/// to symbols exported using the EXPORT_SYMBOL_GPL macro from the >>>>/// linux kernel. >>>>Elf_Scn* ksymtab_section_; >>>>+ Elf_Scn* ksymtab_reloc_section_; >>>>Elf_Scn* ksymtab_gpl_section_; >>>>+ Elf_Scn* ksymtab_gpl_reloc_section_; >>>>+ Elf_Scn* ksymtab_strings_section_; >>>>Elf_Scn* versym_section_; >>>>Elf_Scn* verdef_section_; >>>>Elf_Scn* verneed_section_; >>>>@@ -3277,7 +3288,10 @@ public: >>>> nb_ksymtab_entries_ = 0; >>>> nb_ksymtab_gpl_entries_ = 0; >>>> ksymtab_section_ = 0; >>>>+ ksymtab_reloc_section_ = 0; >>>> ksymtab_gpl_section_ = 0; >>>>+ ksymtab_gpl_reloc_section_ = 0; >>>>+ ksymtab_strings_section_ = 0; >>>> versym_section_ = 0; >>>> verdef_section_ = 0; >>>> verneed_section_ = 0; >>>>@@ -6079,6 +6093,23 @@ public: >>>> return ksymtab_section_; >>>>} >>>> >>>>+ /// Return the .rel{a,}__ksymtab section of a linux kernel ELF file (either >>>>+ /// a vmlinux binary or a kernel module). >>>>+ /// >>>>+ /// @return the .rel{a,}__ksymtab section if found, nil otherwise. >>>>+ Elf_Scn* >>>>+ find_ksymtab_reloc_section() const >>>>+ { >>>>+ if (!ksymtab_reloc_section_) >>>>+ { >>>>+ Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab", SHT_RELA); >>>>+ if (!sec) >>>>+ sec = find_section(elf_handle(), ".rel__ksymtab", SHT_REL); >>>>+ const_cast(this)->ksymtab_reloc_section_ = sec; >>>>+ } >>>>+ return ksymtab_reloc_section_; >>>>+ } >>>>+ >>>>/// Return the __ksymtab_gpl section of a linux kernel ELF file >>>>/// (either a vmlinux binary or a kernel module). >>>>/// >>>>@@ -6092,6 +6123,36 @@ public: >>>> return ksymtab_gpl_section_; >>>>} >>>> >>>>+ /// Return the .rel{a,}__ksymtab_gpl section of a linux kernel ELF file >>>>+ /// (either a vmlinux binary or a kernel module). >>>>+ /// >>>>+ /// @return the .rel{a,}__ksymtab_gpl section if found, nil otherwise. >>>>+ Elf_Scn* >>>>+ find_ksymtab_gpl_reloc_section() const >>>>+ { >>>>+ if (!ksymtab_gpl_reloc_section_) >>>>+ { >>>>+ Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab_gpl", SHT_RELA); >>>>+ if (!sec) >>>>+ sec = find_section(elf_handle(), ".rel__ksymtab_gpl", SHT_REL); >>>>+ const_cast(this)->ksymtab_gpl_reloc_section_ = sec; >>>>+ } >>>>+ return ksymtab_gpl_reloc_section_; >>>>+ } >>>>+ >>>>+ /// Return the __ksymtab_strings section of a linux kernel ELF file >>>>+ /// (either a vmlinux binary or a kernel module). >>>>+ /// >>>>+ /// @return the __ksymtab_strings section if found, nil otherwise. >>>>+ Elf_Scn* >>>>+ find_ksymtab_strings_section() const >>>>+ { >>>>+ if (!ksymtab_strings_section_) >>>>+ const_cast(this)->ksymtab_strings_section_ = >>>>+ find_section(elf_handle(), "__ksymtab_strings", SHT_PROGBITS); >>>>+ return ksymtab_strings_section_; >>>>+ } >>>>+ >>> >>>We should find a way to deduplicate these functions. Not necessarily in >>>this patch, but in a followup. Most of them only differ in the section >>>name and whether they check for an alternative. Something generic like >>> >>>struct SectionQuery{ string name, int type }; >>>Elf_Scn* find_section(const vector&); >>> >>>should be sufficient to forward from all the other variants to one >>>implementation. >>> >>>>/// Return either a __ksymtab or a __ksymtab_gpl section, in case >>>>/// only the __ksymtab_gpl exists. >>>>/// >>>>@@ -6106,6 +6167,19 @@ public: >>>> return result; >>>>} >>>> >>>>+ /// Return either a .rel{a,}__ksymtab or a .rel{a,}__ksymtab_gpl section >>>>+ /// >>>>+ /// @return the .rel{a,}__ksymtab section if it exists, or the >>>>+ /// .rel{a,}__ksymtab_gpl; or NULL if neither is found. >>>>+ Elf_Scn* >>>>+ find_any_ksymtab_reloc_section() const >>>>+ { >>>>+ Elf_Scn *result = find_ksymtab_reloc_section(); >>>>+ if (!result) >>>>+ result = find_ksymtab_gpl_reloc_section(); >>>>+ return result; >>>>+ } >>>>+ >>>>/// Return the SHT_GNU_versym, SHT_GNU_verdef and SHT_GNU_verneed >>>>/// sections that are involved in symbol versionning. >>>>/// >>>>@@ -6283,7 +6357,8 @@ public: >>>> stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(s->st_other)); >>>> >>>> elf_symbol_sptr sym = >>>>- elf_symbol::create(env(), symbol_index, s->st_size, name_str, >>>>+ elf_symbol::create(env(), symbol_index, s->st_size, >>>>+ s->st_value, s->st_shndx, name_str, >>>> stt_to_elf_symbol_type(GELF_ST_TYPE(s->st_info)), >>>> stb_to_elf_symbol_binding(GELF_ST_BIND(s->st_info)), >>>> sym_is_defined, sym_is_common, ver, vis); >>>>@@ -7435,6 +7510,64 @@ public: >>>> return symbol; >>>>} >>>> >>>>+ /// Determine the format of the __ksymtab and __ksymtab_gpl >>>>+ /// sections of Linux kernel modules. >>>>+ /// >>>>+ /// This is important because we need the know the format of these >>> >>>s/need the/need to/ ^^^ >>> >>>>+ /// sections to be able to read from them. >>>>+ /// >>>>+ /// @return the format the __ksymtab[_gpl] sections. >>>>+ enum ksymtab_format >>>>+ get_ksymtab_format_module() const >>>>+ { >>>>+ Elf_Scn *section = find_any_ksymtab_reloc_section(); >>>>+ >>>>+ if (!section) >>>>+ return UNDEFINED_KSYMTAB_FORMAT; >>>>+ >>>>+ // Libdwfl has a weird quirk where, in the process of obtaining an Elf >>>>+ // descriptor via dwfl_module_getelf(), it will apply all relocations it >>>>+ // knows how to and it will zero a relocation after applying it. If the >>> >>>will zero the relocation info after >>> >>>>+ // .rela__ksymtab* section contained only simple (absolute) relocations, >>>>+ // they will have been all applied and sh_size will be 0. For arches that >>>>+ // support relative ksymtabs, simple relocations only appear in pre-4.19 >>>>+ // kernel modules. >>>>+ GElf_Shdr section_mem; >>>>+ GElf_Shdr *section_shdr = gelf_getshdr(section, §ion_mem); >>>>+ if (section_shdr->sh_size == 0) >>>>+ return PRE_V4_19_KSYMTAB_FORMAT; >>>>+ /* SHT_REL sections not supported yet */ >>>>+ if (section_shdr->sh_type == SHT_REL) >>>>+ return UNDEFINED_KSYMTAB_FORMAT; >>>>+ >>>>+ // If we still have a normal non-zeroed relocation section, we can guess >>>>+ // what format the ksymtab is in depending on what types of relocs it >>>>+ // contains. >>>>+ GElf_Rela rela; >>>>+ Elf_Data *rela_section_data = elf_rawdata(section, 0); >>>>+ gelf_getrela(rela_section_data, 0, &rela); >>>>+ >>>>+ // Sigh, I dislike the arch-dependent code here, but this seems to be a >>>>+ // reliable heuristic for kernel modules for now. Relative ksymtabs only >>>>+ // supported on x86 and arm64 as of v4.19. >>>>+ ksymtab_format format; >>>>+ switch (GELF_R_TYPE(rela.r_info)) >>>>+ { >>>>+ case R_X86_64_64: // Same as R_386_32 >>> >>>add 'fallthrough' to the comments, else younger compilers will warn >>>about the potentially unintentional fallthrough. >>> >>>>+ case R_AARCH64_ABS64: >>>>+ format = PRE_V4_19_KSYMTAB_FORMAT; >>>>+ break; >>>>+ case R_X86_64_PC32: // Same as R_386_PC32 >>>>+ case R_AARCH64_PREL32: >>>>+ format = V4_19_KSYMTAB_FORMAT; >>>>+ break; >>>>+ default: >>>>+ format = UNDEFINED_KSYMTAB_FORMAT; >>> >>>We should probably emit a warning in this case. Are there more >>>architectures that have PREL32 enabled as of now that are not yet >>>covered here? >>> >>>>+ break; >>>>+ } >>>>+ return format; >>>>+ } >>>>+ >>>>/// Determine the format of the __ksymtab and __ksymtab_gpl >>>>/// sections. >>>>/// >>>>@@ -7451,6 +7584,18 @@ public: >>>> { >>>> if (ksymtab_format_ == UNDEFINED_KSYMTAB_FORMAT) >>>> { >>>>+ // Since Linux kernel modules are relocatable, we can first try >>>>+ // using a heuristic based on relocations to guess the ksymtab format. >>>>+ if (is_linux_kernel_module()) >>>>+ { >>>>+ ksymtab_format_ = get_ksymtab_format_module(); >>>>+ if (ksymtab_format_ != UNDEFINED_KSYMTAB_FORMAT) >>>>+ return ksymtab_format_; >>>>+ } >>> >>>nit: use tabs >>> >>>>+ >>>>+ // If it's not a kernel module or we couldn't determine its format >>>>+ // with relocations, fall back to the heuristic below. >>>>+ >>>> // OK this is a dirty little heuristic to determine the >>>> // format of the ksymtab section. >>>> // >>>>@@ -7553,48 +7698,24 @@ public: >>>> return nb_ksymtab_gpl_entries_; >>>>} >>>> >>>>- /// Load a given kernel symbol table. >>>>+ /// Populate the symbol map by reading exported symbols from the >>>>+ /// ksymtab directly. >>>>/// >>>>- /// One can thus retrieve the resulting symbols by using the >>>>- /// accessors read_context::linux_exported_fn_syms(), >>>>- /// read_context::linux_exported_var_syms(), >>>>- /// read_context::linux_exported_gpl_fn_syms(), or >>>>- /// read_context::linux_exported_gpl_var_syms(). >>>>+ /// @param section the ksymtab section to read from >>>>/// >>>>- /// @param kind the kind of kernel symbol table to load. >>>>+ /// @param exported_fns_set the set of exported functions >>>>+ /// >>>>+ /// @param exported_vars_set the set of exported variables >>>>+ /// >>>>+ /// @param nb_entries the number of ksymtab entries to read >>>>/// >>>>/// @return true upon successful completion, false otherwise. >>>>bool >>>>- load_kernel_symbol_table(kernel_symbol_table_kind kind) >>>>+ populate_symbol_map_from_ksymtab(Elf_Scn *section, >>>>+ address_set_sptr exported_fns_set, >>>>+ address_set_sptr exported_vars_set, >>>>+ size_t nb_entries) >>>>{ >>>>- size_t nb_entries = 0; >>>>- Elf_Scn *section = 0; >>>>- address_set_sptr linux_exported_fns_set, linux_exported_vars_set; >>>>- >>>>- switch (kind) >>>>- { >>>>- case KERNEL_SYMBOL_TABLE_KIND_UNDEFINED: >>>>- break; >>>>- case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB: >>>>- section = find_ksymtab_section(); >>>>- nb_entries = get_nb_ksymtab_entries(); >>>>- linux_exported_fns_set = create_or_get_linux_exported_fn_syms(); >>>>- linux_exported_vars_set = create_or_get_linux_exported_var_syms(); >>>>- break; >>>>- case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL: >>>>- section = find_ksymtab_gpl_section(); >>>>- nb_entries = get_nb_ksymtab_gpl_entries(); >>>>- linux_exported_fns_set = create_or_get_linux_exported_gpl_fn_syms(); >>>>- linux_exported_vars_set = create_or_get_linux_exported_gpl_var_syms(); >>>>- break; >>>>- } >>>>- >>>>- if (!linux_exported_vars_set >>>>- || !linux_exported_fns_set >>>>- || !section >>>>- || !nb_entries) >>>>- return false; >>>>- >>>> // The data of the section. >>>> Elf_Data *elf_data = elf_rawdata(section, 0); >>>> >>>>@@ -7654,28 +7775,30 @@ public: >>>> // (aka ELF symbol table). >>>> symbol = lookup_elf_symbol_from_address(adjusted_symbol_address); >>>> if (!symbol) >>>>- { >>>>- adjusted_symbol_address = >>>>- maybe_adjust_var_sym_address(symbol_address); >>>>- symbol = lookup_elf_symbol_from_address(adjusted_symbol_address); >>>>- if (!symbol) >>>>- // This must be a symbol that is of type neither FUNC >>>>- // (function) nor OBJECT (variable). There are for intance, >>>>- // symbols of type 'NOTYPE' in the ksymtab symbol table. I >>>>- // am not sure what those are. >>>>- continue; >>>>- } >>>>+ { >>>>+ adjusted_symbol_address = >>>>+ maybe_adjust_var_sym_address(symbol_address); >>>>+ symbol = lookup_elf_symbol_from_address(adjusted_symbol_address); >>>>+ if (!symbol) >>>>+ { >>>>+ // This must be a symbol that is of type neither FUNC >>>>+ // (function) nor OBJECT (variable). There are for intance, >>>>+ // symbols of type 'NOTYPE' in the ksymtab symbol table. I >>>>+ // am not sure what those are. >>>>+ continue; >>>>+ } >>>>+ } >>> >>>This part of the diff seems to a change purely of the indentation/tabs. >>>Maybe you can revert this part. >>> >>>> >>>> address_set_sptr set; >>>> if (symbol->is_function()) >>>> { >>>> ABG_ASSERT(lookup_elf_fn_symbol_from_address(adjusted_symbol_address)); >>>>- set = linux_exported_fns_set; >>>>+ set = exported_fns_set; >>>> } >>>> else if (symbol->is_variable()) >>>> { >>>> ABG_ASSERT(lookup_elf_var_symbol_from_address(adjusted_symbol_address)); >>>>- set = linux_exported_vars_set; >>>>+ set = exported_vars_set; >>>> } >>>> else >>>> ABG_ASSERT_NOT_REACHED; >>>>@@ -7684,6 +7807,157 @@ public: >>>> return true; >>>>} >>>> >>>>+ /// Populate the symbol map by extracting the exported symbols from a >>>>+ /// ksymtab rela section. >>>>+ /// >>>>+ /// @param section the ksymtab section to read from >>>>+ /// >>>>+ /// @param strings_section the __ksymtab_strings section >>>>+ /// >>>>+ /// @param exported_fns_set the set of exported functions >>>>+ /// >>>>+ /// @param exported_vars_set the set of exported variables >>>>+ /// >>>>+ /// @return true upon successful completion, false otherwise. >>>>+ bool >>>>+ populate_symbol_map_from_ksymtab_reloc(Elf_Scn *reloc_section, Elf_Scn *strings_section, >>>>+ address_set_sptr exported_fns_set, >>>>+ address_set_sptr exported_vars_set) >>>>+ { >>>>+ GElf_Shdr reloc_section_mem; >>>>+ GElf_Shdr *reloc_section_shdr = gelf_getshdr(reloc_section, &reloc_section_mem); >>> >>>gelf_getshdr returns &reloc_section_mem. So you might want to use >>>reloc_section_mem directly rather than introducing reloc_section_mem. >>> >>>>+ size_t reloc_count = reloc_section_shdr->sh_size / reloc_section_shdr->sh_entsize; >>>>+ >>>>+ Elf_Data *reloc_section_data = elf_rawdata(reloc_section, 0); >>>>+ size_t strings_ndx = elf_ndxscn(strings_section); >>>>+ >>>>+ GElf_Rela rela; >>>>+ elf_symbol_sptr symbol; >>>>+ for (unsigned int i = 0; i < reloc_count; i++) >>>>+ { >>>>+ gelf_getrela(reloc_section_data, i, &rela); >>>>+ symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info)); >>>>+ >>>>+ // This shouldn't happen.. >>>>+ if (!symbol) >>> >>>Maybe ABG_ASSERT then? >>> >>>>+ return false; >>>>+ >>>>+ // If the symbol is from __ksymtab_strings, ignore it. >>>>+ if (symbol->get_shndx() == strings_ndx) >>>>+ continue; >>>>+ >>>>+ if (!symbol->is_function() && !symbol->is_variable()) >>>>+ { >>>>+ if (do_log()) >>>>+ { >>>>+ if (symbol->get_type() == elf_symbol::NOTYPE_TYPE) >>>>+ cerr << "skipping NOTYPE symbol " >>>>+ << symbol->get_name() >>>>+ << " shndx: " >>>>+ << symbol->get_index() >>>>+ << " @" >>>>+ << elf_path() >>>>+ << "\n"; >>>>+ else if (symbol->get_type() == elf_symbol::SECTION_TYPE) >>>>+ cerr << "skipping SECTION symbol " >>>>+ << "shndx: " >>>>+ << symbol->get_index() >>>>+ << " @" >>>>+ << elf_path() >>>>+ << "\n"; >>>>+ } >>>>+ continue; >>>>+ } >>>>+ >>>>+ address_set_sptr set; >>>>+ if (symbol->is_function()) >>>>+ { >>>>+ ABG_ASSERT(lookup_elf_fn_symbol_from_address(symbol->get_value())); >>>>+ set = exported_fns_set; >>>>+ } >>>>+ else if (symbol->is_variable()) >>>>+ { >>>>+ ABG_ASSERT(lookup_elf_var_symbol_from_address(symbol->get_value())); >>>>+ set = exported_vars_set; >>>>+ } >>>>+ else >>>>+ ABG_ASSERT_NOT_REACHED; >>>>+ set->insert(symbol->get_value()); >>>>+ } >>>>+ return true; >>>>+ } >>>>+ >>>>+ /// Load a given kernel symbol table. >>>>+ /// >>>>+ /// One can thus retrieve the resulting symbols by using the >>>>+ /// accessors read_context::linux_exported_fn_syms(), >>>>+ /// read_context::linux_exported_var_syms(), >>>>+ /// read_context::linux_exported_gpl_fn_syms(), or >>>>+ /// read_context::linux_exported_gpl_var_syms(). >>>>+ /// >>>>+ /// @param kind the kind of kernel symbol table to load. >>>>+ /// >>>>+ /// @return true upon successful completion, false otherwise. >>>>+ bool >>>>+ load_kernel_symbol_table(kernel_symbol_table_kind kind) >>>>+ { >>>>+ size_t nb_entries = 0; >>>>+ Elf_Scn *section = 0, *reloc_section = 0, *strings_section = 0; >>>>+ address_set_sptr linux_exported_fns_set, linux_exported_vars_set; >>>>+ >>>>+ switch (kind) >>>>+ { >>>>+ case KERNEL_SYMBOL_TABLE_KIND_UNDEFINED: >>>>+ break; >>>>+ case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB: >>>>+ section = find_ksymtab_section(); >>>>+ reloc_section = find_ksymtab_reloc_section(); >>>>+ nb_entries = get_nb_ksymtab_entries(); >>>>+ linux_exported_fns_set = create_or_get_linux_exported_fn_syms(); >>>>+ linux_exported_vars_set = create_or_get_linux_exported_var_syms(); >>>>+ break; >>>>+ case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL: >>>>+ section = find_ksymtab_gpl_section(); >>>>+ reloc_section = find_ksymtab_gpl_reloc_section(); >>>>+ nb_entries = get_nb_ksymtab_gpl_entries(); >>>>+ linux_exported_fns_set = create_or_get_linux_exported_gpl_fn_syms(); >>>>+ linux_exported_vars_set = create_or_get_linux_exported_gpl_var_syms(); >>>>+ break; >>>>+ } >>>>+ >>>>+ strings_section = find_ksymtab_strings_section(); >>>>+ >>>>+ if (!linux_exported_vars_set >>>>+ || !linux_exported_fns_set >>>>+ || !section >>>>+ || !strings_section >>>>+ || !nb_entries) >>>>+ return false; >>>>+ >>>>+ GElf_Ehdr eh_mem; >>>>+ GElf_Ehdr* elf_header = gelf_getehdr(elf_handle(), &eh_mem); >>>>+ ksymtab_format format = get_ksymtab_format(); >>>>+ >>>>+ // Although pre-v4.19 kernel modules can have a relocation section for the >>>>+ // __ksymtab section, libdwfl zeroes the rela section after applying >>>>+ // "simple" absolute relocations via dwfl_module_getelf(). For v4.19 and >>>>+ // above, we get PC-relative relocations so dwfl_module_getelf() doesn't >>>>+ // apply those relocations and we're safe to read the relocation section to >>>>+ // determine which exported symbols are in the ksymtab. >>>>+ // >>>>+ // Note: For 32-bit binaries, only populate_symbol_map_from_ksymtab() is >>>>+ // supported as of now. >>>>+ if (!reloc_section >>>>+ || format == PRE_V4_19_KSYMTAB_FORMAT >>>>+ || elf_header->e_ident[EI_CLASS] == ELFCLASS32) >>>>+ return populate_symbol_map_from_ksymtab(section, linux_exported_fns_set, >>>>+ linux_exported_vars_set, nb_entries); >>>>+ else >>>>+ return populate_symbol_map_from_ksymtab_reloc(reloc_section, strings_section, >>>>+ linux_exported_fns_set, >>>>+ linux_exported_vars_set); >>>>+ } >>>>+ >>>>/// Load the special __ksymtab section. This is for linux kernel >>>>/// (module) files. >>>>/// >>>>@@ -8352,6 +8626,17 @@ public: >>>>is_linux_kernel_binary() const >>>>{return find_section(elf_handle(), "__ksymtab_strings", SHT_PROGBITS);} >>>> >>>>+ /// Guess if the current binary is a Linux Kernel module. >>>>+ /// >>>>+ /// To guess that, the function looks for the presence of the >>>>+ /// special ".modinfo" section in the binary. >>>>+ /// >>>>+ bool >>>>+ is_linux_kernel_module() const >>>>+ { >>>>+ return find_section(elf_handle(), ".modinfo", SHT_PROGBITS); >>>>+ } >>>>+ >>> >>>Also refer to this proposal to restrict checking for a module a bit >>>more: https://sourceware.org/ml/libabigail/2019-q3/msg00022.html >>> >>>>/// Getter of the "show_stats" flag. >>>>/// >>>>/// This flag tells if we should emit statistics about various >>>>@@ -15734,6 +16019,8 @@ create_default_var_sym(const string& sym_name, const environment *env) >>>> elf_symbol::create(env, >>>> /*symbol index=*/ 0, >>>> /*symbol size=*/ 0, >>>>+ /*symbol value=*/ 0, >>>>+ /*symbol shndx=*/ 0, >>>> sym_name, >>>> /*symbol type=*/ elf_symbol::OBJECT_TYPE, >>>> /*symbol binding=*/ elf_symbol::GLOBAL_BINDING, >>>>@@ -16173,6 +16460,8 @@ create_default_fn_sym(const string& sym_name, const environment *env) >>>> elf_symbol::create(env, >>>> /*symbol index=*/ 0, >>>> /*symbol size=*/ 0, >>>>+ /*symbol value=*/ 0, >>>>+ /*symbol shndx=*/ 0, >>>> sym_name, >>>> /*symbol type=*/ elf_symbol::FUNC_TYPE, >>>> /*symbol binding=*/ elf_symbol::GLOBAL_BINDING, >>>>diff --git a/src/abg-ir.cc b/src/abg-ir.cc >>>>index a5243c3..ad1c261 100644 >>>>--- a/src/abg-ir.cc >>>>+++ b/src/abg-ir.cc >>>>@@ -1169,6 +1169,8 @@ struct elf_symbol::priv >>>>const environment* env_; >>>>size_t index_; >>>>size_t size_; >>>>+ uint64_t value_; >>>>+ uint32_t shndx_; >>>>string name_; >>>>elf_symbol::type type_; >>>>elf_symbol::binding binding_; >>>>@@ -1213,6 +1215,8 @@ struct elf_symbol::priv >>>> : env_(), >>>> index_(), >>>> size_(), >>>>+ value_(), >>>>+ shndx_(), >>>> type_(elf_symbol::NOTYPE_TYPE), >>>> binding_(elf_symbol::GLOBAL_BINDING), >>>> visibility_(elf_symbol::DEFAULT_VISIBILITY), >>>>@@ -1223,6 +1227,8 @@ struct elf_symbol::priv >>>>priv(const environment* e, >>>> size_t i, >>>> size_t s, >>>>+ uint64_t v, >>>>+ uint32_t x, >>>> const string& n, >>>> elf_symbol::type t, >>>> elf_symbol::binding b, >>>>@@ -1233,6 +1239,8 @@ struct elf_symbol::priv >>>> : env_(e), >>>> index_(i), >>>> size_(s), >>>>+ value_(v), >>>>+ shndx_(x), >>>> name_(n), >>>> type_(t), >>>> binding_(b), >>>>@@ -1269,6 +1277,10 @@ elf_symbol::elf_symbol() >>>>/// >>>>/// @param s the size of the symbol. >>>>/// >>>>+/// @param v the value of the symbol. >>>>+/// >>>>+/// @param x the section header index (shndx) of the symbol. >>>>+/// >>>>/// @param n the name of the symbol. >>>>/// >>>>/// @param t the type of the symbol. >>>>@@ -1285,6 +1297,8 @@ elf_symbol::elf_symbol() >>>>elf_symbol::elf_symbol(const environment* e, >>>> size_t i, >>>> size_t s, >>>>+ uint64_t v, >>>>+ uint32_t x, >>>> const string& n, >>>> type t, >>>> binding b, >>>>@@ -1292,7 +1306,7 @@ elf_symbol::elf_symbol(const environment* e, >>>> bool c, >>>> const version& ve, >>>> visibility vi) >>>>- : priv_(new priv(e, i, s, n, t, b, d, c, ve, vi)) >>>>+ : priv_(new priv(e, i, s, v, x, n, t, b, d, c, ve, vi)) >>>>{} >>>> >>>>/// Factory of instances of @ref elf_symbol. >>>>@@ -1319,6 +1333,10 @@ elf_symbol::create() >>>>/// >>>>/// @param s the size of the symbol. >>>>/// >>>>+/// @param v the value of the symbol. >>>>+/// >>>>+/// @param x the section header index (shndx) of the symbol. >>>>+/// >>>>/// @param n the name of the symbol. >>>>/// >>>>/// @param t the type of the symbol. >>>>@@ -1339,6 +1357,8 @@ elf_symbol_sptr >>>>elf_symbol::create(const environment* e, >>>> size_t i, >>>> size_t s, >>>>+ uint64_t v, >>>>+ uint32_t x, >>>> const string& n, >>>> type t, >>>> binding b, >>>>@@ -1347,7 +1367,7 @@ elf_symbol::create(const environment* e, >>>> const version& ve, >>>> visibility vi) >>>>{ >>>>- elf_symbol_sptr sym(new elf_symbol(e, i, s, n, t, b, d, c, ve, vi)); >>>>+ elf_symbol_sptr sym(new elf_symbol(e, i, s, v, x, n, t, b, d, c, ve, vi)); >>>>sym->priv_->main_symbol_ = sym; >>>>return sym; >>>>} >>>>@@ -1411,6 +1431,20 @@ void >>>>elf_symbol::set_index(size_t s) >>>>{priv_->index_ = s;} >>>> >>>>+/// Getter for the symbol value. >>>>+/// >>>>+/// @return the value of the symbol. >>>>+uint64_t >>>>+elf_symbol::get_value() const >>>>+{return priv_->value_;} >>>>+ >>>>+/// Getter for the symbol section header index. >>>>+/// >>>>+/// @return the section header index of the symbol. >>>>+uint32_t >>>>+elf_symbol::get_shndx() const >>>>+{return priv_->shndx_;} >>>>+ >>>>/// Getter for the name of the @ref elf_symbol. >>>>/// >>>>/// @return a reference to the name of the @ref symbol. >>>>diff --git a/src/abg-reader.cc b/src/abg-reader.cc >>>>index 810a65f..3c481ad 100644 >>>>--- a/src/abg-reader.cc >>>>+++ b/src/abg-reader.cc >>>>@@ -2659,6 +2659,8 @@ build_elf_symbol(read_context& ctxt, const xmlNodePtr node) >>>> >>>>const environment* env = ctxt.get_environment(); >>>>elf_symbol_sptr e = elf_symbol::create(env, /*index=*/0, size, >>>>+ /*value=*/0, >>>>+ /*shndx=*/0, >>>> name, type, binding, >>>> is_defined, is_common, >>>> version, visibility); >>>>-- >>>>2.16.4 >>>> >>> >>>Nice work! Thank you for that! >>>The patch looks in general good to me. I am a bit concerned about the >>>platform support. There are some corner cases that need to be covered. >>>See the testing below. >>> >>>You might find the .clang-format file useful to match the coding style >>>at some points (especially tabs vs spaces). Although when reviewing the >>>patch, I had a bit trouble with unchanged but reformatted lines as they >>>were confusing my diff. Maybe you can revert the changes that were >>>purely for formatting and we take care of formatting in a later patch. >>> >>>I conducted some testing. I choose a minimal configuration with modules: >>> >>>$ export ARCH= >>>$ make allnoconfig >>>$ ./scripts/config \ >>> -e DEBUG_INFO \ # dwarf info >>> -e MODULES \ # module support >>> -e USB_SUPPORT \ >>> -m USB \ >>> -m USB_DUMMY_HCD \ # dummy_hcd.ko is a module without exports >>> -m USB_GADGET # depends on usb-common.ko, which I test with >>> >>>$ make olddefconfig >>>$ make >>>$ abidw ./drivers/usb/common/usb-common.ko | less >>># then check for the existence of the elf-function-symbols section >>> >>> master patched >>>Linux v4.18 >>> >>>x86_64 works works >>>x86 works works >>>arm64 works works >>> >>>Linux v4.19 >>> >>>x86_64 broken works >>>x86 broken STILL BROKEN >>>arm64 broken works >>> >>>So, for x86_64 and arm64 the situation got better and elf symbols are >>>now identified. For x86, this patch did not improve the situation. I did >>>not check that the list of symbols are entirely correct and that they >>>work downstream with abidiff etc. Just the pure existence is a very good >>>indicator for this patch to work. >> >>Hi Matthias! Thanks a bunch for your review! :-) >> >>I'm aware of the x86 issue and I think the diff below should fix it. >>Could you please let me know if it passes your x86 testing? (note, I have >>yet to address your other comments, so this is to be continued!) > >Hi Jessica! > >I confirm this patch is fixing the x86 testing as lined out above. I >also retested the other platforms successfully. > >Thanks! Great work! > >Cheers, >Matthias > >> >>diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc >>index 835bf3d..ed8a8db 100644 >>--- a/src/abg-dwarf-reader.cc >>+++ b/src/abg-dwarf-reader.cc >>@@ -7536,22 +7536,33 @@ public: >> GElf_Shdr *section_shdr = gelf_getshdr(section, §ion_mem); >> if (section_shdr->sh_size == 0) >> return PRE_V4_19_KSYMTAB_FORMAT; >>- /* SHT_REL sections not supported yet */ >>- if (section_shdr->sh_type == SHT_REL) >>- return UNDEFINED_KSYMTAB_FORMAT; >>+ >>+ bool is_relasec = (section_shdr->sh_type == SHT_RELA); >> >> // If we still have a normal non-zeroed relocation section, we can guess >> // what format the ksymtab is in depending on what types of relocs it >> // contains. >>- GElf_Rela rela; >>- Elf_Data *rela_section_data = elf_rawdata(section, 0); >>- gelf_getrela(rela_section_data, 0, &rela); >>+ >>+ int type; >>+ Elf_Data *section_data = elf_rawdata(section, 0); >>+ if (is_relasec) >>+ { >>+ GElf_Rela rela; >>+ gelf_getrela(section_data, 0, &rela); >>+ type = GELF_R_TYPE(rela.r_info); >>+ } >>+ else >>+ { >>+ GElf_Rel rel; >>+ gelf_getrel(section_data, 0, &rel); >>+ type = GELF_R_TYPE(rel.r_info); >>+ } > >Maybe we can deduplicate this a bit. Also further down. >Missing out that additional 'a' might be a bug source at a later time. Yeah, I wasn't sure how to deduplicate this since rels and relas are different data types and we have to use gelf_getrel+GElf_Rel and gelf_getrela+GElf_Rela on rels and relas respectively.. >> // Sigh, I dislike the arch-dependent code here, but this seems to be a >> // reliable heuristic for kernel modules for now. Relative ksymtabs only >> // supported on x86 and arm64 as of v4.19. >> ksymtab_format format; >>- switch (GELF_R_TYPE(rela.r_info)) >>+ switch (type) >> { >> case R_X86_64_64: // Same as R_386_32 >> case R_AARCH64_ABS64: >>@@ -7831,12 +7842,22 @@ public: >> Elf_Data *reloc_section_data = elf_rawdata(reloc_section, 0); >> size_t strings_ndx = elf_ndxscn(strings_section); >> >>- GElf_Rela rela; >>+ bool is_relasec = (reloc_section_shdr->sh_type == SHT_RELA); >> elf_symbol_sptr symbol; >> for (unsigned int i = 0; i < reloc_count; i++) >> { >>- gelf_getrela(reloc_section_data, i, &rela); >>- symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info)); >>+ if (is_relasec) >>+ { >>+ GElf_Rela rela; >>+ gelf_getrela(reloc_section_data, i, &rela); >>+ symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info)); >>+ } >>+ else >>+ { >>+ GElf_Rel rel; >>+ gelf_getrel(reloc_section_data, i, &rel); >>+ symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rel.r_info)); >>+ } >> >> // This shouldn't happen.. >> if (!symbol) >>@@ -7934,8 +7955,6 @@ public: >> || !nb_entries) >> return false; >> >>- GElf_Ehdr eh_mem; >>- GElf_Ehdr* elf_header = gelf_getehdr(elf_handle(), &eh_mem); >> ksymtab_format format = get_ksymtab_format(); >> >> // Although pre-v4.19 kernel modules can have a relocation section for the >>@@ -7947,9 +7966,7 @@ public: >> // >> // Note: For 32-bit binaries, only populate_symbol_map_from_ksymtab() is >> // supported as of now. > >The comment might need updating. Thanks for catching that!