From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 110095 invoked by alias); 22 Aug 2019 18:04:31 -0000 Mailing-List: contact libabigail-help@sourceware.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Id: List-Subscribe: Sender: libabigail-owner@sourceware.org Received: (qmail 110081 invoked by uid 89); 22 Aug 2019 18:04:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.100.3 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-30.2 required=5.0 tests=BAYES_00,ENV_AND_HDR_SPF_MATCH,FSL_HELO_FAKE,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_STOCKGEN,RCVD_IN_DNSWL_NONE,SPF_PASS,USER_IN_DEF_SPF_WL autolearn=ham version=3.3.1 spammy=USB, usb, 37pm, Great X-Spam-Status: No, score=-30.2 required=5.0 tests=BAYES_00,ENV_AND_HDR_SPF_MATCH,FSL_HELO_FAKE,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_STOCKGEN,RCVD_IN_DNSWL_NONE,SPF_PASS,USER_IN_DEF_SPF_WL autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on sourceware.org X-Spam-Level: X-HELO: mail-wr1-f68.google.com Received: from mail-wr1-f68.google.com (HELO mail-wr1-f68.google.com) (209.85.221.68) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 22 Aug 2019 18:04:25 +0000 Received: by mail-wr1-f68.google.com with SMTP id y8so6249902wrn.10 for ; Thu, 22 Aug 2019 11:04:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=H/EBDYgFwcPaABjS/AvGNaNy9L4MCnVqwwTK8bMgis0=; b=oRJ25QiKD9fXykZaUeMFr8g/bmSls2gSLR33J7lU+nUH+NzZBkbhSNB1EJfbCWY99f U19as9n5PigspPgt5pjMk7mefRUJbciUTd4NJtzuGUim+p6SCNAgdYBl5RozlVvVO8n6 OteaE4GhD1EM5vRaPQOtckakah3vHR6FHjeVW0I6bhXw91s5V0RJ/32qrPydpbYQCXeC Xmi+7/f33Im2JLy6z1mXcKA6oqH52xQ4+cmie9WAxQN/ICJknSN2SGn7HeeWuMbHPxfc 3pOKJBmsGAwQPOs780ulkSuSLmN/1LV0OJobfCcFhSrYyRaZQoFwRHiU0YIfyz31aA36 TgCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=H/EBDYgFwcPaABjS/AvGNaNy9L4MCnVqwwTK8bMgis0=; b=D8szIfPiyqYCP8OOMlGCRrwy2omBft0qn1FyTbpXGtD+aYMGNd9nVsPI+xNRmG252B Xzu5sa+bjdx0lbxTiXiau8tNADaf18XCzJX+nGv1pZ4nPCdgnqCOJAcirpWWNn+FoyZy Q0Vl/+mmtVwCy8dSGyJcGSsGANucZp3GmiYLcBqGboq4owYdcWwDyGRkmgbRiybBEY7t B6QOk1Ica+3xepYX0yYJ8oKI46SwXsz2Z2m7VE+O+/lxxdpUh9jxnrD3qeUoh4CPwNFG kVI2tuqW2E9K2ou0MjRrdpzkdR8+ZrgghokH93Wm1EntpwRkAk1y6qAcn58+1lauS3dM D7xg== X-Gm-Message-State: APjAAAXqJrOxekoZoNNxgHVfOk9hnOaH/HutpLHJKXRwbuYYEP49zH0P sDwHWthW+HnwP1HwYUpPna+p6g== X-Google-Smtp-Source: APXvYqyVN/fRxNKvAEK8OjdjT9NWTUhDdgPqkccRjuxKDwmRZqiheRSR2HoF96u3AWdV254sme55OA== X-Received: by 2002:a05:6000:110f:: with SMTP id z15mr233690wrw.162.1566497062194; Thu, 22 Aug 2019 11:04:22 -0700 (PDT) Received: from google.com ([2a00:79e0:d:210:e8f7:125b:61e9:733d]) by smtp.gmail.com with ESMTPSA id j9sm238388wrx.66.2019.08.22.11.04.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Aug 2019 11:04:21 -0700 (PDT) Date: Tue, 01 Jan 2019 00:00:00 -0000 From: "Matthias Maennich via libabigail" Reply-To: Matthias Maennich To: Jessica Yu Cc: libabigail@sourceware.org, Dodji Seketeli , kernel-team@android.com Subject: Re: [RFC PATCH] Support pre and post v4.19 ksymtabs for Linux kernel modules Message-ID: <20190822180418.GA233853@google.com> References: <20190809154900.13751-1-jeyu@kernel.org> <20190819080508.GA189145@google.com> <20190821123436.GB8668@linux-8ccs> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20190821123436.GB8668@linux-8ccs> User-Agent: Mutt/1.10.1 (2018-07-13) X-SW-Source: 2019-q3/txt/msg00027.txt.bz2 On Wed, Aug 21, 2019 at 02:34:37PM +0200, Jessica Yu wrote: >+++ Matthias Maennich [19/08/19 09:05 +0100]: >>Hi Jessica! >> >>On Fri, 09 Aug, 17:49, Jessica Yu wrote: >>>As described in commit ad8c2531fb9, the format of the Linux kernel >>>ksymtab changed in v4.19 to use relative references instead of absolute >>>references. This changes the type of relocations emitted for ksymtab >>>sections to be place-relative 32-bit relocations instead of absolute >>>relocations. One side-effect of this is that libdwfl will not relocate >>>the ksymtab sections due to the PC-relative relocations. This breaks >>>load_kernel_symbol_table() for kernel modules because it only reads in >>>zeros from the unrelocated ksymtab section and is subsequently unable to >>>determine what exported symbols it refers to. Since a vmlinux binary is >>>already fully linked and relocated (and therefore we can read its >>>ksymtab section just fine), this problem is only relevant to Linux >>>kernel modules. >>> >>>To work around this, we utilize the ksymtab relocation sections to >>>determine which symbols the ksymtab entries refer to. We do this by >>>inspecting each relocation's r_info field for the symbol table index and >>>from there we are able to read each symbol's value and subsequently add >>>that to the set of exported symbols. >>> >>>In addition, for Linux kernel modules, we can utilize relocation types >>>to implement a new heuristic to determine the ksymtab format we have. >>>The presence of PC-relative relocations suggest the new v4.19 format, >>>and absolute relocation types suggest the old pre v4.19 format. >>> >>> * include/abg-ir.h (elf_symbol::{elf_symbol, create}): Take new >>> symbol value and shndx parameters. >>> (elf_symbol::{get_value, get_shndx}): Declare new accessors. >>> * src/abg-ir.cc (elf_symbol::priv::{value_, shndx_}): New data members. >>> (elf_symbol::priv::priv): Adjust. >>> (elf_symbol::elf_symbol): Take new value and shndx parameters. >>> (elf_symbol::create): Likewise. >>> (elf_symbol::{get_value, get_shndx}): Define new accessors. >>> * src/abg-reader.cc (build_elf_symbol): Adjust. >>> * src/abg-dwarf-reader.cc (lookup_symbol_from_sysv_hash_tab) >>> (lookup_symbol_from_gnu_hash_tab) >>> (lookup_symbol_from_symtab): Adjust. >>> (read_context::{ksymtab_reloc_section_, >>> ksymtab_gpl_reloc_section_, >>> ksymtab_strings_section_}): New data members. >>> (read_context::read_context): Initialize ksymtab_reloc_section_, >>> ksymtab_gpl_reloc_section_, ksymtab_strings_section_. >>> (read_context::{find_ksymtab_reloc_section, >>> find_ksymtab_gpl_reloc_section, find_ksymtab_strings_section, >>> find_any_ksymtab_reloc_section, get_ksymtab_format_module, >>> populate_symbol_map_from_ksymtab, >>> populate_symbol_map_from_ksymtab_reloc, >>> is_linux_kernel_module}): New member functions. >>> (read_context::load_kernel_symbol_table): Adjust to call either >>> populate_symbol_map_from_ksymtab{_reloc,} depending on ksymtab >>> format. >>> (read_context::get_ksymtab_format): Adjust to call >>> get_ksymtab_format_module for linux kernel modules. >>> (read_context::lookup_elf_symbol_from_index): Adjust. >>> (create_default_var_sym, create_default_fn_sym): Adjust. >>> >>>Signed-off-by: Jessica Yu >>>--- >>>include/abg-ir.h | 10 ++ >>>src/abg-dwarf-reader.cc | 391 +++++++++++++++++++++++++++++++++++++++++------- >>>src/abg-ir.cc | 38 ++++- >>>src/abg-reader.cc | 2 + >>>4 files changed, 388 insertions(+), 53 deletions(-) >>> >>>diff --git a/include/abg-ir.h b/include/abg-ir.h >>>index 170a844..d26f472 100644 >>>--- a/include/abg-ir.h >>>+++ b/include/abg-ir.h >>>@@ -796,6 +796,8 @@ private: >>> elf_symbol(const environment* e, >>> size_t i, >>> size_t s, >>>+ uint64_t v, >>>+ uint32_t x, >> >>I would like to see some comments here to describe the purpose of the >>struct's members. I know there weren't any before, but maybe that is a >>good chance. >> >>Also, I am not entirely sure why these are all one-character names, but >>it makes it hard to come from x to "section header index". Maybe we >>should give them more meaningful names. >> >>> const string& n, >>> type t, >>> binding b, >>>@@ -818,6 +820,8 @@ public: >>> create(const environment* e, >>> size_t i, >>> size_t s, >>>+ uint64_t v, >>>+ uint32_t x, >>> const string& n, >>> type t, >>> binding b, >>>@@ -838,6 +842,12 @@ public: >>> void >>> set_index(size_t); >>> >>>+ uint64_t >>>+ get_value() const; >>>+ >>>+ uint32_t >>>+ get_shndx() const; >>>+ >>> const string& >>> get_name() const; >>> >>>diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc >>>index 47af7e3..835bf3d 100644 >>>--- a/src/abg-dwarf-reader.cc >>>+++ b/src/abg-dwarf-reader.cc >>>@@ -1724,6 +1724,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env, >>> GElf_Sym symbol; >>> const char* sym_name_str; >>> size_t sym_size; >>>+ uint64_t sym_value; >>>+ uint32_t sym_shndx; >>> elf_symbol::type sym_type; >>> elf_symbol::binding sym_binding; >>> elf_symbol::visibility sym_visibility; >>>@@ -1743,6 +1745,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env, >>> sym_visibility = >>> stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(symbol.st_other)); >>> sym_size = symbol.st_size; >>>+ sym_value = symbol.st_value; >>>+ sym_shndx = symbol.st_shndx; >>> elf_symbol::version ver; >>> if (get_version_for_symbol(elf_handle, symbol_index, >>> /*get_def_version=*/true, ver)) >>>@@ -1751,6 +1755,8 @@ lookup_symbol_from_sysv_hash_tab(const environment* env, >>> elf_symbol::create(env, >>> symbol_index, >>> sym_size, >>>+ sym_value, >>>+ sym_shndx, >>> sym_name_str, >>> sym_type, >>> sym_binding, >>>@@ -2028,7 +2034,8 @@ lookup_symbol_from_gnu_hash_tab(const environment* env, >>> ABG_ASSERT(!ver.str().empty()); >>> >>> elf_symbol_sptr symbol_found = >>>- elf_symbol::create(env, i, symbol.st_size, sym_name_str, >>>+ elf_symbol::create(env, i, symbol.st_size, symbol.st_value, >>>+ symbol.st_shndx, sym_name_str, >>> sym_type, sym_binding, >>> symbol.st_shndx != SHN_UNDEF, >>> symbol.st_shndx == SHN_COMMON, >>>@@ -2181,6 +2188,7 @@ lookup_symbol_from_symtab(const environment* env, >>> ABG_ASSERT(!ver.str().empty()); >>> elf_symbol_sptr symbol_found = >>> elf_symbol::create(env, i, sym->st_size, >>>+ sym->st_value, sym->st_shndx, >>> name_str, sym_type, >>> sym_binding, sym_is_defined, >>> sym_is_common, ver, sym_visibility); >>>@@ -3081,7 +3089,10 @@ public: >>> /// to symbols exported using the EXPORT_SYMBOL_GPL macro from the >>> /// linux kernel. >>> Elf_Scn* ksymtab_section_; >>>+ Elf_Scn* ksymtab_reloc_section_; >>> Elf_Scn* ksymtab_gpl_section_; >>>+ Elf_Scn* ksymtab_gpl_reloc_section_; >>>+ Elf_Scn* ksymtab_strings_section_; >>> Elf_Scn* versym_section_; >>> Elf_Scn* verdef_section_; >>> Elf_Scn* verneed_section_; >>>@@ -3277,7 +3288,10 @@ public: >>> nb_ksymtab_entries_ = 0; >>> nb_ksymtab_gpl_entries_ = 0; >>> ksymtab_section_ = 0; >>>+ ksymtab_reloc_section_ = 0; >>> ksymtab_gpl_section_ = 0; >>>+ ksymtab_gpl_reloc_section_ = 0; >>>+ ksymtab_strings_section_ = 0; >>> versym_section_ = 0; >>> verdef_section_ = 0; >>> verneed_section_ = 0; >>>@@ -6079,6 +6093,23 @@ public: >>> return ksymtab_section_; >>> } >>> >>>+ /// Return the .rel{a,}__ksymtab section of a linux kernel ELF file (either >>>+ /// a vmlinux binary or a kernel module). >>>+ /// >>>+ /// @return the .rel{a,}__ksymtab section if found, nil otherwise. >>>+ Elf_Scn* >>>+ find_ksymtab_reloc_section() const >>>+ { >>>+ if (!ksymtab_reloc_section_) >>>+ { >>>+ Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab", SHT_RELA); >>>+ if (!sec) >>>+ sec = find_section(elf_handle(), ".rel__ksymtab", SHT_REL); >>>+ const_cast(this)->ksymtab_reloc_section_ = sec; >>>+ } >>>+ return ksymtab_reloc_section_; >>>+ } >>>+ >>> /// Return the __ksymtab_gpl section of a linux kernel ELF file >>> /// (either a vmlinux binary or a kernel module). >>> /// >>>@@ -6092,6 +6123,36 @@ public: >>> return ksymtab_gpl_section_; >>> } >>> >>>+ /// Return the .rel{a,}__ksymtab_gpl section of a linux kernel ELF file >>>+ /// (either a vmlinux binary or a kernel module). >>>+ /// >>>+ /// @return the .rel{a,}__ksymtab_gpl section if found, nil otherwise. >>>+ Elf_Scn* >>>+ find_ksymtab_gpl_reloc_section() const >>>+ { >>>+ if (!ksymtab_gpl_reloc_section_) >>>+ { >>>+ Elf_Scn *sec = find_section(elf_handle(), ".rela__ksymtab_gpl", SHT_RELA); >>>+ if (!sec) >>>+ sec = find_section(elf_handle(), ".rel__ksymtab_gpl", SHT_REL); >>>+ const_cast(this)->ksymtab_gpl_reloc_section_ = sec; >>>+ } >>>+ return ksymtab_gpl_reloc_section_; >>>+ } >>>+ >>>+ /// Return the __ksymtab_strings section of a linux kernel ELF file >>>+ /// (either a vmlinux binary or a kernel module). >>>+ /// >>>+ /// @return the __ksymtab_strings section if found, nil otherwise. >>>+ Elf_Scn* >>>+ find_ksymtab_strings_section() const >>>+ { >>>+ if (!ksymtab_strings_section_) >>>+ const_cast(this)->ksymtab_strings_section_ = >>>+ find_section(elf_handle(), "__ksymtab_strings", SHT_PROGBITS); >>>+ return ksymtab_strings_section_; >>>+ } >>>+ >> >>We should find a way to deduplicate these functions. Not necessarily in >>this patch, but in a followup. Most of them only differ in the section >>name and whether they check for an alternative. Something generic like >> >>struct SectionQuery{ string name, int type }; >>Elf_Scn* find_section(const vector&); >> >>should be sufficient to forward from all the other variants to one >>implementation. >> >>> /// Return either a __ksymtab or a __ksymtab_gpl section, in case >>> /// only the __ksymtab_gpl exists. >>> /// >>>@@ -6106,6 +6167,19 @@ public: >>> return result; >>> } >>> >>>+ /// Return either a .rel{a,}__ksymtab or a .rel{a,}__ksymtab_gpl section >>>+ /// >>>+ /// @return the .rel{a,}__ksymtab section if it exists, or the >>>+ /// .rel{a,}__ksymtab_gpl; or NULL if neither is found. >>>+ Elf_Scn* >>>+ find_any_ksymtab_reloc_section() const >>>+ { >>>+ Elf_Scn *result = find_ksymtab_reloc_section(); >>>+ if (!result) >>>+ result = find_ksymtab_gpl_reloc_section(); >>>+ return result; >>>+ } >>>+ >>> /// Return the SHT_GNU_versym, SHT_GNU_verdef and SHT_GNU_verneed >>> /// sections that are involved in symbol versionning. >>> /// >>>@@ -6283,7 +6357,8 @@ public: >>> stv_to_elf_symbol_visibility(GELF_ST_VISIBILITY(s->st_other)); >>> >>> elf_symbol_sptr sym = >>>- elf_symbol::create(env(), symbol_index, s->st_size, name_str, >>>+ elf_symbol::create(env(), symbol_index, s->st_size, >>>+ s->st_value, s->st_shndx, name_str, >>> stt_to_elf_symbol_type(GELF_ST_TYPE(s->st_info)), >>> stb_to_elf_symbol_binding(GELF_ST_BIND(s->st_info)), >>> sym_is_defined, sym_is_common, ver, vis); >>>@@ -7435,6 +7510,64 @@ public: >>> return symbol; >>> } >>> >>>+ /// Determine the format of the __ksymtab and __ksymtab_gpl >>>+ /// sections of Linux kernel modules. >>>+ /// >>>+ /// This is important because we need the know the format of these >> >>s/need the/need to/ ^^^ >> >>>+ /// sections to be able to read from them. >>>+ /// >>>+ /// @return the format the __ksymtab[_gpl] sections. >>>+ enum ksymtab_format >>>+ get_ksymtab_format_module() const >>>+ { >>>+ Elf_Scn *section = find_any_ksymtab_reloc_section(); >>>+ >>>+ if (!section) >>>+ return UNDEFINED_KSYMTAB_FORMAT; >>>+ >>>+ // Libdwfl has a weird quirk where, in the process of obtaining an Elf >>>+ // descriptor via dwfl_module_getelf(), it will apply all relocations it >>>+ // knows how to and it will zero a relocation after applying it. If the >> >>will zero the relocation info after >> >>>+ // .rela__ksymtab* section contained only simple (absolute) relocations, >>>+ // they will have been all applied and sh_size will be 0. For arches that >>>+ // support relative ksymtabs, simple relocations only appear in pre-4.19 >>>+ // kernel modules. >>>+ GElf_Shdr section_mem; >>>+ GElf_Shdr *section_shdr = gelf_getshdr(section, §ion_mem); >>>+ if (section_shdr->sh_size == 0) >>>+ return PRE_V4_19_KSYMTAB_FORMAT; >>>+ /* SHT_REL sections not supported yet */ >>>+ if (section_shdr->sh_type == SHT_REL) >>>+ return UNDEFINED_KSYMTAB_FORMAT; >>>+ >>>+ // If we still have a normal non-zeroed relocation section, we can guess >>>+ // what format the ksymtab is in depending on what types of relocs it >>>+ // contains. >>>+ GElf_Rela rela; >>>+ Elf_Data *rela_section_data = elf_rawdata(section, 0); >>>+ gelf_getrela(rela_section_data, 0, &rela); >>>+ >>>+ // Sigh, I dislike the arch-dependent code here, but this seems to be a >>>+ // reliable heuristic for kernel modules for now. Relative ksymtabs only >>>+ // supported on x86 and arm64 as of v4.19. >>>+ ksymtab_format format; >>>+ switch (GELF_R_TYPE(rela.r_info)) >>>+ { >>>+ case R_X86_64_64: // Same as R_386_32 >> >>add 'fallthrough' to the comments, else younger compilers will warn >>about the potentially unintentional fallthrough. >> >>>+ case R_AARCH64_ABS64: >>>+ format = PRE_V4_19_KSYMTAB_FORMAT; >>>+ break; >>>+ case R_X86_64_PC32: // Same as R_386_PC32 >>>+ case R_AARCH64_PREL32: >>>+ format = V4_19_KSYMTAB_FORMAT; >>>+ break; >>>+ default: >>>+ format = UNDEFINED_KSYMTAB_FORMAT; >> >>We should probably emit a warning in this case. Are there more >>architectures that have PREL32 enabled as of now that are not yet >>covered here? >> >>>+ break; >>>+ } >>>+ return format; >>>+ } >>>+ >>> /// Determine the format of the __ksymtab and __ksymtab_gpl >>> /// sections. >>> /// >>>@@ -7451,6 +7584,18 @@ public: >>> { >>> if (ksymtab_format_ == UNDEFINED_KSYMTAB_FORMAT) >>> { >>>+ // Since Linux kernel modules are relocatable, we can first try >>>+ // using a heuristic based on relocations to guess the ksymtab format. >>>+ if (is_linux_kernel_module()) >>>+ { >>>+ ksymtab_format_ = get_ksymtab_format_module(); >>>+ if (ksymtab_format_ != UNDEFINED_KSYMTAB_FORMAT) >>>+ return ksymtab_format_; >>>+ } >> >>nit: use tabs >> >>>+ >>>+ // If it's not a kernel module or we couldn't determine its format >>>+ // with relocations, fall back to the heuristic below. >>>+ >>> // OK this is a dirty little heuristic to determine the >>> // format of the ksymtab section. >>> // >>>@@ -7553,48 +7698,24 @@ public: >>> return nb_ksymtab_gpl_entries_; >>> } >>> >>>- /// Load a given kernel symbol table. >>>+ /// Populate the symbol map by reading exported symbols from the >>>+ /// ksymtab directly. >>> /// >>>- /// One can thus retrieve the resulting symbols by using the >>>- /// accessors read_context::linux_exported_fn_syms(), >>>- /// read_context::linux_exported_var_syms(), >>>- /// read_context::linux_exported_gpl_fn_syms(), or >>>- /// read_context::linux_exported_gpl_var_syms(). >>>+ /// @param section the ksymtab section to read from >>> /// >>>- /// @param kind the kind of kernel symbol table to load. >>>+ /// @param exported_fns_set the set of exported functions >>>+ /// >>>+ /// @param exported_vars_set the set of exported variables >>>+ /// >>>+ /// @param nb_entries the number of ksymtab entries to read >>> /// >>> /// @return true upon successful completion, false otherwise. >>> bool >>>- load_kernel_symbol_table(kernel_symbol_table_kind kind) >>>+ populate_symbol_map_from_ksymtab(Elf_Scn *section, >>>+ address_set_sptr exported_fns_set, >>>+ address_set_sptr exported_vars_set, >>>+ size_t nb_entries) >>> { >>>- size_t nb_entries = 0; >>>- Elf_Scn *section = 0; >>>- address_set_sptr linux_exported_fns_set, linux_exported_vars_set; >>>- >>>- switch (kind) >>>- { >>>- case KERNEL_SYMBOL_TABLE_KIND_UNDEFINED: >>>- break; >>>- case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB: >>>- section = find_ksymtab_section(); >>>- nb_entries = get_nb_ksymtab_entries(); >>>- linux_exported_fns_set = create_or_get_linux_exported_fn_syms(); >>>- linux_exported_vars_set = create_or_get_linux_exported_var_syms(); >>>- break; >>>- case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL: >>>- section = find_ksymtab_gpl_section(); >>>- nb_entries = get_nb_ksymtab_gpl_entries(); >>>- linux_exported_fns_set = create_or_get_linux_exported_gpl_fn_syms(); >>>- linux_exported_vars_set = create_or_get_linux_exported_gpl_var_syms(); >>>- break; >>>- } >>>- >>>- if (!linux_exported_vars_set >>>- || !linux_exported_fns_set >>>- || !section >>>- || !nb_entries) >>>- return false; >>>- >>> // The data of the section. >>> Elf_Data *elf_data = elf_rawdata(section, 0); >>> >>>@@ -7654,28 +7775,30 @@ public: >>> // (aka ELF symbol table). >>> symbol = lookup_elf_symbol_from_address(adjusted_symbol_address); >>> if (!symbol) >>>- { >>>- adjusted_symbol_address = >>>- maybe_adjust_var_sym_address(symbol_address); >>>- symbol = lookup_elf_symbol_from_address(adjusted_symbol_address); >>>- if (!symbol) >>>- // This must be a symbol that is of type neither FUNC >>>- // (function) nor OBJECT (variable). There are for intance, >>>- // symbols of type 'NOTYPE' in the ksymtab symbol table. I >>>- // am not sure what those are. >>>- continue; >>>- } >>>+ { >>>+ adjusted_symbol_address = >>>+ maybe_adjust_var_sym_address(symbol_address); >>>+ symbol = lookup_elf_symbol_from_address(adjusted_symbol_address); >>>+ if (!symbol) >>>+ { >>>+ // This must be a symbol that is of type neither FUNC >>>+ // (function) nor OBJECT (variable). There are for intance, >>>+ // symbols of type 'NOTYPE' in the ksymtab symbol table. I >>>+ // am not sure what those are. >>>+ continue; >>>+ } >>>+ } >> >>This part of the diff seems to a change purely of the indentation/tabs. >>Maybe you can revert this part. >> >>> >>> address_set_sptr set; >>> if (symbol->is_function()) >>> { >>> ABG_ASSERT(lookup_elf_fn_symbol_from_address(adjusted_symbol_address)); >>>- set = linux_exported_fns_set; >>>+ set = exported_fns_set; >>> } >>> else if (symbol->is_variable()) >>> { >>> ABG_ASSERT(lookup_elf_var_symbol_from_address(adjusted_symbol_address)); >>>- set = linux_exported_vars_set; >>>+ set = exported_vars_set; >>> } >>> else >>> ABG_ASSERT_NOT_REACHED; >>>@@ -7684,6 +7807,157 @@ public: >>> return true; >>> } >>> >>>+ /// Populate the symbol map by extracting the exported symbols from a >>>+ /// ksymtab rela section. >>>+ /// >>>+ /// @param section the ksymtab section to read from >>>+ /// >>>+ /// @param strings_section the __ksymtab_strings section >>>+ /// >>>+ /// @param exported_fns_set the set of exported functions >>>+ /// >>>+ /// @param exported_vars_set the set of exported variables >>>+ /// >>>+ /// @return true upon successful completion, false otherwise. >>>+ bool >>>+ populate_symbol_map_from_ksymtab_reloc(Elf_Scn *reloc_section, Elf_Scn *strings_section, >>>+ address_set_sptr exported_fns_set, >>>+ address_set_sptr exported_vars_set) >>>+ { >>>+ GElf_Shdr reloc_section_mem; >>>+ GElf_Shdr *reloc_section_shdr = gelf_getshdr(reloc_section, &reloc_section_mem); >> >>gelf_getshdr returns &reloc_section_mem. So you might want to use >>reloc_section_mem directly rather than introducing reloc_section_mem. >> >>>+ size_t reloc_count = reloc_section_shdr->sh_size / reloc_section_shdr->sh_entsize; >>>+ >>>+ Elf_Data *reloc_section_data = elf_rawdata(reloc_section, 0); >>>+ size_t strings_ndx = elf_ndxscn(strings_section); >>>+ >>>+ GElf_Rela rela; >>>+ elf_symbol_sptr symbol; >>>+ for (unsigned int i = 0; i < reloc_count; i++) >>>+ { >>>+ gelf_getrela(reloc_section_data, i, &rela); >>>+ symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info)); >>>+ >>>+ // This shouldn't happen.. >>>+ if (!symbol) >> >>Maybe ABG_ASSERT then? >> >>>+ return false; >>>+ >>>+ // If the symbol is from __ksymtab_strings, ignore it. >>>+ if (symbol->get_shndx() == strings_ndx) >>>+ continue; >>>+ >>>+ if (!symbol->is_function() && !symbol->is_variable()) >>>+ { >>>+ if (do_log()) >>>+ { >>>+ if (symbol->get_type() == elf_symbol::NOTYPE_TYPE) >>>+ cerr << "skipping NOTYPE symbol " >>>+ << symbol->get_name() >>>+ << " shndx: " >>>+ << symbol->get_index() >>>+ << " @" >>>+ << elf_path() >>>+ << "\n"; >>>+ else if (symbol->get_type() == elf_symbol::SECTION_TYPE) >>>+ cerr << "skipping SECTION symbol " >>>+ << "shndx: " >>>+ << symbol->get_index() >>>+ << " @" >>>+ << elf_path() >>>+ << "\n"; >>>+ } >>>+ continue; >>>+ } >>>+ >>>+ address_set_sptr set; >>>+ if (symbol->is_function()) >>>+ { >>>+ ABG_ASSERT(lookup_elf_fn_symbol_from_address(symbol->get_value())); >>>+ set = exported_fns_set; >>>+ } >>>+ else if (symbol->is_variable()) >>>+ { >>>+ ABG_ASSERT(lookup_elf_var_symbol_from_address(symbol->get_value())); >>>+ set = exported_vars_set; >>>+ } >>>+ else >>>+ ABG_ASSERT_NOT_REACHED; >>>+ set->insert(symbol->get_value()); >>>+ } >>>+ return true; >>>+ } >>>+ >>>+ /// Load a given kernel symbol table. >>>+ /// >>>+ /// One can thus retrieve the resulting symbols by using the >>>+ /// accessors read_context::linux_exported_fn_syms(), >>>+ /// read_context::linux_exported_var_syms(), >>>+ /// read_context::linux_exported_gpl_fn_syms(), or >>>+ /// read_context::linux_exported_gpl_var_syms(). >>>+ /// >>>+ /// @param kind the kind of kernel symbol table to load. >>>+ /// >>>+ /// @return true upon successful completion, false otherwise. >>>+ bool >>>+ load_kernel_symbol_table(kernel_symbol_table_kind kind) >>>+ { >>>+ size_t nb_entries = 0; >>>+ Elf_Scn *section = 0, *reloc_section = 0, *strings_section = 0; >>>+ address_set_sptr linux_exported_fns_set, linux_exported_vars_set; >>>+ >>>+ switch (kind) >>>+ { >>>+ case KERNEL_SYMBOL_TABLE_KIND_UNDEFINED: >>>+ break; >>>+ case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB: >>>+ section = find_ksymtab_section(); >>>+ reloc_section = find_ksymtab_reloc_section(); >>>+ nb_entries = get_nb_ksymtab_entries(); >>>+ linux_exported_fns_set = create_or_get_linux_exported_fn_syms(); >>>+ linux_exported_vars_set = create_or_get_linux_exported_var_syms(); >>>+ break; >>>+ case KERNEL_SYMBOL_TABLE_KIND_KSYMTAB_GPL: >>>+ section = find_ksymtab_gpl_section(); >>>+ reloc_section = find_ksymtab_gpl_reloc_section(); >>>+ nb_entries = get_nb_ksymtab_gpl_entries(); >>>+ linux_exported_fns_set = create_or_get_linux_exported_gpl_fn_syms(); >>>+ linux_exported_vars_set = create_or_get_linux_exported_gpl_var_syms(); >>>+ break; >>>+ } >>>+ >>>+ strings_section = find_ksymtab_strings_section(); >>>+ >>>+ if (!linux_exported_vars_set >>>+ || !linux_exported_fns_set >>>+ || !section >>>+ || !strings_section >>>+ || !nb_entries) >>>+ return false; >>>+ >>>+ GElf_Ehdr eh_mem; >>>+ GElf_Ehdr* elf_header = gelf_getehdr(elf_handle(), &eh_mem); >>>+ ksymtab_format format = get_ksymtab_format(); >>>+ >>>+ // Although pre-v4.19 kernel modules can have a relocation section for the >>>+ // __ksymtab section, libdwfl zeroes the rela section after applying >>>+ // "simple" absolute relocations via dwfl_module_getelf(). For v4.19 and >>>+ // above, we get PC-relative relocations so dwfl_module_getelf() doesn't >>>+ // apply those relocations and we're safe to read the relocation section to >>>+ // determine which exported symbols are in the ksymtab. >>>+ // >>>+ // Note: For 32-bit binaries, only populate_symbol_map_from_ksymtab() is >>>+ // supported as of now. >>>+ if (!reloc_section >>>+ || format == PRE_V4_19_KSYMTAB_FORMAT >>>+ || elf_header->e_ident[EI_CLASS] == ELFCLASS32) >>>+ return populate_symbol_map_from_ksymtab(section, linux_exported_fns_set, >>>+ linux_exported_vars_set, nb_entries); >>>+ else >>>+ return populate_symbol_map_from_ksymtab_reloc(reloc_section, strings_section, >>>+ linux_exported_fns_set, >>>+ linux_exported_vars_set); >>>+ } >>>+ >>> /// Load the special __ksymtab section. This is for linux kernel >>> /// (module) files. >>> /// >>>@@ -8352,6 +8626,17 @@ public: >>> is_linux_kernel_binary() const >>> {return find_section(elf_handle(), "__ksymtab_strings", SHT_PROGBITS);} >>> >>>+ /// Guess if the current binary is a Linux Kernel module. >>>+ /// >>>+ /// To guess that, the function looks for the presence of the >>>+ /// special ".modinfo" section in the binary. >>>+ /// >>>+ bool >>>+ is_linux_kernel_module() const >>>+ { >>>+ return find_section(elf_handle(), ".modinfo", SHT_PROGBITS); >>>+ } >>>+ >> >>Also refer to this proposal to restrict checking for a module a bit >>more: https://sourceware.org/ml/libabigail/2019-q3/msg00022.html >> >>> /// Getter of the "show_stats" flag. >>> /// >>> /// This flag tells if we should emit statistics about various >>>@@ -15734,6 +16019,8 @@ create_default_var_sym(const string& sym_name, const environment *env) >>> elf_symbol::create(env, >>> /*symbol index=*/ 0, >>> /*symbol size=*/ 0, >>>+ /*symbol value=*/ 0, >>>+ /*symbol shndx=*/ 0, >>> sym_name, >>> /*symbol type=*/ elf_symbol::OBJECT_TYPE, >>> /*symbol binding=*/ elf_symbol::GLOBAL_BINDING, >>>@@ -16173,6 +16460,8 @@ create_default_fn_sym(const string& sym_name, const environment *env) >>> elf_symbol::create(env, >>> /*symbol index=*/ 0, >>> /*symbol size=*/ 0, >>>+ /*symbol value=*/ 0, >>>+ /*symbol shndx=*/ 0, >>> sym_name, >>> /*symbol type=*/ elf_symbol::FUNC_TYPE, >>> /*symbol binding=*/ elf_symbol::GLOBAL_BINDING, >>>diff --git a/src/abg-ir.cc b/src/abg-ir.cc >>>index a5243c3..ad1c261 100644 >>>--- a/src/abg-ir.cc >>>+++ b/src/abg-ir.cc >>>@@ -1169,6 +1169,8 @@ struct elf_symbol::priv >>> const environment* env_; >>> size_t index_; >>> size_t size_; >>>+ uint64_t value_; >>>+ uint32_t shndx_; >>> string name_; >>> elf_symbol::type type_; >>> elf_symbol::binding binding_; >>>@@ -1213,6 +1215,8 @@ struct elf_symbol::priv >>> : env_(), >>> index_(), >>> size_(), >>>+ value_(), >>>+ shndx_(), >>> type_(elf_symbol::NOTYPE_TYPE), >>> binding_(elf_symbol::GLOBAL_BINDING), >>> visibility_(elf_symbol::DEFAULT_VISIBILITY), >>>@@ -1223,6 +1227,8 @@ struct elf_symbol::priv >>> priv(const environment* e, >>> size_t i, >>> size_t s, >>>+ uint64_t v, >>>+ uint32_t x, >>> const string& n, >>> elf_symbol::type t, >>> elf_symbol::binding b, >>>@@ -1233,6 +1239,8 @@ struct elf_symbol::priv >>> : env_(e), >>> index_(i), >>> size_(s), >>>+ value_(v), >>>+ shndx_(x), >>> name_(n), >>> type_(t), >>> binding_(b), >>>@@ -1269,6 +1277,10 @@ elf_symbol::elf_symbol() >>>/// >>>/// @param s the size of the symbol. >>>/// >>>+/// @param v the value of the symbol. >>>+/// >>>+/// @param x the section header index (shndx) of the symbol. >>>+/// >>>/// @param n the name of the symbol. >>>/// >>>/// @param t the type of the symbol. >>>@@ -1285,6 +1297,8 @@ elf_symbol::elf_symbol() >>>elf_symbol::elf_symbol(const environment* e, >>> size_t i, >>> size_t s, >>>+ uint64_t v, >>>+ uint32_t x, >>> const string& n, >>> type t, >>> binding b, >>>@@ -1292,7 +1306,7 @@ elf_symbol::elf_symbol(const environment* e, >>> bool c, >>> const version& ve, >>> visibility vi) >>>- : priv_(new priv(e, i, s, n, t, b, d, c, ve, vi)) >>>+ : priv_(new priv(e, i, s, v, x, n, t, b, d, c, ve, vi)) >>>{} >>> >>>/// Factory of instances of @ref elf_symbol. >>>@@ -1319,6 +1333,10 @@ elf_symbol::create() >>>/// >>>/// @param s the size of the symbol. >>>/// >>>+/// @param v the value of the symbol. >>>+/// >>>+/// @param x the section header index (shndx) of the symbol. >>>+/// >>>/// @param n the name of the symbol. >>>/// >>>/// @param t the type of the symbol. >>>@@ -1339,6 +1357,8 @@ elf_symbol_sptr >>>elf_symbol::create(const environment* e, >>> size_t i, >>> size_t s, >>>+ uint64_t v, >>>+ uint32_t x, >>> const string& n, >>> type t, >>> binding b, >>>@@ -1347,7 +1367,7 @@ elf_symbol::create(const environment* e, >>> const version& ve, >>> visibility vi) >>>{ >>>- elf_symbol_sptr sym(new elf_symbol(e, i, s, n, t, b, d, c, ve, vi)); >>>+ elf_symbol_sptr sym(new elf_symbol(e, i, s, v, x, n, t, b, d, c, ve, vi)); >>> sym->priv_->main_symbol_ = sym; >>> return sym; >>>} >>>@@ -1411,6 +1431,20 @@ void >>>elf_symbol::set_index(size_t s) >>>{priv_->index_ = s;} >>> >>>+/// Getter for the symbol value. >>>+/// >>>+/// @return the value of the symbol. >>>+uint64_t >>>+elf_symbol::get_value() const >>>+{return priv_->value_;} >>>+ >>>+/// Getter for the symbol section header index. >>>+/// >>>+/// @return the section header index of the symbol. >>>+uint32_t >>>+elf_symbol::get_shndx() const >>>+{return priv_->shndx_;} >>>+ >>>/// Getter for the name of the @ref elf_symbol. >>>/// >>>/// @return a reference to the name of the @ref symbol. >>>diff --git a/src/abg-reader.cc b/src/abg-reader.cc >>>index 810a65f..3c481ad 100644 >>>--- a/src/abg-reader.cc >>>+++ b/src/abg-reader.cc >>>@@ -2659,6 +2659,8 @@ build_elf_symbol(read_context& ctxt, const xmlNodePtr node) >>> >>> const environment* env = ctxt.get_environment(); >>> elf_symbol_sptr e = elf_symbol::create(env, /*index=*/0, size, >>>+ /*value=*/0, >>>+ /*shndx=*/0, >>> name, type, binding, >>> is_defined, is_common, >>> version, visibility); >>>-- >>>2.16.4 >>> >> >>Nice work! Thank you for that! >>The patch looks in general good to me. I am a bit concerned about the >>platform support. There are some corner cases that need to be covered. >>See the testing below. >> >>You might find the .clang-format file useful to match the coding style >>at some points (especially tabs vs spaces). Although when reviewing the >>patch, I had a bit trouble with unchanged but reformatted lines as they >>were confusing my diff. Maybe you can revert the changes that were >>purely for formatting and we take care of formatting in a later patch. >> >>I conducted some testing. I choose a minimal configuration with modules: >> >>$ export ARCH= >>$ make allnoconfig >>$ ./scripts/config \ >> -e DEBUG_INFO \ # dwarf info >> -e MODULES \ # module support >> -e USB_SUPPORT \ >> -m USB \ >> -m USB_DUMMY_HCD \ # dummy_hcd.ko is a module without exports >> -m USB_GADGET # depends on usb-common.ko, which I test with >> >>$ make olddefconfig >>$ make >>$ abidw ./drivers/usb/common/usb-common.ko | less >># then check for the existence of the elf-function-symbols section >> >> master patched >>Linux v4.18 >> >>x86_64 works works >>x86 works works >>arm64 works works >> >>Linux v4.19 >> >>x86_64 broken works >>x86 broken STILL BROKEN >>arm64 broken works >> >>So, for x86_64 and arm64 the situation got better and elf symbols are >>now identified. For x86, this patch did not improve the situation. I did >>not check that the list of symbols are entirely correct and that they >>work downstream with abidiff etc. Just the pure existence is a very good >>indicator for this patch to work. > >Hi Matthias! Thanks a bunch for your review! :-) > >I'm aware of the x86 issue and I think the diff below should fix it. >Could you please let me know if it passes your x86 testing? (note, I have >yet to address your other comments, so this is to be continued!) Hi Jessica! I confirm this patch is fixing the x86 testing as lined out above. I also retested the other platforms successfully. Thanks! Great work! Cheers, Matthias > >diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc >index 835bf3d..ed8a8db 100644 >--- a/src/abg-dwarf-reader.cc >+++ b/src/abg-dwarf-reader.cc >@@ -7536,22 +7536,33 @@ public: > GElf_Shdr *section_shdr = gelf_getshdr(section, §ion_mem); > if (section_shdr->sh_size == 0) > return PRE_V4_19_KSYMTAB_FORMAT; >- /* SHT_REL sections not supported yet */ >- if (section_shdr->sh_type == SHT_REL) >- return UNDEFINED_KSYMTAB_FORMAT; >+ >+ bool is_relasec = (section_shdr->sh_type == SHT_RELA); > > // If we still have a normal non-zeroed relocation section, we can guess > // what format the ksymtab is in depending on what types of relocs it > // contains. >- GElf_Rela rela; >- Elf_Data *rela_section_data = elf_rawdata(section, 0); >- gelf_getrela(rela_section_data, 0, &rela); >+ >+ int type; >+ Elf_Data *section_data = elf_rawdata(section, 0); >+ if (is_relasec) >+ { >+ GElf_Rela rela; >+ gelf_getrela(section_data, 0, &rela); >+ type = GELF_R_TYPE(rela.r_info); >+ } >+ else >+ { >+ GElf_Rel rel; >+ gelf_getrel(section_data, 0, &rel); >+ type = GELF_R_TYPE(rel.r_info); >+ } Maybe we can deduplicate this a bit. Also further down. Missing out that additional 'a' might be a bug source at a later time. > // Sigh, I dislike the arch-dependent code here, but this seems to be a > // reliable heuristic for kernel modules for now. Relative ksymtabs only > // supported on x86 and arm64 as of v4.19. > ksymtab_format format; >- switch (GELF_R_TYPE(rela.r_info)) >+ switch (type) > { > case R_X86_64_64: // Same as R_386_32 > case R_AARCH64_ABS64: >@@ -7831,12 +7842,22 @@ public: > Elf_Data *reloc_section_data = elf_rawdata(reloc_section, 0); > size_t strings_ndx = elf_ndxscn(strings_section); > >- GElf_Rela rela; >+ bool is_relasec = (reloc_section_shdr->sh_type == SHT_RELA); > elf_symbol_sptr symbol; > for (unsigned int i = 0; i < reloc_count; i++) > { >- gelf_getrela(reloc_section_data, i, &rela); >- symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info)); >+ if (is_relasec) >+ { >+ GElf_Rela rela; >+ gelf_getrela(reloc_section_data, i, &rela); >+ symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rela.r_info)); >+ } >+ else >+ { >+ GElf_Rel rel; >+ gelf_getrel(reloc_section_data, i, &rel); >+ symbol = lookup_elf_symbol_from_index(GELF_R_SYM(rel.r_info)); >+ } > > // This shouldn't happen.. > if (!symbol) >@@ -7934,8 +7955,6 @@ public: > || !nb_entries) > return false; > >- GElf_Ehdr eh_mem; >- GElf_Ehdr* elf_header = gelf_getehdr(elf_handle(), &eh_mem); > ksymtab_format format = get_ksymtab_format(); > > // Although pre-v4.19 kernel modules can have a relocation section for the >@@ -7947,9 +7966,7 @@ public: > // > // Note: For 32-bit binaries, only populate_symbol_map_from_ksymtab() is > // supported as of now. The comment might need updating. >- if (!reloc_section >- || format == PRE_V4_19_KSYMTAB_FORMAT >- || elf_header->e_ident[EI_CLASS] == ELFCLASS32) >+ if (!reloc_section || format == PRE_V4_19_KSYMTAB_FORMAT) > return populate_symbol_map_from_ksymtab(section, linux_exported_fns_set, > linux_exported_vars_set, nb_entries); > else