From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by sourceware.org (Postfix) with ESMTPS id 95B4F3857C6B for ; Tue, 15 Mar 2022 11:25:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 95B4F3857C6B Received: by mail-wr1-x42e.google.com with SMTP id u10so28469211wra.9 for ; Tue, 15 Mar 2022 04:25:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=lzEgCUXgcqLhHBrr4kbr0yoFDssnPTa9lY6+NZlPshg=; b=Av+HJsNT+cPfDDg1ygjallkPqrrC8ozSw/qGjt8KHpZSXUa7owe1Xft+bD0/BIMLFY cclKmalsEfojP5/Amkt77LcqWNfCEadxdOyBLG67jxeFjDhRCWoxflkDvfRb6m6TLFdX IV/K0GbJ87CpiCUGO7YZ/yfJHx50vNeYumUyRZ3vDQnLaZMv1L2tb4kJBqyNtcAIWno1 R4mkel7YDrI4y1JbFjRQ1enik0S/IpzopJiPY1GY+Io23QOkwQXnuaJPIS1vhJI+Hmy4 zWd+RxwQ45KhOPaBPLqK0Ioc/UpTed1vx8grG/0zyji85Fzez5Hovw2aC2J0YtUW9y9g S9mQ== X-Gm-Message-State: AOAM531/AxetNJxrf2+JhXaMrqIg5OD/uGNdey2joRPHbV5Fz6SOYNKe oroeRa1QAurjfbKNrGdjujCOfQ== X-Google-Smtp-Source: ABdhPJyu+CZP9qEwpC4uxyJuPDIBhKqNexAF7wbezxzsWrx8t1MuHsawVjvdlXELXCTo4nxaf7keWA== X-Received: by 2002:adf:c792:0:b0:1f0:769:9ef3 with SMTP id l18-20020adfc792000000b001f007699ef3mr19079496wrg.336.1647343550895; Tue, 15 Mar 2022 04:25:50 -0700 (PDT) Received: from google.com ([2a00:79e0:d:210:bca7:203f:7a8f:510]) by smtp.gmail.com with ESMTPSA id p1-20020a5d59a1000000b00203d83b0ae4sm958576wrr.109.2022.03.15.04.25.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Mar 2022 04:25:50 -0700 (PDT) Date: Tue, 15 Mar 2022 11:25:49 +0000 From: Matthias Maennich To: Giuliano Procida Cc: libabigail@sourceware.org, dodji@seketeli.org, kernel-team@android.com Subject: Re: [PATCH 2/2] add Linux kernel symbol namespace support Message-ID: References: <20220314181312.3436802-1-gprocida@google.com> <20220314181312.3436802-3-gprocida@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20220314181312.3436802-3-gprocida@google.com> X-Spam-Status: No, score=-29.7 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Mar 2022 11:25:55 -0000 On Mon, Mar 14, 2022 at 06:13:12PM +0000, Giuliano Procida wrote: >Bug 28954 - add Linux Kernel symbol namespace support > >This commit adds Linux Kernel symbol namespace support roughly as >discussed. > >Each symbol can be exported to a specified named namespace or left in >the global (nameless) namespace. The latter is explicitly represented >as the empty string, at least when it comes to the value of >__kstrtabns_FOO symbols, but the common interpretation is that such >symbols are not exported to a namespace. > >I would rather not make this assumption in more than one place in the >code and would also prefer to protect against __kstrtabns_FOO for >globally export FOO disappearing with a future change to the export >macros. > >So the code here represents namespace as optional and treats >empty strings found in __ksymtab_strings as missing. > > * include/abg-ir.h (elf_symbol::elf_symbol): Add ns argument. > (elf_symbol::create): Add ns argument. > (elf_symbol::get_namespace): Declare new function. > (elf_symbol::set_namespace): Declare new function. > and set_namespace. > * src/abg-comp-filter.cc (namespace_changed): Define new > helper functions. > (categorize_harmful_diff_node): Also call namespace_changed(). > * src/abg-ir.cc (elf_symbol::priv): Add namespace_ member. > (elf_symbol::priv::priv): Add namespace_ to initialisers. > (elf_symbol::elf_symbol): Take new ns argument and pass it to > priv constructor. > (elf_symbol::create): Take new ns argument and pass it to > elf_symbol constructor. > (elf_symbol::get_namespace): Define new function. > (elf_symbol::set_namespace): Define new function. > * src/abg-reader.cc (build_elf_symbol): If ns attribute is > present, set symbol namespace. > * src/abg-reporter-priv.cc (maybe_report_diff_for_symbol): If > symbol namespaces differ, report this. > * src/abg-symtab-reader.cc (symtab::load): Try to get > __ksymtab_strings metadata and data. Use this to look up > __kstrtabns_FOO namespace entries. Set symbol namespaces where > found. > * src/abg-writer.cc (write_elf_symbol): Emit ns attribute, if > symbol has a namespace. > >Signed-off-by: Giuliano Procida Thanks for working on that! I think this is conceptually the right thing to do to allow compatibility with different kernel versions and configurations. I just had some minor comments. With those addressed, feel free to add: Reviewed-by: Matthias Maennich >--- > include/abg-ir.h | 9 ++++++++ > src/abg-comp-filter.cc | 39 +++++++++++++++++++++++++++++++- > src/abg-ir.cc | 27 +++++++++++++++++++++- > src/abg-reader.cc | 7 ++++++ > src/abg-reporter-priv.cc | 12 ++++++++++ > src/abg-symtab-reader.cc | 49 ++++++++++++++++++++++++++++++++++++++++ > src/abg-writer.cc | 4 ++++ > 7 files changed, 145 insertions(+), 2 deletions(-) > >diff --git a/include/abg-ir.h b/include/abg-ir.h >index a2f4e1a7..7c42bea7 100644 >--- a/include/abg-ir.h >+++ b/include/abg-ir.h >@@ -21,6 +21,7 @@ > #include > #include > #include >+#include "abg-cxx-compat.h" > #include "abg-fwd.h" > #include "abg-hash.h" > #include "abg-traverse.h" >@@ -921,6 +922,7 @@ private: > visibility vi, > bool is_in_ksymtab = false, > uint64_t crc = 0, >+ const abg_compat::optional& ns = {}, When implementing this in the kernel itself, I also got trapped in naming inconsistencies, ns ... NS ... namespace. I believe I went with consistently calling it 'namespace'. I suggest to not abbreviate with 'ns'. > bool is_suppressed = false); > > elf_symbol(const elf_symbol&); >@@ -946,6 +948,7 @@ public: > visibility vi, > bool is_in_ksymtab = false, > uint64_t crc = 0, >+ const abg_compat::optional& ns = {}, > bool is_suppressed = false); > > const environment* >@@ -1023,6 +1026,12 @@ public: > void > set_crc(uint64_t crc); > >+ const abg_compat::optional& >+ get_namespace() const; >+ >+ void >+ set_namespace(const abg_compat::optional& ns); >+ > bool > is_suppressed() const; > >diff --git a/src/abg-comp-filter.cc b/src/abg-comp-filter.cc >index 56251274..6ee31e35 100644 >--- a/src/abg-comp-filter.cc >+++ b/src/abg-comp-filter.cc >@@ -254,6 +254,42 @@ crc_changed(const diff* diff) > return false; > } > >+/// Test if there was a function or variable namespace change. >+/// >+/// @param f the first function or variable to consider. >+/// >+/// @param s the second function or variable to consider. >+/// >+/// @return true if the test is positive, false otherwise. >+template >+static bool >+namespace_changed(const function_or_var_decl_sptr& f, >+ const function_or_var_decl_sptr& s) >+{ >+ const auto symbol_f = f->get_symbol(), symbol_s = s->get_symbol(); const auto& to avoid some refcounts on the symbol_sptr. >+ if (!symbol_f || !symbol_s) >+ return false; >+ return symbol_f->get_namespace() != symbol_s->get_namespace(); >+} >+ >+/// test if the current diff tree node carries a namespace change in ^ Test >+/// either a function or a variable. >+/// >+/// @param diff the diff tree node to consider. >+/// >+/// @return true if the test is positive, false otherwise. >+static bool >+namespace_changed(const diff* diff) >+{ >+ if (const function_decl_diff* d = >+ dynamic_cast(diff)) >+ return namespace_changed(d->first_function_decl(), >+ d->second_function_decl()); >+ if (const var_diff* d = dynamic_cast(diff)) >+ return namespace_changed(d->first_var(), d->second_var()); >+ return false; >+} >+ > /// Test if there was a function name change, but there there was no > /// change in name of the underlying symbol. IOW, if the name of a > /// function changed, but the symbol of the new function is equal to >@@ -1781,7 +1817,8 @@ categorize_harmful_diff_node(diff *d, bool pre) > || non_static_data_member_added_or_removed(d) > || base_classes_added_or_removed(d) > || has_harmful_enum_change(d) >- || crc_changed(d))) >+ || crc_changed(d) >+ || namespace_changed(d))) > category |= SIZE_OR_OFFSET_CHANGE_CATEGORY; > > if (has_virtual_mem_fn_change(d)) >diff --git a/src/abg-ir.cc b/src/abg-ir.cc >index 0ef5e8b2..c510bc55 100644 >--- a/src/abg-ir.cc >+++ b/src/abg-ir.cc >@@ -1728,6 +1728,7 @@ struct elf_symbol::priv > bool is_common_; > bool is_in_ksymtab_; > uint64_t crc_; >+ abg_compat::optional namespace_; > bool is_suppressed_; > elf_symbol_wptr main_symbol_; > elf_symbol_wptr next_alias_; >@@ -1745,6 +1746,7 @@ struct elf_symbol::priv > is_common_(false), > is_in_ksymtab_(false), > crc_(0), >+ namespace_(), > is_suppressed_(false) > {} > >@@ -1760,6 +1762,7 @@ struct elf_symbol::priv > elf_symbol::visibility vi, > bool is_in_ksymtab, > uint64_t crc, >+ const abg_compat::optional& ns, > bool is_suppressed) > : env_(e), > index_(i), >@@ -1773,6 +1776,7 @@ struct elf_symbol::priv > is_common_(c), > is_in_ksymtab_(is_in_ksymtab), > crc_(crc), >+ namespace_(ns), > is_suppressed_(is_suppressed) > { > if (!is_common_) >@@ -1818,6 +1822,8 @@ elf_symbol::elf_symbol() > /// @param vi the visibility of the symbol. > /// > /// @param crc the CRC (modversions) value of Linux Kernel symbols >+/// >+/// @param ns the namespace of Linux Kernel symbols, if any > elf_symbol::elf_symbol(const environment* e, > size_t i, > size_t s, >@@ -1830,6 +1836,7 @@ elf_symbol::elf_symbol(const environment* e, > visibility vi, > bool is_in_ksymtab, > uint64_t crc, >+ const abg_compat::optional& ns, > bool is_suppressed) > : priv_(new priv(e, > i, >@@ -1843,6 +1850,7 @@ elf_symbol::elf_symbol(const environment* e, > vi, > is_in_ksymtab, > crc, >+ ns, > is_suppressed)) > {} > >@@ -1886,6 +1894,8 @@ elf_symbol::create() > /// > /// @param crc the CRC (modversions) value of Linux Kernel symbols > /// >+/// @param ns the namespace of Linux Kernel symbols, if any >+/// > /// @return a (smart) pointer to a newly created instance of @ref > /// elf_symbol. > elf_symbol_sptr >@@ -1901,10 +1911,11 @@ elf_symbol::create(const environment* e, > visibility vi, > bool is_in_ksymtab, > uint64_t crc, >+ const abg_compat::optional& ns, > bool is_suppressed) > { > elf_symbol_sptr sym(new elf_symbol(e, i, s, n, t, b, d, c, ve, vi, >- is_in_ksymtab, crc, is_suppressed)); >+ is_in_ksymtab, crc, ns, is_suppressed)); > sym->priv_->main_symbol_ = sym; > return sym; > } >@@ -2147,6 +2158,20 @@ void > elf_symbol::set_crc(uint64_t crc) > {priv_->crc_ = crc;} > >+/// Getter of the 'namespace' property. >+/// >+/// @return the namespace for Linux Kernel symbols, if any >+const abg_compat::optional& >+elf_symbol::get_namespace() const >+{return priv_->namespace_;} >+ >+/// Setter of the 'namespace' property. >+/// >+/// @param ns the new namespace for Linux Kernel symbols, if any >+void >+elf_symbol::set_namespace(const abg_compat::optional& ns) While I like the std::optional in the data structure, I am not sure I like it exposed to the interface. Is there ever a case where set_namespace is called with an unset optional? And if so, would that not be a better interface to offer ::unset_namespace ? >+{priv_->namespace_ = ns;} >+ > /// Getter for the 'is-suppressed' property. > /// > /// @return true iff the current symbol has been suppressed by a >diff --git a/src/abg-reader.cc b/src/abg-reader.cc >index 31885692..55b7348d 100644 >--- a/src/abg-reader.cc >+++ b/src/abg-reader.cc >@@ -3269,6 +3269,13 @@ build_elf_symbol(read_context& ctxt, const xmlNodePtr node, > if (crc != 0) > e->set_crc(crc); > >+ if (xml_char_sptr s = XML_NODE_GET_ATTRIBUTE(node, "ns")) In particular, I would call the attribute 'namespace' in the XML unless we want to save bytes here. >+ { >+ std::string ns; >+ xml::xml_char_sptr_to_string(s, ns); >+ e->set_namespace({ns}); >+ } >+ > return e; > } > >diff --git a/src/abg-reporter-priv.cc b/src/abg-reporter-priv.cc >index 7012f5dc..dd40690c 100644 >--- a/src/abg-reporter-priv.cc >+++ b/src/abg-reporter-priv.cc >@@ -1158,6 +1158,18 @@ maybe_report_diff_for_symbol(const elf_symbol_sptr& symbol1, > << std::noshowbase << std::dec > << "\n"; > } >+ >+ const abg_compat::optional& ns1 = symbol1->get_namespace(); >+ const abg_compat::optional& ns2 = symbol2->get_namespace(); >+ if (ns1 != ns2) >+ { >+ const std::string none = "(none)"; >+ out << indent << "namespace changed from " >+ << (ns1.has_value() ? ns1.value() : none) >+ << " to " >+ << (ns2.has_value() ? ns2.value() : none) >+ << "\n"; >+ } > } > > /// For a given symbol, emit a string made of its name and version. >diff --git a/src/abg-symtab-reader.cc b/src/abg-symtab-reader.cc >index b42ce87d..589f3703 100644 >--- a/src/abg-symtab-reader.cc >+++ b/src/abg-symtab-reader.cc >@@ -232,9 +232,33 @@ symtab::load_(Elf* elf_handle, > return false; > } > >+ // The __kstrtab_strings sections is basically an ELF strtab but does not >+ // support elf_strptr lookups. A single call to elf_getdata gives a handle to >+ // section data. >+ // >+ // The value of a __kstrtabns_FOO (or other similar) symbol is an address >+ // within the __kstrtab_strings section. To look up the string value, we need >+ // to translate from load address to section offset by subtracting the base >+ // address of the section. >+ Elf_Scn* strings_section = elf_helpers::find_ksymtab_strings_section(elf_handle); >+ size_t strings_offset = 0; >+ const char* strings_data = nullptr; >+ size_t strings_size = 0; >+ if (strings_section) >+ { >+ GElf_Shdr strings_sheader; >+ gelf_getshdr(strings_section, &strings_sheader); >+ strings_offset = strings_sheader.sh_addr; >+ Elf_Data* data = elf_getdata(strings_section, nullptr); >+ ABG_ASSERT(data->d_off == 0); >+ strings_data = reinterpret_cast(data->d_buf); >+ strings_size = data->d_size; >+ } >+ > const bool is_kernel = elf_helpers::is_linux_kernel(elf_handle); > std::unordered_set exported_kernel_symbols; > std::unordered_map crc_values; >+ std::unordered_map namespaces; > > const bool is_arm32 = elf_helpers::architecture_is_arm32(elf_handle); > const bool is_ppc64 = elf_helpers::architecture_is_ppc64(elf_handle); >@@ -288,6 +312,20 @@ symtab::load_(Elf* elf_handle, > ABG_ASSERT(crc_values.emplace(name.substr(6), sym->st_value).second); > continue; > } >+ if (strings_section && is_kernel && name.rfind("__kstrtabns_", 0) == 0) >+ { >+ // This symbol lives in the __ksymtab_strings section but st_value is >+ // a load address so we need to subtract an offset before looking it >+ // up in that section. >+ const size_t offset = sym->st_value - strings_offset; >+ if (offset < strings_size) >+ { >+ const char* ns = strings_data + offset; >+ if (*ns) >+ ABG_ASSERT(namespaces.emplace(name.substr(12), ns).second); constexpr auto name_offset = sizeof("__kstrtabns_"); ABG_ASSERT(namespaces.emplace(name.substr(name_offset), ns).second); Also, consider limiting the string length to the section end to not accidentally read beyond. Cheers, Matthias >+ } >+ continue; >+ } > > // filter out uninteresting entries and only keep functions/variables for > // now. The rest might be interesting in the future though. >@@ -402,6 +440,17 @@ symtab::load_(Elf* elf_handle, > symbol->set_crc(crc_entry.second); > } > >+ // Now add the namespaces >+ for (const auto& namespace_entry : namespaces) >+ { >+ const auto r = name_symbol_map_.find(namespace_entry.first); >+ if (r == name_symbol_map_.end()) >+ continue; >+ >+ for (const auto& symbol : r->second) >+ symbol->set_namespace({namespace_entry.second}); >+ } >+ > // sort the symbols for deterministic output > std::sort(symbols_.begin(), symbols_.end(), symbol_sort); > >diff --git a/src/abg-writer.cc b/src/abg-writer.cc >index 7802128d..4dfa72a6 100644 >--- a/src/abg-writer.cc >+++ b/src/abg-writer.cc >@@ -3136,6 +3136,10 @@ write_elf_symbol(const elf_symbol_sptr& sym, > << std::hex << std::showbase << sym->get_crc() << "'" > << std::dec << std::noshowbase; > >+ if (sym->get_namespace().has_value()) >+ o << " ns='" >+ << sym->get_namespace().value() << "'"; >+ > o << "/>\n"; > > return true; >-- >2.35.1.723.g4982287a31-goog >