From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 89457 invoked by alias); 2 Dec 2019 17:19:54 -0000 Mailing-List: contact libabigail-help@sourceware.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Id: List-Subscribe: Sender: libabigail-owner@sourceware.org Received: (qmail 89447 invoked by uid 89); 2 Dec 2019 17:19:54 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Checked: by ClamAV 0.100.3 on sourceware.org X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.0 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=validate, proof, HX-Received:a81, pointing X-Spam-Status: No, score=-24.0 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on sourceware.org X-Spam-Level: X-HELO: mail-yb1-f201.google.com Received: from mail-yb1-f201.google.com (HELO mail-yb1-f201.google.com) (209.85.219.201) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 02 Dec 2019 17:19:52 +0000 Received: by mail-yb1-f201.google.com with SMTP id y127so26729195yba.19 for ; Mon, 02 Dec 2019 09:19:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=507u9oOWQ8gxC461y8S3bdsFoZv+jXmGICj0MK/hmwc=; b=sVOQDb6pZGvCU5yQ7jSYb3e/et3cehnMJAPpHXu78SxLP99q2lvyyqYqDtnU4NMHq4 ADp/aw8FuhzIy9Gkn2TenVGpT75FoaKgPH2QeKuivA4Q+3HgNDNM1B1EcBmqzw9GI1fR hCjjkVA/clpOJ/fiL7GQqP0USSrdDzl/khAW9LNuioBrQ/7SnLTCu74t1VaYMAPdbKQG 6LrjrPsPk+sdxgDer5T9xPg+hr3t8JRSyXtee3S1cjqds8DRor1kbCz46EId6vAm/IpE 8c7HTFN8SHHrB729C3MOqA3ZwrkwCJkr9xiB8h6wf61AXEj5WIsQokAygvd79sUmj90P yCAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=507u9oOWQ8gxC461y8S3bdsFoZv+jXmGICj0MK/hmwc=; b=qcSzHgETkz3gFXjsessgL8IVj4Y6W14CVvmKuBWFGQraFN5JeEDjW+S5TYwdz2IfsF rMEl4QEpFQhEevCS04WViDB0PFStu9Y5kN7xurq9n2y9BDaepyBRcuVc7LJ5NI+e0g7g umi9pbykBJvQI5DO4S0D6NLkrvB/TTAfkDUucx4IP9FYkDz6KPLe4bwAGHhKhUw7B0XT EFFotB62SWvnPVPDGk6b1OhwMfEhsgvntbGTOBgdGLtRlmPL+w7o9ckJCvGWGX4xCBw/ ISYUa+Icu/MrhvuY0lFC64c7VmgXRSR/oxZmXb7/ifqvEOTxcegpdXzy97G7IjVjHIjh O8eA== X-Gm-Message-State: APjAAAXGQVElHBIMIBmy2fsL3bkRPrt2Rmk2ZwXlDTMGMfJ0p1F8PsEV C2VibaGpKLgLxxcFYJu1mvrkxaqZoA6B5HDFcI90x+Q6uUvMaDNt7A7i/Atf04krQNLLVXlEx4p 5GFI7MidAnfV4ithxKKPdAgBrjw3UbbkBdJj1QSL+rZrjr/7/4xVKZUvxv3SDmABoAWlpRzk= X-Google-Smtp-Source: APXvYqyccELwEnWwm7Z6qzMqSS3mrMqzR1OmZAYucGwuJZV5P/1jWdDeeq9/kgVbxm+Zidw7unZeHjVHZWLPSg== X-Received: by 2002:a81:3845:: with SMTP id f66mr2022966ywa.220.1575307190466; Mon, 02 Dec 2019 09:19:50 -0800 (PST) Date: Tue, 01 Jan 2019 00:00:00 -0000 Message-Id: <20191202171933.261787-1-maennich@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.24.0.393.g34dc348eaf-goog Subject: [RFC PATCH] dwarf-reader: support ksymtab symbol lookup by name From: "Matthias Maennich via libabigail" Reply-To: Matthias Maennich To: libabigail@sourceware.org Cc: dodji@seketeli.org, kernel-team@android.com, maennich@google.com Content-Type: text/plain; charset="UTF-8" X-SW-Source: 2019-q4/txt/msg00070.txt.bz2 The ksymtab address entry might not be actually pointing to a GLOBAL visible symbol in .symtab of the same binary. E.g. that happens when compiling the kernel with clang's -fsanitize-cfi* and friends. In contrast, the ksymtab.name entry has to match an entry in .symtab. It would otherwise upset linkers very much. So, rather than relying on the address being resolvable in any way, we might as well look up the symbol name in the .symtab directly as this is the symbol we later want to analyze anyway and ksymtab.name has to match a visible symbol. In this patch we fall back to a name lookup if anything we tried so far was not successful. That makes the patch a bit more reviewable. On the other hand, we should be able to entirely rely on the name based lookup and drop the address based one. That would require some refactoring that I did not start at this point. I created some local test cases to validate my work. As of now, tests/test-read-dwarf.cc has no explicit support for the linux_kernel_mode, hence is not fully capable of asserting this implementation. Note, there is one additional case missing: If we have CFI binaries with position relative relocations, we also need to touch how populate_symbol_map_from_ksymtab_reloc works to do a similar method there. Long story short: if we decide to go this approach, I will refactor the code and add tests accordingly and fill the gap for the missing cases. Yes, this patch is a bit hacky at this point, but is supposed ot proof that name based lookup can work here. * src/abg-dwarf-reader.cc (populate_symbol_map_from_ksymtab): add support for symtab lookup by ksymtab.name entries. (try_reading_first_ksymtab_entry): Likewise. Signed-off-by: Matthias Maennich --- src/abg-dwarf-reader.cc | 135 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 135 insertions(+) diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc index 85b8e2461721..d1a220643945 100644 --- a/src/abg-dwarf-reader.cc +++ b/src/abg-dwarf-reader.cc @@ -7681,6 +7681,66 @@ public: symbol_address = maybe_adjust_fn_sym_address(symbol_address); symbol = lookup_elf_symbol_from_address(symbol_address); + + // the ksymtab.addr might not be a match in .symtab.addr (e.g. if CFI is + // enabled. We can still lookup the symbol by name in the symtab. + if (!symbol) + { + bytes += symbol_value_size; // advance now to the ksymtab.name entry + if (position_relative_relocations) + { + int32_t offset = 0; + ABG_ASSERT(read_int_from_array_of_bytes(bytes, symbol_value_size, + is_big_endian, offset)); + GElf_Shdr section_header; + gelf_getshdr(section, §ion_header); + // the actual symbol address is relative to its position. Since we + // do not know the position, we take the beginning of the section, + // add the read_offset that we might have and finally apply the + // offset we read from the section. + symbol_address = section_header.sh_addr + read_offset + offset; + } + else + ABG_ASSERT(read_int_from_array_of_bytes( + bytes, symbol_value_size, is_big_endian, symbol_address)); + + // the ksymtab.name entry has a corresponding entry in ksymtab_strings + // that we now discover. + + Elf_Data* ksymtab_strings + = elf_getdata(find_ksymtab_strings_section(), 0); + char* strings = reinterpret_cast(ksymtab_strings->d_buf); + GElf_Shdr section_header; + gelf_getshdr(find_ksymtab_strings_section(), §ion_header); + + GElf_Addr ksymtab_strings_offset + = symbol_address - section_header.sh_addr; + + if (ksymtab_strings_offset < section_header.sh_size) + { + const std::string& name + = strings + symbol_address - section_header.sh_addr; + + // now we got the name, let's find a match in the function or + // variable symbols + + string_elf_symbols_map_type::const_iterator it + = fun_syms().find(name); + if (it != fun_syms().end()) + { + symbol = it->second[0]; + } + + if (!symbol) + { + it = var_syms().find(name); + if (it != var_syms().end()) + { + symbol = it->second[0]; + } + } + } + } return symbol; } @@ -8067,6 +8127,81 @@ public: adjusted_symbol_address = maybe_adjust_var_sym_address(symbol_address); symbol = lookup_elf_symbol_from_address(adjusted_symbol_address); + + // the ksymtab.addr might not be a match in .symtab.addr (e.g. if + // CFI is enabled. We can still lookup the symbol by name in the + // symtab. + if (!symbol) + { + ABG_ASSERT(read_int_from_array_of_bytes( + &bytes[entry_offset + + symbol_value_size], // advance to ksymtab.name + symbol_value_size, is_big_endian, symbol_address)); + + // the ksymtab.name entry has a corresponding entry in + // ksymtab_strings that we now discover. + + Elf_Data* ksymtab_strings + = elf_getdata(find_ksymtab_strings_section(), 0); + char* strings + = reinterpret_cast(ksymtab_strings->d_buf); + GElf_Shdr section_header; + gelf_getshdr(find_ksymtab_strings_section(), §ion_header); + + GElf_Addr ksymtab_strings_offset + = symbol_address - section_header.sh_addr; + if (ksymtab_strings_offset < section_header.sh_size) + { + const std::string& name + = strings + symbol_address - section_header.sh_addr; + + // now we got the name, let's find a match in the function + // or variable symbols. We also need to discover the actual + // symbol address in .symtab to discover the dwarf + // information later. For that, we iterate + // (fun|var)_addr_sym maps to find a match. That is overly + // expensive and can be optimized: TODO: use a proper + // lookup map. + + addr_elf_symbol_sptr_map_type::const_iterator I, E; + string_elf_symbols_map_type::const_iterator it + = fun_syms().find(name); + if (it != fun_syms().end()) + { + symbol = it->second[0]; + for (I = fun_addr_sym_map().begin(), + E = fun_addr_sym_map().end(); + I != E; ++I) + { + if (I->second->get_name() == name) + { + adjusted_symbol_address = I->first; + } + } + } + + // we continue our search in the variable maps ... + + if (!symbol) + { + it = var_syms().find(name); + if (it != var_syms().end()) + { + symbol = it->second[0]; + for (I = var_addr_sym_map().begin(), + E = var_addr_sym_map().end(); + I != E; ++I) + { + if (I->second->get_name() == name) + { + adjusted_symbol_address = I->first; + } + } + } + } + } + } + if (!symbol) // This must be a symbol that is of type neither FUNC // (function) nor OBJECT (variable). There are for intance, -- 2.24.0.393.g34dc348eaf-goog