From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by sourceware.org (Postfix) with ESMTPS id 37F933853D57 for ; Fri, 25 Nov 2022 02:53:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 37F933853D57 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=harmstone.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wr1-x42b.google.com with SMTP id v1so4870244wrt.11 for ; Thu, 24 Nov 2022 18:53:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:sender:from:to:cc:subject:date:message-id:reply-to; bh=xaSW3GxFkyokXmriWsz+Iz4tjX+EhXdcVfEKoQ9lWMQ=; b=gUGTMt79XMCkovXJIkgu090yhK4sGkui5TVb0NX/L9alZ8hbHzVN1n6cgqfkV5spiI ISwvJ1NsyHFk0YQiAM2+Q8gcqSpJQ151s5QCBeAeedMP6vV3SgGGPITWwWd4msU68Q6o s6adW4xYFqN3MuQ8uQTT+3khSVyAKUvuc3N93D9HBKXDPUhKEPPZIEsZ/fgNEceDAYcG Boyc7N/QeBeyNTnRHsEwmT55//HVjjn+J7i57VE+QJcSJfzCS18Z/uIHbIShUUFtwRX2 s9tD7YtI+/loalrl+7L31rcsdzcWRfickQFYx9QKaBnfrcpsLWfHGowLzVcDeWAIblmd OXCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:sender:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=xaSW3GxFkyokXmriWsz+Iz4tjX+EhXdcVfEKoQ9lWMQ=; b=kcatrUgjCEkHbFfs7B3fg/7pq1Ou2CF86RuTNT3xHgk0hu1VvCTy1zKFKaBVrXlqpg mjbG5rmv+vECqQVsa3liKJwXmJvLw1MI873lrzwq2p4d88hF+npFtWlAivAMaR/HwZqM FqIF9BuaUo2mHf6lC5d4faxFtW7nZVhtu5w6aLHj0yAcErjN/F9Z5kWBpM5sOxoqPexI c0ZY8jAaG8+TcPAmtk32FSpAn4QZKOWJOCNKIjL8JFndNKzO5ei7On8tNkTOeAT3j4dv dXIc2of8fKXkPOx94jrmDpZ4VpFn+mqJXEyJ+sZ5RUj1U8hZe9o0+PTs834cIHSFWPHo Tgnw== X-Gm-Message-State: ANoB5pk29gL2G54t6rvwTdacfKhjseNj4qXBpiMj+5myLAUcupjeaJsv Ot7c5wfkgqYe2+H+fxk3wIdKRTadYQ4= X-Google-Smtp-Source: AA0mqf5xvXTHmbcEkQW5go/Ai3nx39wGfQi8QUrckv5OAk7BKFWH7rdQrtHLCrc5Eq3vcL/AWL3W3w== X-Received: by 2002:a5d:5709:0:b0:241:d71c:5dde with SMTP id a9-20020a5d5709000000b00241d71c5ddemr13598882wrv.678.1669344817737; Thu, 24 Nov 2022 18:53:37 -0800 (PST) Received: from beren.harmstone.com ([2a02:8010:64ea:0:8eb8:7eff:fe53:9d5f]) by smtp.gmail.com with ESMTPSA id u13-20020a5d514d000000b002365b759b65sm2617024wrt.86.2022.11.24.18.53.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Nov 2022 18:53:37 -0800 (PST) Sender: Mark Harmstone From: Mark Harmstone To: jbeulich@suse.com, binutils@sourceware.org Cc: Mark Harmstone Subject: [PATCH v2] ld: Generate PDB string table Date: Fri, 25 Nov 2022 02:53:34 +0000 Message-Id: <20221125025334.26665-1-mark@harmstone.com> X-Mailer: git-send-email 2.37.4 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This populates the "/names" named stream of the PDB, which contains the string table. We loop through the .debug$S sections of the object files, deduplicate the strings found in the DEBUG_S_STRINGTABLE subsection, and add them to the table. --- ld/pdb.c | 296 +++++++++++++++++++++++++++++- ld/pdb.h | 12 ++ ld/testsuite/ld-pe/pdb-strings.d | 10 + ld/testsuite/ld-pe/pdb-strings1.s | 19 ++ ld/testsuite/ld-pe/pdb-strings2.s | 19 ++ ld/testsuite/ld-pe/pdb.exp | 122 ++++++++++++ 6 files changed, 472 insertions(+), 6 deletions(-) create mode 100644 ld/testsuite/ld-pe/pdb-strings.d create mode 100644 ld/testsuite/ld-pe/pdb-strings1.s create mode 100644 ld/testsuite/ld-pe/pdb-strings2.s diff --git a/ld/pdb.c b/ld/pdb.c index 6f69574289d..dc008bc38bb 100644 --- a/ld/pdb.c +++ b/ld/pdb.c @@ -41,6 +41,23 @@ struct public uint32_t address; }; +struct string +{ + struct string *next; + uint32_t hash; + uint32_t offset; + size_t len; + char s[]; +}; + +struct string_table +{ + struct string *strings_head; + struct string *strings_tail; + uint32_t strings_len; + htab_t hashmap; +}; + /* Add a new stream to the PDB archive, and return its BFD. */ static bfd * add_stream (bfd *pdb, const char *name, uint16_t *stream_num) @@ -383,15 +400,170 @@ get_arch_number (bfd *abfd) return IMAGE_FILE_MACHINE_I386; } +/* Add a string to the strings table, if it's not already there. */ +static void +add_string (char *str, size_t len, struct string_table *strings) +{ + uint32_t hash = calc_hash (str, len); + void **slot; + + slot = htab_find_slot_with_hash (strings->hashmap, str, hash, INSERT); + + if (slot && !*slot) + { + struct string *s; + + *slot = xmalloc (offsetof (struct string, s) + len); + + s = (struct string *) *slot; + + s->next = NULL; + s->hash = hash; + s->offset = strings->strings_len; + s->len = len; + memcpy (s->s, str, len); + + if (strings->strings_tail) + strings->strings_tail->next = s; + else + strings->strings_head = s; + + strings->strings_tail = s; + + strings->strings_len += len + 1; + } +} + +/* Return the hash of an entry in the string table. */ +static hashval_t +hash_string_table_entry (const void *p) +{ + const struct string *s = (const struct string *) p; + + return s->hash; +} + +/* Compare an entry in the string table with a string. */ +static int +eq_string_table_entry (const void *a, const void *b) +{ + const struct string *s1 = (const struct string *) a; + const char *s2 = (const char *) b; + size_t s2_len = strlen (s2); + + if (s2_len != s1->len) + return 0; + + return memcmp (s1->s, s2, s2_len) == 0; +} + +/* Parse the string table within the .debug$S section. */ +static void +parse_string_table (bfd_byte *data, size_t size, + struct string_table *strings) +{ + while (true) + { + size_t len = strnlen ((char *) data, size); + + add_string ((char *) data, len, strings); + + data += len + 1; + + if (size <= len + 1) + break; + + size -= len + 1; + } +} + +/* Parse the .debug$S section within an object file. */ +static bool +handle_debugs_section (asection *s, bfd *mod, struct string_table *strings) +{ + bfd_byte *data = NULL; + size_t off; + + if (!bfd_get_full_section_contents (mod, s, &data)) + return false; + + if (!data) + return false; + + if (bfd_getl32 (data) != CV_SIGNATURE_C13) + { + free (data); + return true; + } + + off = sizeof (uint32_t); + + while (off + sizeof (uint32_t) <= s->size) + { + uint32_t type, size; + + type = bfd_getl32 (data + off); + + off += sizeof (uint32_t); + + if (off + sizeof (uint32_t) > s->size) + { + free (data); + bfd_set_error (bfd_error_bad_value); + return false; + } + + size = bfd_getl32 (data + off); + + off += sizeof (uint32_t); + + if (off + size > s->size) + { + free (data); + bfd_set_error (bfd_error_bad_value); + return false; + } + + switch (type) + { + case DEBUG_S_STRINGTABLE: + parse_string_table (data + off, size, strings); + + break; + } + + off += size; + + if (off % sizeof (uint32_t)) + off += sizeof (uint32_t) - (off % sizeof (uint32_t)); + } + + free (data); + + return true; +} + /* Populate the module stream, which consists of the transformed .debug$S data for each object file. */ static bool -populate_module_stream (bfd *stream, uint32_t *sym_byte_size) +populate_module_stream (bfd *stream, bfd *mod, uint32_t *sym_byte_size, + struct string_table *strings) { uint8_t int_buf[sizeof (uint32_t)]; *sym_byte_size = sizeof (uint32_t); + /* Process .debug$S section(s). */ + + for (asection *s = mod->sections; s; s = s->next) + { + if (!strcmp (s->name, ".debug$S") && s->size >= sizeof (uint32_t)) + { + if (!handle_debugs_section (s, mod, strings)) + return false; + } + } + /* Write the signature. */ bfd_putl32 (CV_SIGNATURE_C13, int_buf); @@ -412,7 +584,7 @@ populate_module_stream (bfd *stream, uint32_t *sym_byte_size) /* Create the module info substream within the DBI. */ static bool create_module_info_substream (bfd *abfd, bfd *pdb, void **data, - uint32_t *size) + uint32_t *size, struct string_table *strings) { uint8_t *ptr; @@ -482,7 +654,8 @@ create_module_info_substream (bfd *abfd, bfd *pdb, void **data, return false; } - if (!populate_module_stream (stream, &sym_byte_size)) + if (!populate_module_stream (stream, in, &sym_byte_size, + strings)) { free (*data); return false; @@ -687,14 +860,16 @@ static bool populate_dbi_stream (bfd *stream, bfd *abfd, bfd *pdb, uint16_t section_header_stream_num, uint16_t sym_rec_stream_num, - uint16_t publics_stream_num) + uint16_t publics_stream_num, + struct string_table *strings) { struct pdb_dbi_stream_header h; struct optional_dbg_header opt; void *mod_info, *sc; uint32_t mod_info_size, sc_size; - if (!create_module_info_substream (abfd, pdb, &mod_info, &mod_info_size)) + if (!create_module_info_substream (abfd, pdb, &mod_info, &mod_info_size, + strings)) return false; if (!create_section_contrib_substream (abfd, &sc, &sc_size)) @@ -1107,6 +1282,95 @@ create_section_header_stream (bfd *pdb, bfd *abfd, uint16_t *num) return true; } +/* Populate the "/names" named stream, which contains the string table. */ +static bool +populate_names_stream (bfd *stream, struct string_table *strings) +{ + char int_buf[sizeof (uint32_t)]; + struct string_table_header h; + uint32_t num_strings = 0, num_buckets; + struct string **buckets; + + bfd_putl32 (STRING_TABLE_SIGNATURE, &h.signature); + bfd_putl32 (STRING_TABLE_VERSION, &h.version); + + if (bfd_bwrite (&h, sizeof (h), stream) != sizeof (h)) + return false; + + bfd_putl32 (strings->strings_len, int_buf); + + if (bfd_bwrite (int_buf, sizeof (uint32_t), stream) != sizeof (uint32_t)) + return false; + + int_buf[0] = 0; + + if (bfd_bwrite (int_buf, 1, stream) != 1) + return false; + + for (struct string *s = strings->strings_head; s; s = s->next) + { + if (bfd_bwrite (s->s, s->len, stream) != s->len) + return false; + + if (bfd_bwrite (int_buf, 1, stream) != 1) + return false; + + num_strings++; + } + + num_buckets = num_strings * 2; + + buckets = xmalloc (sizeof (struct string *) * num_buckets); + memset (buckets, 0, sizeof (struct string *) * num_buckets); + + for (struct string *s = strings->strings_head; s; s = s->next) + { + uint32_t bucket_num = s->hash % num_buckets; + + while (buckets[bucket_num]) + { + bucket_num++; + + if (bucket_num == num_buckets) + bucket_num = 0; + } + + buckets[bucket_num] = s; + } + + bfd_putl32 (num_buckets, int_buf); + + if (bfd_bwrite (int_buf, sizeof (uint32_t), stream) != sizeof (uint32_t)) + { + free (buckets); + return false; + } + + for (unsigned int i = 0; i < num_buckets; i++) + { + if (buckets[i]) + bfd_putl32 (buckets[i]->offset, int_buf); + else + bfd_putl32 (0, int_buf); + + if (bfd_bwrite (int_buf, sizeof (uint32_t), stream) != + sizeof (uint32_t)) + { + free (buckets); + return false; + } + } + + free (buckets); + + bfd_putl32 (num_strings, int_buf); + + if (bfd_bwrite (int_buf, sizeof (uint32_t), stream) != sizeof (uint32_t)) + return false; + + return true; +} + /* Create a PDB debugging file for the PE image file abfd with the build ID guid, stored at pdb_name. */ bool @@ -1117,6 +1381,7 @@ create_pdb_file (bfd *abfd, const char *pdb_name, const unsigned char *guid) bfd *info_stream, *dbi_stream, *names_stream, *sym_rec_stream, *publics_stream; uint16_t section_header_stream_num, sym_rec_stream_num, publics_stream_num; + struct string_table strings; pdb = bfd_openw (pdb_name, "pdb"); if (!pdb) @@ -1125,6 +1390,13 @@ create_pdb_file (bfd *abfd, const char *pdb_name, const unsigned char *guid) return false; } + strings.strings_head = NULL; + strings.strings_tail = NULL; + strings.strings_len = 1; + strings.hashmap = htab_create_alloc (0, hash_string_table_entry, + eq_string_table_entry, free, + xcalloc, free); + bfd_set_format (pdb, bfd_archive); if (!create_old_directory_stream (pdb)) @@ -1201,13 +1473,23 @@ create_pdb_file (bfd *abfd, const char *pdb_name, const unsigned char *guid) } if (!populate_dbi_stream (dbi_stream, abfd, pdb, section_header_stream_num, - sym_rec_stream_num, publics_stream_num)) + sym_rec_stream_num, publics_stream_num, + &strings)) { einfo (_("%P: warning: cannot populate DBI stream " "in PDB file: %E\n")); goto end; } + add_string ("", 0, &strings); + + if (!populate_names_stream (names_stream, &strings)) + { + einfo (_("%P: warning: cannot populate names stream " + "in PDB file: %E\n")); + goto end; + } + if (!populate_publics_stream (publics_stream, abfd, sym_rec_stream)) { einfo (_("%P: warning: cannot populate publics stream " @@ -1227,5 +1509,7 @@ create_pdb_file (bfd *abfd, const char *pdb_name, const unsigned char *guid) end: bfd_close (pdb); + htab_delete (strings.hashmap); + return ret; } diff --git a/ld/pdb.h b/ld/pdb.h index e22dea18eca..611f71041c0 100644 --- a/ld/pdb.h +++ b/ld/pdb.h @@ -155,6 +155,18 @@ struct optional_dbg_header #define CV_SIGNATURE_C13 4 +#define DEBUG_S_STRINGTABLE 0xf3 + +#define STRING_TABLE_SIGNATURE 0xeffeeffe +#define STRING_TABLE_VERSION 1 + +/* VHdr in nmt.h */ +struct string_table_header +{ + uint32_t signature; + uint32_t version; +}; + #define SECTION_CONTRIB_VERSION_60 0xf12eba2d /* SC in dbicommon.h */ diff --git a/ld/testsuite/ld-pe/pdb-strings.d b/ld/testsuite/ld-pe/pdb-strings.d new file mode 100644 index 00000000000..8be853efb72 --- /dev/null +++ b/ld/testsuite/ld-pe/pdb-strings.d @@ -0,0 +1,10 @@ + +*: file format binary + +Contents of section .data: + 0000 feeffeef 01000000 17000000 0000666f ..............fo + 0010 6f006261 72006261 7a007175 78007175 o.bar.baz.qux.qu + 0020 7578000c 00000001 0000000a 00000000 ux.............. + 0030 00000000 00000000 00000012 00000000 ................ + 0040 00000000 00000002 00000006 00000000 ................ + 0050 0000000e 00000006 000000 ........... \ No newline at end of file diff --git a/ld/testsuite/ld-pe/pdb-strings1.s b/ld/testsuite/ld-pe/pdb-strings1.s new file mode 100644 index 00000000000..09eedd93fb3 --- /dev/null +++ b/ld/testsuite/ld-pe/pdb-strings1.s @@ -0,0 +1,19 @@ +.equ CV_SIGNATURE_C13, 4 +.equ DEBUG_S_STRINGTABLE, 0xf3 + +.section ".debug$S", "rn" +.long CV_SIGNATURE_C13 +.long DEBUG_S_STRINGTABLE +.long .strings_end - .strings_start + +.strings_start: + +.asciz "" +.asciz "foo" +.asciz "bar" +.asciz "baz" +.asciz "qux" + +.strings_end: + +.balign 4 diff --git a/ld/testsuite/ld-pe/pdb-strings2.s b/ld/testsuite/ld-pe/pdb-strings2.s new file mode 100644 index 00000000000..33d9215e4c8 --- /dev/null +++ b/ld/testsuite/ld-pe/pdb-strings2.s @@ -0,0 +1,19 @@ +.equ CV_SIGNATURE_C13, 4 +.equ DEBUG_S_STRINGTABLE, 0xf3 + +.section ".debug$S", "rn" +.long CV_SIGNATURE_C13 +.long DEBUG_S_STRINGTABLE +.long .strings_end - .strings_start + +.strings_start: + +.asciz "" +.asciz "bar" +.asciz "baz" +.asciz "qux" +.asciz "quux" + +.strings_end: + +.balign 4 diff --git a/ld/testsuite/ld-pe/pdb.exp b/ld/testsuite/ld-pe/pdb.exp index 0be65e22fb6..09e9b4a8809 100644 --- a/ld/testsuite/ld-pe/pdb.exp +++ b/ld/testsuite/ld-pe/pdb.exp @@ -703,5 +703,127 @@ proc test2 { } { test_section_contrib $section_contrib } +proc find_named_stream { pdb name } { + global ar + + set exec_output [run_host_cmd "$ar" "x --output tmpdir $pdb 0001"] + + if ![string match "" $exec_output] { + return 0 + } + + set fi [open tmpdir/0001] + fconfigure $fi -translation binary + + seek $fi 0x1c + + set data [read $fi 4] + binary scan $data i string_len + + set strings [read $fi $string_len] + + set string_off 0 + + while {[string first \000 $strings $string_off] != -1 } { + set str [string range $strings $string_off [expr [string first \000 $strings $string_off] - 1]] + + if { $str eq $name } { + break + } + + incr string_off [expr [string length $str] + 1] + } + + if { [string length $strings] == $string_off } { # string not found + close $fi + return 0 + } + + set data [read $fi 4] + binary scan $data i num_entries + + seek $fi 4 current + + set data [read $fi 4] + binary scan $data i present_bitmap_len + + seek $fi [expr $present_bitmap_len * 4] current + + set data [read $fi 4] + binary scan $data i deleted_bitmap_len + + seek $fi [expr $deleted_bitmap_len * 4] current + + for {set i 0} {$i < $num_entries} {incr i} { + set data [read $fi 4] + binary scan $data i offset + + if { $offset == $string_off } { + set data [read $fi 4] + binary scan $data i value + close $fi + + return $value + } + + seek $fi 4 current + } + + close $fi + + return 0 +} + +proc test3 { } { + global as + global ar + global ld + global objdump + global srcdir + global subdir + + if ![ld_assemble $as $srcdir/$subdir/pdb-strings1.s tmpdir/pdb-strings1.o] { + unsupported "Build pdb-strings1.o" + return + } + + if ![ld_assemble $as $srcdir/$subdir/pdb-strings2.s tmpdir/pdb-strings2.o] { + unsupported "Build pdb-strings2.o" + return + } + + if ![ld_link $ld "tmpdir/pdb-strings.exe" "--pdb=tmpdir/pdb-strings.pdb tmpdir/pdb-strings1.o tmpdir/pdb-strings2.o"] { + unsupported "Create PE image with PDB file" + return + } + + set index [find_named_stream "tmpdir/pdb-strings.pdb" "/names"] + + if { $index == 0 } { + fail "Could not find /names stream" + return + } else { + pass "Found /names stream" + } + + set index_str [format "%04x" $index] + + set exec_output [run_host_cmd "$ar" "x --output tmpdir tmpdir/pdb-strings.pdb $index_str"] + + if ![string match "" $exec_output] { + return 0 + } + + set exp [file_contents "$srcdir/$subdir/pdb-strings.d"] + set got [run_host_cmd "$objdump" "-s --target=binary tmpdir/$index_str"] + + if ![string match $exp $got] { + fail "Strings table was not as expected" + } else { + pass "Strings table was as expected" + } +} + test1 test2 +test3 -- 2.37.4