From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 0C2CF3858D32 for ; Mon, 26 Dec 2022 20:47:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0C2CF3858D32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=harmstone.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-x32c.google.com with SMTP id k22-20020a05600c1c9600b003d1ee3a6289so8201695wms.2 for ; Mon, 26 Dec 2022 12:47:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:sender:from:to:cc:subject:date:message-id:reply-to; bh=WtQFthZskNEDCGcq/NNtKzFGi4EjL24cCQnxad40WXI=; b=lFPdcahqAj/vfOc3vGTuQfYuHHzKuajD65Y8HmNJQ5X/sv2r5W01cDueSFvzy43FoQ YAeWml7OQ/hi+34GS+Q8rbHNJIeWRdeJgxjvTnerK9/INhW/zzcnROyviFgzm9cOmpOZ wEaTnybjI9vCtXDPhEJROCizvNVBmQGfGWbzk74eHQULsrU6dr9COx6upNlDXA2G51Fz BK4WG2SdLq1mk6tqtUD7G0itD1EcGVJyJ0dsL6KWTPtTdUPKKI6Vp1GypzHSY/Dg5YB8 NbO+WuaBBrxfZ9L4G6s2P2ujuRjJuKQRjrHi//eI+ChwySeeBZLYNnEGo99jZnpLSmG4 NBlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:sender:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=WtQFthZskNEDCGcq/NNtKzFGi4EjL24cCQnxad40WXI=; b=Ej2BiKqKRw4qUUxkBU3/fuLH6DV9kWRcfq0z1YnKg09WfOAvd/AkzShNTWeCCz1O6u PVetCdk6ljo4o/7zworJgKmaDybSVjCTTgK49g81o4euw+2+bqaJogdN2ICWN283yZn3 9iLhRovqBrsCiK6VbKPUx2zLB4v8IN1kjTqG2o0gLH7S6nWpRh5ijzJ07kBSBWtMXR/A 61Syhj9Zv0MEvAHQnI+36r30qGVNsGu+qWkhkok0OoTBbmOfNOrnG7HnN1psYUk/SU3k eVeH2Fp4GFfzdeppEaDmBKqqtyf4DnAcZfzBlD5pCRVfLlsTEU2RAZwd9UFrDJ+nr8UN bQOQ== X-Gm-Message-State: AFqh2krbwVEK9cDtLpxYyH6OqMdr1NUiNUcjKgwWPfB6UzkyAiVmuOnG f+ovzkXyHcQ9i+Y8oqyGvOSJNa6RXMA= X-Google-Smtp-Source: AMrXdXvxezQD/AlpguJ0BEKF/lwQPh70Xthp/y7dGoEkXLRrZAYSjKdJ8We6U8yscNyQkqJYrMM1rg== X-Received: by 2002:a1c:4b06:0:b0:3cf:5e42:de64 with SMTP id y6-20020a1c4b06000000b003cf5e42de64mr16588629wma.39.1672087675250; Mon, 26 Dec 2022 12:47:55 -0800 (PST) Received: from beren.harmstone.com ([2a02:8010:64ea:0:8eb8:7eff:fe53:9d5f]) by smtp.gmail.com with ESMTPSA id e27-20020adf9bdb000000b00272c0767b4asm10384990wrc.109.2022.12.26.12.47.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Dec 2022 12:47:54 -0800 (PST) Sender: Mark Harmstone From: Mark Harmstone To: binutils@sourceware.org Cc: Mark Harmstone Subject: [PATCH 1/3] ld: Handle extended-length data structures in PDB types Date: Mon, 26 Dec 2022 20:47:49 +0000 Message-Id: <20221226204751.23761-1-mark@harmstone.com> X-Mailer: git-send-email 2.37.4 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,GIT_PATCH_0,HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: A few fixes to minor issues I've discovered in my PDB patches. * If sizes or offsets are greater than 0x8000, they get encoded as extended values in the same way as for enum values - e.g. a LF_ULONG .short followed by a .long. * I've managed to coax MSVC to produce another type, LF_VFTABLE, which is seen when dealing with COM. I don't think LLVM emits this. Note that we can't just implement everything in Microsoft's header files, as most of it is obsolete. * Fixes a stupid bug in the test program, where I was adding an index to a size. The index was hard-coded to 0, so this didn't cause any actual issues. --- ld/pdb.c | 228 ++++++++++++++++++++--- ld/pdb.h | 1 - ld/testsuite/ld-pe/pdb-types1-hashlist.d | 4 +- ld/testsuite/ld-pe/pdb-types1-typelist.d | 16 +- ld/testsuite/ld-pe/pdb-types1b.s | 121 +++++++++++- ld/testsuite/ld-pe/pdb.exp | 2 +- 6 files changed, 344 insertions(+), 28 deletions(-) diff --git a/ld/pdb.c b/ld/pdb.c index 0346ccb388c..5467a9efc12 100644 --- a/ld/pdb.c +++ b/ld/pdb.c @@ -2485,6 +2485,7 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, case LF_MEMBER: { struct lf_member *mem = (struct lf_member *) ptr; + uint16_t offset; size_t name_len, subtype_len; if (left < offsetof (struct lf_member, name)) @@ -2497,9 +2498,34 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, if (!remap_type (&mem->type, map, type_num, num_types)) return false; + subtype_len = offsetof (struct lf_member, name); + + offset = bfd_getl16 (&mem->offset); + + /* If offset >= 0x8000, actual value follows. */ + if (offset >= 0x8000) + { + unsigned int param_len = extended_value_len (offset); + + if (param_len == 0) + { + einfo (_("%P: warning: unhandled type %v within" + " LF_MEMBER\n"), offset); + return false; + } + + subtype_len += param_len; + + if (left < subtype_len) + { + einfo (_("%P: warning: truncated CodeView type record" + " LF_MEMBER\n")); + return false; + } + } + name_len = - strnlen (mem->name, - left - offsetof (struct lf_member, name)); + strnlen ((char *) mem + subtype_len, left - subtype_len); if (name_len == left - offsetof (struct lf_member, name)) { @@ -2510,7 +2536,7 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, name_len++; - subtype_len = offsetof (struct lf_member, name) + name_len; + subtype_len += name_len; if (subtype_len % 4 != 0) subtype_len += 4 - (subtype_len % 4); @@ -2715,6 +2741,8 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, case LF_BCLASS: { struct lf_bclass *bc = (struct lf_bclass *) ptr; + size_t subtype_len; + uint16_t offset; if (left < sizeof (struct lf_bclass)) { @@ -2727,8 +2755,44 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, num_types)) return false; - ptr += sizeof (struct lf_bclass); - left -= sizeof (struct lf_bclass); + subtype_len = sizeof (struct lf_bclass); + + offset = bfd_getl16 (&bc->offset); + + /* If offset >= 0x8000, actual value follows. */ + if (offset >= 0x8000) + { + unsigned int param_len = extended_value_len (offset); + + if (param_len == 0) + { + einfo (_("%P: warning: unhandled type %v within" + " LF_BCLASS\n"), offset); + return false; + } + + subtype_len += param_len; + + if (left < subtype_len) + { + einfo (_("%P: warning: truncated CodeView type record" + " LF_BCLASS\n")); + return false; + } + } + + if (subtype_len % 4 != 0) + subtype_len += 4 - (subtype_len % 4); + + if (left < subtype_len) + { + einfo (_("%P: warning: truncated CodeView type record" + " LF_BCLASS\n")); + return false; + } + + ptr += subtype_len; + left -= subtype_len; break; } @@ -2757,6 +2821,8 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, case LF_IVBCLASS: { struct lf_vbclass *vbc = (struct lf_vbclass *) ptr; + size_t subtype_len; + uint16_t offset; if (left < sizeof (struct lf_vbclass)) { @@ -2773,8 +2839,70 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, type_num, num_types)) return false; - ptr += sizeof (struct lf_vbclass); - left -= sizeof (struct lf_vbclass); + subtype_len = offsetof (struct lf_vbclass, + virtual_base_vbtable_offset); + + offset = bfd_getl16 (&vbc->virtual_base_pointer_offset); + + /* If offset >= 0x8000, actual value follows. */ + if (offset >= 0x8000) + { + unsigned int param_len = extended_value_len (offset); + + if (param_len == 0) + { + einfo (_("%P: warning: unhandled type %v within" + " LF_VBCLASS/LF_IVBCLASS\n"), offset); + return false; + } + + subtype_len += param_len; + + if (left < subtype_len) + { + einfo (_("%P: warning: truncated CodeView type record" + " LF_VBCLASS/LF_IVBCLASS\n")); + return false; + } + } + + offset = bfd_getl16 ((char *)vbc + subtype_len); + subtype_len += sizeof (uint16_t); + + /* If offset >= 0x8000, actual value follows. */ + if (offset >= 0x8000) + { + unsigned int param_len = extended_value_len (offset); + + if (param_len == 0) + { + einfo (_("%P: warning: unhandled type %v within" + " LF_VBCLASS/LF_IVBCLASS\n"), offset); + return false; + } + + subtype_len += param_len; + + if (left < subtype_len) + { + einfo (_("%P: warning: truncated CodeView type record" + " LF_VBCLASS/LF_IVBCLASS\n")); + return false; + } + } + + if (subtype_len % 4 != 0) + subtype_len += 4 - (subtype_len % 4); + + if (left < subtype_len) + { + einfo (_("%P: warning: truncated CodeView type record" + " LF_VBCLASS/LF_IVBCLASS\n")); + return false; + } + + ptr += subtype_len; + left -= subtype_len; break; } @@ -2959,8 +3087,8 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, case LF_STRUCTURE: { struct lf_class *cl = (struct lf_class *) data; - uint16_t prop; - size_t name_len; + uint16_t prop, num_bytes; + size_t name_len, name_off; if (size < offsetof (struct lf_class, name)) { @@ -2978,9 +3106,35 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, if (!remap_type (&cl->vshape, map, type_num, num_types)) return false; - name_len = strnlen (cl->name, size - offsetof (struct lf_class, name)); + name_off = offsetof (struct lf_class, name); + + num_bytes = bfd_getl16 (&cl->length); - if (name_len == size - offsetof (struct lf_class, name)) + /* If num_bytes >= 0x8000, actual value follows. */ + if (num_bytes >= 0x8000) + { + unsigned int param_len = extended_value_len (num_bytes); + + if (param_len == 0) + { + einfo (_("%P: warning: unhandled type %v within" + " LF_CLASS/LF_STRUCTURE\n"), num_bytes); + return false; + } + + name_off += param_len; + + if (size < name_off) + { + einfo (_("%P: warning: truncated CodeView type record" + " LF_CLASS/LF_STRUCTURE\n")); + return false; + } + } + + name_len = strnlen ((char *) cl + name_off, size - name_off); + + if (name_len == size - name_off) { einfo (_("%P: warning: name for LF_CLASS/LF_STRUCTURE has no" " terminating zero\n")); @@ -2993,10 +3147,11 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, { /* Structure has another name following first one. */ - size_t len = offsetof (struct lf_class, name) + name_len + 1; + size_t len = name_off + name_len + 1; size_t unique_name_len; - unique_name_len = strnlen (cl->name + name_len + 1, size - len); + unique_name_len = strnlen ((char *) cl + name_off + name_len + 1, + size - len); if (unique_name_len == size - len) { @@ -3007,10 +3162,10 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, } if (!(prop & (CV_PROP_FORWARD_REF | CV_PROP_SCOPED)) - && !is_name_anonymous (cl->name, name_len)) + && !is_name_anonymous ((char *) cl + name_off, name_len)) { other_hash = true; - cv_hash = crc32 ((uint8_t *) cl->name, name_len); + cv_hash = crc32 ((uint8_t *) cl + name_off, name_len); } break; @@ -3019,8 +3174,8 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, case LF_UNION: { struct lf_union *un = (struct lf_union *) data; - uint16_t prop; - size_t name_len; + uint16_t prop, num_bytes; + size_t name_len, name_off; if (size < offsetof (struct lf_union, name)) { @@ -3032,9 +3187,35 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, if (!remap_type (&un->field_list, map, type_num, num_types)) return false; - name_len = strnlen (un->name, size - offsetof (struct lf_union, name)); + name_off = offsetof (struct lf_union, name); + + num_bytes = bfd_getl16 (&un->length); + + /* If num_bytes >= 0x8000, actual value follows. */ + if (num_bytes >= 0x8000) + { + unsigned int param_len = extended_value_len (num_bytes); + + if (param_len == 0) + { + einfo (_("%P: warning: unhandled type %v within" + " LF_UNION\n"), num_bytes); + return false; + } + + name_off += param_len; + + if (size < name_off) + { + einfo (_("%P: warning: truncated CodeView type record" + " LF_UNION\n")); + return false; + } + } + + name_len = strnlen ((char *) un + name_off, size - name_off); - if (name_len == size - offsetof (struct lf_union, name)) + if (name_len == size - name_off) { einfo (_("%P: warning: name for LF_UNION has no" " terminating zero\n")); @@ -3047,10 +3228,11 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, { /* Structure has another name following first one. */ - size_t len = offsetof (struct lf_union, name) + name_len + 1; + size_t len = name_off + name_len + 1; size_t unique_name_len; - unique_name_len = strnlen (un->name + name_len + 1, size - len); + unique_name_len = strnlen ((char *) un + name_off + name_len + 1, + size - len); if (unique_name_len == size - len) { @@ -3061,10 +3243,10 @@ handle_type (uint8_t *data, struct type_entry **map, uint32_t type_num, } if (!(prop & (CV_PROP_FORWARD_REF | CV_PROP_SCOPED)) - && !is_name_anonymous (un->name, name_len)) + && !is_name_anonymous ((char *) un + name_off, name_len)) { other_hash = true; - cv_hash = crc32 ((uint8_t *) un->name, name_len); + cv_hash = crc32 ((uint8_t *) un + name_off, name_len); } break; diff --git a/ld/pdb.h b/ld/pdb.h index 749a60249df..ddf731b99c9 100644 --- a/ld/pdb.h +++ b/ld/pdb.h @@ -480,7 +480,6 @@ struct lf_bclass uint16_t attributes; uint32_t base_class_type; uint16_t offset; - uint16_t padding; } ATTRIBUTE_PACKED; /* lfVFuncTab in cvinfo.h */ diff --git a/ld/testsuite/ld-pe/pdb-types1-hashlist.d b/ld/testsuite/ld-pe/pdb-types1-hashlist.d index b75f08c1de7..aa00aaf7593 100644 --- a/ld/testsuite/ld-pe/pdb-types1-hashlist.d +++ b/ld/testsuite/ld-pe/pdb-types1-hashlist.d @@ -10,4 +10,6 @@ Contents of section .data: 0050 ffd80200 b0260100 7c060200 e3240200 * 0060 63ff0100 fb6b0300 0ad90100 523c0200 * 0070 4d5e0200 8a940200 4b710300 6aa90300 * - 0080 0a2c0300 67e10300 4a3d0300 * \ No newline at end of file + 0080 0a2c0300 67e10300 4a3d0300 fa460300 * + 0090 db020200 ec4e0100 131e0300 fb120300 * + 00a0 aece0200 1db70100 * \ No newline at end of file diff --git a/ld/testsuite/ld-pe/pdb-types1-typelist.d b/ld/testsuite/ld-pe/pdb-types1-typelist.d index ff2d91c311e..df862c3f837 100644 --- a/ld/testsuite/ld-pe/pdb-types1-typelist.d +++ b/ld/testsuite/ld-pe/pdb-types1-typelist.d @@ -57,4 +57,18 @@ Contents of section .data: 0340 7200f2f1 10150000 20100000 6e657374 r....... ...nest 0350 65645f65 6e756d00 1a000515 01000000 ed_enum......... 0360 21100000 00000000 00000000 04007175 !.............qu - 0370 757800f1 ux.. \ No newline at end of file + 0370 757800f1 12000315 10000000 74000000 ux..........t... + 0380 028060ea 00f3f2f1 2e000312 0d150300 ..`............. + 0390 23100000 00006100 0d150300 23100000 #.....a.....#... + 03a0 028060ea 6200f2f1 0d150300 23100000 ..`.b.......#... + 03b0 0480c0d4 01006300 26000515 03000000 ......c.&....... + 03c0 24100000 00000000 00000000 048020bf $............. . + 03d0 02006c6f 6e677374 72756374 00f3f2f1 ..longstruct.... + 03e0 1a000312 0d150300 23100000 00006100 ........#.....a. + 03f0 0d150300 23100000 00006200 1a000615 ....#.....b..... + 0400 02000000 26100000 028060ea 6c6f6e67 ....&.....`.long + 0410 756e696f 6e00f2f1 1e000312 00140000 union........... + 0420 25100000 0480c0d4 0100f2f1 0d150300 %............... + 0430 23100000 00006400 26000312 01140000 #.....d.&....... + 0440 25100000 00000000 028060ea 0480c0d4 %.........`..... + 0450 0100f2f1 0d150300 23100000 00006400 ........#.....d. \ No newline at end of file diff --git a/ld/testsuite/ld-pe/pdb-types1b.s b/ld/testsuite/ld-pe/pdb-types1b.s index 89ee6e3840f..544b338c251 100644 --- a/ld/testsuite/ld-pe/pdb-types1b.s +++ b/ld/testsuite/ld-pe/pdb-types1b.s @@ -33,6 +33,7 @@ .equ LF_USHORT, 0x8002 .equ LF_LONG, 0x8003 +.equ LF_ULONG, 0x8004 .equ LF_UQUADWORD, 0x800a .equ CV_PTR_NEAR32, 0xa @@ -447,7 +448,7 @@ # Type 1021, struct quux, field list 1020 .struct4: -.short .types_end - .struct4 - 2 +.short .arr2 - .struct4 - 2 .short LF_STRUCTURE .short 1 # no. members .short 0 # property @@ -458,4 +459,122 @@ .asciz "quux" # name .byte 0xf1 # padding +# Type 1022, array[60000] of char +.arr2: +.short .fieldlist8 - .arr2 - 2 +.short LF_ARRAY +.long T_CHAR # element type +.long T_INT4 # index type +.short LF_USHORT +.short 60000 # size in bytes +.byte 0 # name +.byte 0xf3 # padding +.byte 0xf2 # padding +.byte 0xf1 # padding + +# Type 1023, field list for struct longstruct +.fieldlist8: +.short .struct5 - .fieldlist8 - 2 +.short LF_FIELDLIST +.short LF_MEMBER +.short 3 # public +.long 0x1022 +.short 0 # offset +.asciz "a" +.short LF_MEMBER +.short 3 # public +.long 0x1022 +.short LF_USHORT +.short 60000 # offset +.asciz "b" +.byte 0xf2 # padding +.byte 0xf1 # padding +.short LF_MEMBER +.short 3 # public +.long 0x1022 +.short LF_ULONG +.long 120000 # offset +.asciz "c" + +# Type 1024, struct longstruct +.struct5: +.short .fieldlist9 - .struct5 - 2 +.short LF_STRUCTURE +.short 3 # no. members +.short 0 # property +.long 0x1023 # field list +.long 0 # type derived from +.long 0 # type of vshape table +.short LF_ULONG +.long 180000 # size +.asciz "longstruct" # name +.byte 0xf3 # padding +.byte 0xf2 # padding +.byte 0xf1 # padding + +# Type 1025, field list for union longunion +.fieldlist9: +.short .union4 - .fieldlist9 - 2 +.short LF_FIELDLIST +.short LF_MEMBER +.short 3 # public +.long 0x1022 +.short 0 # offset +.asciz "a" +.short LF_MEMBER +.short 3 # public +.long 0x1022 +.short 0 # offset +.asciz "b" + +# Type 1026, union longunion (field list 1025) +.union4: +.short .fieldlist10 - .union4 - 2 +.short LF_UNION +.short 2 # no. members +.short 0 # property +.long 0x1025 # field list +.short LF_USHORT +.short 60000 # size +.asciz "longunion" +.byte 0xf2 # padding +.byte 0xf1 # padding + +# Type 1027, field list with base class longstruct +.fieldlist10: +.short .fieldlist11 - .fieldlist10 - 2 +.short LF_FIELDLIST +.short LF_BCLASS +.short 0 # attributes +.long 0x1024 # base class +.short LF_ULONG +.long 120000 # offset within class +.byte 0xf2 # padding +.byte 0xf1 # padding +.short LF_MEMBER +.short 3 # public +.long 0x1022 +.short 0 # offset +.asciz "d" + +# Type 1028, field list with virtual base class longstruct +.fieldlist11: +.short .types_end - .fieldlist11 - 2 +.short LF_FIELDLIST +.short LF_VBCLASS +.short 0 # attributes +.long 0x1024 # type index of direct virtual base class +.long 0 # type index of virtual base pointer +.short LF_USHORT +.short 60000 # virtual base pointer offset +.short LF_ULONG +.long 120000 # virtual base offset from vbtable +.byte 0xf2 # padding +.byte 0xf1 # padding +.short LF_MEMBER +.short 3 # public +.long 0x1022 +.short 0 # offset +.asciz "d" + .types_end: diff --git a/ld/testsuite/ld-pe/pdb.exp b/ld/testsuite/ld-pe/pdb.exp index bd50b2fb076..fbc0cf949f1 100644 --- a/ld/testsuite/ld-pe/pdb.exp +++ b/ld/testsuite/ld-pe/pdb.exp @@ -1029,7 +1029,7 @@ proc test5 { } { binary scan $data i end_type # end_type is one greater than the last type in the stream - if { $end_type != 0x1023 } { + if { $end_type != 0x102a } { fail "Incorrect end type value in TPI stream." } else { pass "Correct end type value in TPI stream." -- 2.37.4