[Public] Hi, Below patch was committed first(commit f18acc9c4e5d18f4783f3a7d59e3ec95d7af0199) and later there was few changes done in fetch_indexed_string()(commit a3f1431a5087445ec3987d38b5ee37bcb802214c), hence it required this patch to make "split-dwarf and dwarf-5" work in binutils. While dumping .debug_str_offsets.dwo section contents, proper str_offsets_base argument value needed to be passed to fetch_indexed_string(). NOTE: DWO ID is not printed for dwarf 5 during initial dwo file dump information summary. But its dumped under .debug_info section dump. Because now there is no separate attribute in dwarf5 for DWO ID like it was there in dwarf4. PATCH inlined: From fb64c0617164d5b7c36c9206974f249f7cdef428 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E2=80=9Cbhkumarn=E2=80=9D?= Date: Sat, 2 Jul 2022 23:57:47 +0530 Subject: [PATCH] [PATCH] Modified changes for split-dwarf and dwarf-5. * dwarf.c(process_debug_info): Include DW_TAG_skeleton_unit. (display_debug_str_offsets): While dumping .debug_str_offsets.dwo, pass proper str_offsets_base to fetch_indexed_string(). (load_separate_debug_files): Skip DWO ID dump for dwarf-5. Change-Id: I63c32f503333f88b0bc4c6d696ece47987f46137 --- binutils/dwarf.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/binutils/dwarf.c b/binutils/dwarf.c index 6f36e427b04..3b32fa10ccd 100644 --- a/binutils/dwarf.c +++ b/binutils/dwarf.c @@ -4039,6 +4039,7 @@ process_debug_info (struct dwarf_section * section, need_base_address = 0; break; case DW_TAG_compile_unit: + case DW_TAG_skeleton_unit: need_base_address = 1; need_dwo_info = do_loc; break; @@ -7801,6 +7802,7 @@ display_debug_str_offsets (struct dwarf_section *section, unsigned char *start = section->start; unsigned char *end = start + section->size; unsigned char *curr = start; + dwarf_vma debug_str_offsets_hdr_len; const char *suffix = strrchr (section->name, '.'); bool dwo = suffix && strcmp (suffix, ".dwo") == 0; @@ -7823,9 +7825,13 @@ display_debug_str_offsets (struct dwarf_section *section, { SAFE_BYTE_GET_AND_INC (length, curr, 8, end); entry_length = 8; + debug_str_offsets_hdr_len = 16; } else - entry_length = 4; + { + entry_length = 4; + debug_str_offsets_hdr_len = 8; + } unsigned char *entries_end; if (length == 0) @@ -7877,7 +7883,7 @@ display_debug_str_offsets (struct dwarf_section *section, SAFE_BYTE_GET_AND_INC (offset, curr, entry_length, entries_end); if (dwo) string = (const unsigned char *) - fetch_indexed_string (idx, NULL, entry_length, dwo, 0); + fetch_indexed_string (idx, NULL, entry_length, dwo, debug_str_offsets_hdr_len); else string = fetch_indirect_string (offset); @@ -12013,7 +12019,7 @@ load_separate_debug_files (void * file, const char * filename) printf (_(" Directory: %s\n"), dir ? dir : _("")); if (id != NULL) display_data (printf (_(" ID: ")), (unsigned char *) id, 8); - else + else if (debug_information[0].dwarf_version != 5) printf (_(" ID: \n")); printf ("\n\n"); } -- 2.17.1 -----Original Message----- From: Kumar N, Bhuvanendra Sent: Friday, June 10, 2022 11:04 PM To: Jan Beulich Cc: George, Jini Susan ; Natarajan, Kavitha ; binutils@sourceware.org Subject: RE: [PATCH] Binutils support for split-dwarf and dwarf-5 [AMD Official Use Only - General] Hi, Thanks for the review comments. I could address your review comments in the updated patch which is attached and also inlined below. > Can this legitimately be stored in a static variable? Don't you need to collect the value(s) into debug_info_p just like is done for DW_AT_loclists_base? I have moved this to debug_information or debug_info_p. During this I observed, debug_information struct is allocated and initialized inside function process_debug_info(), under the condition " if (do_loc || do_debug_loc || do_debug_ranges)". Now I have added do_debug_info to this list of checks for my requirement and it helps during readelf -wi and its working fine. > What if this fails? Immediately preceding load_debug_section() uses have their return values checked. I have added the proper comment now. If we don’t have the .debug_str_offsets section, then it simply ignores. regards, bhuvan Updated patch inlined: From 4d90b859146b7aacc1067ff1a77c123e77391ee0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E2=80=9Cbhkumarn=E2=80=9D?= Date: Fri, 10 Jun 2022 22:00:34 +0530 Subject: [PATCH] [PATCH] Binutils support for split-dwarf and dwarf-5. This patch adds support for split-dwarf and dwarf5, for both gcc and clang. Basic sections dump like .debug_info.dwo, .debug_abbrev.dwo, .debug_str.dwo etc are supported in this patch. There are few differences b/w dwarf4 and dwarf5, w.r.t. split-dwarf like different attributes/forms used to represent DWO name and ID, .debug_str_offsets section etc. All these are addressed. * dwarf.c (fetch_indexed_string): Added new parameter str_offsets_base to calculate the string offset. (read_and_display_attr_value): Read DW_AT_str_offsets_base attribute. (process_debug_info): While allocating memory and initializing debug_information, do it for do_debug_info also, if its true. (load_separate_debug_files): Load .debug_str_offsets if exists. * dwarf.h (struct debug_info): Add str_offsets_base field. --- binutils/dwarf.c | 54 ++++++++++++++++++++++++++++++++---------------- binutils/dwarf.h | 1 + 2 files changed, 37 insertions(+), 18 deletions(-) diff --git a/binutils/dwarf.c b/binutils/dwarf.c index e8f0ea4cf0c..6d2541039e5 100644 --- a/binutils/dwarf.c +++ b/binutils/dwarf.c @@ -118,8 +118,6 @@ int dwarf_cutoff_level = -1; unsigned long dwarf_start_die; int dwarf_check = 0; -static dwarf_vma str_offsets_base = 0; -#define DEBUG_STR_OFFSETS_HEADER_LEN 8 /* Convenient constant, to avoid having to cast -1 to dwarf_vma when testing whether e.g. a locview list is present. */ @@ -693,7 +691,7 @@ fetch_indirect_line_string (dwarf_vma offset) static const char * fetch_indexed_string (dwarf_vma idx, struct cu_tu_set *this_set, - dwarf_vma offset_size, bool dwo) + dwarf_vma offset_size, bool dwo, dwarf_vma str_offsets_base) { enum dwarf_section_display_enum str_sec_idx = dwo ? str_dwo : str; enum dwarf_section_display_enum idx_sec_idx = dwo ? str_index_dwo : str_index; @@ -781,6 +779,13 @@ fetch_indexed_string (dwarf_vma idx, struct cu_tu_set *this_set, return _(""); } + if (str_offsets_base > 0 ) + { + if (offset_size == 8) + str_offsets_base -= 16; + else + str_offsets_base -= 8; + } str_offset = byte_get (curr + index_offset + str_offsets_base, offset_size); str_offset -= str_section->address; if (str_offset >= str_section->size) @@ -2487,8 +2492,6 @@ read_and_display_attr_value (unsigned long attribute, case DW_FORM_GNU_ref_alt: case DW_FORM_GNU_strp_alt: SAFE_BYTE_GET_AND_INC (uvalue, data, offset_size, end); - if (attribute == DW_AT_str_offsets_base) - str_offsets_base = uvalue - DEBUG_STR_OFFSETS_HEADER_LEN; break; case DW_FORM_flag_present: @@ -2737,11 +2740,13 @@ read_and_display_attr_value (unsigned long attribute, /* We have already displayed the form name. */ printf (_("%c(offset: 0x%s): %s"), delimiter, dwarf_vmatoa ("x", uvalue), - fetch_indexed_string (uvalue, this_set, offset_size, dwo)); + fetch_indexed_string (uvalue, this_set, offset_size, dwo, + debug_info_p->str_offsets_base)); else printf (_("%c(indexed string: 0x%s): %s"), delimiter, dwarf_vmatoa ("x", uvalue), - fetch_indexed_string (uvalue, this_set, offset_size, dwo)); + fetch_indexed_string (uvalue, this_set, offset_size, dwo, + debug_info_p->str_offsets_base)); } break; @@ -2816,7 +2821,7 @@ read_and_display_attr_value (unsigned long attribute, break; } - if ((do_loc || do_debug_loc || do_debug_ranges) + if ((do_loc || do_debug_loc || do_debug_ranges || do_debug_info) && num_debug_info_entries == 0 && debug_info_p != NULL) { @@ -2829,6 +2834,13 @@ read_and_display_attr_value (unsigned long attribute, debug_info_p->loclists_base = uvalue; break; + case DW_AT_str_offsets_base: + if (debug_info_p->str_offsets_base) + warn (_("CU @ 0x%s has multiple str_offsets_base values"), + dwarf_vmatoa ("x", debug_info_p->cu_offset)); + debug_info_p->str_offsets_base = uvalue; + break; + case DW_AT_frame_base: have_frame_base = 1; /* Fall through. */ @@ -2967,7 +2979,8 @@ read_and_display_attr_value (unsigned long attribute, case DW_FORM_strx2: case DW_FORM_strx3: case DW_FORM_strx4: - add_dwo_name (fetch_indexed_string (uvalue, this_set, offset_size, false), cu_offset); + add_dwo_name (fetch_indexed_string (uvalue, this_set, offset_size, false, + debug_info_p->str_offsets_base), +cu_offset); break; case DW_FORM_string: add_dwo_name ((const char *) orig_data, cu_offset); @@ -2999,7 +3012,8 @@ read_and_display_attr_value (unsigned long attribute, case DW_FORM_strx2: case DW_FORM_strx3: case DW_FORM_strx4: - add_dwo_dir (fetch_indexed_string (uvalue, this_set, offset_size, false), cu_offset); + add_dwo_dir (fetch_indexed_string (uvalue, this_set, offset_size, false, + debug_info_p->str_offsets_base), +cu_offset); break; case DW_FORM_string: add_dwo_dir ((const char *) orig_data, cu_offset); @@ -3319,6 +3333,7 @@ read_and_display_attr_value (unsigned long attribute, /* Fall through. */ case DW_AT_location: case DW_AT_loclists_base: + case DW_AT_str_offsets_base: case DW_AT_string_length: case DW_AT_return_addr: case DW_AT_data_member_location: @@ -3337,8 +3352,10 @@ read_and_display_attr_value (unsigned long attribute, if ((dwarf_version < 4 && (form == DW_FORM_data4 || form == DW_FORM_data8)) || form == DW_FORM_sec_offset - || form == DW_FORM_loclistx) - printf (_(" (location list)")); + || form == DW_FORM_loclistx) { + if (attribute != DW_AT_str_offsets_base) + printf (_(" (location list)")); + } /* Fall through. */ case DW_AT_allocated: case DW_AT_associated: @@ -3569,7 +3586,7 @@ process_debug_info (struct dwarf_section * section, return false; } - if ((do_loc || do_debug_loc || do_debug_ranges) + if ((do_loc || do_debug_loc || do_debug_ranges || do_debug_info) && num_debug_info_entries == 0 && ! do_types) { @@ -3804,7 +3821,7 @@ process_debug_info (struct dwarf_section * section, continue; } - if ((do_loc || do_debug_loc || do_debug_ranges) + if ((do_loc || do_debug_loc || do_debug_ranges || do_debug_info) && num_debug_info_entries == 0 && alloc_num_debug_info_entries > unit && ! do_types) @@ -3825,6 +3842,7 @@ process_debug_info (struct dwarf_section * section, debug_information [unit].range_lists = NULL; debug_information [unit].max_range_lists= 0; debug_information [unit].num_range_lists = 0; + debug_information [unit].str_offsets_base = 0; } if (!do_loc && dwarf_start_die == 0) @@ -4064,7 +4082,6 @@ process_debug_info (struct dwarf_section * section, level); } - str_offsets_base = 0; /* If a locview attribute appears before a location one, make sure we don't associate it with an earlier loclist. */ @@ -4097,7 +4114,7 @@ process_debug_info (struct dwarf_section * section, /* Set num_debug_info_entries here so that it can be used to check if we need to process .debug_loc and .debug_ranges sections. */ - if ((do_loc || do_debug_loc || do_debug_ranges) + if ((do_loc || do_debug_loc || do_debug_ranges || do_debug_info) && num_debug_info_entries == 0 && ! do_types) { @@ -6245,7 +6262,7 @@ display_debug_macro (struct dwarf_section *section, READ_ULEB (lineno, curr, end); READ_ULEB (offset, curr, end); string = (const unsigned char *) - fetch_indexed_string (offset, NULL, offset_size, false); + fetch_indexed_string (offset, NULL, offset_size, false, 0); if (op == DW_MACRO_define_strx) printf (" DW_MACRO_define_strx "); else @@ -7860,7 +7877,7 @@ display_debug_str_offsets (struct dwarf_section *section, SAFE_BYTE_GET_AND_INC (offset, curr, entry_length, entries_end); if (dwo) string = (const unsigned char *) - fetch_indexed_string (idx, NULL, entry_length, dwo); + fetch_indexed_string (idx, NULL, entry_length, dwo, 0); else string = fetch_indirect_string (offset); @@ -11895,6 +11912,7 @@ load_separate_debug_files (void * file, const char * filename) && load_debug_section (abbrev, file) && load_debug_section (info, file)) { + /* Load .debug_str_offsets section, if exists. */ load_debug_section (str_index, file); free_dwo_info (); diff --git a/binutils/dwarf.h b/binutils/dwarf.h index 040e674c6ce..ce38858d609 100644 --- a/binutils/dwarf.h +++ b/binutils/dwarf.h @@ -192,6 +192,7 @@ typedef struct dwarf_vma * range_lists; unsigned int num_range_lists; unsigned int max_range_lists; + dwarf_vma str_offsets_base; } debug_info; -- 2.17.1 -----Original Message----- From: Jan Beulich Sent: Friday, June 3, 2022 5:34 PM To: Kumar N, Bhuvanendra Cc: George, Jini Susan ; Natarajan, Kavitha ; binutils@sourceware.org Subject: Re: [PATCH] Binutils support for split-dwarf and dwarf-5 [CAUTION: External Email] On 02.06.2022 16:55, Kumar N, Bhuvanendra via Binutils wrote: > Patch 1/2 inlined: > > From 1bda1e144f43d1f41fe30a934ba008561083dc07 Mon Sep 17 00:00:00 2001 > From: =?UTF-8?q?=E2=80=9Cbhkumarn=E2=80=9D?= > Bhuvanendra.KumarN@amd.com > Date: Thu, 2 Jun 2022 17:44:02 +0530 > Subject: [PATCH] [PATCH 1/2] Binutils support for split-dwarf and dwarf-5. > > This fix adds support for split-dwarf and dwarf5, for both gcc and clang. > There are 2 patches. Basic sections dump like .debug_info.dwo, > .debug_abbrev.dwo, .debug_str.dwo etc are supported in this patch 1/2. > But location list and ranges list sections dump with split-dwarf and > dwarf5 are supported in the next patch 2/2. Just as a remark: "patch 1/2" and "next patch" and alike aren't very meaningful in commit messages. The individual commits may end up far apart in the repo. > --- a/binutils/dwarf.c > +++ b/binutils/dwarf.c > @@ -118,6 +118,8 @@ int dwarf_cutoff_level = -1; unsigned long > dwarf_start_die; int dwarf_check = 0; > +static dwarf_vma str_offsets_base = 0; #define > +DEBUG_STR_OFFSETS_HEADER_LEN 8 Aiui 8 is correct for 32-bit Dwarf, but 16 would need using for 64-bit. > @@ -2485,6 +2487,8 @@ read_and_display_attr_value (unsigned long attribute, > case DW_FORM_GNU_ref_alt: > case DW_FORM_GNU_strp_alt: > SAFE_BYTE_GET_AND_INC (uvalue, data, offset_size, end); > + if (attribute == DW_AT_str_offsets_base) > + str_offsets_base = uvalue - DEBUG_STR_OFFSETS_HEADER_LEN; Can this legitimately be stored in a static variable? Don't you need to collect the value(s) into debug_info_p just like is done for DW_AT_loclists_base? > @@ -11889,6 +11895,7 @@ load_separate_debug_files (void * file, const char * filename) > && load_debug_section (abbrev, file) > && load_debug_section (info, file)) > { > + load_debug_section (str_index, file); What if this fails? Immediately preceding load_debug_section() uses have their return values checked. > --- /dev/null > +++ b/binutils/testsuite/binutils-all/dwo_5.s > @@ -0,0 +1,257 @@ > + .file "a.c" > + .text > +.Ltext0: > + .globl a > + .type a, @function > +a: > +.LFB0: > + .file 1 "a.c" > + .loc 1 1 13 > + .cfi_startproc > + pushq %rbp > + .cfi_def_cfa_offset 16 > + .cfi_offset 6, -16 > + movq %rsp, %rbp > + .cfi_def_cfa_register 6 > + .loc 1 1 22 > + movl $111, %eax > + .loc 1 1 1 > + popq %rbp > + .cfi_def_cfa 7, 8 > + ret > + .cfi_endproc > +.LFE0: > + .size a, .-a > +.Letext0: > + .section .debug_addr,"",@progbits > + .long 0xc > + .value 0x5 > + .byte 0x8 > + .byte 0 > +.Ldebug_addr0: > + .quad .LFB0 > + .section .debug_info.dwo,"e",@progbits > +.Ldebug_info0: > + .long 0x35 > + .value 0x5 > + .byte 0x5 > + .byte 0x8 > + .long .Ldebug_abbrev0 > + .byte 0x46 > + .byte 0x1c > + .byte 0xf1 > + .byte 0xfb > + .byte 0xe5 > + .byte 0x96 > + .byte 0x7d > + .byte 0x2b > + .uleb128 0x1 > + .uleb128 0x1 > + .byte 0x1d > + .string "a.c" > + .uleb128 0 > + .uleb128 0x2 > + .string "a" > + .byte 0x1 > + .byte 0x1 > + .byte 0x5 > + .long 0x31 > + .uleb128 0 > + .quad .LFE0-.LFB0 > + .uleb128 0x1 > + .byte 0x9c > + .uleb128 0x3 > + .byte 0x4 > + .byte 0x5 > + .string "int" > + .byte 0 > + .section .debug_info,"",@progbits > +.Lskeleton_debug_info0: > + .long 0x31 > + .value 0x5 > + .byte 0x4 > + .byte 0x8 > + .long .Lskeleton_debug_abbrev0 > + .byte 0x46 > + .byte 0x1c > + .byte 0xf1 > + .byte 0xfb > + .byte 0xe5 > + .byte 0x96 > + .byte 0x7d > + .byte 0x2b > + .uleb128 0x1 > + .quad .Ltext0 > + .quad .Letext0-.Ltext0 > + .long .Ldebug_line0 > + .long .LASF0 > + .long .LASF1 > + .long .Ldebug_addr0 > + .section .debug_abbrev,"",@progbits > +.Lskeleton_debug_abbrev0: > + .uleb128 0x1 > + .uleb128 0x4a > + .byte 0 > + .uleb128 0x11 > + .uleb128 0x1 > + .uleb128 0x12 > + .uleb128 0x7 > + .uleb128 0x10 > + .uleb128 0x17 > + .uleb128 0x76 > + .uleb128 0xe > + .uleb128 0x1b > + .uleb128 0xe > + .uleb128 0x2134 > + .uleb128 0x19 > + .uleb128 0x73 > + .uleb128 0x17 > + .byte 0 > + .byte 0 > + .byte 0 > + .section .debug_abbrev.dwo,"e",@progbits > +.Ldebug_abbrev0: > + .uleb128 0x1 > + .uleb128 0x11 > + .byte 0x1 > + .uleb128 0x25 > + .uleb128 0x1a > + .uleb128 0x13 > + .uleb128 0xb > + .uleb128 0x3 > + .uleb128 0x8 > + .uleb128 0x1b > + .uleb128 0x1a > + .byte 0 > + .byte 0 > + .uleb128 0x2 > + .uleb128 0x2e > + .byte 0 > + .uleb128 0x3f > + .uleb128 0x19 > + .uleb128 0x3 > + .uleb128 0x8 > + .uleb128 0x3a > + .uleb128 0xb > + .uleb128 0x3b > + .uleb128 0xb > + .uleb128 0x39 > + .uleb128 0xb > + .uleb128 0x27 > + .uleb128 0x19 > + .uleb128 0x49 > + .uleb128 0x13 > + .uleb128 0x11 > + .uleb128 0x1b > + .uleb128 0x12 > + .uleb128 0x7 > + .uleb128 0x40 > + .uleb128 0x18 > + .uleb128 0x7a > + .uleb128 0x19 > + .byte 0 > + .byte 0 > + .uleb128 0x3 > + .uleb128 0x24 > + .byte 0 > + .uleb128 0xb > + .uleb128 0xb > + .uleb128 0x3e > + .uleb128 0xb > + .uleb128 0x3 > + .uleb128 0x8 > + .byte 0 > + .byte 0 > + .byte 0 > + .section .debug_gnu_pubnames,"",@progbits > + .long 0x15 > + .value 0x2 > + .long .Lskeleton_debug_info0 > + .long 0x39 > + .long 0x1c > + .byte 0x30 > + .string "a" > + .long 0 > + .section .debug_gnu_pubtypes,"",@progbits > + .long 0x17 > + .value 0x2 > + .long .Lskeleton_debug_info0 > + .long 0x39 > + .long 0x31 > + .byte 0x90 > + .string "int" > + .long 0 > + .section .debug_aranges,"",@progbits > + .long 0x2c > + .value 0x2 > + .long .Lskeleton_debug_info0 > + .byte 0x8 > + .byte 0 > + .value 0 > + .value 0 > + .quad .Ltext0 > + .quad .Letext0-.Ltext0 > + .quad 0 > + .quad 0 > + .section .debug_line,"",@progbits > +.Ldebug_line0: > + .section .debug_line.dwo,"e",@progbits > +.Lskeleton_debug_line0: > + .long .LELT0-.LSLT0 > +.LSLT0: > + .value 0x5 > + .byte 0x8 > + .byte 0 > + .long .LELTP0-.LASLTP0 > +.LASLTP0: > + .byte 0x1 > + .byte 0x1 > + .byte 0x1 > + .byte 0xf6 > + .byte 0xf2 > + .byte 0xd > + .byte 0 > + .byte 0x1 > + .byte 0x1 > + .byte 0x1 > + .byte 0x1 > + .byte 0 > + .byte 0 > + .byte 0 > + .byte 0x1 > + .byte 0 > + .byte 0 > + .byte 0x1 > + .byte 0x1 > + .uleb128 0x1 > + .uleb128 0x8 > + .uleb128 0x1 > + .string "/tmp" > + .byte 0x2 > + .uleb128 0x1 > + .uleb128 0x8 > + .uleb128 0x2 > + .uleb128 0xb > + .uleb128 0x2 > + .string "a.c" > + .byte 0 > + .string "a.c" > + .byte 0 > +.LELTP0: > +.LELT0: > + .section .debug_str,"MS",@progbits,1 > +.LASF1: > + .string "/tmp" > +.LASF0: > + .string "a.dwo" > + .section .debug_str_offsets.dwo,"e",@progbits > + .long 0xc > + .value 0x5 > + .value 0 > + .long 0 > + .long 0x5 > + .section .debug_str.dwo,"e",@progbits > + .string "/tmp" > + .string "GNU C17 10.0.0 20190813 (experimental) -mtune=generic -march=x86-64 -gdwarf-5 -gsplit-dwarf" > + .ident "GCC: (GNU) 10.0.0 20190813 (experimental)" > + .section .note.GNU-stack,"",@progbits I'm a little worried about the maintainability (and reviewability) of such testcases: This looks to be compiler output, even with non- essential pieces left in. That's fine in principle, but without any comments it's close to unreadable. Plus it's then unclear whether the compiler-generated directives are actually all correct; after all compilers can also have bugs, and the compiler used wasn't even a released version, and even quite early a major-10 version. Jan