* [RFC] More compact (100x) -g3 .debug_macinfo @ 2011-07-13 17:12 Jakub Jelinek 2011-07-13 19:59 ` Tom Tromey 2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek 0 siblings, 2 replies; 25+ messages in thread From: Jakub Jelinek @ 2011-07-13 17:12 UTC (permalink / raw) To: Jason Merrill, Richard Henderson, Tom Tromey, Jan Kratochvil, Roland McGrath, Cary Coutant, Mark Wielaard Cc: gcc-patches Hi! Currently .debug_macinfo is prohibitively large, because it doesn't allow for any kind of merging of duplicate debug information. This patch is an RFC for extensions that allow it to bring it down to manageable levels. The ideas for the first shrinking come from Jason and/or Roland I think from last year and is similar to the introduction of DW_FORM_strp to replace DW_FORM_string in some cases. In particular, if the string in DW_MACINFO_define or DW_MACINFO_undef is larger than 4 bytes including terminating '\0' and there is a chance the string might occur more than once, instead an offset into .debug_str is used. The usual .debug_str string merging then kicks in and removes duplicities. The second savings come from merging of identical sequences of DW_MACINFO_define/undef ops. Usually, when you include some header, the macros it defines/undefines are the same. Unfortunately it is hard to merge whole headers, because: 1) DW_MACINFO_start_file uses .debug_line references, which prevent merging - different CUs have different .debug_line content 2) multiple inclusion of headers with single inclusion guards is quite common and results in such merging to be less than satisfactory, as if some header includes <stdio.h> and you include that header in one source file without prior inclusion of stdio.h and in a different one after #include <stdio.h>, suddenly the .debug_macinfo sequence for that header is different if it transitively includes included headers Unfortunately, as defined in DWARF{2,3,4}, .debug_macinfo is not really allowing extensions. DW_MACINFO_vendor_ext doesn't count, because its argument is a string, which certainly can't include embedded zeros needed for the offsets into other sections or other portions of the same section. The following approach just grabs a range of .debug_macinfo opcodes for vendor use, if the DWARF commitee would give such an approach a green light. .debug_macinfo has 256 possible opcodes and just defines 5 (plus 1 for termination), the remaining 250 are unused. Other alternative would be to come up with .debug_gnu_macinfo section or similar and defining a new DW_AT_GNU_macro_info attribute that would be used instead of DW_AT_macro_info, but I'd prefer to stay with .debug_macinfo. The newly added opcodes: DW_MACINFO_GNU_define_indirect4 0xe0 This opcode has two arguments, one is uleb128 lineno and the other is 4 byte offset into .debug_str. Except for the encoding of the string it is similar to DW_MACINFO_define. DW_MACINFO_GNU_undef_indirect4 0xe1 This opcode has two arguments, one is uleb128 lineno and the other is 4 byte offset into .debug_str. Except for the encoding of the string it is similar to DW_MACINFO_undef. DW_MACINFO_GNU_transparent_include4 0xe2 This opcode has a single argument, a 4 byte offset into .debug_macinfo. It instructs the debug info consumer that this opcode during reading should be replaced with the sequence of .debug_macinfo opcodes from the mentioned offset, up to a terminating 0 opcode (not including that 0). DW_MACINFO_GNU_define_opcode 0xe3 This is an opcode for future extensibility through which a debugger could skip unknown opcodes. It has 3 arguments: 1 byte opcode number, uleb128 count of arguments and a count bytes long array, with a DW_FORM_* code how the argument is encoded. The debug info producers have to ensure that opcodes in DW_MACINFO_GNU_transparent_include4 chains reference the right sections for any .debug_macinfo that includes them (which essentially means that DW_MACINFO_start_file can't be used in the transparent_include4 chain. Perhaps cleaner would be not to define all offset sizes in the opcode values/names and instead have DW_MACINFO_GNU_define_indirect and DW_MACINFO_GNU_undef_indirect whose arguments would be DW_FORM_udata and DW_FORM_strp (i.e. offset size) - the producers would need to ensure that .debug_macinfo chains with different assumed offset size aren't merged together, which could be done e.g. by using wm4.[<filename>.]<lineno>.<md5> and wm8.* comdat groups instead of the current wm.*. DW_MACINFO_GNU_transparent_include4 then would have DW_FORM_sec_offset single argument and DW_MACINFO_GNU_define_opcode would have DW_FORM_data1 and DW_FORM_block arguments and the implicit opcode definition assumed at the start of every .debug_macinfo would be: DW_MACINFO_GNU_define_opcode <0, 0 []> DW_MACINFO_GNU_define_opcode <DW_MACINFO_define, 2 [DW_FORM_udata, DW_FORM_string]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_undef, 2 [DW_FORM_udata, DW_FORM_string]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_start_file, 2 [DW_FORM_udata, DW_FORM_sec_offset]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_end_file, 1 [DW_FORM_udata]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_define_indirect, 2 [DW_FORM_udata, DW_FORM_strp]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_undef_indirect, 2 [DW_FORM_udata, DW_FORM_strp]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_define_opcode, 2 [DW_FORM_data1, DW_FORM_block]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_vendor_ext, 1 [DW_FORM_string]> This approach doesn't need any linker changes, the slight disadvantage is a small increase in the size of -g3 built object files (e.g. on i686-linux -g3 -O2 gcc/*.o were together 461.3MB large before this patch and with this patch 518.6MB, i.e. more than 13% more), but the size of cc1plus reduced significantly, from 428.9MB down to 92.6MB. Previously, .debug_macinfo section occupied in cc1plus 339MB and .debug_str 1MB, with the patch .debug_macinfo has 1MB and .debug_str 2.5MB. .debug_str wasn't used for macinfo before, so macinfo now takes together 2.5MB compared to 339MB before. 2011-07-13 Jakub Jelinek <jakub@redhat.com> * dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add. (DW_MACINFO_GNU_define_indirect4, DW_MACINFO_GNU_undef_indirect4, DW_MACINFO_GNU_transparent_include4, DW_MACINFO_GNU_define_opcode): Add. * dwarf2out.c (dwarf2out_undef): Remove redundant semicolon. (htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op): New functions. (output_macinfo): Use them. If !dwarf_strict and .debug_str is mergeable, optimize longer strings using DW_MACINFO_GNU_{define,undef}_indirect4 and if HAVE_COMDAT and ELF, optimize longer sequences of define/undef ops from headers using DW_MACINFO_GNU_transparent_include4. --- include/dwarf2.h.jj 2011-06-23 10:14:06.000000000 +0200 +++ include/dwarf2.h 2011-07-13 11:39:49.000000000 +0200 @@ -877,7 +877,13 @@ enum dwarf_macinfo_record_type DW_MACINFO_undef = 2, DW_MACINFO_start_file = 3, DW_MACINFO_end_file = 4, - DW_MACINFO_vendor_ext = 255 + DW_MACINFO_lo_user = 0xe0, + DW_MACINFO_GNU_define_indirect4 = 0xe0, + DW_MACINFO_GNU_undef_indirect4 = 0xe1, + DW_MACINFO_GNU_transparent_include4 = 0xe2, + DW_MACINFO_GNU_define_opcode = 0xe3, + DW_MACINFO_hi_user = 0xfe, + DW_MACINFO_vendor_ext = 0xff }; \f /* @@@ For use with GNU frame unwind information. */ --- gcc/dwarf2out.c.jj 2011-07-12 17:59:01.000000000 +0200 +++ gcc/dwarf2out.c 2011-07-13 17:04:17.000000000 +0200 @@ -20383,17 +20383,118 @@ dwarf2out_undef (unsigned int lineno ATT macinfo_entry e; e.code = DW_MACINFO_undef; e.lineno = lineno; - e.info = xstrdup (buffer);; + e.info = xstrdup (buffer); VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); } } +/* Routines to manipulate hash table of CUs. */ +static hashval_t +htab_macinfo_hash (const void *of) +{ + const macinfo_entry *const entry = + (const macinfo_entry *) of; + + return htab_hash_string (entry->info); +} + +static int +htab_macinfo_eq (const void *of1, const void *of2) +{ + const macinfo_entry *const entry1 = (const macinfo_entry *) of1; + const macinfo_entry *const entry2 = (const macinfo_entry *) of2; + + return !strcmp (entry1->info, entry2->info); +} + +/* Output a single .debug_macinfo entry. */ + +static void +output_macinfo_op (macinfo_entry *ref) +{ + int file_num; + size_t len; + struct indirect_string_node *node; + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + + switch (ref->code) + { + case DW_MACINFO_start_file: + file_num = maybe_emit_file (lookup_filename (ref->info)); + dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); + dw2_asm_output_data_uleb128 (ref->lineno, + "Included from line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + break; + case DW_MACINFO_end_file: + dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + len = strlen (ref->info) + 1; + if (!dwarf_strict + && len > DWARF_OFFSET_SIZE + && DWARF_OFFSET_SIZE == 4 + && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET + && (debug_str_section->common.flags & SECTION_MERGE) != 0) + { + ref->code = ref->code == DW_MACINFO_define + ? DW_MACINFO_GNU_define_indirect4 + : DW_MACINFO_GNU_undef_indirect4; + output_macinfo_op (ref); + return; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_MACINFO_define + ? "Define macro" : "Undefine macro"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_nstring (ref->info, -1, "The macro"); + break; + case DW_MACINFO_GNU_define_indirect4: + case DW_MACINFO_GNU_undef_indirect4: + node = find_AT_string (ref->info); + if (node->form != DW_FORM_strp) + { + char label[32]; + ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter); + ++dw2_string_counter; + node->label = xstrdup (label); + node->form = DW_FORM_strp; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_MACINFO_GNU_define_indirect4 + ? "Define macro indirect4" + : "Undefine macro indirect4"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label, + debug_str_section, "The macro: \"%s\"", + ref->info); + break; + case DW_MACINFO_GNU_transparent_include4: + dw2_asm_output_data (1, ref->code, "Transparent include4"); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_MACINFO_SECTION_LABEL, ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL); + break; + default: + fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", + ASM_COMMENT_START, (unsigned long) ref->code); + break; + } +} + static void output_macinfo (void) { unsigned i; unsigned long length = VEC_length (macinfo_entry, macinfo_table); - macinfo_entry *ref; + macinfo_entry *ref, *ref2; + VEC (macinfo_entry, gc) *files = NULL; + unsigned long transparent_includes = 0; + htab_t macinfo_htab = NULL; if (! length) return; @@ -20402,37 +20503,185 @@ output_macinfo (void) { switch (ref->code) { - case DW_MACINFO_start_file: + case DW_MACINFO_start_file: + VEC_safe_push (macinfo_entry, gc, files, ref); + break; + case DW_MACINFO_end_file: + if (!VEC_empty (macinfo_entry, files)) { - int file_num = maybe_emit_file (lookup_filename (ref->info)); - dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); - dw2_asm_output_data_uleb128 - (ref->lineno, "Included from line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + ref2 = VEC_last (macinfo_entry, files); + free (CONST_CAST (char *, ref2->info)); + VEC_pop (macinfo_entry, files); } - break; - case DW_MACINFO_end_file: - dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); - break; - case DW_MACINFO_define: - dw2_asm_output_data (1, DW_MACINFO_define, "Define macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - case DW_MACINFO_undef: - dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - default: - fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", - ASM_COMMENT_START, (unsigned long)ref->code); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: +#ifdef OBJECT_FORMAT_ELF + if (!dwarf_strict + && HAVE_COMDAT_GROUP + && DWARF_OFFSET_SIZE == 4 + && VEC_length (macinfo_entry, files) != 1 + && i > 0 + && i + 1 < length + && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0) + { + char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1]; + unsigned char checksum[16]; + struct md5_ctx ctx; + char *tmp, *tail; + const char *base; + unsigned int j = i, k, l; + void **slot; + + ref2 = VEC_index (macinfo_entry, macinfo_table, i + 1); + if (ref2->code != DW_MACINFO_define + && ref2->code != DW_MACINFO_undef) + break; + + if (VEC_empty (macinfo_entry, files)) + { + if (ref->lineno != 0 || ref2->lineno != 0) + break; + } + else if (ref->lineno == 0) + break; + md5_init_ctx (&ctx); + for (; VEC_iterate (macinfo_entry, macinfo_table, j, ref2); j++) + if (ref2->code != DW_MACINFO_define + && ref2->code != DW_MACINFO_undef) + break; + else if (ref->lineno == 0 && ref2->lineno != 0) + break; + else + { + unsigned char code = ref2->code; + md5_process_bytes (&code, 1, &ctx); + checksum_uleb128 (ref2->lineno, &ctx); + md5_process_bytes (ref2->info, strlen (ref2->info) + 1, + &ctx); + } + md5_finish_ctx (&ctx, checksum); + if (ref->lineno == 0) + base = ""; + else + base = lbasename (VEC_last (macinfo_entry, files)->info); + for (l = 0, k = 0; base[k]; k++) + if (ISIDNUM (base[k]) || base[k] == '.') + l++; + if (l) + l++; + sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED, + VEC_index (macinfo_entry, macinfo_table, i)->lineno); + tmp = XNEWVEC (char, 3 + l + strlen (linebuf) + 1 + 16 * 2 + 1); + strcpy (tmp, "wm."); + tail = tmp + 3; + if (l) + { + for (k = 0; base[k]; k++) + if (ISIDNUM (base[k]) || base[k] == '.') + *tail++ = base[k]; + *tail++ = '.'; + } + l = strlen (linebuf); + memcpy (tail, linebuf, l); + tail += l; + *tail++ = '.'; + for (k = 0; k < 16; k++) + sprintf (tail + k * 2, "%02x", checksum[k] & 0xff); + ref2 = VEC_index (macinfo_entry, macinfo_table, i - 1); + ref2->code = DW_MACINFO_GNU_transparent_include4; + ref2->lineno = 0; + ref2->info = tmp; + if (macinfo_htab == NULL) + macinfo_htab = htab_create (10, htab_macinfo_hash, + htab_macinfo_eq, NULL); + slot = htab_find_slot (macinfo_htab, ref2, INSERT); + if (*slot != NULL) + { + free (CONST_CAST (char *, ref2->info)); + ref2->code = 0; + ref2->info = NULL; + ref2 = (macinfo_entry *) *slot; + output_macinfo_op (ref2); + for (j = i; + VEC_iterate (macinfo_entry, macinfo_table, j, ref2); + j++) + if (ref2->code != DW_MACINFO_define + && ref2->code != DW_MACINFO_undef) + break; + else if (ref->lineno == 0 && ref2->lineno != 0) + break; + else + { + ref2->code = 0; + free (CONST_CAST (char *, ref2->info)); + ref2->info = NULL; + } + } + else + { + *slot = ref2; + ref2->lineno = ++transparent_includes; + output_macinfo_op (ref2); + } + i = j - 1; + continue; + } +#endif + break; + default: break; } + output_macinfo_op (ref); + /* For DW_MACINFO_start_file ref->info has been copied into files + vector. */ + if (ref->code != DW_MACINFO_start_file) + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + ref->code = 0; } + + if (!transparent_includes) + return; + + htab_delete (macinfo_htab); + +#ifdef OBJECT_FORMAT_ELF + for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++) + switch (ref->code) + { + case 0: + continue; + case DW_MACINFO_GNU_transparent_include4: + { + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + tree comdat_key = get_identifier (ref->info); + /* Terminate the previous .debug_macinfo section. */ + dw2_asm_output_data (1, 0, "End compilation unit"); + targetm.asm_out.named_section (DEBUG_MACINFO_SECTION, + SECTION_DEBUG + | SECTION_LINKONCE, + comdat_key); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_MACINFO_SECTION_LABEL, + ref->lineno); + ASM_OUTPUT_LABEL (asm_out_file, label); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + } + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + output_macinfo_op (ref); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + break; + default: + gcc_unreachable (); + } +#endif } /* Set up for Dwarf output at the start of compilation. */ Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo 2011-07-13 17:12 [RFC] More compact (100x) -g3 .debug_macinfo Jakub Jelinek @ 2011-07-13 19:59 ` Tom Tromey 2011-07-13 20:37 ` Jakub Jelinek 2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek 1 sibling, 1 reply; 25+ messages in thread From: Tom Tromey @ 2011-07-13 19:59 UTC (permalink / raw) To: Jakub Jelinek Cc: Jason Merrill, Richard Henderson, Jan Kratochvil, Roland McGrath, Cary Coutant, Mark Wielaard, gcc-patches >>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes: Jakub> Currently .debug_macinfo is prohibitively large, because it doesn't Jakub> allow for any kind of merging of duplicate debug information. Jakub> This patch is an RFC for extensions that allow it to bring it down Jakub> to manageable levels. I wrote a gdb patch for this. I've appended it in case you want to try it out; it is against git master. I tried it a little on an executable Jakub sent me and it seems to work fine. It is no trouble to change this patch if you change the format. It wasn't hard to write in the first place, it just bigger than it is because I moved a bunch of code into a new function. I don't think I really understood DW_MACINFO_GNU_define_opcode, so the implementation here is probably wrong. Tom 2011-07-13 Tom Tromey <tromey@redhat.com> * dwarf2read.c (read_indirect_string_at_offset): New function. (read_indirect_string): Use it. (dwarf_decode_macro_bytes): New function, taken from dwarf_decode_macros. Handle DW_MACINFO_GNU_*. (dwarf_decode_macros): Use it. handle DW_MACINFO_GNU_*. diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c index fde5b6a..af35f16 100644 --- a/gdb/dwarf2read.c +++ b/gdb/dwarf2read.c @@ -10182,32 +10182,32 @@ read_direct_string (bfd *abfd, gdb_byte *buf, unsigned int *bytes_read_ptr) } static char * -read_indirect_string (bfd *abfd, gdb_byte *buf, - const struct comp_unit_head *cu_header, - unsigned int *bytes_read_ptr) +read_indirect_string_at_offset (bfd *abfd, LONGEST str_offset) { - LONGEST str_offset = read_offset (abfd, buf, cu_header, bytes_read_ptr); - dwarf2_read_section (dwarf2_per_objfile->objfile, &dwarf2_per_objfile->str); if (dwarf2_per_objfile->str.buffer == NULL) - { - error (_("DW_FORM_strp used without .debug_str section [in module %s]"), - bfd_get_filename (abfd)); - return NULL; - } + error (_("DW_FORM_strp used without .debug_str section [in module %s]"), + bfd_get_filename (abfd)); if (str_offset >= dwarf2_per_objfile->str.size) - { - error (_("DW_FORM_strp pointing outside of " - ".debug_str section [in module %s]"), - bfd_get_filename (abfd)); - return NULL; - } + error (_("DW_FORM_strp pointing outside of " + ".debug_str section [in module %s]"), + bfd_get_filename (abfd)); gdb_assert (HOST_CHAR_BIT == 8); if (dwarf2_per_objfile->str.buffer[str_offset] == '\0') return NULL; return (char *) (dwarf2_per_objfile->str.buffer + str_offset); } +static char * +read_indirect_string (bfd *abfd, gdb_byte *buf, + const struct comp_unit_head *cu_header, + unsigned int *bytes_read_ptr) +{ + LONGEST str_offset = read_offset (abfd, buf, cu_header, bytes_read_ptr); + + return read_indirect_string_at_offset (abfd, str_offset); +} + static unsigned long read_unsigned_leb128 (bfd *abfd, gdb_byte *buf, unsigned int *bytes_read_ptr) { @@ -14576,116 +14576,14 @@ parse_macro_definition (struct macro_source_file *file, int line, static void -dwarf_decode_macros (struct line_header *lh, unsigned int offset, - char *comp_dir, bfd *abfd, - struct dwarf2_cu *cu) +dwarf_decode_macro_bytes (bfd *abfd, gdb_byte *mac_ptr, gdb_byte *mac_end, + struct macro_source_file *current_file, + struct line_header *lh, char *comp_dir, + struct dwarf2_cu *cu) { - gdb_byte *mac_ptr, *mac_end; - struct macro_source_file *current_file = 0; enum dwarf_macinfo_record_type macinfo_type; int at_commandline; - dwarf2_read_section (dwarf2_per_objfile->objfile, - &dwarf2_per_objfile->macinfo); - if (dwarf2_per_objfile->macinfo.buffer == NULL) - { - complaint (&symfile_complaints, _("missing .debug_macinfo section")); - return; - } - - /* First pass: Find the name of the base filename. - This filename is needed in order to process all macros whose definition - (or undefinition) comes from the command line. These macros are defined - before the first DW_MACINFO_start_file entry, and yet still need to be - associated to the base file. - - To determine the base file name, we scan the macro definitions until we - reach the first DW_MACINFO_start_file entry. We then initialize - CURRENT_FILE accordingly so that any macro definition found before the - first DW_MACINFO_start_file can still be associated to the base file. */ - - mac_ptr = dwarf2_per_objfile->macinfo.buffer + offset; - mac_end = dwarf2_per_objfile->macinfo.buffer - + dwarf2_per_objfile->macinfo.size; - - do - { - /* Do we at least have room for a macinfo type byte? */ - if (mac_ptr >= mac_end) - { - /* Complaint is printed during the second pass as GDB will probably - stop the first pass earlier upon finding - DW_MACINFO_start_file. */ - break; - } - - macinfo_type = read_1_byte (abfd, mac_ptr); - mac_ptr++; - - switch (macinfo_type) - { - /* A zero macinfo type indicates the end of the macro - information. */ - case 0: - break; - - case DW_MACINFO_define: - case DW_MACINFO_undef: - /* Only skip the data by MAC_PTR. */ - { - unsigned int bytes_read; - - read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); - mac_ptr += bytes_read; - read_direct_string (abfd, mac_ptr, &bytes_read); - mac_ptr += bytes_read; - } - break; - - case DW_MACINFO_start_file: - { - unsigned int bytes_read; - int line, file; - - line = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); - mac_ptr += bytes_read; - file = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); - mac_ptr += bytes_read; - - current_file = macro_start_file (file, line, current_file, - comp_dir, lh, cu->objfile); - } - break; - - case DW_MACINFO_end_file: - /* No data to skip by MAC_PTR. */ - break; - - case DW_MACINFO_vendor_ext: - /* Only skip the data by MAC_PTR. */ - { - unsigned int bytes_read; - - read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); - mac_ptr += bytes_read; - read_direct_string (abfd, mac_ptr, &bytes_read); - mac_ptr += bytes_read; - } - break; - - default: - break; - } - } while (macinfo_type != 0 && current_file == NULL); - - /* Second pass: Process all entries. - - Use the AT_COMMAND_LINE flag to determine whether we are still processing - command-line macro definitions/undefinitions. This flag is unset when we - reach the first DW_MACINFO_start_file entry. */ - - mac_ptr = dwarf2_per_objfile->macinfo.buffer + offset; - /* Determines if GDB is still before first DW_MACINFO_start_file. If true GDB is still reading the definitions from command line. First DW_MACINFO_start_file will need to be ignored as it was already executed @@ -14716,27 +14614,43 @@ dwarf_decode_macros (struct line_header *lh, unsigned int offset, case DW_MACINFO_define: case DW_MACINFO_undef: + case DW_MACINFO_GNU_define_indirect4: + case DW_MACINFO_GNU_undef_indirect4: { unsigned int bytes_read; int line; char *body; + int is_define; - line = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); - mac_ptr += bytes_read; - body = read_direct_string (abfd, mac_ptr, &bytes_read); - mac_ptr += bytes_read; + line = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + + if (macinfo_type == DW_MACINFO_define + || macinfo_type == DW_MACINFO_undef) + { + body = read_direct_string (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + } + else + { + LONGEST str_offset; + + str_offset = read_offset_1 (abfd, mac_ptr, 4); + mac_ptr += 4; + + body = read_indirect_string_at_offset (abfd, str_offset); + } + is_define = (macinfo_type == DW_MACINFO_define + || macinfo_type == DW_MACINFO_GNU_define_indirect4); if (! current_file) { /* DWARF violation as no main source is present. */ complaint (&symfile_complaints, _("debug info with no main source gives macro %s " "on line %d: %s"), - macinfo_type == DW_MACINFO_define ? - _("definition") : - macinfo_type == DW_MACINFO_undef ? - _("undefinition") : - _("something-or-other"), line, body); + is_define ? _("definition") : _("undefinition"), + line, body); break; } if ((line == 0 && !at_commandline) @@ -14744,17 +14658,17 @@ dwarf_decode_macros (struct line_header *lh, unsigned int offset, complaint (&symfile_complaints, _("debug info gives %s macro %s with %s line %d: %s"), at_commandline ? _("command-line") : _("in-file"), - macinfo_type == DW_MACINFO_define ? - _("definition") : - macinfo_type == DW_MACINFO_undef ? - _("undefinition") : - _("something-or-other"), + is_define ? _("definition") : _("undefinition"), line == 0 ? _("zero") : _("non-zero"), line, body); - if (macinfo_type == DW_MACINFO_define) + if (is_define) parse_macro_definition (current_file, line, body); - else if (macinfo_type == DW_MACINFO_undef) - macro_undef (current_file, line, body); + else + { + gdb_assert (macinfo_type == DW_MACINFO_undef + || macinfo_type == DW_MACINFO_GNU_undef_indirect4); + macro_undef (current_file, line, body); + } } break; @@ -14825,6 +14739,33 @@ dwarf_decode_macros (struct line_header *lh, unsigned int offset, } break; + case DW_MACINFO_GNU_transparent_include4: + { + LONGEST offset; + + offset = read_offset_1 (abfd, mac_ptr, 4); + mac_ptr += 4; + + dwarf_decode_macro_bytes (abfd, + (dwarf2_per_objfile->macinfo.buffer + + offset), + mac_end, current_file, + lh, comp_dir, cu); + } + break; + + case DW_MACINFO_GNU_define_opcode: + { + unsigned int bytes_read, arg; + + /* Just ignore it. */ + mac_ptr += 1; + arg = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + mac_ptr += arg; + } + break; + case DW_MACINFO_vendor_ext: { unsigned int bytes_read; @@ -14842,6 +14783,149 @@ dwarf_decode_macros (struct line_header *lh, unsigned int offset, } while (macinfo_type != 0); } +static void +dwarf_decode_macros (struct line_header *lh, unsigned int offset, + char *comp_dir, bfd *abfd, + struct dwarf2_cu *cu) +{ + gdb_byte *mac_ptr, *mac_end; + struct macro_source_file *current_file = 0; + enum dwarf_macinfo_record_type macinfo_type; + + dwarf2_read_section (dwarf2_per_objfile->objfile, + &dwarf2_per_objfile->macinfo); + if (dwarf2_per_objfile->macinfo.buffer == NULL) + { + complaint (&symfile_complaints, _("missing .debug_macinfo section")); + return; + } + + /* First pass: Find the name of the base filename. + This filename is needed in order to process all macros whose definition + (or undefinition) comes from the command line. These macros are defined + before the first DW_MACINFO_start_file entry, and yet still need to be + associated to the base file. + + To determine the base file name, we scan the macro definitions until we + reach the first DW_MACINFO_start_file entry. We then initialize + CURRENT_FILE accordingly so that any macro definition found before the + first DW_MACINFO_start_file can still be associated to the base file. */ + + mac_ptr = dwarf2_per_objfile->macinfo.buffer + offset; + mac_end = dwarf2_per_objfile->macinfo.buffer + + dwarf2_per_objfile->macinfo.size; + + do + { + /* Do we at least have room for a macinfo type byte? */ + if (mac_ptr >= mac_end) + { + /* Complaint is printed during the second pass as GDB will probably + stop the first pass earlier upon finding + DW_MACINFO_start_file. */ + break; + } + + macinfo_type = read_1_byte (abfd, mac_ptr); + mac_ptr++; + + switch (macinfo_type) + { + /* A zero macinfo type indicates the end of the macro + information. */ + case 0: + break; + + case DW_MACINFO_define: + case DW_MACINFO_undef: + /* Only skip the data by MAC_PTR. */ + { + unsigned int bytes_read; + + read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + read_direct_string (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + } + break; + + case DW_MACINFO_start_file: + { + unsigned int bytes_read; + int line, file; + + line = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + file = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + + current_file = macro_start_file (file, line, current_file, + comp_dir, lh, cu->objfile); + } + break; + + case DW_MACINFO_end_file: + /* No data to skip by MAC_PTR. */ + break; + + case DW_MACINFO_vendor_ext: + /* Only skip the data by MAC_PTR. */ + { + unsigned int bytes_read; + + read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + read_direct_string (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + } + break; + + case DW_MACINFO_GNU_define_indirect4: + case DW_MACINFO_GNU_undef_indirect4: + { + unsigned int bytes_read; + + read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + mac_ptr += 4; + } + break; + + case DW_MACINFO_GNU_transparent_include4: + /* Note that, according to the spec, a transparent include + chain cannot call DW_MACINFO_start_file. So, we can just + skip this opcode. */ + mac_ptr += 4; + break; + + case DW_MACINFO_GNU_define_opcode: + { + unsigned int bytes_read, arg; + + mac_ptr += 1; + arg = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read); + mac_ptr += bytes_read; + mac_ptr += arg; + } + break; + + default: + break; + } + } while (macinfo_type != 0 && current_file == NULL); + + /* Second pass: Process all entries. + + Use the AT_COMMAND_LINE flag to determine whether we are still processing + command-line macro definitions/undefinitions. This flag is unset when we + reach the first DW_MACINFO_start_file entry. */ + + mac_ptr = dwarf2_per_objfile->macinfo.buffer + offset; + + dwarf_decode_macro_bytes (abfd, mac_ptr, mac_end, current_file, + lh, comp_dir, cu); +} + /* Check if the attribute's form is a DW_FORM_block* if so return true else false. */ static int diff --git a/include/dwarf2.h b/include/dwarf2.h index b2806ef..40a8a66 100644 --- a/include/dwarf2.h +++ b/include/dwarf2.h @@ -877,7 +877,13 @@ enum dwarf_macinfo_record_type DW_MACINFO_undef = 2, DW_MACINFO_start_file = 3, DW_MACINFO_end_file = 4, - DW_MACINFO_vendor_ext = 255 + DW_MACINFO_lo_user = 0xe0, + DW_MACINFO_GNU_define_indirect4 = 0xe0, + DW_MACINFO_GNU_undef_indirect4 = 0xe1, + DW_MACINFO_GNU_transparent_include4 = 0xe2, + DW_MACINFO_GNU_define_opcode = 0xe3, + DW_MACINFO_hi_user = 0xfe, + DW_MACINFO_vendor_ext = 0xff }; \f /* @@@ For use with GNU frame unwind information. */ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo 2011-07-13 19:59 ` Tom Tromey @ 2011-07-13 20:37 ` Jakub Jelinek 2011-07-18 15:42 ` Tom Tromey 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2011-07-13 20:37 UTC (permalink / raw) To: Tom Tromey Cc: Jason Merrill, Richard Henderson, Jan Kratochvil, Roland McGrath, Cary Coutant, Mark Wielaard, gcc-patches On Wed, Jul 13, 2011 at 01:36:03PM -0600, Tom Tromey wrote: > I wrote a gdb patch for this. I've appended it in case you want to try > it out; it is against git master. I tried it a little on an executable > Jakub sent me and it seems to work fine. Thanks. > It is no trouble to change this patch if you change the format. It > wasn't hard to write in the first place, it just bigger than it is > because I moved a bunch of code into a new function. > > I don't think I really understood DW_MACINFO_GNU_define_opcode, so the > implementation here is probably wrong. Well, I think you've skipped it correctly and furthermore even patched GCC doesn't emit it. The point of it was to allow skipping unknown opcodes. If you implement this opcode fully and say GCC 4.8 adds a new vendor opcode, the old implementation would be able to silently skip over such opcodes. So, the reader implementation could do something like have an array of 256 pointers, at the start of parsing a particular .debug_macinfo chunk clear it (or, when the chunk is read because of DW_MACINFO_GNU_transparent_include4 it would instead make a copy of the current array and make the copy current), and when you encounter DW_OP_GNU_define_opcode, you store a pointer to the encoded operands of that opcode into the table. And, when you find an unknown opcode (reach default: case), and array[op] is non-NULL, you read the uleb128 from that location to get the count and iterate over the DW_FORM_* values in the array and for each of them skip corresponding bytes from the opcode's operand. Say .debug_macinfo chunk could start with DW_MACINFO_GNU_define_opcode, 0xe5, 2, DW_FORM_udata, DW_FORM_block, DW_MACINFO_define, 0, "A 1", 0xe5, 0x80, 0x7f, 5, 1, 2, 3, 4, 5, DW_MACINFO_define, 0, "B 1", 0 and you'd be able to grok both defines in it, because you'd understand that after seeing 0xe5 you need to read one uleb128, another uleb128 and skip the second number of bytes after it. The copy of the table would be so that the producer could define_opcode just in the .debug_macinfo spot referenced from DW_AT_macro_info and wouldn't have to repeat it in the transparent include chains, if it ensured that the chains wouldn't be merged without having the define_opcode in all the referencing .debug_macinfo sections. And the copy of array allows the transparent chain to add new opcodes or redefine them, while not affecting the outer sequence. Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo 2011-07-13 20:37 ` Jakub Jelinek @ 2011-07-18 15:42 ` Tom Tromey 0 siblings, 0 replies; 25+ messages in thread From: Tom Tromey @ 2011-07-18 15:42 UTC (permalink / raw) To: Jakub Jelinek Cc: Jason Merrill, Richard Henderson, Jan Kratochvil, Roland McGrath, Cary Coutant, Mark Wielaard, gcc-patches >>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes: Tom> I don't think I really understood DW_MACINFO_GNU_define_opcode, so the Tom> implementation here is probably wrong. Jakub> Well, I think you've skipped it correctly and furthermore even patched Jakub> GCC doesn't emit it. The point of it was to allow skipping unknown Jakub> opcodes. If you implement this opcode fully and say GCC 4.8 adds a new Jakub> vendor opcode, the old implementation would be able to silently skip Jakub> over such opcodes. I implemented this part today, so I think the gdb patch is complete now. Tom ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2) 2011-07-13 17:12 [RFC] More compact (100x) -g3 .debug_macinfo Jakub Jelinek 2011-07-13 19:59 ` Tom Tromey @ 2011-07-15 15:52 ` Jakub Jelinek 2011-07-15 17:19 ` Richard Henderson 2011-07-15 18:28 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Tom Tromey 1 sibling, 2 replies; 25+ messages in thread From: Jakub Jelinek @ 2011-07-15 15:52 UTC (permalink / raw) To: gcc-patches Cc: Jason Merrill, Richard Henderson, Tom Tromey, Jan Kratochvil, Roland McGrath, Cary Coutant, Mark Wielaard On Wed, Jul 13, 2011 at 07:00:53PM +0200, Jakub Jelinek wrote: The patch below implements that slight change, in particular the "4" suffixes from the op names were dropped, DW_MACINFO_GNU_*_indirect have DW_FORM_udata and DW_FORM_strp arguments now (i.e. DWARF_OFFSET_SIZE large) and DW_MACINFO_GNU_transparent_include has DW_FORM_sec_offset argument (i.e. again 4 bytes long for 32-bit DWARF and 8 bytes long for 64-bit DWARF). GCC assures that no merging will happen between .debug_macinfo chunks with 32-bit and 64-bit DWARF by adding the byte size in the comdat GROUP name. I think that's cleaner than hardcoding 4 bytes and not optimizing anything on MIPS. The newly added opcodes: DW_MACINFO_GNU_define_indirect 0xe0 This opcode has two arguments, one is uleb128 lineno and the other is offset size long byte offset into .debug_str. Except for the encoding of the string it is similar to DW_MACINFO_define. DW_MACINFO_GNU_undef_indirect 0xe1 This opcode has two arguments, one is uleb128 lineno and the other is offset size long byte offset into .debug_str. Except for the encoding of the string it is similar to DW_MACINFO_undef. DW_MACINFO_GNU_transparent_include 0xe2 This opcode has a single argument, a offset size long byte offset into .debug_macinfo. It instructs the debug info consumer that this opcode during reading should be replaced with the sequence of .debug_macinfo opcodes from the mentioned offset, up to a terminating 0 opcode (not including that 0). DW_MACINFO_GNU_define_opcode 0xe3 This is an opcode for future extensibility through which a debugger could skip unknown opcodes. It has 3 arguments: 1 byte opcode number, uleb128 count of arguments and a count bytes long array, with a DW_FORM_* code how the argument is encoded. DW_MACINFO_GNU_define_opcode <0, 0 []> DW_MACINFO_GNU_define_opcode <DW_MACINFO_define, 2 [DW_FORM_udata, DW_FORM_string]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_undef, 2 [DW_FORM_udata, DW_FORM_string]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_start_file, 2 [DW_FORM_udata, DW_FORM_udata]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_end_file, 1 [DW_FORM_udata]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_define_indirect, 2 [DW_FORM_udata, DW_FORM_strp]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_undef_indirect, 2 [DW_FORM_udata, DW_FORM_strp]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_transparent_include, 1 [DW_FORM_sec_offset]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_define_opcode, 2 [DW_FORM_data1, DW_FORM_block]> DW_MACINFO_GNU_define_opcode <DW_MACINFO_vendor_ext, 2 [DW_FORM_udata, DW_FORM_string]> 2011-07-15 Jakub Jelinek <jakub@redhat.com> * dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add. (DW_MACINFO_GNU_define_indirect, DW_MACINFO_GNU_undef_indirect, DW_MACINFO_GNU_transparent_include, DW_MACINFO_GNU_define_opcode): Add. * dwarf2out.c (dwarf2out_undef): Remove redundant semicolon. (htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op): New functions. (output_macinfo): Use them. If !dwarf_strict and .debug_str is mergeable, optimize longer strings using DW_MACINFO_GNU_{define,undef}_indirect and if HAVE_COMDAT and ELF, optimize longer sequences of define/undef ops from headers using DW_MACINFO_GNU_transparent_include. --- include/dwarf2.h.jj 2011-06-23 10:14:06.000000000 +0200 +++ include/dwarf2.h 2011-07-13 11:39:49.000000000 +0200 @@ -877,7 +877,13 @@ enum dwarf_macinfo_record_type DW_MACINFO_undef = 2, DW_MACINFO_start_file = 3, DW_MACINFO_end_file = 4, - DW_MACINFO_vendor_ext = 255 + DW_MACINFO_lo_user = 0xe0, + DW_MACINFO_GNU_define_indirect = 0xe0, + DW_MACINFO_GNU_undef_indirect = 0xe1, + DW_MACINFO_GNU_transparent_include = 0xe2, + DW_MACINFO_GNU_define_opcode = 0xe3, + DW_MACINFO_hi_user = 0xfe, + DW_MACINFO_vendor_ext = 0xff }; \f /* @@@ For use with GNU frame unwind information. */ --- gcc/dwarf2out.c.jj 2011-07-12 17:59:01.000000000 +0200 +++ gcc/dwarf2out.c 2011-07-13 17:04:17.000000000 +0200 @@ -20383,17 +20383,117 @@ dwarf2out_undef (unsigned int lineno ATT macinfo_entry e; e.code = DW_MACINFO_undef; e.lineno = lineno; - e.info = xstrdup (buffer);; + e.info = xstrdup (buffer); VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); } } +/* Routines to manipulate hash table of CUs. */ +static hashval_t +htab_macinfo_hash (const void *of) +{ + const macinfo_entry *const entry = + (const macinfo_entry *) of; + + return htab_hash_string (entry->info); +} + +static int +htab_macinfo_eq (const void *of1, const void *of2) +{ + const macinfo_entry *const entry1 = (const macinfo_entry *) of1; + const macinfo_entry *const entry2 = (const macinfo_entry *) of2; + + return !strcmp (entry1->info, entry2->info); +} + +/* Output a single .debug_macinfo entry. */ + +static void +output_macinfo_op (macinfo_entry *ref) +{ + int file_num; + size_t len; + struct indirect_string_node *node; + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + + switch (ref->code) + { + case DW_MACINFO_start_file: + file_num = maybe_emit_file (lookup_filename (ref->info)); + dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); + dw2_asm_output_data_uleb128 (ref->lineno, + "Included from line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + break; + case DW_MACINFO_end_file: + dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + len = strlen (ref->info) + 1; + if (!dwarf_strict + && len > DWARF_OFFSET_SIZE + && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET + && (debug_str_section->common.flags & SECTION_MERGE) != 0) + { + ref->code = ref->code == DW_MACINFO_define + ? DW_MACINFO_GNU_define_indirect + : DW_MACINFO_GNU_undef_indirect; + output_macinfo_op (ref); + return; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_MACINFO_define + ? "Define macro" : "Undefine macro"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_nstring (ref->info, -1, "The macro"); + break; + case DW_MACINFO_GNU_define_indirect: + case DW_MACINFO_GNU_undef_indirect: + node = find_AT_string (ref->info); + if (node->form != DW_FORM_strp) + { + char label[32]; + ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter); + ++dw2_string_counter; + node->label = xstrdup (label); + node->form = DW_FORM_strp; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_MACINFO_GNU_define_indirect + ? "Define macro indirect" + : "Undefine macro indirect"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label, + debug_str_section, "The macro: \"%s\"", + ref->info); + break; + case DW_MACINFO_GNU_transparent_include: + dw2_asm_output_data (1, ref->code, "Transparent include"); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_MACINFO_SECTION_LABEL, ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL); + break; + default: + fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", + ASM_COMMENT_START, (unsigned long) ref->code); + break; + } +} + static void output_macinfo (void) { unsigned i; unsigned long length = VEC_length (macinfo_entry, macinfo_table); - macinfo_entry *ref; + macinfo_entry *ref, *ref2; + VEC (macinfo_entry, gc) *files = NULL; + unsigned long transparent_includes = 0; + htab_t macinfo_htab = NULL; if (! length) return; @@ -20402,37 +20503,184 @@ output_macinfo (void) { switch (ref->code) { - case DW_MACINFO_start_file: + case DW_MACINFO_start_file: + VEC_safe_push (macinfo_entry, gc, files, ref); + break; + case DW_MACINFO_end_file: + if (!VEC_empty (macinfo_entry, files)) { - int file_num = maybe_emit_file (lookup_filename (ref->info)); - dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); - dw2_asm_output_data_uleb128 - (ref->lineno, "Included from line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + ref2 = VEC_last (macinfo_entry, files); + free (CONST_CAST (char *, ref2->info)); + VEC_pop (macinfo_entry, files); } - break; - case DW_MACINFO_end_file: - dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); - break; - case DW_MACINFO_define: - dw2_asm_output_data (1, DW_MACINFO_define, "Define macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - case DW_MACINFO_undef: - dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - default: - fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", - ASM_COMMENT_START, (unsigned long)ref->code); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: +#ifdef OBJECT_FORMAT_ELF + if (!dwarf_strict + && HAVE_COMDAT_GROUP + && VEC_length (macinfo_entry, files) != 1 + && i > 0 + && i + 1 < length + && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0) + { + char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1]; + unsigned char checksum[16]; + struct md5_ctx ctx; + char *tmp, *tail; + const char *base; + unsigned int j = i, k, l; + void **slot; + + ref2 = VEC_index (macinfo_entry, macinfo_table, i + 1); + if (ref2->code != DW_MACINFO_define + && ref2->code != DW_MACINFO_undef) + break; + + if (VEC_empty (macinfo_entry, files)) + { + if (ref->lineno != 0 || ref2->lineno != 0) + break; + } + else if (ref->lineno == 0) + break; + md5_init_ctx (&ctx); + for (; VEC_iterate (macinfo_entry, macinfo_table, j, ref2); j++) + if (ref2->code != DW_MACINFO_define + && ref2->code != DW_MACINFO_undef) + break; + else if (ref->lineno == 0 && ref2->lineno != 0) + break; + else + { + unsigned char code = ref2->code; + md5_process_bytes (&code, 1, &ctx); + checksum_uleb128 (ref2->lineno, &ctx); + md5_process_bytes (ref2->info, strlen (ref2->info) + 1, + &ctx); + } + md5_finish_ctx (&ctx, checksum); + if (ref->lineno == 0) + base = ""; + else + base = lbasename (VEC_last (macinfo_entry, files)->info); + for (l = 0, k = 0; base[k]; k++) + if (ISIDNUM (base[k]) || base[k] == '.') + l++; + if (l) + l++; + sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED, + VEC_index (macinfo_entry, macinfo_table, i)->lineno); + tmp = XNEWVEC (char, 4 + l + strlen (linebuf) + 1 + 16 * 2 + 1); + strcpy (tmp, DWARF_OFFSET_SIZE == 4 ? "wm4." : "wm8."); + tail = tmp + 4; + if (l) + { + for (k = 0; base[k]; k++) + if (ISIDNUM (base[k]) || base[k] == '.') + *tail++ = base[k]; + *tail++ = '.'; + } + l = strlen (linebuf); + memcpy (tail, linebuf, l); + tail += l; + *tail++ = '.'; + for (k = 0; k < 16; k++) + sprintf (tail + k * 2, "%02x", checksum[k] & 0xff); + ref2 = VEC_index (macinfo_entry, macinfo_table, i - 1); + ref2->code = DW_MACINFO_GNU_transparent_include; + ref2->lineno = 0; + ref2->info = tmp; + if (macinfo_htab == NULL) + macinfo_htab = htab_create (10, htab_macinfo_hash, + htab_macinfo_eq, NULL); + slot = htab_find_slot (macinfo_htab, ref2, INSERT); + if (*slot != NULL) + { + free (CONST_CAST (char *, ref2->info)); + ref2->code = 0; + ref2->info = NULL; + ref2 = (macinfo_entry *) *slot; + output_macinfo_op (ref2); + for (j = i; + VEC_iterate (macinfo_entry, macinfo_table, j, ref2); + j++) + if (ref2->code != DW_MACINFO_define + && ref2->code != DW_MACINFO_undef) + break; + else if (ref->lineno == 0 && ref2->lineno != 0) + break; + else + { + ref2->code = 0; + free (CONST_CAST (char *, ref2->info)); + ref2->info = NULL; + } + } + else + { + *slot = ref2; + ref2->lineno = ++transparent_includes; + output_macinfo_op (ref2); + } + i = j - 1; + continue; + } +#endif + break; + default: break; } + output_macinfo_op (ref); + /* For DW_MACINFO_start_file ref->info has been copied into files + vector. */ + if (ref->code != DW_MACINFO_start_file) + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + ref->code = 0; } + + if (!transparent_includes) + return; + + htab_delete (macinfo_htab); + +#ifdef OBJECT_FORMAT_ELF + for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++) + switch (ref->code) + { + case 0: + continue; + case DW_MACINFO_GNU_transparent_include: + { + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + tree comdat_key = get_identifier (ref->info); + /* Terminate the previous .debug_macinfo section. */ + dw2_asm_output_data (1, 0, "End compilation unit"); + targetm.asm_out.named_section (DEBUG_MACINFO_SECTION, + SECTION_DEBUG + | SECTION_LINKONCE, + comdat_key); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_MACINFO_SECTION_LABEL, + ref->lineno); + ASM_OUTPUT_LABEL (asm_out_file, label); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + } + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + output_macinfo_op (ref); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + break; + default: + gcc_unreachable (); + } +#endif } /* Set up for Dwarf output at the start of compilation. */ Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2) 2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek @ 2011-07-15 17:19 ` Richard Henderson 2011-07-15 21:18 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 3) Jakub Jelinek 2011-07-15 18:28 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Tom Tromey 1 sibling, 1 reply; 25+ messages in thread From: Richard Henderson @ 2011-07-15 17:19 UTC (permalink / raw) To: Jakub Jelinek Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/15/2011 08:42 AM, Jakub Jelinek wrote: > The newly added opcodes: > DW_MACINFO_GNU_define_indirect 0xe0 > This opcode has two arguments, one is uleb128 lineno and the > other is offset size long byte offset into .debug_str. Except > for the encoding of the string it is similar to DW_MACINFO_define. > DW_MACINFO_GNU_undef_indirect 0xe1 > This opcode has two arguments, one is uleb128 lineno and the > other is offset size long byte offset into .debug_str. Except > for the encoding of the string it is similar to DW_MACINFO_undef. > DW_MACINFO_GNU_transparent_include 0xe2 > This opcode has a single argument, a offset size long byte offset into > .debug_macinfo. It instructs the debug info consumer that > this opcode during reading should be replaced with the sequence > of .debug_macinfo opcodes from the mentioned offset, up to > a terminating 0 opcode (not including that 0). > DW_MACINFO_GNU_define_opcode 0xe3 > This is an opcode for future extensibility through which > a debugger could skip unknown opcodes. It has 3 arguments: > 1 byte opcode number, uleb128 count of arguments and > a count bytes long array, with a DW_FORM_* code how the > argument is encoded. I do like the new opcodes. Elsewhere you described transparent_include as also saving state about defined opcodes around the include. Do you want to either describe that or drop it? > + case DW_MACINFO_define: > + case DW_MACINFO_undef: > +#ifdef OBJECT_FORMAT_ELF > + if (!dwarf_strict > + && HAVE_COMDAT_GROUP > + && VEC_length (macinfo_entry, files) != 1 > + && i > 0 > + && i + 1 < length > + && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0) > + { > + char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1]; > + unsigned char checksum[16]; > + struct md5_ctx ctx; I'd like to see this broken out into some functions, and avoid as much code as possible within ifdefs. Perhaps some_function (...) { #ifndef OBJECT_FORMAT_ELF return; #endif // everything else } I think it also doesn't help review that there are no comments at all, and a preponderance of description-less variable names like "ref" and "ref2". r~ ^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC] More compact (100x) -g3 .debug_macinfo (take 3) 2011-07-15 17:19 ` Richard Henderson @ 2011-07-15 21:18 ` Jakub Jelinek 2011-07-18 15:09 ` Tom Tromey 2011-07-20 1:17 ` Richard Henderson 0 siblings, 2 replies; 25+ messages in thread From: Jakub Jelinek @ 2011-07-15 21:18 UTC (permalink / raw) To: Richard Henderson Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On Fri, Jul 15, 2011 at 09:22:42AM -0700, Richard Henderson wrote: > On 07/15/2011 08:42 AM, Jakub Jelinek wrote: > > > The newly added opcodes: > > DW_MACINFO_GNU_define_indirect 0xe0 > > This opcode has two arguments, one is uleb128 lineno and the > > other is offset size long byte offset into .debug_str. Except > > for the encoding of the string it is similar to DW_MACINFO_define. > > DW_MACINFO_GNU_undef_indirect 0xe1 > > This opcode has two arguments, one is uleb128 lineno and the > > other is offset size long byte offset into .debug_str. Except > > for the encoding of the string it is similar to DW_MACINFO_undef. > > DW_MACINFO_GNU_transparent_include 0xe2 > > This opcode has a single argument, a offset size long byte offset into > > .debug_macinfo. It instructs the debug info consumer that > > this opcode during reading should be replaced with the sequence > > of .debug_macinfo opcodes from the mentioned offset, up to > > a terminating 0 opcode (not including that 0). > > DW_MACINFO_GNU_define_opcode 0xe3 > > This is an opcode for future extensibility through which > > a debugger could skip unknown opcodes. It has 3 arguments: > > 1 byte opcode number, uleb128 count of arguments and > > a count bytes long array, with a DW_FORM_* code how the > > argument is encoded. > > I do like the new opcodes. > > Elsewhere you described transparent_include as also saving state > about defined opcodes around the include. Do you want to either > describe that or drop it? Ok, so how about this way (as DWARF4 modifications, of course for DWARF5 proposal GNU_ would be gone and the ops would have different codes): 6.3.1 The valid macinfo types are as follows: ... DW_MACINFO_GNU_define_indirect A macro definition. DW_MACINFO_GNU_undef_indirect A macro undefinition. DW_MACINFO_GNU_transparent_include Include a sequence of entries from given offset. DW_MACINFO_GNU_define_opcode Define extension opcode and its arguments. 6.3.1.1 All DW_MACINFO_GNU_define_indirect and DW_MACINFO_undef_indirect entries have two operands. The first operand encodes the line number of the source line on which the relevant defining or undefining macro directives appeared. The second operand consists of an offset into a string table contained in the .debug_str section of the object file. In the 32-bit DWARF format, the representation of the operand value is a 4-byte unsigned offset; in the 64-bit DWARF format, it is an 8-byte unsigned offset. Apart from the encoding of the operands these entries are equivalent to DW_MACINFO_define resp. DW_MACINFO_undef. 6.3.1.5 Transparent inclusion of a sequence of entries A DW_MACINFO_GNU_transparent_include entry has one operand, offset into another part of the .debug_macinfo section. In the 32-bit DWARF format, the representation of the operand value is a 4-byte unsigned offset; in the 64-bit DWARF format, it is an 8-byte unsigned offset. This entry instructs the consumer to replace this entry with a sequence of macinfo entries found at the given .debug_macinfo offset, up to, but excluding, the terminating entry with type code 0. This entry type is aimed at sharing duplicate sequences of macinfo entries between macinfo from different compilation units. The producer should ensure that only sequences with matching DWARF format size (either all 32-bit DWARF or all 64-bit DWARF) are merged together, and that either DW_MACINFO_start_file entries aren't in those sequences, or only macinfo entries referencing the same .debug_line section part include the sequence. 6.3.1.6 Defining new opcodes and operands A DW_MACINFO_GNU_define_opcode entry has 2 operands. The first operand is a one byte constant with the type code it defines operand types for, the second operand is a DW_FORM_block encoded array of operand forms. The second operand starts with an unsigned LEB128 encoded number of operands and for each of the operands there is one byte, containing a form encoding how the corresponding operand is encoded. This entry allows to define new vendor extension entry types which consumers will be able to skip over and ignore. Each so defined opcode is valid for subsequent entries until the terminating entry with type code 0, including any sequences included from those entries using DW_MACINFO_GNU_transparent_include. Opcodes defined using this entry in a chain included through DW_MACINFO_GNU_transparent_include isn't valid in the parent sequence after the DW_MACINFO_GNU_transparent_include entry that included it though. 7.22 Macro Information Add DW_MACINFO_lo_user 0xe0 DW_MACINFO_GNU_define_indirect 0xe0 DW_MACINFO_GNU_undef_indirect 0xe1 DW_MACINFO_GNU_transparent_include 0xe2 DW_MACINFO_GNU_define_opcode 0xe3 DW_MACINFO_hi_user 0xfe to the table. > I'd like to see this broken out into some functions, and avoid > as much code as possible within ifdefs. Perhaps > > some_function (...) > { > #ifndef OBJECT_FORMAT_ELF > return; > #endif > // everything else > } > > I think it also doesn't help review that there are no comments > at all, and a preponderance of description-less variable names > like "ref" and "ref2". I've tried to cure these issues in the following (so far just lightly tested) patch: 2011-07-15 Jakub Jelinek <jakub@redhat.com> * dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add. (DW_MACINFO_GNU_define_indirect, DW_MACINFO_GNU_undef_indirect, DW_MACINFO_GNU_transparent_include, DW_MACINFO_GNU_define_opcode): Add. * dwarf2out.c (dwarf2out_define): If the vector is empty and lineno is 0, emit a dummy entry first. (dwarf2out_undef): Likewise. Remove redundant semicolon. (htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op, optimize_macinfo_range): New functions. (output_macinfo): Use them. If !dwarf_strict and .debug_str is mergeable, optimize longer strings using DW_MACINFO_GNU_{define,undef}_indirect and if HAVE_COMDAT_GROUP, optimize longer sequences of define/undef ops from headers using DW_MACINFO_GNU_transparent_include. --- include/dwarf2.h.jj 2011-06-23 10:14:06.000000000 +0200 +++ include/dwarf2.h 2011-07-13 11:39:49.000000000 +0200 @@ -877,7 +877,13 @@ enum dwarf_macinfo_record_type DW_MACINFO_undef = 2, DW_MACINFO_start_file = 3, DW_MACINFO_end_file = 4, - DW_MACINFO_vendor_ext = 255 + DW_MACINFO_lo_user = 0xe0, + DW_MACINFO_GNU_define_indirect = 0xe0, + DW_MACINFO_GNU_undef_indirect = 0xe1, + DW_MACINFO_GNU_transparent_include = 0xe2, + DW_MACINFO_GNU_define_opcode = 0xe3, + DW_MACINFO_hi_user = 0xfe, + DW_MACINFO_vendor_ext = 0xff }; \f /* @@@ For use with GNU frame unwind information. */ --- gcc/dwarf2out.c.jj 2011-07-15 20:46:32.000000000 +0200 +++ gcc/dwarf2out.c 2011-07-15 22:15:14.000000000 +0200 @@ -20291,6 +20291,15 @@ dwarf2out_define (unsigned int lineno AT if (debug_info_level >= DINFO_LEVEL_VERBOSE) { macinfo_entry e; + /* Insert a dummy first entry to be able to optimize the whole + predefined macro block using DW_MACINFO_GNU_transparent_include. */ + if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0) + { + e.code = 0; + e.lineno = 0; + e.info = NULL; + VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); + } e.code = DW_MACINFO_define; e.lineno = lineno; e.info = xstrdup (buffer);; @@ -20309,58 +20318,363 @@ dwarf2out_undef (unsigned int lineno ATT if (debug_info_level >= DINFO_LEVEL_VERBOSE) { macinfo_entry e; + /* Insert a dummy first entry to be able to optimize the whole + predefined macro block using DW_MACINFO_GNU_transparent_include. */ + if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0) + { + e.code = 0; + e.lineno = 0; + e.info = NULL; + VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); + } e.code = DW_MACINFO_undef; e.lineno = lineno; - e.info = xstrdup (buffer);; + e.info = xstrdup (buffer); VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); } } +/* Routines to manipulate hash table of CUs. */ + +static hashval_t +htab_macinfo_hash (const void *of) +{ + const macinfo_entry *const entry = + (const macinfo_entry *) of; + + return htab_hash_string (entry->info); +} + +static int +htab_macinfo_eq (const void *of1, const void *of2) +{ + const macinfo_entry *const entry1 = (const macinfo_entry *) of1; + const macinfo_entry *const entry2 = (const macinfo_entry *) of2; + + return !strcmp (entry1->info, entry2->info); +} + +/* Output a single .debug_macinfo entry. */ + +static void +output_macinfo_op (macinfo_entry *ref) +{ + int file_num; + size_t len; + struct indirect_string_node *node; + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + + switch (ref->code) + { + case DW_MACINFO_start_file: + file_num = maybe_emit_file (lookup_filename (ref->info)); + dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); + dw2_asm_output_data_uleb128 (ref->lineno, + "Included from line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + break; + case DW_MACINFO_end_file: + dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + len = strlen (ref->info) + 1; + if (!dwarf_strict + && len > DWARF_OFFSET_SIZE + && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET + && (debug_str_section->common.flags & SECTION_MERGE) != 0) + { + ref->code = ref->code == DW_MACINFO_define + ? DW_MACINFO_GNU_define_indirect + : DW_MACINFO_GNU_undef_indirect; + output_macinfo_op (ref); + return; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_MACINFO_define + ? "Define macro" : "Undefine macro"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_nstring (ref->info, -1, "The macro"); + break; + case DW_MACINFO_GNU_define_indirect: + case DW_MACINFO_GNU_undef_indirect: + node = find_AT_string (ref->info); + if (node->form != DW_FORM_strp) + { + char label[32]; + ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter); + ++dw2_string_counter; + node->label = xstrdup (label); + node->form = DW_FORM_strp; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_MACINFO_GNU_define_indirect + ? "Define macro indirect" + : "Undefine macro indirect"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label, + debug_str_section, "The macro: \"%s\"", + ref->info); + break; + case DW_MACINFO_GNU_transparent_include: + dw2_asm_output_data (1, ref->code, "Transparent include"); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_MACINFO_SECTION_LABEL, ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL); + break; + default: + fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", + ASM_COMMENT_START, (unsigned long) ref->code); + break; + } +} + +/* Attempt to make a sequence of define/undef macinfo ops shareable with + other compilation unit .debug_macinfo sections. IDX is the first + index of a define/undef, return the number of ops that should be + emitted in a comdat .debug_macinfo section and emit + a DW_MACINFO_GNU_transparent_include entry referencing it. + If the define/undef entry should be emitted normally, return 0. */ + +static unsigned +optimize_macinfo_range (unsigned int idx, VEC (macinfo_entry, gc) *files, + htab_t *macinfo_htab) +{ + macinfo_entry *first, *second, *cur, *inc; + char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1]; + unsigned char checksum[16]; + struct md5_ctx ctx; + char *grp_name, *tail; + const char *base; + unsigned int i, count, encoded_filename_len, linebuf_len; + void **slot; + + first = VEC_index (macinfo_entry, macinfo_table, idx); + second = VEC_index (macinfo_entry, macinfo_table, idx + 1); + + /* Optimize only if there are at least two consecutive define/undef ops, + and either all of them are before first DW_MACINFO_start_file + with lineno 0 (i.e. predefined macro block), or all of them are + in some included header file. */ + if (second->code != DW_MACINFO_define && second->code != DW_MACINFO_undef) + return 0; + if (VEC_empty (macinfo_entry, files)) + { + if (first->lineno != 0 || second->lineno != 0) + return 0; + } + else if (first->lineno == 0) + return 0; + + /* Find the last define/undef entry that can be grouped together + with first and at the same time compute md5 checksum of their + codes, linenumbers and strings. */ + md5_init_ctx (&ctx); + for (i = idx; VEC_iterate (macinfo_entry, macinfo_table, i, cur); i++) + if (cur->code != DW_MACINFO_define && cur->code != DW_MACINFO_undef) + break; + else if (first->lineno == 0 && cur->lineno != 0) + break; + else + { + unsigned char code = cur->code; + md5_process_bytes (&code, 1, &ctx); + checksum_uleb128 (cur->lineno, &ctx); + md5_process_bytes (cur->info, strlen (cur->info) + 1, &ctx); + } + md5_finish_ctx (&ctx, checksum); + count = i - idx; + + /* From the containing include filename (if any) pick up just + usable characters from its basename. */ + if (first->lineno == 0) + base = ""; + else + base = lbasename (VEC_last (macinfo_entry, files)->info); + for (encoded_filename_len = 0, i = 0; base[i]; i++) + if (ISIDNUM (base[i]) || base[i] == '.') + encoded_filename_len++; + /* Count . at the end. */ + if (encoded_filename_len) + encoded_filename_len++; + + sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED, first->lineno); + linebuf_len = strlen (linebuf); + + /* The group name format is: wmN.[<encoded filename>.]<lineno>.<md5sum> */ + grp_name = XNEWVEC (char, 4 + encoded_filename_len + linebuf_len + 1 + + 16 * 2 + 1); + memcpy (grp_name, DWARF_OFFSET_SIZE == 4 ? "wm4." : "wm8.", 4); + tail = grp_name + 4; + if (encoded_filename_len) + { + for (i = 0; base[i]; i++) + if (ISIDNUM (base[i]) || base[i] == '.') + *tail++ = base[i]; + *tail++ = '.'; + } + memcpy (tail, linebuf, linebuf_len); + tail += linebuf_len; + *tail++ = '.'; + for (i = 0; i < 16; i++) + sprintf (tail + i * 2, "%02x", checksum[i] & 0xff); + + /* Construct a macinfo_entry for DW_MACINFO_GNU_transparent_include + in the empty vector entry before the first define/undef. */ + inc = VEC_index (macinfo_entry, macinfo_table, idx - 1); + inc->code = DW_MACINFO_GNU_transparent_include; + inc->lineno = 0; + inc->info = grp_name; + if (*macinfo_htab == NULL) + *macinfo_htab = htab_create (10, htab_macinfo_hash, htab_macinfo_eq, NULL); + /* Avoid emitting duplicates. */ + slot = htab_find_slot (*macinfo_htab, inc, INSERT); + if (*slot != NULL) + { + free (CONST_CAST (char *, inc->info)); + inc->code = 0; + inc->info = NULL; + /* If such an entry has been used before, just emit + a DW_MACINFO_GNU_transparent_include op. */ + inc = (macinfo_entry *) *slot; + output_macinfo_op (inc); + /* And clear all macinfo_entry in the range to avoid emitting them + in the second pass. */ + for (i = idx; + VEC_iterate (macinfo_entry, macinfo_table, i, cur) + && i < idx + count; + i++) + { + cur->code = 0; + free (CONST_CAST (char *, cur->info)); + cur->info = NULL; + } + } + else + { + *slot = inc; + inc->lineno = htab_elements (*macinfo_htab); + output_macinfo_op (inc); + } + return count; +} + +/* Output macinfo section(s). */ + static void output_macinfo (void) { unsigned i; unsigned long length = VEC_length (macinfo_entry, macinfo_table); macinfo_entry *ref; + VEC (macinfo_entry, gc) *files = NULL; + htab_t macinfo_htab = NULL; if (! length) return; + /* In the first loop, it emits the primary .debug_macinfo section + and after each emitted op the macinfo_entry is cleared. + If a longer range of define/undef ops can be optimized using + DW_MACINFO_GNU_transparent_include, the + DW_MACINFO_GNU_transparent_include op is emitted and kept in + the vector before the first define/undef in the range and the + whole range of define/undef ops is not emitted and kept. */ for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++) { switch (ref->code) { - case DW_MACINFO_start_file: + case DW_MACINFO_start_file: + VEC_safe_push (macinfo_entry, gc, files, ref); + break; + case DW_MACINFO_end_file: + if (!VEC_empty (macinfo_entry, files)) { - int file_num = maybe_emit_file (lookup_filename (ref->info)); - dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); - dw2_asm_output_data_uleb128 - (ref->lineno, "Included from line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + macinfo_entry *file = VEC_last (macinfo_entry, files); + free (CONST_CAST (char *, file->info)); + VEC_pop (macinfo_entry, files); } - break; - case DW_MACINFO_end_file: - dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); - break; - case DW_MACINFO_define: - dw2_asm_output_data (1, DW_MACINFO_define, "Define macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - case DW_MACINFO_undef: - dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - default: - fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", - ASM_COMMENT_START, (unsigned long)ref->code); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + if (!dwarf_strict + && HAVE_COMDAT_GROUP + && VEC_length (macinfo_entry, files) != 1 + && i > 0 + && i + 1 < length + && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0) + { + unsigned count = optimize_macinfo_range (i, files, &macinfo_htab); + if (count) + { + i += count - 1; + continue; + } + } + break; + case 0: + /* A dummy entry may be inserted at the beginning to be able + to optimize the whole block of predefined macros. */ + if (i == 0) + continue; + default: break; } + output_macinfo_op (ref); + /* For DW_MACINFO_start_file ref->info has been copied into files + vector. */ + if (ref->code != DW_MACINFO_start_file) + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + ref->code = 0; } + + if (macinfo_htab == NULL) + return; + + htab_delete (macinfo_htab); + + /* If any DW_MACINFO_GNU_transparent_include were used, on those + DW_MACINFO_GNU_transparent_include entries terminate the + current chain and switch to a new comdat .debug_macinfo + section and emit the define/undef entries within it. */ + for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++) + switch (ref->code) + { + case 0: + continue; + case DW_MACINFO_GNU_transparent_include: + { + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + tree comdat_key = get_identifier (ref->info); + /* Terminate the previous .debug_macinfo section. */ + dw2_asm_output_data (1, 0, "End compilation unit"); + targetm.asm_out.named_section (DEBUG_MACINFO_SECTION, + SECTION_DEBUG + | SECTION_LINKONCE, + comdat_key); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_MACINFO_SECTION_LABEL, + ref->lineno); + ASM_OUTPUT_LABEL (asm_out_file, label); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + } + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + output_macinfo_op (ref); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + break; + default: + gcc_unreachable (); + } } /* Set up for Dwarf output at the start of compilation. */ Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 3) 2011-07-15 21:18 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 3) Jakub Jelinek @ 2011-07-18 15:09 ` Tom Tromey 2011-07-20 1:17 ` Richard Henderson 1 sibling, 0 replies; 25+ messages in thread From: Tom Tromey @ 2011-07-18 15:09 UTC (permalink / raw) To: Jakub Jelinek Cc: Richard Henderson, gcc-patches, Jason Merrill, Jan Kratochvil, Cary Coutant, Mark Wielaard >>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes: Jakub> Ok, so how about this way (as DWARF4 modifications, of course for Jakub> DWARF5 proposal GNU_ would be gone and the ops would have different Jakub> codes): Thanks very much for writing it up this way. I think it is very important that all our DWARF extensions be well-documented. Jakub> 6.3.1.6 Defining new opcodes and operands Jakub> The second operand starts with an unsigned LEB128 encoded number Jakub> of operands and for each of the operands there is one byte, Jakub> containing a form encoding how the corresponding operand is Jakub> encoded. It seems to me that DW_FORM_flag_present is not useful here. Jakub> Each so defined opcode is valid for subsequent entries until the Jakub> terminating entry with type code 0, including any sequences Jakub> included from those entries using Jakub> DW_MACINFO_GNU_transparent_include. Opcodes defined using this Jakub> entry in a chain included through Jakub> DW_MACINFO_GNU_transparent_include isn't valid in the parent Jakub> sequence after the DW_MACINFO_GNU_transparent_include entry that Jakub> included it though. I think you can remove this second sentence. It is implied by the first one. Tom ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 3) 2011-07-15 21:18 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 3) Jakub Jelinek 2011-07-18 15:09 ` Tom Tromey @ 2011-07-20 1:17 ` Richard Henderson 2011-07-21 11:38 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Jakub Jelinek 1 sibling, 1 reply; 25+ messages in thread From: Richard Henderson @ 2011-07-20 1:17 UTC (permalink / raw) To: Jakub Jelinek Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/15/2011 01:58 PM, Jakub Jelinek wrote: > * dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add. > (DW_MACINFO_GNU_define_indirect, DW_MACINFO_GNU_undef_indirect, > DW_MACINFO_GNU_transparent_include, DW_MACINFO_GNU_define_opcode): > Add. > > * dwarf2out.c (dwarf2out_define): If the vector is empty and > lineno is 0, emit a dummy entry first. > (dwarf2out_undef): Likewise. Remove redundant semicolon. > (htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op, > optimize_macinfo_range): New functions. > (output_macinfo): Use them. If !dwarf_strict and .debug_str is > mergeable, optimize longer strings using > DW_MACINFO_GNU_{define,undef}_indirect and if HAVE_COMDAT_GROUP, > optimize longer sequences of define/undef ops from headers > using DW_MACINFO_GNU_transparent_include. This looks much better. Barring any other feedback from other interested Dwarf parties, I think this can go in. r~ ^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-20 1:17 ` Richard Henderson @ 2011-07-21 11:38 ` Jakub Jelinek 2011-07-21 17:25 ` Richard Henderson 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2011-07-21 11:38 UTC (permalink / raw) To: Richard Henderson Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard [-- Attachment #1: Type: text/plain, Size: 2391 bytes --] On Tue, Jul 19, 2011 at 04:26:08PM -0700, Richard Henderson wrote: > On 07/15/2011 01:58 PM, Jakub Jelinek wrote: > > * dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add. > > (DW_MACINFO_GNU_define_indirect, DW_MACINFO_GNU_undef_indirect, > > DW_MACINFO_GNU_transparent_include, DW_MACINFO_GNU_define_opcode): > > Add. > > > > * dwarf2out.c (dwarf2out_define): If the vector is empty and > > lineno is 0, emit a dummy entry first. > > (dwarf2out_undef): Likewise. Remove redundant semicolon. > > (htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op, > > optimize_macinfo_range): New functions. > > (output_macinfo): Use them. If !dwarf_strict and .debug_str is > > mergeable, optimize longer strings using > > DW_MACINFO_GNU_{define,undef}_indirect and if HAVE_COMDAT_GROUP, > > optimize longer sequences of define/undef ops from headers > > using DW_MACINFO_GNU_transparent_include. > > This looks much better. Barring any other feedback from > other interested Dwarf parties, I think this can go in. Based on dwarf-discuss discussions, here is an alternative that intoduces .debug_gnu_macro section instead and DW_AT_GNU_macros referencing it. Currently, the patch emits 3 byte section headers at the start of the .debug_gnu_macro chunks referenced from .debug_info (through DW_AT_GNU_macros), containing version number (2 byte, 4 ATM) and 1 byte section offset, but the DW_GNU_MACRO_transparent_include referenced sequences don't have it. The .debug_gnu_macro section isn't completely usable without the referencing CUs anyway, so IMHO we could still get away completely without any section header, but if we need it, the question is if the offset size there is useful and if the section header shouldn't go before the transparent_include chains as well (only with that e.g. readelf -wm would be able to dump .debug_gnu_macro without reading .debug_info and tracking offsets to it). In x86_64 cc1plus for which I've been posting figures, I see 395 CUs referencing .debug_gnu_macro and at most 511 different .debug_gnu_macro chains with unique md5sums. So, the cost of the 3 byte headers is for cc1plus just in CU referenced chunks 1185 bytes, 3 byte headers in all .debug_gnu_macro chunks 2718 bytes. Also, should the decision whether to emit .debug_gnu_macro or .debug_macinfo depend on -gdwarf-strict, or should we have a separate switch for that? Jakub [-- Attachment #2: X200 --] [-- Type: text/plain, Size: 19219 bytes --] 2011-07-21 Jakub Jelinek <jakub@redhat.com> * dwarf2.h (DW_AT_GNU_macros): New. (enum dwarf_gnu_macro_record_type): New enum. Add DW_GNU_MACRO_*. * dwarf2out.c (struct macinfo_struct): Change code to unsigned char. (DEBUG_GNU_MACRO_SECTION, DEBUG_GNU_MACRO_SECTION_LABEL): Define. (dwarf_attr_name): Handle DW_AT_GNU_macros. (dwarf2out_define): If the vector is empty and lineno is 0, emit a dummy entry first. (dwarf2out_undef): Likewise. Remove redundant semicolon. (htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op, optimize_macinfo_range): New functions. (output_macinfo): Use them. If !dwarf_strict and .debug_str is mergeable, optimize longer strings using DW_GNU_MACRO_{define,undef}_indirect and if HAVE_COMDAT_GROUP, optimize longer sequences of define/undef ops from headers using DW_GNU_MACRO_transparent_include. For !dwarf_strict emit a section header. (dwarf2out_init): For !dwarf_strict set debug_macinfo_section and macinfo_section_label to DEBUG_GNU_MACRO_SECTION resp. DEBUG_GNU_MACRO_SECTION_LABEL. (dwarf2out_finish): For !dwarf_strict emit DW_AT_GNU_macros instead of DW_AT_macro_info. --- include/dwarf2.h.jj 2011-07-15 20:46:32.000000000 +0200 +++ include/dwarf2.h 2011-07-21 10:11:25.000000000 +0200 @@ -366,6 +366,8 @@ enum dwarf_attribute DW_AT_GNU_all_tail_call_sites = 0x2116, DW_AT_GNU_all_call_sites = 0x2117, DW_AT_GNU_all_source_call_sites = 0x2118, + /* Section offset into .debug_gnu_macro section. */ + DW_AT_GNU_macros = 0x2119, /* VMS extensions. */ DW_AT_VMS_rtnbeg_pd_address = 0x2201, /* GNAT extensions. */ @@ -879,6 +881,21 @@ enum dwarf_macinfo_record_type DW_MACINFO_end_file = 4, DW_MACINFO_vendor_ext = 255 }; + +/* Names and codes for new style macro information. */ +enum dwarf_gnu_macro_record_type + { + DW_GNU_MACRO_define = 1, + DW_GNU_MACRO_undef = 2, + DW_GNU_MACRO_start_file = 3, + DW_GNU_MACRO_end_file = 4, + DW_GNU_MACRO_define_indirect = 5, + DW_GNU_MACRO_undef_indirect = 6, + DW_GNU_MACRO_transparent_include = 7, + DW_GNU_MACRO_define_opcode = 8, + DW_GNU_MACRO_lo_user = 0xe0, + DW_GNU_MACRO_hi_user = 0xff + }; \f /* @@@ For use with GNU frame unwind information. */ --- gcc/dwarf2out.c.jj 2011-07-21 09:54:49.000000000 +0200 +++ gcc/dwarf2out.c 2011-07-21 10:47:03.000000000 +0200 @@ -2770,7 +2770,7 @@ struct GTY(()) dw_ranges_struct { /* A structure to hold a macinfo entry. */ typedef struct GTY(()) macinfo_struct { - unsigned HOST_WIDE_INT code; + unsigned char code; unsigned HOST_WIDE_INT lineno; const char *info; } @@ -3417,6 +3417,9 @@ static void gen_scheduled_generic_parms_ #ifndef DEBUG_MACINFO_SECTION #define DEBUG_MACINFO_SECTION ".debug_macinfo" #endif +#ifndef DEBUG_GNU_MACRO_SECTION +#define DEBUG_GNU_MACRO_SECTION ".debug_gnu_macro" +#endif #ifndef DEBUG_LINE_SECTION #define DEBUG_LINE_SECTION ".debug_line" #endif @@ -3474,6 +3477,9 @@ static void gen_scheduled_generic_parms_ #ifndef DEBUG_MACINFO_SECTION_LABEL #define DEBUG_MACINFO_SECTION_LABEL "Ldebug_macinfo" #endif +#ifndef DEBUG_GNU_MACRO_SECTION_LABEL +#define DEBUG_GNU_MACRO_SECTION_LABEL "Ldebug_gnu_macro" +#endif /* Definitions of defaults for formats and names of various special @@ -4015,6 +4021,8 @@ dwarf_attr_name (unsigned int attr) return "DW_AT_GNU_all_call_sites"; case DW_AT_GNU_all_source_call_sites: return "DW_AT_GNU_all_source_call_sites"; + case DW_AT_GNU_macros: + return "DW_AT_GNU_macros"; case DW_AT_GNAT_descriptive_type: return "DW_AT_GNAT_descriptive_type"; @@ -20291,6 +20299,15 @@ dwarf2out_define (unsigned int lineno AT if (debug_info_level >= DINFO_LEVEL_VERBOSE) { macinfo_entry e; + /* Insert a dummy first entry to be able to optimize the whole + predefined macro block using DW_GNU_MACRO_transparent_include. */ + if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0) + { + e.code = 0; + e.lineno = 0; + e.info = NULL; + VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); + } e.code = DW_MACINFO_define; e.lineno = lineno; e.info = xstrdup (buffer);; @@ -20309,58 +20326,376 @@ dwarf2out_undef (unsigned int lineno ATT if (debug_info_level >= DINFO_LEVEL_VERBOSE) { macinfo_entry e; + /* Insert a dummy first entry to be able to optimize the whole + predefined macro block using DW_GNU_MACRO_transparent_include. */ + if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0) + { + e.code = 0; + e.lineno = 0; + e.info = NULL; + VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); + } e.code = DW_MACINFO_undef; e.lineno = lineno; - e.info = xstrdup (buffer);; + e.info = xstrdup (buffer); VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); } } +/* Routines to manipulate hash table of CUs. */ + +static hashval_t +htab_macinfo_hash (const void *of) +{ + const macinfo_entry *const entry = + (const macinfo_entry *) of; + + return htab_hash_string (entry->info); +} + +static int +htab_macinfo_eq (const void *of1, const void *of2) +{ + const macinfo_entry *const entry1 = (const macinfo_entry *) of1; + const macinfo_entry *const entry2 = (const macinfo_entry *) of2; + + return !strcmp (entry1->info, entry2->info); +} + +/* Output a single .debug_macinfo entry. */ + +static void +output_macinfo_op (macinfo_entry *ref) +{ + int file_num; + size_t len; + struct indirect_string_node *node; + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + + switch (ref->code) + { + case DW_MACINFO_start_file: + file_num = maybe_emit_file (lookup_filename (ref->info)); + dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); + dw2_asm_output_data_uleb128 (ref->lineno, + "Included from line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + break; + case DW_MACINFO_end_file: + dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + len = strlen (ref->info) + 1; + if (!dwarf_strict + && len > DWARF_OFFSET_SIZE + && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET + && (debug_str_section->common.flags & SECTION_MERGE) != 0) + { + ref->code = ref->code == DW_MACINFO_define + ? DW_GNU_MACRO_define_indirect + : DW_GNU_MACRO_undef_indirect; + output_macinfo_op (ref); + return; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_MACINFO_define + ? "Define macro" : "Undefine macro"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_nstring (ref->info, -1, "The macro"); + break; + case DW_GNU_MACRO_define_indirect: + case DW_GNU_MACRO_undef_indirect: + node = find_AT_string (ref->info); + if (node->form != DW_FORM_strp) + { + char label[32]; + ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter); + ++dw2_string_counter; + node->label = xstrdup (label); + node->form = DW_FORM_strp; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_GNU_MACRO_define_indirect + ? "Define macro indirect" + : "Undefine macro indirect"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label, + debug_str_section, "The macro: \"%s\"", + ref->info); + break; + case DW_GNU_MACRO_transparent_include: + dw2_asm_output_data (1, ref->code, "Transparent include"); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_GNU_MACRO_SECTION_LABEL, ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL); + break; + default: + fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", + ASM_COMMENT_START, (unsigned long) ref->code); + break; + } +} + +/* Attempt to make a sequence of define/undef macinfo ops shareable with + other compilation unit .debug_macinfo sections. IDX is the first + index of a define/undef, return the number of ops that should be + emitted in a comdat .debug_macinfo section and emit + a DW_GNU_MACRO_transparent_include entry referencing it. + If the define/undef entry should be emitted normally, return 0. */ + +static unsigned +optimize_macinfo_range (unsigned int idx, VEC (macinfo_entry, gc) *files, + htab_t *macinfo_htab) +{ + macinfo_entry *first, *second, *cur, *inc; + char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1]; + unsigned char checksum[16]; + struct md5_ctx ctx; + char *grp_name, *tail; + const char *base; + unsigned int i, count, encoded_filename_len, linebuf_len; + void **slot; + + first = VEC_index (macinfo_entry, macinfo_table, idx); + second = VEC_index (macinfo_entry, macinfo_table, idx + 1); + + /* Optimize only if there are at least two consecutive define/undef ops, + and either all of them are before first DW_MACINFO_start_file + with lineno 0 (i.e. predefined macro block), or all of them are + in some included header file. */ + if (second->code != DW_MACINFO_define && second->code != DW_MACINFO_undef) + return 0; + if (VEC_empty (macinfo_entry, files)) + { + if (first->lineno != 0 || second->lineno != 0) + return 0; + } + else if (first->lineno == 0) + return 0; + + /* Find the last define/undef entry that can be grouped together + with first and at the same time compute md5 checksum of their + codes, linenumbers and strings. */ + md5_init_ctx (&ctx); + for (i = idx; VEC_iterate (macinfo_entry, macinfo_table, i, cur); i++) + if (cur->code != DW_MACINFO_define && cur->code != DW_MACINFO_undef) + break; + else if (first->lineno == 0 && cur->lineno != 0) + break; + else + { + unsigned char code = cur->code; + md5_process_bytes (&code, 1, &ctx); + checksum_uleb128 (cur->lineno, &ctx); + md5_process_bytes (cur->info, strlen (cur->info) + 1, &ctx); + } + md5_finish_ctx (&ctx, checksum); + count = i - idx; + + /* From the containing include filename (if any) pick up just + usable characters from its basename. */ + if (first->lineno == 0) + base = ""; + else + base = lbasename (VEC_last (macinfo_entry, files)->info); + for (encoded_filename_len = 0, i = 0; base[i]; i++) + if (ISIDNUM (base[i]) || base[i] == '.') + encoded_filename_len++; + /* Count . at the end. */ + if (encoded_filename_len) + encoded_filename_len++; + + sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED, first->lineno); + linebuf_len = strlen (linebuf); + + /* The group name format is: wmN.[<encoded filename>.]<lineno>.<md5sum> */ + grp_name = XNEWVEC (char, 4 + encoded_filename_len + linebuf_len + 1 + + 16 * 2 + 1); + memcpy (grp_name, DWARF_OFFSET_SIZE == 4 ? "wm4." : "wm8.", 4); + tail = grp_name + 4; + if (encoded_filename_len) + { + for (i = 0; base[i]; i++) + if (ISIDNUM (base[i]) || base[i] == '.') + *tail++ = base[i]; + *tail++ = '.'; + } + memcpy (tail, linebuf, linebuf_len); + tail += linebuf_len; + *tail++ = '.'; + for (i = 0; i < 16; i++) + sprintf (tail + i * 2, "%02x", checksum[i] & 0xff); + + /* Construct a macinfo_entry for DW_GNU_MACRO_transparent_include + in the empty vector entry before the first define/undef. */ + inc = VEC_index (macinfo_entry, macinfo_table, idx - 1); + inc->code = DW_GNU_MACRO_transparent_include; + inc->lineno = 0; + inc->info = grp_name; + if (*macinfo_htab == NULL) + *macinfo_htab = htab_create (10, htab_macinfo_hash, htab_macinfo_eq, NULL); + /* Avoid emitting duplicates. */ + slot = htab_find_slot (*macinfo_htab, inc, INSERT); + if (*slot != NULL) + { + free (CONST_CAST (char *, inc->info)); + inc->code = 0; + inc->info = NULL; + /* If such an entry has been used before, just emit + a DW_GNU_MACRO_transparent_include op. */ + inc = (macinfo_entry *) *slot; + output_macinfo_op (inc); + /* And clear all macinfo_entry in the range to avoid emitting them + in the second pass. */ + for (i = idx; + VEC_iterate (macinfo_entry, macinfo_table, i, cur) + && i < idx + count; + i++) + { + cur->code = 0; + free (CONST_CAST (char *, cur->info)); + cur->info = NULL; + } + } + else + { + *slot = inc; + inc->lineno = htab_elements (*macinfo_htab); + output_macinfo_op (inc); + } + return count; +} + +/* Output macinfo section(s). */ + static void output_macinfo (void) { unsigned i; unsigned long length = VEC_length (macinfo_entry, macinfo_table); macinfo_entry *ref; + VEC (macinfo_entry, gc) *files = NULL; + htab_t macinfo_htab = NULL; if (! length) return; + /* output_macinfo* uses these interchangeably. */ + gcc_assert ((int) DW_MACINFO_define == (int) DW_GNU_MACRO_define + && (int) DW_MACINFO_undef == (int) DW_GNU_MACRO_undef + && (int) DW_MACINFO_start_file == (int) DW_GNU_MACRO_start_file + && (int) DW_MACINFO_end_file == (int) DW_GNU_MACRO_end_file); + + /* For .debug_gnu_macro emit the section header. */ + if (!dwarf_strict) + { + dw2_asm_output_data (2, 4, "DWARF GNU macro version number"); + dw2_asm_output_data (1, DWARF_OFFSET_SIZE, "Offset size"); + } + + /* In the first loop, it emits the primary .debug_macinfo section + and after each emitted op the macinfo_entry is cleared. + If a longer range of define/undef ops can be optimized using + DW_GNU_MACRO_transparent_include, the + DW_GNU_MACRO_transparent_include op is emitted and kept in + the vector before the first define/undef in the range and the + whole range of define/undef ops is not emitted and kept. */ for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++) { switch (ref->code) { - case DW_MACINFO_start_file: + case DW_MACINFO_start_file: + VEC_safe_push (macinfo_entry, gc, files, ref); + break; + case DW_MACINFO_end_file: + if (!VEC_empty (macinfo_entry, files)) { - int file_num = maybe_emit_file (lookup_filename (ref->info)); - dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); - dw2_asm_output_data_uleb128 - (ref->lineno, "Included from line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + macinfo_entry *file = VEC_last (macinfo_entry, files); + free (CONST_CAST (char *, file->info)); + VEC_pop (macinfo_entry, files); } - break; - case DW_MACINFO_end_file: - dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); - break; - case DW_MACINFO_define: - dw2_asm_output_data (1, DW_MACINFO_define, "Define macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - case DW_MACINFO_undef: - dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - default: - fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", - ASM_COMMENT_START, (unsigned long)ref->code); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + if (!dwarf_strict + && HAVE_COMDAT_GROUP + && VEC_length (macinfo_entry, files) != 1 + && i > 0 + && i + 1 < length + && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0) + { + unsigned count = optimize_macinfo_range (i, files, &macinfo_htab); + if (count) + { + i += count - 1; + continue; + } + } + break; + case 0: + /* A dummy entry may be inserted at the beginning to be able + to optimize the whole block of predefined macros. */ + if (i == 0) + continue; + default: break; } + output_macinfo_op (ref); + /* For DW_MACINFO_start_file ref->info has been copied into files + vector. */ + if (ref->code != DW_MACINFO_start_file) + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + ref->code = 0; } + + if (macinfo_htab == NULL) + return; + + htab_delete (macinfo_htab); + + /* If any DW_GNU_MACRO_transparent_include were used, on those + DW_GNU_MACRO_transparent_include entries terminate the + current chain and switch to a new comdat .debug_macinfo + section and emit the define/undef entries within it. */ + for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++) + switch (ref->code) + { + case 0: + continue; + case DW_GNU_MACRO_transparent_include: + { + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + tree comdat_key = get_identifier (ref->info); + /* Terminate the previous .debug_macinfo section. */ + dw2_asm_output_data (1, 0, "End compilation unit"); + targetm.asm_out.named_section (DEBUG_GNU_MACRO_SECTION, + SECTION_DEBUG + | SECTION_LINKONCE, + comdat_key); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_GNU_MACRO_SECTION_LABEL, + ref->lineno); + ASM_OUTPUT_LABEL (asm_out_file, label); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + } + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + output_macinfo_op (ref); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + break; + default: + gcc_unreachable (); + } } /* Set up for Dwarf output at the start of compilation. */ @@ -20409,7 +20744,9 @@ dwarf2out_init (const char *filename ATT SECTION_DEBUG, NULL); debug_aranges_section = get_section (DEBUG_ARANGES_SECTION, SECTION_DEBUG, NULL); - debug_macinfo_section = get_section (DEBUG_MACINFO_SECTION, + debug_macinfo_section = get_section (dwarf_strict + ? DEBUG_MACINFO_SECTION + : DEBUG_GNU_MACRO_SECTION, SECTION_DEBUG, NULL); debug_line_section = get_section (DEBUG_LINE_SECTION, SECTION_DEBUG, NULL); @@ -20441,7 +20778,9 @@ dwarf2out_init (const char *filename ATT ASM_GENERATE_INTERNAL_LABEL (ranges_section_label, DEBUG_RANGES_SECTION_LABEL, 0); ASM_GENERATE_INTERNAL_LABEL (macinfo_section_label, - DEBUG_MACINFO_SECTION_LABEL, 0); + dwarf_strict + ? DEBUG_MACINFO_SECTION_LABEL + : DEBUG_GNU_MACRO_SECTION_LABEL, 0); if (debug_info_level >= DINFO_LEVEL_VERBOSE) macinfo_table = VEC_alloc (macinfo_entry, gc, 64); @@ -21984,7 +22323,9 @@ dwarf2out_finish (const char *filename) debug_line_section_label); if (debug_info_level >= DINFO_LEVEL_VERBOSE) - add_AT_macptr (comp_unit_die (), DW_AT_macro_info, macinfo_section_label); + add_AT_macptr (comp_unit_die (), + dwarf_strict ? DW_AT_macro_info : DW_AT_GNU_macros, + macinfo_section_label); if (have_location_lists) optimize_location_lists (comp_unit_die ()); ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-21 11:38 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Jakub Jelinek @ 2011-07-21 17:25 ` Richard Henderson 2011-07-21 18:13 ` Jakub Jelinek ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Richard Henderson @ 2011-07-21 17:25 UTC (permalink / raw) To: Jakub Jelinek Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/21/2011 04:22 AM, Jakub Jelinek wrote: > Currently, the patch emits 3 byte section headers at the start of > the .debug_gnu_macro chunks referenced from .debug_info (through > DW_AT_GNU_macros), containing version number (2 byte, 4 ATM) and > 1 byte section offset, but the DW_GNU_MACRO_transparent_include > referenced sequences don't have it. > The .debug_gnu_macro section isn't completely usable without the referencing > CUs anyway, so IMHO we could still get away completely without > any section header, but if we need it, the question is if > the offset size there is useful and if the section header shouldn't > go before the transparent_include chains as well (only with that > e.g. readelf -wm would be able to dump .debug_gnu_macro without > reading .debug_info and tracking offsets to it). I've been wondering if the header shouldn't contain the opcode definitions, similar to .debug_line, and drop your _define_opcode. It does mean that you couldn't re-define opcodes within any one sequence, but does that actually seem useful? Defining the opcodes in the header makes it clear that there should be a header for the include sequences, and that makes it clear that the defined opcodes are local to a given sequence, without having to have awkward wording as for _define_opcode. I do like mjw's idea of using the version number to distinguish our implementation and one with the dwarf5 stamp of approval. This suggests going ahead with .debug_macro as the section name. > In x86_64 cc1plus for which I've been posting figures, I see > 395 CUs referencing .debug_gnu_macro and at most 511 different > .debug_gnu_macro chains with unique md5sums. So, the cost of the > 3 byte headers is for cc1plus just in CU referenced chunks > 1185 bytes, 3 byte headers in all .debug_gnu_macro chunks > 2718 bytes. Putting the opcode definitions into the header would increase the overhead more, somewhere between 12 and 20 bytes per chain. Which is, I think still manageable. > Also, should the decision whether to emit .debug_gnu_macro or .debug_macinfo > depend on -gdwarf-strict, or should we have a separate switch for that? I'm fine with strict. Anyone else have an opinion? r~ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-21 17:25 ` Richard Henderson @ 2011-07-21 18:13 ` Jakub Jelinek 2011-07-22 13:49 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek 2011-07-22 20:33 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Michael Eager 2 siblings, 0 replies; 25+ messages in thread From: Jakub Jelinek @ 2011-07-21 18:13 UTC (permalink / raw) To: Richard Henderson Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard, dwarf-discuss On Thu, Jul 21, 2011 at 10:10:39AM -0700, Richard Henderson wrote: > On 07/21/2011 04:22 AM, Jakub Jelinek wrote: > > Currently, the patch emits 3 byte section headers at the start of > > the .debug_gnu_macro chunks referenced from .debug_info (through > > DW_AT_GNU_macros), containing version number (2 byte, 4 ATM) and > > 1 byte section offset, but the DW_GNU_MACRO_transparent_include > > referenced sequences don't have it. > > The .debug_gnu_macro section isn't completely usable without the referencing > > CUs anyway, so IMHO we could still get away completely without > > any section header, but if we need it, the question is if > > the offset size there is useful and if the section header shouldn't > > go before the transparent_include chains as well (only with that > > e.g. readelf -wm would be able to dump .debug_gnu_macro without > > reading .debug_info and tracking offsets to it). > > I've been wondering if the header shouldn't contain the opcode > definitions, similar to .debug_line, and drop your _define_opcode. > It does mean that you couldn't re-define opcodes within any one > sequence, but does that actually seem useful? I've talked to Tom about it last night. The advantage of not having it in the header is saving 1 byte for the case when no extension opcodes need to be defined, and perhaps if we changed the wording that the defined opcodes end at 0 termination to allow it to last, then we could with many opcodes share the opcode arguments descriptions. So we could have DW_GNU_MACRO_transparent_include .Ldebug_macro17 after the header of many sections and then .Ldebug_macro17: DW_GNU_MACRO_define_opcode DW_GNU_MACRO_lo_user+0 1 DW_FORM_udata DW_GNU_MACRO_define_opcode DW_GNU_MACRO_lo_user+1 2 DW_FORM_udata DW_FORM_sdata DW_GNU_MACRO_define_opcode DW_GNU_MACRO_lo_user+2 1 DW_FORM_strp 0 If the opcode definitions were in the header, then they could be either after a uleb128 that would say how many of the definitions there are, followed by what I've been proposing as DW_GNU_MACRO_define_opcode arguments alone (i.e. opcode number and DW_FORM_block array of forms for arguments). Or it could be instead without the uleb128, but zero terminated. > Defining the opcodes in the header makes it clear that there > should be a header for the include sequences, and that makes it > clear that the defined opcodes are local to a given sequence, > without having to have awkward wording as for _define_opcode. > > I do like mjw's idea of using the version number to distinguish > our implementation and one with the dwarf5 stamp of approval. > This suggests going ahead with .debug_macro as the section name. If we knew that DWARF5 would either start the .debug_macro sections with a header starting with the 2 byte version and the version there would be 5 (I think if it does start with a 2 byte version number, it would use 5), then perhaps it would be safe to use .debug_macro section with version 4 (or 1?). Shall we use DW_GNU_MACRO_* names, or DW_MACRO_GNU_* names? > > In x86_64 cc1plus for which I've been posting figures, I see > > 395 CUs referencing .debug_gnu_macro and at most 511 different > > .debug_gnu_macro chains with unique md5sums. So, the cost of the > > 3 byte headers is for cc1plus just in CU referenced chunks > > 1185 bytes, 3 byte headers in all .debug_gnu_macro chunks > > 2718 bytes. > > Putting the opcode definitions into the header would increase > the overhead more, somewhere between 12 and 20 bytes per chain. > Which is, I think still manageable. The question is, do we want to always describe all the opcodes we use, or can we assume the ops described in the corresponding standard as given? Say if DWARF 5 (and our version 4) defines 8 standard opcodes, and DWARF 6 adds another 3, and we want to use the new opcodes, with -gdwarf-5 -gno-strict-dwarf we'd define the opcode arguments for the 3 DWARF 6 ops (or a subset of them that we actually use), while for -gdwarf-6 we wouldn't define any and just put version 6 into the section. > > Also, should the decision whether to emit .debug_gnu_macro or .debug_macinfo > > depend on -gdwarf-strict, or should we have a separate switch for that? > > I'm fine with strict. Anyone else have an opinion? Ok. Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) 2011-07-21 17:25 ` Richard Henderson 2011-07-21 18:13 ` Jakub Jelinek @ 2011-07-22 13:49 ` Jakub Jelinek 2011-07-22 15:34 ` Tom Tromey 2011-07-22 17:24 ` Richard Henderson 2011-07-22 20:33 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Michael Eager 2 siblings, 2 replies; 25+ messages in thread From: Jakub Jelinek @ 2011-07-22 13:49 UTC (permalink / raw) To: Richard Henderson Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On Thu, Jul 21, 2011 at 10:10:39AM -0700, Richard Henderson wrote: > On 07/21/2011 04:22 AM, Jakub Jelinek wrote: > > Currently, the patch emits 3 byte section headers at the start of > > the .debug_gnu_macro chunks referenced from .debug_info (through > > DW_AT_GNU_macros), containing version number (2 byte, 4 ATM) and > > 1 byte section offset, but the DW_GNU_MACRO_transparent_include > > referenced sequences don't have it. > > The .debug_gnu_macro section isn't completely usable without the referencing > > CUs anyway, so IMHO we could still get away completely without > > any section header, but if we need it, the question is if > > the offset size there is useful and if the section header shouldn't > > go before the transparent_include chains as well (only with that > > e.g. readelf -wm would be able to dump .debug_gnu_macro without > > reading .debug_info and tracking offsets to it). > > I've been wondering if the header shouldn't contain the opcode > definitions, similar to .debug_line, and drop your _define_opcode. > It does mean that you couldn't re-define opcodes within any one > sequence, but does that actually seem useful? Ok, based on further discussions here, on Dwarf-discuss and on IRC here is a hopefully final version. Changes since last version: 1) uses .debug_macro section instead of .debug_gnu_macro 2) GNU_MACRO -> MACRO_GNU 3) define_opcode is gone 4) each .debug_macro part has a header, 3+ byte long 2 byte version (4) 1 byte flags (flag & 1) ? "64-bit DWARF format" : "32-bit DWARF format" (flag & 2) ? "lineptr present" : "lineptr missing" (flag & 4) ? "defopcode table present" : "defopcode table missing" [{4,8} byte lineptr offset (if flag & 2)] [1-byte count; count x { 1-byte opcode; uleb128 len; len x { 1-byte DW_FORM_* } } (if flag & 4)] 5) for the defopcode the standard lists DW_FORM_* values that are allowed, which rules out e.g. DW_FORM_addr (which would require us to know address size), DW_FORM_flag_present (which is meaningless in that case), etc. Currently GCC emits the base macro sequences with lineptr included, while the chains referenced from transparent_include with no lineptr and no DW_MACRO_GNU_*_file in it. Perhaps we could gain some more savings by creating a transparent_include chain also from everything after depth > 1 start_file up to (and including) corresponding end_file, if it is at least a few ops long, probably still by using transparent_includes for the consecutive define/undef ranges. The reason for the define/undef only sequences is now primarily include guards in the headers (different .debug_line reason is gone with the possibility to put the .debug_line reference in the header). By keeping it two level we could have bigger savings if in several CUs include guards didn't make a difference or the header inclusion is the same, but still not lose completely if include guards make a difference. Anyway, this proposal format allows to add it any time if it results in significant savings. DWARF edits: ============================ 2.2 Change "Macro information (#define, #undef)" to "Legacy macro information (#define, #undef)". Add "DW_AT_macros Macro information (#define, #undef)" row to the figure 2. 3.1.1 5. Replace "DW_AT_macro_info" with "DW_AT_macros". Replace ".debug_macinfo" with ".debug_macro". Add at the end of the paragraph "The DW_AT_macro_info attribute instead might refer to the .debug_macinfo section as defined in DWARF version 4." 6.3 Replace ".debug_macinfo" with ".debug_macro". Replace: "The macro information for each compilation unit is represented as a series of âmacinfoâ entries. Each macinfo entry consists of a âtype codeâ and up to two additional operands." with: "The macro information for each compilation unit starts with a section header followed by a series of âmacinfoâ entries. Each macinfo entry consists of a âtype codeâ and zero or more operands." Add at the end: "The section header starts with a 2-byte version number, followed by 1-byte flags value. If the least significant bit (bit 0) in the flags is cleared, this is 32-bit DWARF format macro section and offsets are 4 byte long, if it is set, it is 64-bit DWARF format macro section and offsets are 8 byte long. If the second least significant bit (bit 1) in the flags is set, the flags byte is followed by an offset in the .debug_line section of the beginning of the line number information, encoded as 4 byte offset for 32-bit DWARF format macro section and 8 byte offset for 64-bit DWARF format macro section. If the third least significant bit (bit 2) in the flags is set, this is followed by a table describing arguments of the macinfo entry types. The macinfo entry types defined in this standard may, but might not, be described in the table, other macinfo entry types used in the section should be described there. Vendor extension macinfo entry types should be allocated in the range from DW_MACRO_lo_user to DW_MACRO_hi_user, other unassigned codes are reserved for future DWARF standards. The table starts with a 1-byte count of the defined opcodes, followed by an entry for each of those opcodes. Each entry starts with a 1-byte opcode number, followed by unsigned LEB128 encoded number of arguments and for each argument there is a single byte describing the form in which the argument is encoded. The allowed values are DW_FORM_data1, DW_FORM_data2, DW_FORM_data4, DW_FORM_data8, DW_FORM_sdata, DW_FORM_udata, DW_FORM_block, DW_FORM_block1, DW_FORM_block2, DW_FORM_block4, DW_FORM_flag, DW_FORM_string, DW_FORM_strp and DW_FORM_sec_offset. This table allows a consumer to skip over unknown macinfo entry types." 6.3.1 Replace: "DW_MACINFO_define A macro definition. DW_MACINFO_undef A macro undefinition. DW_MACINFO_start_file The start of a new source file inclusion. DW_MACINFO_end_file The end of the current source file inclusion. DW_MACINFO_vendor_ext Vendor specific macro information directives." with "DW_MACRO_define A macro definition. DW_MACRO_undef A macro undefinition. DW_MACRO_start_file The start of a new source file inclusion. DW_MACRO_end_file The end of the current source file inclusion. DW_MACRO_define_indirect A macro definition. DW_MACRO_undef_indirect A macro undefinition. DW_MACRO_transparent_include Include a sequence of entries from given offset." 6.3.1.1 Replace all "DW_MACINFO_define" occurrences with "DW_MACRO_define" and all "DW_MACINFO_undef" with "DW_MACRO_undef". Append: "All DW_MACRO_define_indirect and DW_MACRO_undef_indirect entries have two operands. The first operand encodes the line number of the source line on which the relevant defining or undefining macro directives appeared. The second operand consists of an offset into a string table contained in the .debug_str section of the object file. The size of the operand is given in the section header offset size. Apart from the encoding of the operands these entries are equivalent to DW_MACRO_define resp. DW_MACRO_undef. 6.3.1.2 Replace all "DW_MACINFO_start_file" occurrences with "DW_MACRO_start_file". Append: "If any DW_MACINFO_start_file entries are present, the header should contain a reference to .debug_line section." 6.3.1.3 Replace all "DW_MACINFO_end_file" occurrences with "DW_MACRO_end_file". 6.3.1.4 Replace the whole section with: "6.3.1.4 Transparent inclusion of a sequence of entries A DW_MACRO_transparent_include entry has one operand, offset into another part of the .debug_macro section. The size of the operand is given in the section header offset size. This entry instructs the consumer to replace this entry with a sequence of macinfo entries found after the section header at the given .debug_macro offset, up to, but excluding, the terminating entry with type code 0. This entry type is aimed at sharing duplicate sequences of macinfo entries between macinfo from different compilation units." 6.3.2 Replace "DW_MACINFO_start_file" with "DW_MACRO_start_file" and "DW_MACINFO_end_file" with "DW_MACRO_end_file". 6.3.3 Replace "DW_MACINFO_define and DW_MACINFO_undef" with "DW_MACRO_define, DW_MACRO_define_indirect, DW_MACRO_undef and DW_MACRO_undef_indirect" and replace "DW_MACINFO_define or DW_MACINFO_undef" with "DW_MACRO_define, DW_MACRO_define_indirect, DW_MACRO_undef or DW_MACRO_undef_indirect". Replace "DW_MACINFO_start_file" with "DW_MACRO_start_file". 6.3.4 Replace ".debug_macinfo" with ".debug_macro". 7.5.4 Replace ".debug_macinfo section to the first byte" with ".debug_macro section to the header". Add: "DW_AT_macros 0xXX macptr" to figure 20. 7.22 Remove: "as are the constants in a DW_MACINFO_vendor_ext entry". Replace figure 39 with: "Macinfo Type Name Value DW_MACRO_define 0x01 DW_MACRO_undef 0x02 DW_MACRO_start_file 0x03 DW_MACRO_end_file 0x04 DW_MACRO_define_indirect 0x05 DW_MACRO_undef_indirect 0x06 DW_MACRO_transparent_include 0x07 DW_MACRO_lo_user 0xe0 DW_MACRO_hi_user 0xff Figure 39. Macinfo Type Encodings" Appendix A Add "DW_AT_macros" to allowable attributes for "DW_TAG_compile_unit" and "DW_TAG_partial_unit". Appendix B Replace "DW_AT_macinfo" with "DW_AT_macros". Replace ".debug_macinfo" with ".debug_macro". Add ( .debug_macro ) -> [ DW_MACRO_transparent_include (j) ] -> ( .debug_macro ) and ( .debug_macro ) -> [ DW_MACRO_define_indirect, DW_MACRO_undef_indirect (k) ] -> ( .debug_str ) ( .debug_macro ) -> [ DW_MACRO_start_file (l) ] -> ( .debug_line ) to the picture. In (h) replace ".debug_macinfo" with ".debug_macro". Add: (j) .debug_macro A macinfo operand of the form DW_FORM_sec_offset is an offset into another part of the .debug_macro section, to the first macinfo entry in the sequence instead of a section header. (k) .debug_macro A macinfo operand of the form DW_FORM_strp is an offset into the .debug_str section of the corresponding string. (l) .debug_macro DW_MACRO_start_file second operand refers to a file entry in the .debug_line section, with the .debug_macro header containing an offset to the start of the referenced .debug_line section." Appendix F Change ".debug_macinfo - - -" to ".debug_macinfo - - - x" Add ".debug_macro x x x 5" row to the table. ============================ Patch has been bootstrapped/regtested on x86_64-linux and i686-linux. Ok for trunk? 2011-07-22 Jakub Jelinek <jakub@redhat.com> * dwarf2.h (DW_AT_GNU_macros): New. (enum dwarf_macro_record_type): New enum. Add DW_MACRO_GNU_*. * dwarf2out.c (struct macinfo_struct): Change code to unsigned char. (DEBUG_MACRO_SECTION, DEBUG_MACRO_SECTION_LABEL): Define. (dwarf_attr_name): Handle DW_AT_GNU_macros. (dwarf2out_define): If the vector is empty and lineno is 0, emit a dummy entry first. (dwarf2out_undef): Likewise. Remove redundant semicolon. (htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op, optimize_macinfo_range): New functions. (output_macinfo): Use them. If !dwarf_strict and .debug_str is mergeable, optimize longer strings using DW_MACRO_GNU_{define,undef}_indirect and if HAVE_COMDAT_GROUP, optimize longer sequences of define/undef ops from headers using DW_MACRO_GNU_transparent_include. For !dwarf_strict emit a section headers. (dwarf2out_init): For !dwarf_strict set debug_macinfo_section and macinfo_section_label to DEBUG_MACRO_SECTION resp. DEBUG_MACRO_SECTION_LABEL. (dwarf2out_finish): For !dwarf_strict emit DW_AT_GNU_macros instead of DW_AT_macro_info. --- include/dwarf2.h.jj 2011-07-15 20:46:32.000000000 +0200 +++ include/dwarf2.h 2011-07-22 09:06:47.000000000 +0200 @@ -366,6 +366,8 @@ enum dwarf_attribute DW_AT_GNU_all_tail_call_sites = 0x2116, DW_AT_GNU_all_call_sites = 0x2117, DW_AT_GNU_all_source_call_sites = 0x2118, + /* Section offset into .debug_macro section. */ + DW_AT_GNU_macros = 0x2119, /* VMS extensions. */ DW_AT_VMS_rtnbeg_pd_address = 0x2201, /* GNAT extensions. */ @@ -879,6 +881,20 @@ enum dwarf_macinfo_record_type DW_MACINFO_end_file = 4, DW_MACINFO_vendor_ext = 255 }; + +/* Names and codes for new style macro information. */ +enum dwarf_macro_record_type + { + DW_MACRO_GNU_define = 1, + DW_MACRO_GNU_undef = 2, + DW_MACRO_GNU_start_file = 3, + DW_MACRO_GNU_end_file = 4, + DW_MACRO_GNU_define_indirect = 5, + DW_MACRO_GNU_undef_indirect = 6, + DW_MACRO_GNU_transparent_include = 7, + DW_MACRO_GNU_lo_user = 0xe0, + DW_MACRO_GNU_hi_user = 0xff + }; \f /* @@@ For use with GNU frame unwind information. */ --- gcc/dwarf2out.c.jj 2011-07-21 09:54:49.000000000 +0200 +++ gcc/dwarf2out.c 2011-07-22 09:18:49.000000000 +0200 @@ -2770,7 +2770,7 @@ struct GTY(()) dw_ranges_struct { /* A structure to hold a macinfo entry. */ typedef struct GTY(()) macinfo_struct { - unsigned HOST_WIDE_INT code; + unsigned char code; unsigned HOST_WIDE_INT lineno; const char *info; } @@ -3417,6 +3417,9 @@ static void gen_scheduled_generic_parms_ #ifndef DEBUG_MACINFO_SECTION #define DEBUG_MACINFO_SECTION ".debug_macinfo" #endif +#ifndef DEBUG_MACRO_SECTION +#define DEBUG_MACRO_SECTION ".debug_macro" +#endif #ifndef DEBUG_LINE_SECTION #define DEBUG_LINE_SECTION ".debug_line" #endif @@ -3474,6 +3477,9 @@ static void gen_scheduled_generic_parms_ #ifndef DEBUG_MACINFO_SECTION_LABEL #define DEBUG_MACINFO_SECTION_LABEL "Ldebug_macinfo" #endif +#ifndef DEBUG_MACRO_SECTION_LABEL +#define DEBUG_MACRO_SECTION_LABEL "Ldebug_macro" +#endif /* Definitions of defaults for formats and names of various special @@ -4015,6 +4021,8 @@ dwarf_attr_name (unsigned int attr) return "DW_AT_GNU_all_call_sites"; case DW_AT_GNU_all_source_call_sites: return "DW_AT_GNU_all_source_call_sites"; + case DW_AT_GNU_macros: + return "DW_AT_GNU_macros"; case DW_AT_GNAT_descriptive_type: return "DW_AT_GNAT_descriptive_type"; @@ -20291,6 +20299,15 @@ dwarf2out_define (unsigned int lineno AT if (debug_info_level >= DINFO_LEVEL_VERBOSE) { macinfo_entry e; + /* Insert a dummy first entry to be able to optimize the whole + predefined macro block using DW_MACRO_GNU_transparent_include. */ + if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0) + { + e.code = 0; + e.lineno = 0; + e.info = NULL; + VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); + } e.code = DW_MACINFO_define; e.lineno = lineno; e.info = xstrdup (buffer);; @@ -20309,58 +20326,386 @@ dwarf2out_undef (unsigned int lineno ATT if (debug_info_level >= DINFO_LEVEL_VERBOSE) { macinfo_entry e; + /* Insert a dummy first entry to be able to optimize the whole + predefined macro block using DW_MACRO_GNU_transparent_include. */ + if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0) + { + e.code = 0; + e.lineno = 0; + e.info = NULL; + VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); + } e.code = DW_MACINFO_undef; e.lineno = lineno; - e.info = xstrdup (buffer);; + e.info = xstrdup (buffer); VEC_safe_push (macinfo_entry, gc, macinfo_table, &e); } } +/* Routines to manipulate hash table of CUs. */ + +static hashval_t +htab_macinfo_hash (const void *of) +{ + const macinfo_entry *const entry = + (const macinfo_entry *) of; + + return htab_hash_string (entry->info); +} + +static int +htab_macinfo_eq (const void *of1, const void *of2) +{ + const macinfo_entry *const entry1 = (const macinfo_entry *) of1; + const macinfo_entry *const entry2 = (const macinfo_entry *) of2; + + return !strcmp (entry1->info, entry2->info); +} + +/* Output a single .debug_macinfo entry. */ + +static void +output_macinfo_op (macinfo_entry *ref) +{ + int file_num; + size_t len; + struct indirect_string_node *node; + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + + switch (ref->code) + { + case DW_MACINFO_start_file: + file_num = maybe_emit_file (lookup_filename (ref->info)); + dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); + dw2_asm_output_data_uleb128 (ref->lineno, + "Included from line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + break; + case DW_MACINFO_end_file: + dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + len = strlen (ref->info) + 1; + if (!dwarf_strict + && len > DWARF_OFFSET_SIZE + && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET + && (debug_str_section->common.flags & SECTION_MERGE) != 0) + { + ref->code = ref->code == DW_MACINFO_define + ? DW_MACRO_GNU_define_indirect + : DW_MACRO_GNU_undef_indirect; + output_macinfo_op (ref); + return; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_MACINFO_define + ? "Define macro" : "Undefine macro"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_nstring (ref->info, -1, "The macro"); + break; + case DW_MACRO_GNU_define_indirect: + case DW_MACRO_GNU_undef_indirect: + node = find_AT_string (ref->info); + if (node->form != DW_FORM_strp) + { + char label[32]; + ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter); + ++dw2_string_counter; + node->label = xstrdup (label); + node->form = DW_FORM_strp; + } + dw2_asm_output_data (1, ref->code, + ref->code == DW_MACRO_GNU_define_indirect + ? "Define macro indirect" + : "Undefine macro indirect"); + dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", + (unsigned long) ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label, + debug_str_section, "The macro: \"%s\"", + ref->info); + break; + case DW_MACRO_GNU_transparent_include: + dw2_asm_output_data (1, ref->code, "Transparent include"); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_MACRO_SECTION_LABEL, ref->lineno); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL); + break; + default: + fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", + ASM_COMMENT_START, (unsigned long) ref->code); + break; + } +} + +/* Attempt to make a sequence of define/undef macinfo ops shareable with + other compilation unit .debug_macinfo sections. IDX is the first + index of a define/undef, return the number of ops that should be + emitted in a comdat .debug_macinfo section and emit + a DW_MACRO_GNU_transparent_include entry referencing it. + If the define/undef entry should be emitted normally, return 0. */ + +static unsigned +optimize_macinfo_range (unsigned int idx, VEC (macinfo_entry, gc) *files, + htab_t *macinfo_htab) +{ + macinfo_entry *first, *second, *cur, *inc; + char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1]; + unsigned char checksum[16]; + struct md5_ctx ctx; + char *grp_name, *tail; + const char *base; + unsigned int i, count, encoded_filename_len, linebuf_len; + void **slot; + + first = VEC_index (macinfo_entry, macinfo_table, idx); + second = VEC_index (macinfo_entry, macinfo_table, idx + 1); + + /* Optimize only if there are at least two consecutive define/undef ops, + and either all of them are before first DW_MACINFO_start_file + with lineno 0 (i.e. predefined macro block), or all of them are + in some included header file. */ + if (second->code != DW_MACINFO_define && second->code != DW_MACINFO_undef) + return 0; + if (VEC_empty (macinfo_entry, files)) + { + if (first->lineno != 0 || second->lineno != 0) + return 0; + } + else if (first->lineno == 0) + return 0; + + /* Find the last define/undef entry that can be grouped together + with first and at the same time compute md5 checksum of their + codes, linenumbers and strings. */ + md5_init_ctx (&ctx); + for (i = idx; VEC_iterate (macinfo_entry, macinfo_table, i, cur); i++) + if (cur->code != DW_MACINFO_define && cur->code != DW_MACINFO_undef) + break; + else if (first->lineno == 0 && cur->lineno != 0) + break; + else + { + unsigned char code = cur->code; + md5_process_bytes (&code, 1, &ctx); + checksum_uleb128 (cur->lineno, &ctx); + md5_process_bytes (cur->info, strlen (cur->info) + 1, &ctx); + } + md5_finish_ctx (&ctx, checksum); + count = i - idx; + + /* From the containing include filename (if any) pick up just + usable characters from its basename. */ + if (first->lineno == 0) + base = ""; + else + base = lbasename (VEC_last (macinfo_entry, files)->info); + for (encoded_filename_len = 0, i = 0; base[i]; i++) + if (ISIDNUM (base[i]) || base[i] == '.') + encoded_filename_len++; + /* Count . at the end. */ + if (encoded_filename_len) + encoded_filename_len++; + + sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED, first->lineno); + linebuf_len = strlen (linebuf); + + /* The group name format is: wmN.[<encoded filename>.]<lineno>.<md5sum> */ + grp_name = XNEWVEC (char, 4 + encoded_filename_len + linebuf_len + 1 + + 16 * 2 + 1); + memcpy (grp_name, DWARF_OFFSET_SIZE == 4 ? "wm4." : "wm8.", 4); + tail = grp_name + 4; + if (encoded_filename_len) + { + for (i = 0; base[i]; i++) + if (ISIDNUM (base[i]) || base[i] == '.') + *tail++ = base[i]; + *tail++ = '.'; + } + memcpy (tail, linebuf, linebuf_len); + tail += linebuf_len; + *tail++ = '.'; + for (i = 0; i < 16; i++) + sprintf (tail + i * 2, "%02x", checksum[i] & 0xff); + + /* Construct a macinfo_entry for DW_MACRO_GNU_transparent_include + in the empty vector entry before the first define/undef. */ + inc = VEC_index (macinfo_entry, macinfo_table, idx - 1); + inc->code = DW_MACRO_GNU_transparent_include; + inc->lineno = 0; + inc->info = grp_name; + if (*macinfo_htab == NULL) + *macinfo_htab = htab_create (10, htab_macinfo_hash, htab_macinfo_eq, NULL); + /* Avoid emitting duplicates. */ + slot = htab_find_slot (*macinfo_htab, inc, INSERT); + if (*slot != NULL) + { + free (CONST_CAST (char *, inc->info)); + inc->code = 0; + inc->info = NULL; + /* If such an entry has been used before, just emit + a DW_MACRO_GNU_transparent_include op. */ + inc = (macinfo_entry *) *slot; + output_macinfo_op (inc); + /* And clear all macinfo_entry in the range to avoid emitting them + in the second pass. */ + for (i = idx; + VEC_iterate (macinfo_entry, macinfo_table, i, cur) + && i < idx + count; + i++) + { + cur->code = 0; + free (CONST_CAST (char *, cur->info)); + cur->info = NULL; + } + } + else + { + *slot = inc; + inc->lineno = htab_elements (*macinfo_htab); + output_macinfo_op (inc); + } + return count; +} + +/* Output macinfo section(s). */ + static void output_macinfo (void) { unsigned i; unsigned long length = VEC_length (macinfo_entry, macinfo_table); macinfo_entry *ref; + VEC (macinfo_entry, gc) *files = NULL; + htab_t macinfo_htab = NULL; if (! length) return; + /* output_macinfo* uses these interchangeably. */ + gcc_assert ((int) DW_MACINFO_define == (int) DW_MACRO_GNU_define + && (int) DW_MACINFO_undef == (int) DW_MACRO_GNU_undef + && (int) DW_MACINFO_start_file == (int) DW_MACRO_GNU_start_file + && (int) DW_MACINFO_end_file == (int) DW_MACRO_GNU_end_file); + + /* For .debug_macro emit the section header. */ + if (!dwarf_strict) + { + dw2_asm_output_data (2, 4, "DWARF macro version number"); + if (DWARF_OFFSET_SIZE == 8) + dw2_asm_output_data (1, 3, "Flags: 64-bit, lineptr present"); + else + dw2_asm_output_data (1, 2, "Flags: 32-bit, lineptr present"); + dw2_asm_output_offset (DWARF_OFFSET_SIZE, debug_line_section_label, + debug_line_section, NULL); + } + + /* In the first loop, it emits the primary .debug_macinfo section + and after each emitted op the macinfo_entry is cleared. + If a longer range of define/undef ops can be optimized using + DW_MACRO_GNU_transparent_include, the + DW_MACRO_GNU_transparent_include op is emitted and kept in + the vector before the first define/undef in the range and the + whole range of define/undef ops is not emitted and kept. */ for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++) { switch (ref->code) { - case DW_MACINFO_start_file: + case DW_MACINFO_start_file: + VEC_safe_push (macinfo_entry, gc, files, ref); + break; + case DW_MACINFO_end_file: + if (!VEC_empty (macinfo_entry, files)) { - int file_num = maybe_emit_file (lookup_filename (ref->info)); - dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file"); - dw2_asm_output_data_uleb128 - (ref->lineno, "Included from line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info); + macinfo_entry *file = VEC_last (macinfo_entry, files); + free (CONST_CAST (char *, file->info)); + VEC_pop (macinfo_entry, files); } - break; - case DW_MACINFO_end_file: - dw2_asm_output_data (1, DW_MACINFO_end_file, "End file"); - break; - case DW_MACINFO_define: - dw2_asm_output_data (1, DW_MACINFO_define, "Define macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - case DW_MACINFO_undef: - dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro"); - dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", - (unsigned long)ref->lineno); - dw2_asm_output_nstring (ref->info, -1, "The macro"); - break; - default: - fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n", - ASM_COMMENT_START, (unsigned long)ref->code); + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + if (!dwarf_strict + && HAVE_COMDAT_GROUP + && VEC_length (macinfo_entry, files) != 1 + && i > 0 + && i + 1 < length + && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0) + { + unsigned count = optimize_macinfo_range (i, files, &macinfo_htab); + if (count) + { + i += count - 1; + continue; + } + } + break; + case 0: + /* A dummy entry may be inserted at the beginning to be able + to optimize the whole block of predefined macros. */ + if (i == 0) + continue; + default: break; } + output_macinfo_op (ref); + /* For DW_MACINFO_start_file ref->info has been copied into files + vector. */ + if (ref->code != DW_MACINFO_start_file) + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + ref->code = 0; } + + if (macinfo_htab == NULL) + return; + + htab_delete (macinfo_htab); + + /* If any DW_MACRO_GNU_transparent_include were used, on those + DW_MACRO_GNU_transparent_include entries terminate the + current chain and switch to a new comdat .debug_macinfo + section and emit the define/undef entries within it. */ + for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++) + switch (ref->code) + { + case 0: + continue; + case DW_MACRO_GNU_transparent_include: + { + char label[MAX_ARTIFICIAL_LABEL_BYTES]; + tree comdat_key = get_identifier (ref->info); + /* Terminate the previous .debug_macinfo section. */ + dw2_asm_output_data (1, 0, "End compilation unit"); + targetm.asm_out.named_section (DEBUG_MACRO_SECTION, + SECTION_DEBUG + | SECTION_LINKONCE, + comdat_key); + ASM_GENERATE_INTERNAL_LABEL (label, + DEBUG_MACRO_SECTION_LABEL, + ref->lineno); + ASM_OUTPUT_LABEL (asm_out_file, label); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + dw2_asm_output_data (2, 4, "DWARF macro version number"); + if (DWARF_OFFSET_SIZE == 8) + dw2_asm_output_data (1, 1, "Flags: 64-bit"); + else + dw2_asm_output_data (1, 0, "Flags: 32-bit"); + } + break; + case DW_MACINFO_define: + case DW_MACINFO_undef: + output_macinfo_op (ref); + ref->code = 0; + free (CONST_CAST (char *, ref->info)); + ref->info = NULL; + break; + default: + gcc_unreachable (); + } } /* Set up for Dwarf output at the start of compilation. */ @@ -20409,7 +20754,9 @@ dwarf2out_init (const char *filename ATT SECTION_DEBUG, NULL); debug_aranges_section = get_section (DEBUG_ARANGES_SECTION, SECTION_DEBUG, NULL); - debug_macinfo_section = get_section (DEBUG_MACINFO_SECTION, + debug_macinfo_section = get_section (dwarf_strict + ? DEBUG_MACINFO_SECTION + : DEBUG_MACRO_SECTION, SECTION_DEBUG, NULL); debug_line_section = get_section (DEBUG_LINE_SECTION, SECTION_DEBUG, NULL); @@ -20441,7 +20788,9 @@ dwarf2out_init (const char *filename ATT ASM_GENERATE_INTERNAL_LABEL (ranges_section_label, DEBUG_RANGES_SECTION_LABEL, 0); ASM_GENERATE_INTERNAL_LABEL (macinfo_section_label, - DEBUG_MACINFO_SECTION_LABEL, 0); + dwarf_strict + ? DEBUG_MACINFO_SECTION_LABEL + : DEBUG_MACRO_SECTION_LABEL, 0); if (debug_info_level >= DINFO_LEVEL_VERBOSE) macinfo_table = VEC_alloc (macinfo_entry, gc, 64); @@ -21984,7 +22333,9 @@ dwarf2out_finish (const char *filename) debug_line_section_label); if (debug_info_level >= DINFO_LEVEL_VERBOSE) - add_AT_macptr (comp_unit_die (), DW_AT_macro_info, macinfo_section_label); + add_AT_macptr (comp_unit_die (), + dwarf_strict ? DW_AT_macro_info : DW_AT_GNU_macros, + macinfo_section_label); if (have_location_lists) optimize_location_lists (comp_unit_die ()); Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) 2011-07-22 13:49 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek @ 2011-07-22 15:34 ` Tom Tromey 2011-07-22 17:24 ` Richard Henderson 1 sibling, 0 replies; 25+ messages in thread From: Tom Tromey @ 2011-07-22 15:34 UTC (permalink / raw) To: Jakub Jelinek Cc: Richard Henderson, gcc-patches, Jason Merrill, Jan Kratochvil, Cary Coutant, Mark Wielaard >>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes: Jakub> Ok, based on further discussions here, on Dwarf-discuss and on IRC Jakub> here is a hopefully final version. I've updated the gdb patch. Tom ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) 2011-07-22 13:49 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek 2011-07-22 15:34 ` Tom Tromey @ 2011-07-22 17:24 ` Richard Henderson 1 sibling, 0 replies; 25+ messages in thread From: Richard Henderson @ 2011-07-22 17:24 UTC (permalink / raw) To: Jakub Jelinek Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/22/2011 06:20 AM, Jakub Jelinek wrote: > Ok, based on further discussions here, on Dwarf-discuss and on IRC > here is a hopefully final version. Ok by me. The dwarf edits look good too. r~ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-21 17:25 ` Richard Henderson 2011-07-21 18:13 ` Jakub Jelinek 2011-07-22 13:49 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek @ 2011-07-22 20:33 ` Michael Eager 2011-07-22 21:50 ` Richard Henderson 2 siblings, 1 reply; 25+ messages in thread From: Michael Eager @ 2011-07-22 20:33 UTC (permalink / raw) To: Richard Henderson Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/21/2011 10:10 AM, Richard Henderson wrote: > I've been wondering if the header shouldn't contain the opcode > definitions, similar to .debug_line, and drop your _define_opcode. > It does mean that you couldn't re-define opcodes within any one > sequence, but does that actually seem useful? The definition of opcodes in the line number table is different from opcodes in other tables, including a modified macro table. There are many opcodes (essentially every possible value is used) and the specific meaning of the opcodes may be different for different targets. It seems unlikely that different targets would have different meanings for the macro opcodes. -- Michael Eager eager@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-22 20:33 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Michael Eager @ 2011-07-22 21:50 ` Richard Henderson 2011-07-22 21:51 ` Michael Eager 0 siblings, 1 reply; 25+ messages in thread From: Richard Henderson @ 2011-07-22 21:50 UTC (permalink / raw) To: Michael Eager Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/22/2011 12:54 PM, Michael Eager wrote: > The definition of opcodes in the line number table is different from > opcodes in other tables, including a modified macro table. There > are many opcodes (essentially every possible value is used) and the > specific meaning of the opcodes may be different for different targets. I'm referring to the standard_opcode_lengths section of the .debug_line header here. We're trying to do something similar for the .debug_macro section. r~ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-22 21:50 ` Richard Henderson @ 2011-07-22 21:51 ` Michael Eager 2011-07-22 22:10 ` Richard Henderson 0 siblings, 1 reply; 25+ messages in thread From: Michael Eager @ 2011-07-22 21:51 UTC (permalink / raw) To: Richard Henderson Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/22/2011 02:08 PM, Richard Henderson wrote: > On 07/22/2011 12:54 PM, Michael Eager wrote: >> The definition of opcodes in the line number table is different from >> opcodes in other tables, including a modified macro table. There >> are many opcodes (essentially every possible value is used) and the >> specific meaning of the opcodes may be different for different targets. > > I'm referring to the standard_opcode_lengths section of the .debug_line > header here. We're trying to do something similar for the .debug_macro > section. There doesn't seem to be any need. standard_opcode_lengths is only needed because the opcode meanings can vary for different targets. -- Michael Eager eager@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-22 21:51 ` Michael Eager @ 2011-07-22 22:10 ` Richard Henderson 2011-07-23 0:32 ` Michael Eager 0 siblings, 1 reply; 25+ messages in thread From: Richard Henderson @ 2011-07-22 22:10 UTC (permalink / raw) To: Michael Eager Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/22/2011 02:16 PM, Michael Eager wrote: > On 07/22/2011 02:08 PM, Richard Henderson wrote: >> On 07/22/2011 12:54 PM, Michael Eager wrote: >>> The definition of opcodes in the line number table is different from >>> opcodes in other tables, including a modified macro table. There >>> are many opcodes (essentially every possible value is used) and the >>> specific meaning of the opcodes may be different for different targets. >> >> I'm referring to the standard_opcode_lengths section of the .debug_line >> header here. We're trying to do something similar for the .debug_macro >> section. > > There doesn't seem to be any need. standard_opcode_lengths is only needed > because the opcode meanings can vary for different targets. I beg your pardon, but no, the meanings of the *standard* opcodes cannot vary. Only the special opcode meanings vary. See 6.2.4 #10: # By increasing opcode_base, and adding elements to this array, # new standard opcodes can be added, while allowing consumers who # do not know about these new opcodes to be able to skip them. r~ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-22 22:10 ` Richard Henderson @ 2011-07-23 0:32 ` Michael Eager 2011-07-23 0:36 ` Richard Henderson 0 siblings, 1 reply; 25+ messages in thread From: Michael Eager @ 2011-07-23 0:32 UTC (permalink / raw) To: Richard Henderson Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/22/2011 02:20 PM, Richard Henderson wrote: > On 07/22/2011 02:16 PM, Michael Eager wrote: >> On 07/22/2011 02:08 PM, Richard Henderson wrote: >>> On 07/22/2011 12:54 PM, Michael Eager wrote: >>>> The definition of opcodes in the line number table is different from >>>> opcodes in other tables, including a modified macro table. There >>>> are many opcodes (essentially every possible value is used) and the >>>> specific meaning of the opcodes may be different for different targets. >>> >>> I'm referring to the standard_opcode_lengths section of the .debug_line >>> header here. We're trying to do something similar for the .debug_macro >>> section. >> >> There doesn't seem to be any need. standard_opcode_lengths is only needed >> because the opcode meanings can vary for different targets. > > I beg your pardon, but no, the meanings of the *standard* opcodes > cannot vary. Only the special opcode meanings vary. > > See 6.2.4 #10: > > # By increasing opcode_base, and adding elements to this array, > # new standard opcodes can be added, while allowing consumers who > # do not know about these new opcodes to be able to skip them. Which part of "not needed" did you misunderstand? -- Michael Eager eager@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-23 0:32 ` Michael Eager @ 2011-07-23 0:36 ` Richard Henderson 2011-07-26 7:34 ` Jason Merrill 0 siblings, 1 reply; 25+ messages in thread From: Richard Henderson @ 2011-07-23 0:36 UTC (permalink / raw) To: Michael Eager Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard On 07/22/2011 04:15 PM, Michael Eager wrote: > On 07/22/2011 02:20 PM, Richard Henderson wrote: >> On 07/22/2011 02:16 PM, Michael Eager wrote: >>> On 07/22/2011 02:08 PM, Richard Henderson wrote: >>>> On 07/22/2011 12:54 PM, Michael Eager wrote: >>>>> The definition of opcodes in the line number table is different from >>>>> opcodes in other tables, including a modified macro table. There >>>>> are many opcodes (essentially every possible value is used) and the >>>>> specific meaning of the opcodes may be different for different targets. >>>> >>>> I'm referring to the standard_opcode_lengths section of the .debug_line >>>> header here. We're trying to do something similar for the .debug_macro >>>> section. >>> >>> There doesn't seem to be any need. standard_opcode_lengths is only needed >>> because the opcode meanings can vary for different targets. >> >> I beg your pardon, but no, the meanings of the *standard* opcodes >> cannot vary. Only the special opcode meanings vary. >> >> See 6.2.4 #10: >> >> # By increasing opcode_base, and adding elements to this array, >> # new standard opcodes can be added, while allowing consumers who >> # do not know about these new opcodes to be able to skip them. > > Which part of "not needed" did you misunderstand? The part in which "not needed" appears. I'm afraid I have no idea what you're talking about anymore. r~ ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) 2011-07-23 0:36 ` Richard Henderson @ 2011-07-26 7:34 ` Jason Merrill 0 siblings, 0 replies; 25+ messages in thread From: Jason Merrill @ 2011-07-26 7:34 UTC (permalink / raw) To: Richard Henderson Cc: Michael Eager, Jakub Jelinek, gcc-patches, Tom Tromey, Jan Kratochvil, Cary Coutant, Mark Wielaard There seems to be some violent agreement going on here. I think everyone agrees that we don't need to define anything about standard .debug_macro opcodes in the binary, that they will always mean the same thing. The question was how to establish extended opcodes, whether via a define_opcode operation or in the header. Jason ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2) 2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek 2011-07-15 17:19 ` Richard Henderson @ 2011-07-15 18:28 ` Tom Tromey 2011-07-15 19:21 ` Jakub Jelinek 1 sibling, 1 reply; 25+ messages in thread From: Tom Tromey @ 2011-07-15 18:28 UTC (permalink / raw) To: Jakub Jelinek Cc: gcc-patches, Jason Merrill, Richard Henderson, Jan Kratochvil, Roland McGrath, Cary Coutant, Mark Wielaard >>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes: Jakub> The patch below implements that slight change, in particular the Jakub> "4" suffixes from the op names were dropped, Jakub> DW_MACINFO_GNU_*_indirect have DW_FORM_udata and DW_FORM_strp Jakub> arguments now (i.e. DWARF_OFFSET_SIZE large) and Jakub> DW_MACINFO_GNU_transparent_include has DW_FORM_sec_offset Jakub> argument (i.e. again 4 bytes long for 32-bit DWARF and 8 bytes Jakub> long for 64-bit DWARF). GCC assures that no merging will happen Jakub> between .debug_macinfo chunks with 32-bit and 64-bit DWARF by Jakub> adding the byte size in the comdat GROUP name. I think that's Jakub> cleaner than hardcoding 4 bytes and not optimizing anything on Jakub> MIPS. The .debug_macinfo section doesn't have any header describing its contents. How would a consumer know which offset size to use? Tom ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2) 2011-07-15 18:28 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Tom Tromey @ 2011-07-15 19:21 ` Jakub Jelinek 2011-07-15 19:30 ` Tom Tromey 0 siblings, 1 reply; 25+ messages in thread From: Jakub Jelinek @ 2011-07-15 19:21 UTC (permalink / raw) To: Tom Tromey Cc: gcc-patches, Jason Merrill, Richard Henderson, Jan Kratochvil, Roland McGrath, Cary Coutant, Mark Wielaard On Fri, Jul 15, 2011 at 12:15:48PM -0600, Tom Tromey wrote: > >>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes: > > Jakub> The patch below implements that slight change, in particular the > Jakub> "4" suffixes from the op names were dropped, > Jakub> DW_MACINFO_GNU_*_indirect have DW_FORM_udata and DW_FORM_strp > Jakub> arguments now (i.e. DWARF_OFFSET_SIZE large) and > Jakub> DW_MACINFO_GNU_transparent_include has DW_FORM_sec_offset > Jakub> argument (i.e. again 4 bytes long for 32-bit DWARF and 8 bytes > Jakub> long for 64-bit DWARF). GCC assures that no merging will happen > Jakub> between .debug_macinfo chunks with 32-bit and 64-bit DWARF by > Jakub> adding the byte size in the comdat GROUP name. I think that's > Jakub> cleaner than hardcoding 4 bytes and not optimizing anything on > Jakub> MIPS. > > The .debug_macinfo section doesn't have any header describing its > contents. How would a consumer know which offset size to use? The same way as it knows how to interpret the second operands of DW_MACINFO_start_file. They aren't meaningful without knowing what .debug_line section they refer to. For .debug_line, you need to remember DW_AT_stmt_list of the CU that refers to the .debug_macinfo section through DW_AT_macro_info, and you'd remember whether the referencing CU is 32-bit DWARF or 64-bit DWARF. And the producer would need to arange that DW_MACINFO_GNU_transparent_include referenced chunks have the same properties (i.e. same offset size, and, if they use DW_MACINFO_start_file, also the same .debug_line). Jakub ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2) 2011-07-15 19:21 ` Jakub Jelinek @ 2011-07-15 19:30 ` Tom Tromey 0 siblings, 0 replies; 25+ messages in thread From: Tom Tromey @ 2011-07-15 19:30 UTC (permalink / raw) To: Jakub Jelinek Cc: gcc-patches, Jason Merrill, Richard Henderson, Jan Kratochvil, Roland McGrath, Cary Coutant, Mark Wielaard >>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes: >> The .debug_macinfo section doesn't have any header describing its >> contents. How would a consumer know which offset size to use? Jakub> The same way as it knows how to interpret the second operands of Jakub> DW_MACINFO_start_file. Ok, duh. I updated my gdb patch. I can send it if you want. Tom ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2011-07-26 5:17 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-07-13 17:12 [RFC] More compact (100x) -g3 .debug_macinfo Jakub Jelinek 2011-07-13 19:59 ` Tom Tromey 2011-07-13 20:37 ` Jakub Jelinek 2011-07-18 15:42 ` Tom Tromey 2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek 2011-07-15 17:19 ` Richard Henderson 2011-07-15 21:18 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 3) Jakub Jelinek 2011-07-18 15:09 ` Tom Tromey 2011-07-20 1:17 ` Richard Henderson 2011-07-21 11:38 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Jakub Jelinek 2011-07-21 17:25 ` Richard Henderson 2011-07-21 18:13 ` Jakub Jelinek 2011-07-22 13:49 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek 2011-07-22 15:34 ` Tom Tromey 2011-07-22 17:24 ` Richard Henderson 2011-07-22 20:33 ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Michael Eager 2011-07-22 21:50 ` Richard Henderson 2011-07-22 21:51 ` Michael Eager 2011-07-22 22:10 ` Richard Henderson 2011-07-23 0:32 ` Michael Eager 2011-07-23 0:36 ` Richard Henderson 2011-07-26 7:34 ` Jason Merrill 2011-07-15 18:28 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Tom Tromey 2011-07-15 19:21 ` Jakub Jelinek 2011-07-15 19:30 ` Tom Tromey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).