public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] More compact (100x) -g3 .debug_macinfo
@ 2011-07-13 17:12 Jakub Jelinek
  2011-07-13 19:59 ` Tom Tromey
  2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek
  0 siblings, 2 replies; 25+ messages in thread
From: Jakub Jelinek @ 2011-07-13 17:12 UTC (permalink / raw)
  To: Jason Merrill, Richard Henderson, Tom Tromey, Jan Kratochvil,
	Roland McGrath, Cary Coutant, Mark Wielaard
  Cc: gcc-patches

Hi!

Currently .debug_macinfo is prohibitively large, because it doesn't
allow for any kind of merging of duplicate debug information.

This patch is an RFC for extensions that allow it to bring it down
to manageable levels.  The ideas for the first shrinking come from Jason
and/or Roland I think from last year and is similar to the introduction of
DW_FORM_strp to replace DW_FORM_string in some cases.
In particular, if the string in DW_MACINFO_define or DW_MACINFO_undef is
larger than 4 bytes including terminating '\0' and there is a chance the
string might occur more than once, instead an offset into .debug_str
is used.  The usual .debug_str string merging then kicks in and removes
duplicities.

The second savings come from merging of identical sequences of
DW_MACINFO_define/undef ops.  Usually, when you include some header,
the macros it defines/undefines are the same.  Unfortunately it is hard
to merge whole headers, because:
1) DW_MACINFO_start_file uses .debug_line references, which prevent merging
   - different CUs have different .debug_line content
2) multiple inclusion of headers with single inclusion guards is quite
   common and results in such merging to be less than satisfactory, as
   if some header includes <stdio.h> and you include that header
   in one source file without prior inclusion of stdio.h and in a different
   one after #include <stdio.h>, suddenly the .debug_macinfo sequence
   for that header is different if it transitively includes included headers

Unfortunately, as defined in DWARF{2,3,4}, .debug_macinfo is not really
allowing extensions.  DW_MACINFO_vendor_ext doesn't count, because its
argument is a string, which certainly can't include embedded zeros needed
for the offsets into other sections or other portions of the same section.

The following approach just grabs a range of .debug_macinfo opcodes for
vendor use, if the DWARF commitee would give such an approach a green light.
.debug_macinfo has 256 possible opcodes and just defines 5 (plus 1 for
termination), the remaining 250 are unused.
Other alternative would be to come up with .debug_gnu_macinfo section or
similar and defining a new DW_AT_GNU_macro_info attribute that would be
used instead of DW_AT_macro_info, but I'd prefer to stay with
.debug_macinfo.

The newly added opcodes:
DW_MACINFO_GNU_define_indirect4		0xe0
	This opcode has two arguments, one is uleb128 lineno and the
	other is 4 byte offset into .debug_str.  Except for the
	encoding of the string it is similar to DW_MACINFO_define.
DW_MACINFO_GNU_undef_indirect4		0xe1
	This opcode has two arguments, one is uleb128 lineno and the
	other is 4 byte offset into .debug_str.  Except for the
	encoding of the string it is similar to DW_MACINFO_undef.
DW_MACINFO_GNU_transparent_include4	0xe2
	This opcode has a single argument, a 4 byte offset into
	.debug_macinfo.  It instructs the debug info consumer that
	this opcode during reading should be replaced with the sequence
	of .debug_macinfo opcodes from the mentioned offset, up to
	a terminating 0 opcode (not including that 0).
DW_MACINFO_GNU_define_opcode		0xe3
	This is an opcode for future extensibility through which
	a debugger could skip unknown opcodes.  It has 3 arguments:
	1 byte opcode number, uleb128 count of arguments and
	a count bytes long array, with a DW_FORM_* code how the
	argument is encoded.
The debug info producers have to ensure that opcodes in
DW_MACINFO_GNU_transparent_include4 chains reference the right sections
for any .debug_macinfo that includes them (which essentially means
that DW_MACINFO_start_file can't be used in the transparent_include4
chain.  Perhaps cleaner would be not to define all offset sizes in the
opcode values/names and instead have DW_MACINFO_GNU_define_indirect
and DW_MACINFO_GNU_undef_indirect whose arguments would be
DW_FORM_udata and DW_FORM_strp (i.e. offset size) - the producers
would need to ensure that .debug_macinfo chains with different
assumed offset size aren't merged together, which could be done
e.g. by using wm4.[<filename>.]<lineno>.<md5> and wm8.* comdat
groups instead of the current wm.*.  DW_MACINFO_GNU_transparent_include4
then would have DW_FORM_sec_offset single argument and
DW_MACINFO_GNU_define_opcode would have DW_FORM_data1 and DW_FORM_block
arguments and the implicit opcode definition assumed at the start
of every .debug_macinfo would be:
DW_MACINFO_GNU_define_opcode <0, 0 []>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_define, 2 [DW_FORM_udata, DW_FORM_string]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_undef, 2 [DW_FORM_udata, DW_FORM_string]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_start_file, 2 [DW_FORM_udata, DW_FORM_sec_offset]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_end_file, 1 [DW_FORM_udata]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_define_indirect, 2 [DW_FORM_udata, DW_FORM_strp]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_undef_indirect, 2 [DW_FORM_udata, DW_FORM_strp]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_define_opcode, 2 [DW_FORM_data1, DW_FORM_block]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_vendor_ext, 1 [DW_FORM_string]>

This approach doesn't need any linker changes, the slight disadvantage is
a small increase in the size of -g3 built object files (e.g. on i686-linux
-g3 -O2 gcc/*.o were together 461.3MB large before this patch and with this patch
518.6MB, i.e. more than 13% more), but the size of cc1plus reduced
significantly, from 428.9MB down to 92.6MB.  Previously, .debug_macinfo
section occupied in cc1plus 339MB and .debug_str 1MB, with the patch
.debug_macinfo has 1MB and .debug_str 2.5MB.  .debug_str wasn't used
for macinfo before, so macinfo now takes together 2.5MB compared to
339MB before.

2011-07-13  Jakub Jelinek  <jakub@redhat.com>

	* dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add.
	(DW_MACINFO_GNU_define_indirect4, DW_MACINFO_GNU_undef_indirect4,
	DW_MACINFO_GNU_transparent_include4, DW_MACINFO_GNU_define_opcode):
	Add.

	* dwarf2out.c (dwarf2out_undef): Remove redundant semicolon.
	(htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op): New
	functions.
	(output_macinfo): Use them.  If !dwarf_strict and .debug_str is
	mergeable, optimize longer strings using
	DW_MACINFO_GNU_{define,undef}_indirect4 and if HAVE_COMDAT and ELF,
	optimize longer sequences of define/undef ops from headers
	using DW_MACINFO_GNU_transparent_include4.

--- include/dwarf2.h.jj	2011-06-23 10:14:06.000000000 +0200
+++ include/dwarf2.h	2011-07-13 11:39:49.000000000 +0200
@@ -877,7 +877,13 @@ enum dwarf_macinfo_record_type
     DW_MACINFO_undef = 2,
     DW_MACINFO_start_file = 3,
     DW_MACINFO_end_file = 4,
-    DW_MACINFO_vendor_ext = 255
+    DW_MACINFO_lo_user = 0xe0,
+    DW_MACINFO_GNU_define_indirect4 = 0xe0,
+    DW_MACINFO_GNU_undef_indirect4 = 0xe1,
+    DW_MACINFO_GNU_transparent_include4 = 0xe2,
+    DW_MACINFO_GNU_define_opcode = 0xe3,
+    DW_MACINFO_hi_user = 0xfe,
+    DW_MACINFO_vendor_ext = 0xff
   };
 \f
 /* @@@ For use with GNU frame unwind information.  */
--- gcc/dwarf2out.c.jj	2011-07-12 17:59:01.000000000 +0200
+++ gcc/dwarf2out.c	2011-07-13 17:04:17.000000000 +0200
@@ -20383,17 +20383,118 @@ dwarf2out_undef (unsigned int lineno ATT
       macinfo_entry e;
       e.code = DW_MACINFO_undef;
       e.lineno = lineno;
-      e.info = xstrdup (buffer);;
+      e.info = xstrdup (buffer);
       VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
+/* Routines to manipulate hash table of CUs.  */
+static hashval_t
+htab_macinfo_hash (const void *of)
+{
+  const macinfo_entry *const entry =
+    (const macinfo_entry *) of;
+
+  return htab_hash_string (entry->info);
+}
+
+static int
+htab_macinfo_eq (const void *of1, const void *of2)
+{
+  const macinfo_entry *const entry1 = (const macinfo_entry *) of1;
+  const macinfo_entry *const entry2 = (const macinfo_entry *) of2;
+
+  return !strcmp (entry1->info, entry2->info);
+}
+
+/* Output a single .debug_macinfo entry.  */
+
+static void
+output_macinfo_op (macinfo_entry *ref)
+{
+  int file_num;
+  size_t len;
+  struct indirect_string_node *node;
+  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+
+  switch (ref->code)
+    {
+    case DW_MACINFO_start_file:
+      file_num = maybe_emit_file (lookup_filename (ref->info));
+      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
+      dw2_asm_output_data_uleb128 (ref->lineno,
+				   "Included from line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+      break;
+    case DW_MACINFO_end_file:
+      dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
+      break;
+    case DW_MACINFO_define:
+    case DW_MACINFO_undef:
+      len = strlen (ref->info) + 1;
+      if (!dwarf_strict
+	  && len > DWARF_OFFSET_SIZE
+	  && DWARF_OFFSET_SIZE == 4
+	  && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
+	  && (debug_str_section->common.flags & SECTION_MERGE) != 0)
+	{
+	  ref->code = ref->code == DW_MACINFO_define
+		      ? DW_MACINFO_GNU_define_indirect4
+		      : DW_MACINFO_GNU_undef_indirect4;
+	  output_macinfo_op (ref);
+	  return;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_define
+			   ? "Define macro" : "Undefine macro");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_nstring (ref->info, -1, "The macro");
+      break;
+    case DW_MACINFO_GNU_define_indirect4:
+    case DW_MACINFO_GNU_undef_indirect4:
+      node = find_AT_string (ref->info);
+      if (node->form != DW_FORM_strp)
+	{
+	  char label[32];
+	  ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter);
+	  ++dw2_string_counter;
+	  node->label = xstrdup (label);
+	  node->form = DW_FORM_strp;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_GNU_define_indirect4
+			   ? "Define macro indirect4"
+			   : "Undefine macro indirect4");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label,
+			     debug_str_section, "The macro: \"%s\"",
+			     ref->info);
+      break;
+    case DW_MACINFO_GNU_transparent_include4:
+      dw2_asm_output_data (1, ref->code, "Transparent include4");
+      ASM_GENERATE_INTERNAL_LABEL (label,
+				   DEBUG_MACINFO_SECTION_LABEL, ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL);
+      break;
+    default:
+      fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
+	       ASM_COMMENT_START, (unsigned long) ref->code);
+      break;
+    }
+}
+
 static void
 output_macinfo (void)
 {
   unsigned i;
   unsigned long length = VEC_length (macinfo_entry, macinfo_table);
-  macinfo_entry *ref;
+  macinfo_entry *ref, *ref2;
+  VEC (macinfo_entry, gc) *files = NULL;
+  unsigned long transparent_includes = 0;
+  htab_t macinfo_htab = NULL;
 
   if (! length)
     return;
@@ -20402,37 +20503,185 @@ output_macinfo (void)
     {
       switch (ref->code)
 	{
-	  case DW_MACINFO_start_file:
+	case DW_MACINFO_start_file:
+	  VEC_safe_push (macinfo_entry, gc, files, ref);
+	  break;
+	case DW_MACINFO_end_file:
+	  if (!VEC_empty (macinfo_entry, files))
 	    {
-	      int file_num = maybe_emit_file (lookup_filename (ref->info));
-	      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
-	      dw2_asm_output_data_uleb128 
-			(ref->lineno, "Included from line number %lu", 
-			 			(unsigned long)ref->lineno);
-	      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+	      ref2 = VEC_last (macinfo_entry, files);
+	      free (CONST_CAST (char *, ref2->info));
+	      VEC_pop (macinfo_entry, files);
 	    }
-	    break;
-	  case DW_MACINFO_end_file:
-	    dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
-	    break;
-	  case DW_MACINFO_define:
-	    dw2_asm_output_data (1, DW_MACINFO_define, "Define macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  case DW_MACINFO_undef:
-	    dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  default:
-	   fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
-	     ASM_COMMENT_START, (unsigned long)ref->code);
+	  break;
+	case DW_MACINFO_define:
+	case DW_MACINFO_undef:
+#ifdef OBJECT_FORMAT_ELF
+	  if (!dwarf_strict
+	      && HAVE_COMDAT_GROUP
+	      && DWARF_OFFSET_SIZE == 4
+	      && VEC_length (macinfo_entry, files) != 1
+	      && i > 0
+	      && i + 1 < length
+	      && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0)
+	    {
+	      char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1];
+	      unsigned char checksum[16];
+	      struct md5_ctx ctx;
+	      char *tmp, *tail;
+	      const char *base;
+	      unsigned int j = i, k, l;
+	      void **slot;
+
+	      ref2 = VEC_index (macinfo_entry, macinfo_table, i + 1);
+	      if (ref2->code != DW_MACINFO_define
+		  && ref2->code != DW_MACINFO_undef)
+		break;
+
+	      if (VEC_empty (macinfo_entry, files))
+		{
+		  if (ref->lineno != 0 || ref2->lineno != 0)
+		    break;
+		}
+	      else if (ref->lineno == 0)
+		break;
+	      md5_init_ctx (&ctx);
+	      for (; VEC_iterate (macinfo_entry, macinfo_table, j, ref2); j++)
+		if (ref2->code != DW_MACINFO_define
+		    && ref2->code != DW_MACINFO_undef)
+		  break;
+		else if (ref->lineno == 0 && ref2->lineno != 0)
+		  break;
+		else
+		  {
+		    unsigned char code = ref2->code;
+		    md5_process_bytes (&code, 1, &ctx);
+		    checksum_uleb128 (ref2->lineno, &ctx);
+		    md5_process_bytes (ref2->info, strlen (ref2->info) + 1,
+				       &ctx);
+		  }
+	      md5_finish_ctx (&ctx, checksum);
+	      if (ref->lineno == 0)
+		base = "";
+	      else
+		base = lbasename (VEC_last (macinfo_entry, files)->info);
+	      for (l = 0, k = 0; base[k]; k++)
+		if (ISIDNUM (base[k]) || base[k] == '.')
+		  l++;
+	      if (l)
+		l++;
+	      sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED,
+		       VEC_index (macinfo_entry, macinfo_table, i)->lineno);
+	      tmp = XNEWVEC (char, 3 + l + strlen (linebuf) + 1 + 16 * 2 + 1);
+	      strcpy (tmp, "wm.");
+	      tail = tmp + 3;
+	      if (l)
+		{
+		  for (k = 0; base[k]; k++)
+		    if (ISIDNUM (base[k]) || base[k] == '.')
+		      *tail++ = base[k];
+		  *tail++ = '.';
+		}
+	      l = strlen (linebuf);
+	      memcpy (tail, linebuf, l);
+	      tail += l;
+	      *tail++ = '.';
+	      for (k = 0; k < 16; k++)
+		sprintf (tail + k * 2, "%02x", checksum[k] & 0xff);
+	      ref2 = VEC_index (macinfo_entry, macinfo_table, i - 1);
+	      ref2->code = DW_MACINFO_GNU_transparent_include4;
+	      ref2->lineno = 0;
+	      ref2->info = tmp;
+	      if (macinfo_htab == NULL)
+		macinfo_htab = htab_create (10, htab_macinfo_hash,
+					    htab_macinfo_eq, NULL);
+	      slot = htab_find_slot (macinfo_htab, ref2, INSERT);
+	      if (*slot != NULL)
+		{
+		  free (CONST_CAST (char *, ref2->info));
+		  ref2->code = 0;
+		  ref2->info = NULL;
+		  ref2 = (macinfo_entry *) *slot;
+		  output_macinfo_op (ref2);
+		  for (j = i;
+		       VEC_iterate (macinfo_entry, macinfo_table, j, ref2);
+		       j++)
+		    if (ref2->code != DW_MACINFO_define
+			&& ref2->code != DW_MACINFO_undef)
+		      break;
+		    else if (ref->lineno == 0 && ref2->lineno != 0)
+		      break;
+		    else
+		      {
+			ref2->code = 0;
+			free (CONST_CAST (char *, ref2->info));
+			ref2->info = NULL;
+		      }
+		}
+	      else
+		{
+		  *slot = ref2;
+		  ref2->lineno = ++transparent_includes;
+		  output_macinfo_op (ref2);
+		}
+	      i = j - 1;
+	      continue;
+	    }
+#endif
+	  break;
+	default:
 	  break;
 	}
+      output_macinfo_op (ref);
+      /* For DW_MACINFO_start_file ref->info has been copied into files
+	 vector.  */
+      if (ref->code != DW_MACINFO_start_file)
+	free (CONST_CAST (char *, ref->info));
+      ref->info = NULL;
+      ref->code = 0;
     }
+
+  if (!transparent_includes)
+    return;
+
+  htab_delete (macinfo_htab);
+
+#ifdef OBJECT_FORMAT_ELF
+  for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
+    switch (ref->code)
+      {
+      case 0:
+	continue;
+      case DW_MACINFO_GNU_transparent_include4:
+	{
+	  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+	  tree comdat_key = get_identifier (ref->info);
+	  /* Terminate the previous .debug_macinfo section.  */
+	  dw2_asm_output_data (1, 0, "End compilation unit");
+	  targetm.asm_out.named_section (DEBUG_MACINFO_SECTION,
+					 SECTION_DEBUG
+					 | SECTION_LINKONCE,
+					 comdat_key);
+	  ASM_GENERATE_INTERNAL_LABEL (label,
+				       DEBUG_MACINFO_SECTION_LABEL,
+				       ref->lineno);
+	  ASM_OUTPUT_LABEL (asm_out_file, label);
+	  ref->code = 0;
+	  free (CONST_CAST (char *, ref->info));
+	  ref->info = NULL;
+	}
+	break;
+      case DW_MACINFO_define:
+      case DW_MACINFO_undef:
+	output_macinfo_op (ref);
+	ref->code = 0;
+	free (CONST_CAST (char *, ref->info));
+	ref->info = NULL;
+	break;
+      default:
+	gcc_unreachable ();
+      }
+#endif
 }
 
 /* Set up for Dwarf output at the start of compilation.  */

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo
  2011-07-13 17:12 [RFC] More compact (100x) -g3 .debug_macinfo Jakub Jelinek
@ 2011-07-13 19:59 ` Tom Tromey
  2011-07-13 20:37   ` Jakub Jelinek
  2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek
  1 sibling, 1 reply; 25+ messages in thread
From: Tom Tromey @ 2011-07-13 19:59 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, Richard Henderson, Jan Kratochvil, Roland McGrath,
	Cary Coutant, Mark Wielaard, gcc-patches

>>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes:

Jakub> Currently .debug_macinfo is prohibitively large, because it doesn't
Jakub> allow for any kind of merging of duplicate debug information.

Jakub> This patch is an RFC for extensions that allow it to bring it down
Jakub> to manageable levels.

I wrote a gdb patch for this.  I've appended it in case you want to try
it out; it is against git master.  I tried it a little on an executable
Jakub sent me and it seems to work fine.

It is no trouble to change this patch if you change the format.  It
wasn't hard to write in the first place, it just bigger than it is
because I moved a bunch of code into a new function.

I don't think I really understood DW_MACINFO_GNU_define_opcode, so the
implementation here is probably wrong.

Tom

2011-07-13  Tom Tromey  <tromey@redhat.com>

	* dwarf2read.c (read_indirect_string_at_offset): New function.
	(read_indirect_string): Use it.
	(dwarf_decode_macro_bytes): New function, taken from
	dwarf_decode_macros.  Handle DW_MACINFO_GNU_*.
	(dwarf_decode_macros): Use it.  handle DW_MACINFO_GNU_*.

diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index fde5b6a..af35f16 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -10182,32 +10182,32 @@ read_direct_string (bfd *abfd, gdb_byte *buf, unsigned int *bytes_read_ptr)
 }
 
 static char *
-read_indirect_string (bfd *abfd, gdb_byte *buf,
-		      const struct comp_unit_head *cu_header,
-		      unsigned int *bytes_read_ptr)
+read_indirect_string_at_offset (bfd *abfd, LONGEST str_offset)
 {
-  LONGEST str_offset = read_offset (abfd, buf, cu_header, bytes_read_ptr);
-
   dwarf2_read_section (dwarf2_per_objfile->objfile, &dwarf2_per_objfile->str);
   if (dwarf2_per_objfile->str.buffer == NULL)
-    {
-      error (_("DW_FORM_strp used without .debug_str section [in module %s]"),
-		      bfd_get_filename (abfd));
-      return NULL;
-    }
+    error (_("DW_FORM_strp used without .debug_str section [in module %s]"),
+	   bfd_get_filename (abfd));
   if (str_offset >= dwarf2_per_objfile->str.size)
-    {
-      error (_("DW_FORM_strp pointing outside of "
-	       ".debug_str section [in module %s]"),
-	     bfd_get_filename (abfd));
-      return NULL;
-    }
+    error (_("DW_FORM_strp pointing outside of "
+	     ".debug_str section [in module %s]"),
+	   bfd_get_filename (abfd));
   gdb_assert (HOST_CHAR_BIT == 8);
   if (dwarf2_per_objfile->str.buffer[str_offset] == '\0')
     return NULL;
   return (char *) (dwarf2_per_objfile->str.buffer + str_offset);
 }
 
+static char *
+read_indirect_string (bfd *abfd, gdb_byte *buf,
+		      const struct comp_unit_head *cu_header,
+		      unsigned int *bytes_read_ptr)
+{
+  LONGEST str_offset = read_offset (abfd, buf, cu_header, bytes_read_ptr);
+
+  return read_indirect_string_at_offset (abfd, str_offset);
+}
+
 static unsigned long
 read_unsigned_leb128 (bfd *abfd, gdb_byte *buf, unsigned int *bytes_read_ptr)
 {
@@ -14576,116 +14576,14 @@ parse_macro_definition (struct macro_source_file *file, int line,
 
 
 static void
-dwarf_decode_macros (struct line_header *lh, unsigned int offset,
-                     char *comp_dir, bfd *abfd,
-                     struct dwarf2_cu *cu)
+dwarf_decode_macro_bytes (bfd *abfd, gdb_byte *mac_ptr, gdb_byte *mac_end,
+			  struct macro_source_file *current_file,
+			  struct line_header *lh, char *comp_dir,
+			  struct dwarf2_cu *cu)
 {
-  gdb_byte *mac_ptr, *mac_end;
-  struct macro_source_file *current_file = 0;
   enum dwarf_macinfo_record_type macinfo_type;
   int at_commandline;
 
-  dwarf2_read_section (dwarf2_per_objfile->objfile,
-		       &dwarf2_per_objfile->macinfo);
-  if (dwarf2_per_objfile->macinfo.buffer == NULL)
-    {
-      complaint (&symfile_complaints, _("missing .debug_macinfo section"));
-      return;
-    }
-
-  /* First pass: Find the name of the base filename.
-     This filename is needed in order to process all macros whose definition
-     (or undefinition) comes from the command line.  These macros are defined
-     before the first DW_MACINFO_start_file entry, and yet still need to be
-     associated to the base file.
-
-     To determine the base file name, we scan the macro definitions until we
-     reach the first DW_MACINFO_start_file entry.  We then initialize
-     CURRENT_FILE accordingly so that any macro definition found before the
-     first DW_MACINFO_start_file can still be associated to the base file.  */
-
-  mac_ptr = dwarf2_per_objfile->macinfo.buffer + offset;
-  mac_end = dwarf2_per_objfile->macinfo.buffer
-    + dwarf2_per_objfile->macinfo.size;
-
-  do
-    {
-      /* Do we at least have room for a macinfo type byte?  */
-      if (mac_ptr >= mac_end)
-        {
-	  /* Complaint is printed during the second pass as GDB will probably
-	     stop the first pass earlier upon finding
-	     DW_MACINFO_start_file.  */
-	  break;
-        }
-
-      macinfo_type = read_1_byte (abfd, mac_ptr);
-      mac_ptr++;
-
-      switch (macinfo_type)
-        {
-          /* A zero macinfo type indicates the end of the macro
-             information.  */
-        case 0:
-	  break;
-
-	case DW_MACINFO_define:
-	case DW_MACINFO_undef:
-	  /* Only skip the data by MAC_PTR.  */
-	  {
-	    unsigned int bytes_read;
-
-	    read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
-	    mac_ptr += bytes_read;
-	    read_direct_string (abfd, mac_ptr, &bytes_read);
-	    mac_ptr += bytes_read;
-	  }
-	  break;
-
-	case DW_MACINFO_start_file:
-	  {
-	    unsigned int bytes_read;
-	    int line, file;
-
-	    line = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
-	    mac_ptr += bytes_read;
-	    file = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
-	    mac_ptr += bytes_read;
-
-	    current_file = macro_start_file (file, line, current_file,
-					     comp_dir, lh, cu->objfile);
-	  }
-	  break;
-
-	case DW_MACINFO_end_file:
-	  /* No data to skip by MAC_PTR.  */
-	  break;
-
-	case DW_MACINFO_vendor_ext:
-	  /* Only skip the data by MAC_PTR.  */
-	  {
-	    unsigned int bytes_read;
-
-	    read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
-	    mac_ptr += bytes_read;
-	    read_direct_string (abfd, mac_ptr, &bytes_read);
-	    mac_ptr += bytes_read;
-	  }
-	  break;
-
-	default:
-	  break;
-	}
-    } while (macinfo_type != 0 && current_file == NULL);
-
-  /* Second pass: Process all entries.
-
-     Use the AT_COMMAND_LINE flag to determine whether we are still processing
-     command-line macro definitions/undefinitions.  This flag is unset when we
-     reach the first DW_MACINFO_start_file entry.  */
-
-  mac_ptr = dwarf2_per_objfile->macinfo.buffer + offset;
-
   /* Determines if GDB is still before first DW_MACINFO_start_file.  If true
      GDB is still reading the definitions from command line.  First
      DW_MACINFO_start_file will need to be ignored as it was already executed
@@ -14716,27 +14614,43 @@ dwarf_decode_macros (struct line_header *lh, unsigned int offset,
 
         case DW_MACINFO_define:
         case DW_MACINFO_undef:
+	case DW_MACINFO_GNU_define_indirect4:
+	case DW_MACINFO_GNU_undef_indirect4:
           {
             unsigned int bytes_read;
             int line;
             char *body;
+	    int is_define;
 
-            line = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
-            mac_ptr += bytes_read;
-            body = read_direct_string (abfd, mac_ptr, &bytes_read);
-            mac_ptr += bytes_read;
+	    line = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+
+	    if (macinfo_type == DW_MACINFO_define
+		|| macinfo_type == DW_MACINFO_undef)
+	      {
+		body = read_direct_string (abfd, mac_ptr, &bytes_read);
+		mac_ptr += bytes_read;
+	      }
+	    else
+	      {
+		LONGEST str_offset;
+
+		str_offset = read_offset_1 (abfd, mac_ptr, 4);
+		mac_ptr += 4;
+
+		body = read_indirect_string_at_offset (abfd, str_offset);
+	      }
 
+	    is_define = (macinfo_type == DW_MACINFO_define
+			 || macinfo_type == DW_MACINFO_GNU_define_indirect4);
             if (! current_file)
 	      {
 		/* DWARF violation as no main source is present.  */
 		complaint (&symfile_complaints,
 			   _("debug info with no main source gives macro %s "
 			     "on line %d: %s"),
-			   macinfo_type == DW_MACINFO_define ?
-			     _("definition") :
-			       macinfo_type == DW_MACINFO_undef ?
-				 _("undefinition") :
-				 _("something-or-other"), line, body);
+			   is_define ? _("definition") : _("undefinition"),
+			   line, body);
 		break;
 	      }
 	    if ((line == 0 && !at_commandline)
@@ -14744,17 +14658,17 @@ dwarf_decode_macros (struct line_header *lh, unsigned int offset,
 	      complaint (&symfile_complaints,
 			 _("debug info gives %s macro %s with %s line %d: %s"),
 			 at_commandline ? _("command-line") : _("in-file"),
-			 macinfo_type == DW_MACINFO_define ?
-			   _("definition") :
-			     macinfo_type == DW_MACINFO_undef ?
-			       _("undefinition") :
-			       _("something-or-other"),
+			 is_define ? _("definition") : _("undefinition"),
 			 line == 0 ? _("zero") : _("non-zero"), line, body);
 
-	    if (macinfo_type == DW_MACINFO_define)
+	    if (is_define)
 	      parse_macro_definition (current_file, line, body);
-	    else if (macinfo_type == DW_MACINFO_undef)
-	      macro_undef (current_file, line, body);
+	    else
+	      {
+		gdb_assert (macinfo_type == DW_MACINFO_undef
+			    || macinfo_type == DW_MACINFO_GNU_undef_indirect4);
+		macro_undef (current_file, line, body);
+	      }
           }
           break;
 
@@ -14825,6 +14739,33 @@ dwarf_decode_macros (struct line_header *lh, unsigned int offset,
             }
           break;
 
+	case DW_MACINFO_GNU_transparent_include4:
+	  {
+	    LONGEST offset;
+
+	    offset = read_offset_1 (abfd, mac_ptr, 4);
+	    mac_ptr += 4;
+
+	    dwarf_decode_macro_bytes (abfd,
+				      (dwarf2_per_objfile->macinfo.buffer
+				       + offset),
+				      mac_end, current_file,
+				      lh, comp_dir, cu);
+	  }
+	  break;
+
+	case DW_MACINFO_GNU_define_opcode:
+	  {
+	    unsigned int bytes_read, arg;
+
+	    /* Just ignore it.  */
+	    mac_ptr += 1;
+	    arg = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+	    mac_ptr += arg;
+	  }
+	  break;
+
         case DW_MACINFO_vendor_ext:
           {
             unsigned int bytes_read;
@@ -14842,6 +14783,149 @@ dwarf_decode_macros (struct line_header *lh, unsigned int offset,
     } while (macinfo_type != 0);
 }
 
+static void
+dwarf_decode_macros (struct line_header *lh, unsigned int offset,
+                     char *comp_dir, bfd *abfd,
+                     struct dwarf2_cu *cu)
+{
+  gdb_byte *mac_ptr, *mac_end;
+  struct macro_source_file *current_file = 0;
+  enum dwarf_macinfo_record_type macinfo_type;
+
+  dwarf2_read_section (dwarf2_per_objfile->objfile,
+		       &dwarf2_per_objfile->macinfo);
+  if (dwarf2_per_objfile->macinfo.buffer == NULL)
+    {
+      complaint (&symfile_complaints, _("missing .debug_macinfo section"));
+      return;
+    }
+
+  /* First pass: Find the name of the base filename.
+     This filename is needed in order to process all macros whose definition
+     (or undefinition) comes from the command line.  These macros are defined
+     before the first DW_MACINFO_start_file entry, and yet still need to be
+     associated to the base file.
+
+     To determine the base file name, we scan the macro definitions until we
+     reach the first DW_MACINFO_start_file entry.  We then initialize
+     CURRENT_FILE accordingly so that any macro definition found before the
+     first DW_MACINFO_start_file can still be associated to the base file.  */
+
+  mac_ptr = dwarf2_per_objfile->macinfo.buffer + offset;
+  mac_end = dwarf2_per_objfile->macinfo.buffer
+    + dwarf2_per_objfile->macinfo.size;
+
+  do
+    {
+      /* Do we at least have room for a macinfo type byte?  */
+      if (mac_ptr >= mac_end)
+        {
+	  /* Complaint is printed during the second pass as GDB will probably
+	     stop the first pass earlier upon finding
+	     DW_MACINFO_start_file.  */
+	  break;
+        }
+
+      macinfo_type = read_1_byte (abfd, mac_ptr);
+      mac_ptr++;
+
+      switch (macinfo_type)
+        {
+          /* A zero macinfo type indicates the end of the macro
+             information.  */
+        case 0:
+	  break;
+
+	case DW_MACINFO_define:
+	case DW_MACINFO_undef:
+	  /* Only skip the data by MAC_PTR.  */
+	  {
+	    unsigned int bytes_read;
+
+	    read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+	    read_direct_string (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+	  }
+	  break;
+
+	case DW_MACINFO_start_file:
+	  {
+	    unsigned int bytes_read;
+	    int line, file;
+
+	    line = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+	    file = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+
+	    current_file = macro_start_file (file, line, current_file,
+					     comp_dir, lh, cu->objfile);
+	  }
+	  break;
+
+	case DW_MACINFO_end_file:
+	  /* No data to skip by MAC_PTR.  */
+	  break;
+
+	case DW_MACINFO_vendor_ext:
+	  /* Only skip the data by MAC_PTR.  */
+	  {
+	    unsigned int bytes_read;
+
+	    read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+	    read_direct_string (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+	  }
+	  break;
+
+	case DW_MACINFO_GNU_define_indirect4:
+	case DW_MACINFO_GNU_undef_indirect4:
+	  {
+	    unsigned int bytes_read;
+
+	    read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+	    mac_ptr += 4;
+	  }
+	  break;
+
+	case DW_MACINFO_GNU_transparent_include4:
+	  /* Note that, according to the spec, a transparent include
+	     chain cannot call DW_MACINFO_start_file.  So, we can just
+	     skip this opcode.  */
+	  mac_ptr += 4;
+	  break;
+
+	case DW_MACINFO_GNU_define_opcode:
+	  {
+	    unsigned int bytes_read, arg;
+
+	    mac_ptr += 1;
+	    arg = read_unsigned_leb128 (abfd, mac_ptr, &bytes_read);
+	    mac_ptr += bytes_read;
+	    mac_ptr += arg;
+	  }
+	  break;
+
+	default:
+	  break;
+	}
+    } while (macinfo_type != 0 && current_file == NULL);
+
+  /* Second pass: Process all entries.
+
+     Use the AT_COMMAND_LINE flag to determine whether we are still processing
+     command-line macro definitions/undefinitions.  This flag is unset when we
+     reach the first DW_MACINFO_start_file entry.  */
+
+  mac_ptr = dwarf2_per_objfile->macinfo.buffer + offset;
+
+  dwarf_decode_macro_bytes (abfd, mac_ptr, mac_end, current_file,
+			    lh, comp_dir, cu);
+}
+
 /* Check if the attribute's form is a DW_FORM_block*
    if so return true else false.  */
 static int
diff --git a/include/dwarf2.h b/include/dwarf2.h
index b2806ef..40a8a66 100644
--- a/include/dwarf2.h
+++ b/include/dwarf2.h
@@ -877,7 +877,13 @@ enum dwarf_macinfo_record_type
     DW_MACINFO_undef = 2,
     DW_MACINFO_start_file = 3,
     DW_MACINFO_end_file = 4,
-    DW_MACINFO_vendor_ext = 255
+    DW_MACINFO_lo_user = 0xe0,
+    DW_MACINFO_GNU_define_indirect4 = 0xe0,
+    DW_MACINFO_GNU_undef_indirect4 = 0xe1,
+    DW_MACINFO_GNU_transparent_include4 = 0xe2,
+    DW_MACINFO_GNU_define_opcode = 0xe3,
+    DW_MACINFO_hi_user = 0xfe,
+    DW_MACINFO_vendor_ext = 0xff
   };
 \f
 /* @@@ For use with GNU frame unwind information.  */

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo
  2011-07-13 19:59 ` Tom Tromey
@ 2011-07-13 20:37   ` Jakub Jelinek
  2011-07-18 15:42     ` Tom Tromey
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2011-07-13 20:37 UTC (permalink / raw)
  To: Tom Tromey
  Cc: Jason Merrill, Richard Henderson, Jan Kratochvil, Roland McGrath,
	Cary Coutant, Mark Wielaard, gcc-patches

On Wed, Jul 13, 2011 at 01:36:03PM -0600, Tom Tromey wrote:
> I wrote a gdb patch for this.  I've appended it in case you want to try
> it out; it is against git master.  I tried it a little on an executable
> Jakub sent me and it seems to work fine.

Thanks.

> It is no trouble to change this patch if you change the format.  It
> wasn't hard to write in the first place, it just bigger than it is
> because I moved a bunch of code into a new function.
> 
> I don't think I really understood DW_MACINFO_GNU_define_opcode, so the
> implementation here is probably wrong.

Well, I think you've skipped it correctly and furthermore even patched
GCC doesn't emit it.  The point of it was to allow skipping unknown
opcodes.  If you implement this opcode fully and say GCC 4.8 adds a new
vendor opcode, the old implementation would be able to silently skip
over such opcodes.
So, the reader implementation could do something like have an array
of 256 pointers, at the start of parsing a particular .debug_macinfo
chunk clear it (or, when the chunk is read because of
DW_MACINFO_GNU_transparent_include4 it would instead make a copy
of the current array and make the copy current), and when you encounter
DW_OP_GNU_define_opcode, you store a pointer to the encoded operands
of that opcode into the table.  And, when you find an unknown opcode
(reach default: case), and array[op] is non-NULL, you read the uleb128
from that location to get the count and iterate over the DW_FORM_* values
in the array and for each of them skip corresponding bytes from the opcode's
operand.  Say .debug_macinfo chunk could start with
DW_MACINFO_GNU_define_opcode, 0xe5, 2, DW_FORM_udata, DW_FORM_block,
DW_MACINFO_define, 0, "A 1",
0xe5, 0x80, 0x7f, 5, 1, 2, 3, 4, 5,
DW_MACINFO_define, 0, "B 1",
0
and you'd be able to grok both defines in it, because you'd understand
that after seeing 0xe5 you need to read one uleb128, another uleb128 and
skip the second number of bytes after it.
The copy of the table would be so that the producer could define_opcode just
in the .debug_macinfo spot referenced from DW_AT_macro_info and wouldn't
have to repeat it in the transparent include chains, if it ensured that the
chains wouldn't be merged without having the define_opcode in all the
referencing .debug_macinfo sections.  And the copy of array allows the
transparent chain to add new opcodes or redefine them, while not affecting
the outer sequence.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2)
  2011-07-13 17:12 [RFC] More compact (100x) -g3 .debug_macinfo Jakub Jelinek
  2011-07-13 19:59 ` Tom Tromey
@ 2011-07-15 15:52 ` Jakub Jelinek
  2011-07-15 17:19   ` Richard Henderson
  2011-07-15 18:28   ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Tom Tromey
  1 sibling, 2 replies; 25+ messages in thread
From: Jakub Jelinek @ 2011-07-15 15:52 UTC (permalink / raw)
  To: gcc-patches
  Cc: Jason Merrill, Richard Henderson, Tom Tromey, Jan Kratochvil,
	Roland McGrath, Cary Coutant, Mark Wielaard

On Wed, Jul 13, 2011 at 07:00:53PM +0200, Jakub Jelinek wrote:
The patch below implements that slight change, in particular the "4"
suffixes from the op names were dropped, DW_MACINFO_GNU_*_indirect
have DW_FORM_udata and DW_FORM_strp arguments now (i.e. DWARF_OFFSET_SIZE
large) and DW_MACINFO_GNU_transparent_include has DW_FORM_sec_offset
argument (i.e. again 4 bytes long for 32-bit DWARF and 8 bytes long for
64-bit DWARF).  GCC assures that no merging will happen between
.debug_macinfo chunks with 32-bit and 64-bit DWARF by adding the byte size
in the comdat GROUP name.  I think that's cleaner than hardcoding
4 bytes and not optimizing anything on MIPS.

The newly added opcodes:
DW_MACINFO_GNU_define_indirect		0xe0
	This opcode has two arguments, one is uleb128 lineno and the
	other is offset size long byte offset into .debug_str.  Except
	for the encoding of the string it is similar to DW_MACINFO_define.
DW_MACINFO_GNU_undef_indirect		0xe1
	This opcode has two arguments, one is uleb128 lineno and the
	other is offset size long byte offset into .debug_str.  Except
	for the encoding of the string it is similar to DW_MACINFO_undef.
DW_MACINFO_GNU_transparent_include	0xe2
	This opcode has a single argument, a offset size long byte offset into
	.debug_macinfo.  It instructs the debug info consumer that
	this opcode during reading should be replaced with the sequence
	of .debug_macinfo opcodes from the mentioned offset, up to
	a terminating 0 opcode (not including that 0).
DW_MACINFO_GNU_define_opcode		0xe3
	This is an opcode for future extensibility through which
	a debugger could skip unknown opcodes.  It has 3 arguments:
	1 byte opcode number, uleb128 count of arguments and
	a count bytes long array, with a DW_FORM_* code how the
	argument is encoded.

DW_MACINFO_GNU_define_opcode <0, 0 []>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_define, 2 [DW_FORM_udata, DW_FORM_string]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_undef, 2 [DW_FORM_udata, DW_FORM_string]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_start_file, 2 [DW_FORM_udata, DW_FORM_udata]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_end_file, 1 [DW_FORM_udata]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_define_indirect, 2 [DW_FORM_udata, DW_FORM_strp]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_undef_indirect, 2 [DW_FORM_udata, DW_FORM_strp]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_transparent_include, 1 [DW_FORM_sec_offset]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_GNU_define_opcode, 2 [DW_FORM_data1, DW_FORM_block]>
DW_MACINFO_GNU_define_opcode <DW_MACINFO_vendor_ext, 2 [DW_FORM_udata, DW_FORM_string]>

2011-07-15  Jakub Jelinek  <jakub@redhat.com>

	* dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add.
	(DW_MACINFO_GNU_define_indirect, DW_MACINFO_GNU_undef_indirect,
	DW_MACINFO_GNU_transparent_include, DW_MACINFO_GNU_define_opcode):
	Add.

	* dwarf2out.c (dwarf2out_undef): Remove redundant semicolon.
	(htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op): New
	functions.
	(output_macinfo): Use them.  If !dwarf_strict and .debug_str is
	mergeable, optimize longer strings using
	DW_MACINFO_GNU_{define,undef}_indirect and if HAVE_COMDAT and ELF,
	optimize longer sequences of define/undef ops from headers
	using DW_MACINFO_GNU_transparent_include.

--- include/dwarf2.h.jj	2011-06-23 10:14:06.000000000 +0200
+++ include/dwarf2.h	2011-07-13 11:39:49.000000000 +0200
@@ -877,7 +877,13 @@ enum dwarf_macinfo_record_type
     DW_MACINFO_undef = 2,
     DW_MACINFO_start_file = 3,
     DW_MACINFO_end_file = 4,
-    DW_MACINFO_vendor_ext = 255
+    DW_MACINFO_lo_user = 0xe0,
+    DW_MACINFO_GNU_define_indirect = 0xe0,
+    DW_MACINFO_GNU_undef_indirect = 0xe1,
+    DW_MACINFO_GNU_transparent_include = 0xe2,
+    DW_MACINFO_GNU_define_opcode = 0xe3,
+    DW_MACINFO_hi_user = 0xfe,
+    DW_MACINFO_vendor_ext = 0xff
   };
 \f
 /* @@@ For use with GNU frame unwind information.  */
--- gcc/dwarf2out.c.jj	2011-07-12 17:59:01.000000000 +0200
+++ gcc/dwarf2out.c	2011-07-13 17:04:17.000000000 +0200
@@ -20383,17 +20383,117 @@ dwarf2out_undef (unsigned int lineno ATT
       macinfo_entry e;
       e.code = DW_MACINFO_undef;
       e.lineno = lineno;
-      e.info = xstrdup (buffer);;
+      e.info = xstrdup (buffer);
       VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
+/* Routines to manipulate hash table of CUs.  */
+static hashval_t
+htab_macinfo_hash (const void *of)
+{
+  const macinfo_entry *const entry =
+    (const macinfo_entry *) of;
+
+  return htab_hash_string (entry->info);
+}
+
+static int
+htab_macinfo_eq (const void *of1, const void *of2)
+{
+  const macinfo_entry *const entry1 = (const macinfo_entry *) of1;
+  const macinfo_entry *const entry2 = (const macinfo_entry *) of2;
+
+  return !strcmp (entry1->info, entry2->info);
+}
+
+/* Output a single .debug_macinfo entry.  */
+
+static void
+output_macinfo_op (macinfo_entry *ref)
+{
+  int file_num;
+  size_t len;
+  struct indirect_string_node *node;
+  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+
+  switch (ref->code)
+    {
+    case DW_MACINFO_start_file:
+      file_num = maybe_emit_file (lookup_filename (ref->info));
+      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
+      dw2_asm_output_data_uleb128 (ref->lineno,
+				   "Included from line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+      break;
+    case DW_MACINFO_end_file:
+      dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
+      break;
+    case DW_MACINFO_define:
+    case DW_MACINFO_undef:
+      len = strlen (ref->info) + 1;
+      if (!dwarf_strict
+	  && len > DWARF_OFFSET_SIZE
+	  && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
+	  && (debug_str_section->common.flags & SECTION_MERGE) != 0)
+	{
+	  ref->code = ref->code == DW_MACINFO_define
+		      ? DW_MACINFO_GNU_define_indirect
+		      : DW_MACINFO_GNU_undef_indirect;
+	  output_macinfo_op (ref);
+	  return;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_define
+			   ? "Define macro" : "Undefine macro");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_nstring (ref->info, -1, "The macro");
+      break;
+    case DW_MACINFO_GNU_define_indirect:
+    case DW_MACINFO_GNU_undef_indirect:
+      node = find_AT_string (ref->info);
+      if (node->form != DW_FORM_strp)
+	{
+	  char label[32];
+	  ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter);
+	  ++dw2_string_counter;
+	  node->label = xstrdup (label);
+	  node->form = DW_FORM_strp;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_GNU_define_indirect
+			   ? "Define macro indirect"
+			   : "Undefine macro indirect");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label,
+			     debug_str_section, "The macro: \"%s\"",
+			     ref->info);
+      break;
+    case DW_MACINFO_GNU_transparent_include:
+      dw2_asm_output_data (1, ref->code, "Transparent include");
+      ASM_GENERATE_INTERNAL_LABEL (label,
+				   DEBUG_MACINFO_SECTION_LABEL, ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL);
+      break;
+    default:
+      fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
+	       ASM_COMMENT_START, (unsigned long) ref->code);
+      break;
+    }
+}
+
 static void
 output_macinfo (void)
 {
   unsigned i;
   unsigned long length = VEC_length (macinfo_entry, macinfo_table);
-  macinfo_entry *ref;
+  macinfo_entry *ref, *ref2;
+  VEC (macinfo_entry, gc) *files = NULL;
+  unsigned long transparent_includes = 0;
+  htab_t macinfo_htab = NULL;
 
   if (! length)
     return;
@@ -20402,37 +20503,184 @@ output_macinfo (void)
     {
       switch (ref->code)
 	{
-	  case DW_MACINFO_start_file:
+	case DW_MACINFO_start_file:
+	  VEC_safe_push (macinfo_entry, gc, files, ref);
+	  break;
+	case DW_MACINFO_end_file:
+	  if (!VEC_empty (macinfo_entry, files))
 	    {
-	      int file_num = maybe_emit_file (lookup_filename (ref->info));
-	      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
-	      dw2_asm_output_data_uleb128 
-			(ref->lineno, "Included from line number %lu", 
-			 			(unsigned long)ref->lineno);
-	      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+	      ref2 = VEC_last (macinfo_entry, files);
+	      free (CONST_CAST (char *, ref2->info));
+	      VEC_pop (macinfo_entry, files);
 	    }
-	    break;
-	  case DW_MACINFO_end_file:
-	    dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
-	    break;
-	  case DW_MACINFO_define:
-	    dw2_asm_output_data (1, DW_MACINFO_define, "Define macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  case DW_MACINFO_undef:
-	    dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  default:
-	   fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
-	     ASM_COMMENT_START, (unsigned long)ref->code);
+	  break;
+	case DW_MACINFO_define:
+	case DW_MACINFO_undef:
+#ifdef OBJECT_FORMAT_ELF
+	  if (!dwarf_strict
+	      && HAVE_COMDAT_GROUP
+	      && VEC_length (macinfo_entry, files) != 1
+	      && i > 0
+	      && i + 1 < length
+	      && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0)
+	    {
+	      char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1];
+	      unsigned char checksum[16];
+	      struct md5_ctx ctx;
+	      char *tmp, *tail;
+	      const char *base;
+	      unsigned int j = i, k, l;
+	      void **slot;
+
+	      ref2 = VEC_index (macinfo_entry, macinfo_table, i + 1);
+	      if (ref2->code != DW_MACINFO_define
+		  && ref2->code != DW_MACINFO_undef)
+		break;
+
+	      if (VEC_empty (macinfo_entry, files))
+		{
+		  if (ref->lineno != 0 || ref2->lineno != 0)
+		    break;
+		}
+	      else if (ref->lineno == 0)
+		break;
+	      md5_init_ctx (&ctx);
+	      for (; VEC_iterate (macinfo_entry, macinfo_table, j, ref2); j++)
+		if (ref2->code != DW_MACINFO_define
+		    && ref2->code != DW_MACINFO_undef)
+		  break;
+		else if (ref->lineno == 0 && ref2->lineno != 0)
+		  break;
+		else
+		  {
+		    unsigned char code = ref2->code;
+		    md5_process_bytes (&code, 1, &ctx);
+		    checksum_uleb128 (ref2->lineno, &ctx);
+		    md5_process_bytes (ref2->info, strlen (ref2->info) + 1,
+				       &ctx);
+		  }
+	      md5_finish_ctx (&ctx, checksum);
+	      if (ref->lineno == 0)
+		base = "";
+	      else
+		base = lbasename (VEC_last (macinfo_entry, files)->info);
+	      for (l = 0, k = 0; base[k]; k++)
+		if (ISIDNUM (base[k]) || base[k] == '.')
+		  l++;
+	      if (l)
+		l++;
+	      sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED,
+		       VEC_index (macinfo_entry, macinfo_table, i)->lineno);
+	      tmp = XNEWVEC (char, 4 + l + strlen (linebuf) + 1 + 16 * 2 + 1);
+	      strcpy (tmp, DWARF_OFFSET_SIZE == 4 ? "wm4." : "wm8.");
+	      tail = tmp + 4;
+	      if (l)
+		{
+		  for (k = 0; base[k]; k++)
+		    if (ISIDNUM (base[k]) || base[k] == '.')
+		      *tail++ = base[k];
+		  *tail++ = '.';
+		}
+	      l = strlen (linebuf);
+	      memcpy (tail, linebuf, l);
+	      tail += l;
+	      *tail++ = '.';
+	      for (k = 0; k < 16; k++)
+		sprintf (tail + k * 2, "%02x", checksum[k] & 0xff);
+	      ref2 = VEC_index (macinfo_entry, macinfo_table, i - 1);
+	      ref2->code = DW_MACINFO_GNU_transparent_include;
+	      ref2->lineno = 0;
+	      ref2->info = tmp;
+	      if (macinfo_htab == NULL)
+		macinfo_htab = htab_create (10, htab_macinfo_hash,
+					    htab_macinfo_eq, NULL);
+	      slot = htab_find_slot (macinfo_htab, ref2, INSERT);
+	      if (*slot != NULL)
+		{
+		  free (CONST_CAST (char *, ref2->info));
+		  ref2->code = 0;
+		  ref2->info = NULL;
+		  ref2 = (macinfo_entry *) *slot;
+		  output_macinfo_op (ref2);
+		  for (j = i;
+		       VEC_iterate (macinfo_entry, macinfo_table, j, ref2);
+		       j++)
+		    if (ref2->code != DW_MACINFO_define
+			&& ref2->code != DW_MACINFO_undef)
+		      break;
+		    else if (ref->lineno == 0 && ref2->lineno != 0)
+		      break;
+		    else
+		      {
+			ref2->code = 0;
+			free (CONST_CAST (char *, ref2->info));
+			ref2->info = NULL;
+		      }
+		}
+	      else
+		{
+		  *slot = ref2;
+		  ref2->lineno = ++transparent_includes;
+		  output_macinfo_op (ref2);
+		}
+	      i = j - 1;
+	      continue;
+	    }
+#endif
+	  break;
+	default:
 	  break;
 	}
+      output_macinfo_op (ref);
+      /* For DW_MACINFO_start_file ref->info has been copied into files
+	 vector.  */
+      if (ref->code != DW_MACINFO_start_file)
+	free (CONST_CAST (char *, ref->info));
+      ref->info = NULL;
+      ref->code = 0;
     }
+
+  if (!transparent_includes)
+    return;
+
+  htab_delete (macinfo_htab);
+
+#ifdef OBJECT_FORMAT_ELF
+  for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
+    switch (ref->code)
+      {
+      case 0:
+	continue;
+      case DW_MACINFO_GNU_transparent_include:
+	{
+	  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+	  tree comdat_key = get_identifier (ref->info);
+	  /* Terminate the previous .debug_macinfo section.  */
+	  dw2_asm_output_data (1, 0, "End compilation unit");
+	  targetm.asm_out.named_section (DEBUG_MACINFO_SECTION,
+					 SECTION_DEBUG
+					 | SECTION_LINKONCE,
+					 comdat_key);
+	  ASM_GENERATE_INTERNAL_LABEL (label,
+				       DEBUG_MACINFO_SECTION_LABEL,
+				       ref->lineno);
+	  ASM_OUTPUT_LABEL (asm_out_file, label);
+	  ref->code = 0;
+	  free (CONST_CAST (char *, ref->info));
+	  ref->info = NULL;
+	}
+	break;
+      case DW_MACINFO_define:
+      case DW_MACINFO_undef:
+	output_macinfo_op (ref);
+	ref->code = 0;
+	free (CONST_CAST (char *, ref->info));
+	ref->info = NULL;
+	break;
+      default:
+	gcc_unreachable ();
+      }
+#endif
 }
 
 /* Set up for Dwarf output at the start of compilation.  */

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2)
  2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek
@ 2011-07-15 17:19   ` Richard Henderson
  2011-07-15 21:18     ` [RFC] More compact (100x) -g3 .debug_macinfo (take 3) Jakub Jelinek
  2011-07-15 18:28   ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Tom Tromey
  1 sibling, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2011-07-15 17:19 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil,
	Cary Coutant, Mark Wielaard

On 07/15/2011 08:42 AM, Jakub Jelinek wrote:

> The newly added opcodes:
> DW_MACINFO_GNU_define_indirect		0xe0
> 	This opcode has two arguments, one is uleb128 lineno and the
> 	other is offset size long byte offset into .debug_str.  Except
> 	for the encoding of the string it is similar to DW_MACINFO_define.
> DW_MACINFO_GNU_undef_indirect		0xe1
> 	This opcode has two arguments, one is uleb128 lineno and the
> 	other is offset size long byte offset into .debug_str.  Except
> 	for the encoding of the string it is similar to DW_MACINFO_undef.
> DW_MACINFO_GNU_transparent_include	0xe2
> 	This opcode has a single argument, a offset size long byte offset into
> 	.debug_macinfo.  It instructs the debug info consumer that
> 	this opcode during reading should be replaced with the sequence
> 	of .debug_macinfo opcodes from the mentioned offset, up to
> 	a terminating 0 opcode (not including that 0).
> DW_MACINFO_GNU_define_opcode		0xe3
> 	This is an opcode for future extensibility through which
> 	a debugger could skip unknown opcodes.  It has 3 arguments:
> 	1 byte opcode number, uleb128 count of arguments and
> 	a count bytes long array, with a DW_FORM_* code how the
> 	argument is encoded.

I do like the new opcodes.

Elsewhere you described transparent_include as also saving state
about defined opcodes around the include.  Do you want to either
describe that or drop it?



> +	case DW_MACINFO_define:
> +	case DW_MACINFO_undef:
> +#ifdef OBJECT_FORMAT_ELF
> +	  if (!dwarf_strict
> +	      && HAVE_COMDAT_GROUP
> +	      && VEC_length (macinfo_entry, files) != 1
> +	      && i > 0
> +	      && i + 1 < length
> +	      && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0)
> +	    {
> +	      char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1];
> +	      unsigned char checksum[16];
> +	      struct md5_ctx ctx;

I'd like to see this broken out into some functions, and avoid
as much code as possible within ifdefs.  Perhaps

some_function (...)
{
#ifndef OBJECT_FORMAT_ELF
  return;
#endif
  // everything else
}

I think it also doesn't help review that there are no comments
at all, and a preponderance of description-less variable names
like "ref" and "ref2".


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2)
  2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek
  2011-07-15 17:19   ` Richard Henderson
@ 2011-07-15 18:28   ` Tom Tromey
  2011-07-15 19:21     ` Jakub Jelinek
  1 sibling, 1 reply; 25+ messages in thread
From: Tom Tromey @ 2011-07-15 18:28 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, Jason Merrill, Richard Henderson, Jan Kratochvil,
	Roland McGrath, Cary Coutant, Mark Wielaard

>>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes:

Jakub> The patch below implements that slight change, in particular the
Jakub> "4" suffixes from the op names were dropped,
Jakub> DW_MACINFO_GNU_*_indirect have DW_FORM_udata and DW_FORM_strp
Jakub> arguments now (i.e. DWARF_OFFSET_SIZE large) and
Jakub> DW_MACINFO_GNU_transparent_include has DW_FORM_sec_offset
Jakub> argument (i.e. again 4 bytes long for 32-bit DWARF and 8 bytes
Jakub> long for 64-bit DWARF).  GCC assures that no merging will happen
Jakub> between .debug_macinfo chunks with 32-bit and 64-bit DWARF by
Jakub> adding the byte size in the comdat GROUP name.  I think that's
Jakub> cleaner than hardcoding 4 bytes and not optimizing anything on
Jakub> MIPS.

The .debug_macinfo section doesn't have any header describing its
contents.  How would a consumer know which offset size to use?

Tom

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2)
  2011-07-15 18:28   ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Tom Tromey
@ 2011-07-15 19:21     ` Jakub Jelinek
  2011-07-15 19:30       ` Tom Tromey
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2011-07-15 19:21 UTC (permalink / raw)
  To: Tom Tromey
  Cc: gcc-patches, Jason Merrill, Richard Henderson, Jan Kratochvil,
	Roland McGrath, Cary Coutant, Mark Wielaard

On Fri, Jul 15, 2011 at 12:15:48PM -0600, Tom Tromey wrote:
> >>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes:
> 
> Jakub> The patch below implements that slight change, in particular the
> Jakub> "4" suffixes from the op names were dropped,
> Jakub> DW_MACINFO_GNU_*_indirect have DW_FORM_udata and DW_FORM_strp
> Jakub> arguments now (i.e. DWARF_OFFSET_SIZE large) and
> Jakub> DW_MACINFO_GNU_transparent_include has DW_FORM_sec_offset
> Jakub> argument (i.e. again 4 bytes long for 32-bit DWARF and 8 bytes
> Jakub> long for 64-bit DWARF).  GCC assures that no merging will happen
> Jakub> between .debug_macinfo chunks with 32-bit and 64-bit DWARF by
> Jakub> adding the byte size in the comdat GROUP name.  I think that's
> Jakub> cleaner than hardcoding 4 bytes and not optimizing anything on
> Jakub> MIPS.
> 
> The .debug_macinfo section doesn't have any header describing its
> contents.  How would a consumer know which offset size to use?

The same way as it knows how to interpret the second operands of
DW_MACINFO_start_file.  They aren't meaningful without knowing what
.debug_line section they refer to.  For .debug_line, you need to remember
DW_AT_stmt_list of the CU that refers to the .debug_macinfo section
through DW_AT_macro_info, and you'd remember whether the referencing
CU is 32-bit DWARF or 64-bit DWARF.  And the producer would need to arange
that DW_MACINFO_GNU_transparent_include referenced chunks have the same
properties (i.e. same offset size, and, if they use DW_MACINFO_start_file,
also the same .debug_line).

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 2)
  2011-07-15 19:21     ` Jakub Jelinek
@ 2011-07-15 19:30       ` Tom Tromey
  0 siblings, 0 replies; 25+ messages in thread
From: Tom Tromey @ 2011-07-15 19:30 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, Jason Merrill, Richard Henderson, Jan Kratochvil,
	Roland McGrath, Cary Coutant, Mark Wielaard

>>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes:

>> The .debug_macinfo section doesn't have any header describing its
>> contents.  How would a consumer know which offset size to use?

Jakub> The same way as it knows how to interpret the second operands of
Jakub> DW_MACINFO_start_file.

Ok, duh.

I updated my gdb patch.  I can send it if you want.

Tom

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC] More compact (100x) -g3 .debug_macinfo (take 3)
  2011-07-15 17:19   ` Richard Henderson
@ 2011-07-15 21:18     ` Jakub Jelinek
  2011-07-18 15:09       ` Tom Tromey
  2011-07-20  1:17       ` Richard Henderson
  0 siblings, 2 replies; 25+ messages in thread
From: Jakub Jelinek @ 2011-07-15 21:18 UTC (permalink / raw)
  To: Richard Henderson
  Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil,
	Cary Coutant, Mark Wielaard

On Fri, Jul 15, 2011 at 09:22:42AM -0700, Richard Henderson wrote:
> On 07/15/2011 08:42 AM, Jakub Jelinek wrote:
> 
> > The newly added opcodes:
> > DW_MACINFO_GNU_define_indirect		0xe0
> > 	This opcode has two arguments, one is uleb128 lineno and the
> > 	other is offset size long byte offset into .debug_str.  Except
> > 	for the encoding of the string it is similar to DW_MACINFO_define.
> > DW_MACINFO_GNU_undef_indirect		0xe1
> > 	This opcode has two arguments, one is uleb128 lineno and the
> > 	other is offset size long byte offset into .debug_str.  Except
> > 	for the encoding of the string it is similar to DW_MACINFO_undef.
> > DW_MACINFO_GNU_transparent_include	0xe2
> > 	This opcode has a single argument, a offset size long byte offset into
> > 	.debug_macinfo.  It instructs the debug info consumer that
> > 	this opcode during reading should be replaced with the sequence
> > 	of .debug_macinfo opcodes from the mentioned offset, up to
> > 	a terminating 0 opcode (not including that 0).
> > DW_MACINFO_GNU_define_opcode		0xe3
> > 	This is an opcode for future extensibility through which
> > 	a debugger could skip unknown opcodes.  It has 3 arguments:
> > 	1 byte opcode number, uleb128 count of arguments and
> > 	a count bytes long array, with a DW_FORM_* code how the
> > 	argument is encoded.
> 
> I do like the new opcodes.
> 
> Elsewhere you described transparent_include as also saving state
> about defined opcodes around the include.  Do you want to either
> describe that or drop it?

Ok, so how about this way (as DWARF4 modifications, of course for
DWARF5 proposal GNU_ would be gone and the ops would have different
codes):

6.3.1

The valid macinfo types are as follows:
...
DW_MACINFO_GNU_define_indirect		A macro definition.
DW_MACINFO_GNU_undef_indirect		A macro undefinition.
DW_MACINFO_GNU_transparent_include	Include a sequence of entries from given offset.
DW_MACINFO_GNU_define_opcode		Define extension opcode and its arguments.

6.3.1.1

All DW_MACINFO_GNU_define_indirect and DW_MACINFO_undef_indirect entries have
two operands.  The first operand encodes the line number of the source line on
which the relevant defining or undefining macro directives appeared.
The second operand consists of an offset into a string table contained in
the .debug_str section of the object file.  In the 32-bit DWARF format, the
representation of the operand value is a 4-byte unsigned offset; in the
64-bit DWARF format, it is an 8-byte unsigned offset.  Apart from the
encoding of the operands these entries are equivalent to DW_MACINFO_define
resp. DW_MACINFO_undef.

6.3.1.5  Transparent inclusion of a sequence of entries

A DW_MACINFO_GNU_transparent_include entry has one operand, offset into
another part of the .debug_macinfo section.  In the 32-bit DWARF format, the
representation of the operand value is a 4-byte unsigned offset; in the
64-bit DWARF format, it is an 8-byte unsigned offset.  This entry instructs
the consumer to replace this entry with a sequence of macinfo entries found
at the given .debug_macinfo offset, up to, but excluding, the terminating
entry with type code 0.  This entry type is aimed at sharing duplicate
sequences of macinfo entries between macinfo from different compilation
units.  The producer should ensure that only sequences with matching
DWARF format size (either all 32-bit DWARF or all 64-bit DWARF) are
merged together, and that either DW_MACINFO_start_file entries aren't
in those sequences, or only macinfo entries referencing the same
.debug_line section part include the sequence.

6.3.1.6  Defining new opcodes and operands

A DW_MACINFO_GNU_define_opcode entry has 2 operands.  The first operand
is a one byte constant with the type code it defines operand types for,
the second operand is a DW_FORM_block encoded array of operand forms.
The second operand starts with an unsigned LEB128 encoded number of operands
and for each of the operands there is one byte, containing a form encoding
how the corresponding operand is encoded.  This entry allows to define
new vendor extension entry types which consumers will be able to skip over
and ignore.  Each so defined opcode is valid for subsequent entries
until the terminating entry with type code 0, including any sequences
included from those entries using DW_MACINFO_GNU_transparent_include.
Opcodes defined using this entry in a chain included through
DW_MACINFO_GNU_transparent_include isn't valid in the parent sequence
after the DW_MACINFO_GNU_transparent_include entry that included it though.

7.22 Macro Information

Add
	DW_MACINFO_lo_user			0xe0
	DW_MACINFO_GNU_define_indirect		0xe0
	DW_MACINFO_GNU_undef_indirect		0xe1
	DW_MACINFO_GNU_transparent_include	0xe2
	DW_MACINFO_GNU_define_opcode		0xe3
	DW_MACINFO_hi_user			0xfe
to the table.

> I'd like to see this broken out into some functions, and avoid
> as much code as possible within ifdefs.  Perhaps
> 
> some_function (...)
> {
> #ifndef OBJECT_FORMAT_ELF
>   return;
> #endif
>   // everything else
> }
> 
> I think it also doesn't help review that there are no comments
> at all, and a preponderance of description-less variable names
> like "ref" and "ref2".

I've tried to cure these issues in the following (so far just
lightly tested) patch:

2011-07-15  Jakub Jelinek  <jakub@redhat.com>

	* dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add.
	(DW_MACINFO_GNU_define_indirect, DW_MACINFO_GNU_undef_indirect,
	DW_MACINFO_GNU_transparent_include, DW_MACINFO_GNU_define_opcode):
	Add.

	* dwarf2out.c (dwarf2out_define): If the vector is empty and
	lineno is 0, emit a dummy entry first.
	(dwarf2out_undef): Likewise.  Remove redundant semicolon.
	(htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op,
	optimize_macinfo_range): New functions.
	(output_macinfo): Use them.  If !dwarf_strict and .debug_str is
	mergeable, optimize longer strings using
	DW_MACINFO_GNU_{define,undef}_indirect and if HAVE_COMDAT_GROUP,
	optimize longer sequences of define/undef ops from headers
	using DW_MACINFO_GNU_transparent_include.

--- include/dwarf2.h.jj	2011-06-23 10:14:06.000000000 +0200
+++ include/dwarf2.h	2011-07-13 11:39:49.000000000 +0200
@@ -877,7 +877,13 @@ enum dwarf_macinfo_record_type
     DW_MACINFO_undef = 2,
     DW_MACINFO_start_file = 3,
     DW_MACINFO_end_file = 4,
-    DW_MACINFO_vendor_ext = 255
+    DW_MACINFO_lo_user = 0xe0,
+    DW_MACINFO_GNU_define_indirect = 0xe0,
+    DW_MACINFO_GNU_undef_indirect = 0xe1,
+    DW_MACINFO_GNU_transparent_include = 0xe2,
+    DW_MACINFO_GNU_define_opcode = 0xe3,
+    DW_MACINFO_hi_user = 0xfe,
+    DW_MACINFO_vendor_ext = 0xff
   };
 \f
 /* @@@ For use with GNU frame unwind information.  */
--- gcc/dwarf2out.c.jj	2011-07-15 20:46:32.000000000 +0200
+++ gcc/dwarf2out.c	2011-07-15 22:15:14.000000000 +0200
@@ -20291,6 +20291,15 @@ dwarf2out_define (unsigned int lineno AT
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
       macinfo_entry e;
+      /* Insert a dummy first entry to be able to optimize the whole
+	 predefined macro block using DW_MACINFO_GNU_transparent_include.  */
+      if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0)
+	{
+	  e.code = 0;
+	  e.lineno = 0;
+	  e.info = NULL;
+	  VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
+	}
       e.code = DW_MACINFO_define;
       e.lineno = lineno;
       e.info = xstrdup (buffer);;
@@ -20309,58 +20318,363 @@ dwarf2out_undef (unsigned int lineno ATT
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
       macinfo_entry e;
+      /* Insert a dummy first entry to be able to optimize the whole
+	 predefined macro block using DW_MACINFO_GNU_transparent_include.  */
+      if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0)
+	{
+	  e.code = 0;
+	  e.lineno = 0;
+	  e.info = NULL;
+	  VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
+	}
       e.code = DW_MACINFO_undef;
       e.lineno = lineno;
-      e.info = xstrdup (buffer);;
+      e.info = xstrdup (buffer);
       VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
+/* Routines to manipulate hash table of CUs.  */
+
+static hashval_t
+htab_macinfo_hash (const void *of)
+{
+  const macinfo_entry *const entry =
+    (const macinfo_entry *) of;
+
+  return htab_hash_string (entry->info);
+}
+
+static int
+htab_macinfo_eq (const void *of1, const void *of2)
+{
+  const macinfo_entry *const entry1 = (const macinfo_entry *) of1;
+  const macinfo_entry *const entry2 = (const macinfo_entry *) of2;
+
+  return !strcmp (entry1->info, entry2->info);
+}
+
+/* Output a single .debug_macinfo entry.  */
+
+static void
+output_macinfo_op (macinfo_entry *ref)
+{
+  int file_num;
+  size_t len;
+  struct indirect_string_node *node;
+  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+
+  switch (ref->code)
+    {
+    case DW_MACINFO_start_file:
+      file_num = maybe_emit_file (lookup_filename (ref->info));
+      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
+      dw2_asm_output_data_uleb128 (ref->lineno,
+				   "Included from line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+      break;
+    case DW_MACINFO_end_file:
+      dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
+      break;
+    case DW_MACINFO_define:
+    case DW_MACINFO_undef:
+      len = strlen (ref->info) + 1;
+      if (!dwarf_strict
+	  && len > DWARF_OFFSET_SIZE
+	  && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
+	  && (debug_str_section->common.flags & SECTION_MERGE) != 0)
+	{
+	  ref->code = ref->code == DW_MACINFO_define
+		      ? DW_MACINFO_GNU_define_indirect
+		      : DW_MACINFO_GNU_undef_indirect;
+	  output_macinfo_op (ref);
+	  return;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_define
+			   ? "Define macro" : "Undefine macro");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_nstring (ref->info, -1, "The macro");
+      break;
+    case DW_MACINFO_GNU_define_indirect:
+    case DW_MACINFO_GNU_undef_indirect:
+      node = find_AT_string (ref->info);
+      if (node->form != DW_FORM_strp)
+	{
+	  char label[32];
+	  ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter);
+	  ++dw2_string_counter;
+	  node->label = xstrdup (label);
+	  node->form = DW_FORM_strp;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_GNU_define_indirect
+			   ? "Define macro indirect"
+			   : "Undefine macro indirect");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label,
+			     debug_str_section, "The macro: \"%s\"",
+			     ref->info);
+      break;
+    case DW_MACINFO_GNU_transparent_include:
+      dw2_asm_output_data (1, ref->code, "Transparent include");
+      ASM_GENERATE_INTERNAL_LABEL (label,
+				   DEBUG_MACINFO_SECTION_LABEL, ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL);
+      break;
+    default:
+      fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
+	       ASM_COMMENT_START, (unsigned long) ref->code);
+      break;
+    }
+}
+
+/* Attempt to make a sequence of define/undef macinfo ops shareable with
+   other compilation unit .debug_macinfo sections.  IDX is the first
+   index of a define/undef, return the number of ops that should be
+   emitted in a comdat .debug_macinfo section and emit
+   a DW_MACINFO_GNU_transparent_include entry referencing it.
+   If the define/undef entry should be emitted normally, return 0.  */
+
+static unsigned
+optimize_macinfo_range (unsigned int idx, VEC (macinfo_entry, gc) *files,
+			htab_t *macinfo_htab)
+{
+  macinfo_entry *first, *second, *cur, *inc;
+  char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1];
+  unsigned char checksum[16];
+  struct md5_ctx ctx;
+  char *grp_name, *tail;
+  const char *base;
+  unsigned int i, count, encoded_filename_len, linebuf_len;
+  void **slot;
+
+  first = VEC_index (macinfo_entry, macinfo_table, idx);
+  second = VEC_index (macinfo_entry, macinfo_table, idx + 1);
+
+  /* Optimize only if there are at least two consecutive define/undef ops,
+     and either all of them are before first DW_MACINFO_start_file
+     with lineno 0 (i.e. predefined macro block), or all of them are
+     in some included header file.  */
+  if (second->code != DW_MACINFO_define && second->code != DW_MACINFO_undef)
+    return 0;
+  if (VEC_empty (macinfo_entry, files))
+    {
+      if (first->lineno != 0 || second->lineno != 0)
+	return 0;
+    }
+  else if (first->lineno == 0)
+    return 0;
+
+  /* Find the last define/undef entry that can be grouped together
+     with first and at the same time compute md5 checksum of their
+     codes, linenumbers and strings.  */
+  md5_init_ctx (&ctx);
+  for (i = idx; VEC_iterate (macinfo_entry, macinfo_table, i, cur); i++)
+    if (cur->code != DW_MACINFO_define && cur->code != DW_MACINFO_undef)
+      break;
+    else if (first->lineno == 0 && cur->lineno != 0)
+      break;
+    else
+      {
+	unsigned char code = cur->code;
+	md5_process_bytes (&code, 1, &ctx);
+	checksum_uleb128 (cur->lineno, &ctx);
+	md5_process_bytes (cur->info, strlen (cur->info) + 1, &ctx);
+      }
+  md5_finish_ctx (&ctx, checksum);
+  count = i - idx;
+
+  /* From the containing include filename (if any) pick up just
+     usable characters from its basename.  */
+  if (first->lineno == 0)
+    base = "";
+  else
+    base = lbasename (VEC_last (macinfo_entry, files)->info);
+  for (encoded_filename_len = 0, i = 0; base[i]; i++)
+    if (ISIDNUM (base[i]) || base[i] == '.')
+      encoded_filename_len++;
+  /* Count . at the end.  */
+  if (encoded_filename_len)
+    encoded_filename_len++;
+
+  sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED, first->lineno);
+  linebuf_len = strlen (linebuf);
+
+  /* The group name format is: wmN.[<encoded filename>.]<lineno>.<md5sum>  */
+  grp_name = XNEWVEC (char, 4 + encoded_filename_len + linebuf_len + 1
+		      + 16 * 2 + 1);
+  memcpy (grp_name, DWARF_OFFSET_SIZE == 4 ? "wm4." : "wm8.", 4);
+  tail = grp_name + 4;
+  if (encoded_filename_len)
+    {
+      for (i = 0; base[i]; i++)
+	if (ISIDNUM (base[i]) || base[i] == '.')
+	  *tail++ = base[i];
+      *tail++ = '.';
+    }
+  memcpy (tail, linebuf, linebuf_len);
+  tail += linebuf_len;
+  *tail++ = '.';
+  for (i = 0; i < 16; i++)
+    sprintf (tail + i * 2, "%02x", checksum[i] & 0xff);
+
+  /* Construct a macinfo_entry for DW_MACINFO_GNU_transparent_include
+     in the empty vector entry before the first define/undef.  */
+  inc = VEC_index (macinfo_entry, macinfo_table, idx - 1);
+  inc->code = DW_MACINFO_GNU_transparent_include;
+  inc->lineno = 0;
+  inc->info = grp_name;
+  if (*macinfo_htab == NULL)
+    *macinfo_htab = htab_create (10, htab_macinfo_hash, htab_macinfo_eq, NULL);
+  /* Avoid emitting duplicates.  */
+  slot = htab_find_slot (*macinfo_htab, inc, INSERT);
+  if (*slot != NULL)
+    {
+      free (CONST_CAST (char *, inc->info));
+      inc->code = 0;
+      inc->info = NULL;
+      /* If such an entry has been used before, just emit
+	 a DW_MACINFO_GNU_transparent_include op.  */
+      inc = (macinfo_entry *) *slot;
+      output_macinfo_op (inc);
+      /* And clear all macinfo_entry in the range to avoid emitting them
+	 in the second pass.  */
+      for (i = idx;
+	   VEC_iterate (macinfo_entry, macinfo_table, i, cur)
+	   && i < idx + count;
+	   i++)
+	{
+	  cur->code = 0;
+	  free (CONST_CAST (char *, cur->info));
+	  cur->info = NULL;
+	}
+    }
+  else
+    {
+      *slot = inc;
+      inc->lineno = htab_elements (*macinfo_htab);
+      output_macinfo_op (inc);
+    }
+  return count;
+}
+
+/* Output macinfo section(s).  */
+
 static void
 output_macinfo (void)
 {
   unsigned i;
   unsigned long length = VEC_length (macinfo_entry, macinfo_table);
   macinfo_entry *ref;
+  VEC (macinfo_entry, gc) *files = NULL;
+  htab_t macinfo_htab = NULL;
 
   if (! length)
     return;
 
+  /* In the first loop, it emits the primary .debug_macinfo section
+     and after each emitted op the macinfo_entry is cleared.
+     If a longer range of define/undef ops can be optimized using
+     DW_MACINFO_GNU_transparent_include, the
+     DW_MACINFO_GNU_transparent_include op is emitted and kept in
+     the vector before the first define/undef in the range and the
+     whole range of define/undef ops is not emitted and kept.  */
   for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
     {
       switch (ref->code)
 	{
-	  case DW_MACINFO_start_file:
+	case DW_MACINFO_start_file:
+	  VEC_safe_push (macinfo_entry, gc, files, ref);
+	  break;
+	case DW_MACINFO_end_file:
+	  if (!VEC_empty (macinfo_entry, files))
 	    {
-	      int file_num = maybe_emit_file (lookup_filename (ref->info));
-	      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
-	      dw2_asm_output_data_uleb128 
-			(ref->lineno, "Included from line number %lu", 
-			 			(unsigned long)ref->lineno);
-	      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+	      macinfo_entry *file = VEC_last (macinfo_entry, files);
+	      free (CONST_CAST (char *, file->info));
+	      VEC_pop (macinfo_entry, files);
 	    }
-	    break;
-	  case DW_MACINFO_end_file:
-	    dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
-	    break;
-	  case DW_MACINFO_define:
-	    dw2_asm_output_data (1, DW_MACINFO_define, "Define macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  case DW_MACINFO_undef:
-	    dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  default:
-	   fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
-	     ASM_COMMENT_START, (unsigned long)ref->code);
+	  break;
+	case DW_MACINFO_define:
+	case DW_MACINFO_undef:
+	  if (!dwarf_strict
+	      && HAVE_COMDAT_GROUP
+	      && VEC_length (macinfo_entry, files) != 1
+	      && i > 0
+	      && i + 1 < length
+	      && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0)
+	    {
+	      unsigned count = optimize_macinfo_range (i, files, &macinfo_htab);
+	      if (count)
+		{
+		  i += count - 1;
+		  continue;
+		}
+	    }
+	  break;
+	case 0:
+	  /* A dummy entry may be inserted at the beginning to be able
+	     to optimize the whole block of predefined macros.  */
+	  if (i == 0)
+	    continue;
+	default:
 	  break;
 	}
+      output_macinfo_op (ref);
+      /* For DW_MACINFO_start_file ref->info has been copied into files
+	 vector.  */
+      if (ref->code != DW_MACINFO_start_file)
+	free (CONST_CAST (char *, ref->info));
+      ref->info = NULL;
+      ref->code = 0;
     }
+
+  if (macinfo_htab == NULL)
+    return;
+
+  htab_delete (macinfo_htab);
+
+  /* If any DW_MACINFO_GNU_transparent_include were used, on those
+     DW_MACINFO_GNU_transparent_include entries terminate the
+     current chain and switch to a new comdat .debug_macinfo
+     section and emit the define/undef entries within it.  */
+  for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
+    switch (ref->code)
+      {
+      case 0:
+	continue;
+      case DW_MACINFO_GNU_transparent_include:
+	{
+	  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+	  tree comdat_key = get_identifier (ref->info);
+	  /* Terminate the previous .debug_macinfo section.  */
+	  dw2_asm_output_data (1, 0, "End compilation unit");
+	  targetm.asm_out.named_section (DEBUG_MACINFO_SECTION,
+					 SECTION_DEBUG
+					 | SECTION_LINKONCE,
+					 comdat_key);
+	  ASM_GENERATE_INTERNAL_LABEL (label,
+				       DEBUG_MACINFO_SECTION_LABEL,
+				       ref->lineno);
+	  ASM_OUTPUT_LABEL (asm_out_file, label);
+	  ref->code = 0;
+	  free (CONST_CAST (char *, ref->info));
+	  ref->info = NULL;
+	}
+	break;
+      case DW_MACINFO_define:
+      case DW_MACINFO_undef:
+	output_macinfo_op (ref);
+	ref->code = 0;
+	free (CONST_CAST (char *, ref->info));
+	ref->info = NULL;
+	break;
+      default:
+	gcc_unreachable ();
+      }
 }
 
 /* Set up for Dwarf output at the start of compilation.  */

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 3)
  2011-07-15 21:18     ` [RFC] More compact (100x) -g3 .debug_macinfo (take 3) Jakub Jelinek
@ 2011-07-18 15:09       ` Tom Tromey
  2011-07-20  1:17       ` Richard Henderson
  1 sibling, 0 replies; 25+ messages in thread
From: Tom Tromey @ 2011-07-18 15:09 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Henderson, gcc-patches, Jason Merrill, Jan Kratochvil,
	Cary Coutant, Mark Wielaard

>>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes:

Jakub> Ok, so how about this way (as DWARF4 modifications, of course for
Jakub> DWARF5 proposal GNU_ would be gone and the ops would have different
Jakub> codes):

Thanks very much for writing it up this way.  I think it is very
important that all our DWARF extensions be well-documented.

Jakub> 6.3.1.6  Defining new opcodes and operands

Jakub> The second operand starts with an unsigned LEB128 encoded number
Jakub> of operands and for each of the operands there is one byte,
Jakub> containing a form encoding how the corresponding operand is
Jakub> encoded.

It seems to me that DW_FORM_flag_present is not useful here.

Jakub> Each so defined opcode is valid for subsequent entries until the
Jakub> terminating entry with type code 0, including any sequences
Jakub> included from those entries using
Jakub> DW_MACINFO_GNU_transparent_include.  Opcodes defined using this
Jakub> entry in a chain included through
Jakub> DW_MACINFO_GNU_transparent_include isn't valid in the parent
Jakub> sequence after the DW_MACINFO_GNU_transparent_include entry that
Jakub> included it though.

I think you can remove this second sentence.  It is implied by the first
one.

Tom

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo
  2011-07-13 20:37   ` Jakub Jelinek
@ 2011-07-18 15:42     ` Tom Tromey
  0 siblings, 0 replies; 25+ messages in thread
From: Tom Tromey @ 2011-07-18 15:42 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Jason Merrill, Richard Henderson, Jan Kratochvil, Roland McGrath,
	Cary Coutant, Mark Wielaard, gcc-patches

>>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes:

Tom> I don't think I really understood DW_MACINFO_GNU_define_opcode, so the
Tom> implementation here is probably wrong.

Jakub> Well, I think you've skipped it correctly and furthermore even patched
Jakub> GCC doesn't emit it.  The point of it was to allow skipping unknown
Jakub> opcodes.  If you implement this opcode fully and say GCC 4.8 adds a new
Jakub> vendor opcode, the old implementation would be able to silently skip
Jakub> over such opcodes.

I implemented this part today, so I think the gdb patch is complete now.

Tom

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_macinfo (take 3)
  2011-07-15 21:18     ` [RFC] More compact (100x) -g3 .debug_macinfo (take 3) Jakub Jelinek
  2011-07-18 15:09       ` Tom Tromey
@ 2011-07-20  1:17       ` Richard Henderson
  2011-07-21 11:38         ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Jakub Jelinek
  1 sibling, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2011-07-20  1:17 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil,
	Cary Coutant, Mark Wielaard

On 07/15/2011 01:58 PM, Jakub Jelinek wrote:
> 	* dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add.
> 	(DW_MACINFO_GNU_define_indirect, DW_MACINFO_GNU_undef_indirect,
> 	DW_MACINFO_GNU_transparent_include, DW_MACINFO_GNU_define_opcode):
> 	Add.
> 
> 	* dwarf2out.c (dwarf2out_define): If the vector is empty and
> 	lineno is 0, emit a dummy entry first.
> 	(dwarf2out_undef): Likewise.  Remove redundant semicolon.
> 	(htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op,
> 	optimize_macinfo_range): New functions.
> 	(output_macinfo): Use them.  If !dwarf_strict and .debug_str is
> 	mergeable, optimize longer strings using
> 	DW_MACINFO_GNU_{define,undef}_indirect and if HAVE_COMDAT_GROUP,
> 	optimize longer sequences of define/undef ops from headers
> 	using DW_MACINFO_GNU_transparent_include.

This looks much better.  Barring any other feedback from
other interested Dwarf parties, I think this can go in.


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-20  1:17       ` Richard Henderson
@ 2011-07-21 11:38         ` Jakub Jelinek
  2011-07-21 17:25           ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Jakub Jelinek @ 2011-07-21 11:38 UTC (permalink / raw)
  To: Richard Henderson
  Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil,
	Cary Coutant, Mark Wielaard

[-- Attachment #1: Type: text/plain, Size: 2391 bytes --]

On Tue, Jul 19, 2011 at 04:26:08PM -0700, Richard Henderson wrote:
> On 07/15/2011 01:58 PM, Jakub Jelinek wrote:
> > 	* dwarf2.h (DW_MACINFO_lo_user, DW_MACINFO_hi_user): Add.
> > 	(DW_MACINFO_GNU_define_indirect, DW_MACINFO_GNU_undef_indirect,
> > 	DW_MACINFO_GNU_transparent_include, DW_MACINFO_GNU_define_opcode):
> > 	Add.
> > 
> > 	* dwarf2out.c (dwarf2out_define): If the vector is empty and
> > 	lineno is 0, emit a dummy entry first.
> > 	(dwarf2out_undef): Likewise.  Remove redundant semicolon.
> > 	(htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op,
> > 	optimize_macinfo_range): New functions.
> > 	(output_macinfo): Use them.  If !dwarf_strict and .debug_str is
> > 	mergeable, optimize longer strings using
> > 	DW_MACINFO_GNU_{define,undef}_indirect and if HAVE_COMDAT_GROUP,
> > 	optimize longer sequences of define/undef ops from headers
> > 	using DW_MACINFO_GNU_transparent_include.
> 
> This looks much better.  Barring any other feedback from
> other interested Dwarf parties, I think this can go in.

Based on dwarf-discuss discussions, here is an alternative that
intoduces .debug_gnu_macro section instead and DW_AT_GNU_macros
referencing it.

Currently, the patch emits 3 byte section headers at the start of
the .debug_gnu_macro chunks referenced from .debug_info (through
DW_AT_GNU_macros), containing version number (2 byte, 4 ATM) and
1 byte section offset, but the DW_GNU_MACRO_transparent_include
referenced sequences don't have it.
The .debug_gnu_macro section isn't completely usable without the referencing
CUs anyway, so IMHO we could still get away completely without
any section header, but if we need it, the question is if
the offset size there is useful and if the section header shouldn't
go before the transparent_include chains as well (only with that
e.g. readelf -wm would be able to dump .debug_gnu_macro without
reading .debug_info and tracking offsets to it).
In x86_64 cc1plus for which I've been posting figures, I see
395 CUs referencing .debug_gnu_macro and at most 511 different
.debug_gnu_macro chains with unique md5sums.  So, the cost of the
3 byte headers is for cc1plus just in CU referenced chunks
1185 bytes, 3 byte headers in all .debug_gnu_macro chunks
2718 bytes.

Also, should the decision whether to emit .debug_gnu_macro or .debug_macinfo
depend on -gdwarf-strict, or should we have a separate switch for that?

	Jakub

[-- Attachment #2: X200 --]
[-- Type: text/plain, Size: 19219 bytes --]

2011-07-21  Jakub Jelinek  <jakub@redhat.com>

	* dwarf2.h (DW_AT_GNU_macros): New.
	(enum dwarf_gnu_macro_record_type): New enum.  Add DW_GNU_MACRO_*.

	* dwarf2out.c (struct macinfo_struct): Change code to unsigned char.
	(DEBUG_GNU_MACRO_SECTION, DEBUG_GNU_MACRO_SECTION_LABEL): Define.
	(dwarf_attr_name): Handle DW_AT_GNU_macros.
	(dwarf2out_define): If the vector is empty and
	lineno is 0, emit a dummy entry first.
	(dwarf2out_undef): Likewise.  Remove redundant semicolon.
	(htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op,
	optimize_macinfo_range): New functions.
	(output_macinfo): Use them.  If !dwarf_strict and .debug_str is
	mergeable, optimize longer strings using
	DW_GNU_MACRO_{define,undef}_indirect and if HAVE_COMDAT_GROUP,
	optimize longer sequences of define/undef ops from headers
	using DW_GNU_MACRO_transparent_include.  For !dwarf_strict
	emit a section header.
	(dwarf2out_init): For !dwarf_strict set debug_macinfo_section
	and macinfo_section_label to DEBUG_GNU_MACRO_SECTION
	resp. DEBUG_GNU_MACRO_SECTION_LABEL.
	(dwarf2out_finish): For !dwarf_strict emit DW_AT_GNU_macros
	instead of DW_AT_macro_info.

--- include/dwarf2.h.jj	2011-07-15 20:46:32.000000000 +0200
+++ include/dwarf2.h	2011-07-21 10:11:25.000000000 +0200
@@ -366,6 +366,8 @@ enum dwarf_attribute
     DW_AT_GNU_all_tail_call_sites = 0x2116,
     DW_AT_GNU_all_call_sites = 0x2117,
     DW_AT_GNU_all_source_call_sites = 0x2118,
+    /* Section offset into .debug_gnu_macro section.  */
+    DW_AT_GNU_macros = 0x2119,
     /* VMS extensions.  */
     DW_AT_VMS_rtnbeg_pd_address = 0x2201,
     /* GNAT extensions.  */
@@ -879,6 +881,21 @@ enum dwarf_macinfo_record_type
     DW_MACINFO_end_file = 4,
     DW_MACINFO_vendor_ext = 255
   };
+
+/* Names and codes for new style macro information.  */
+enum dwarf_gnu_macro_record_type
+  {
+    DW_GNU_MACRO_define = 1,
+    DW_GNU_MACRO_undef = 2,
+    DW_GNU_MACRO_start_file = 3,
+    DW_GNU_MACRO_end_file = 4,
+    DW_GNU_MACRO_define_indirect = 5,
+    DW_GNU_MACRO_undef_indirect = 6,
+    DW_GNU_MACRO_transparent_include = 7,
+    DW_GNU_MACRO_define_opcode = 8,
+    DW_GNU_MACRO_lo_user = 0xe0,
+    DW_GNU_MACRO_hi_user = 0xff
+  };
 \f
 /* @@@ For use with GNU frame unwind information.  */
 
--- gcc/dwarf2out.c.jj	2011-07-21 09:54:49.000000000 +0200
+++ gcc/dwarf2out.c	2011-07-21 10:47:03.000000000 +0200
@@ -2770,7 +2770,7 @@ struct GTY(()) dw_ranges_struct {
 /* A structure to hold a macinfo entry.  */
 
 typedef struct GTY(()) macinfo_struct {
-  unsigned HOST_WIDE_INT code;
+  unsigned char code;
   unsigned HOST_WIDE_INT lineno;
   const char *info;
 }
@@ -3417,6 +3417,9 @@ static void gen_scheduled_generic_parms_
 #ifndef DEBUG_MACINFO_SECTION
 #define DEBUG_MACINFO_SECTION	".debug_macinfo"
 #endif
+#ifndef DEBUG_GNU_MACRO_SECTION
+#define DEBUG_GNU_MACRO_SECTION	".debug_gnu_macro"
+#endif
 #ifndef DEBUG_LINE_SECTION
 #define DEBUG_LINE_SECTION	".debug_line"
 #endif
@@ -3474,6 +3477,9 @@ static void gen_scheduled_generic_parms_
 #ifndef DEBUG_MACINFO_SECTION_LABEL
 #define DEBUG_MACINFO_SECTION_LABEL     "Ldebug_macinfo"
 #endif
+#ifndef DEBUG_GNU_MACRO_SECTION_LABEL
+#define DEBUG_GNU_MACRO_SECTION_LABEL	"Ldebug_gnu_macro"
+#endif
 
 
 /* Definitions of defaults for formats and names of various special
@@ -4015,6 +4021,8 @@ dwarf_attr_name (unsigned int attr)
       return "DW_AT_GNU_all_call_sites";
     case DW_AT_GNU_all_source_call_sites:
       return "DW_AT_GNU_all_source_call_sites";
+    case DW_AT_GNU_macros:
+      return "DW_AT_GNU_macros";
 
     case DW_AT_GNAT_descriptive_type:
       return "DW_AT_GNAT_descriptive_type";
@@ -20291,6 +20299,15 @@ dwarf2out_define (unsigned int lineno AT
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
       macinfo_entry e;
+      /* Insert a dummy first entry to be able to optimize the whole
+	 predefined macro block using DW_GNU_MACRO_transparent_include.  */
+      if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0)
+	{
+	  e.code = 0;
+	  e.lineno = 0;
+	  e.info = NULL;
+	  VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
+	}
       e.code = DW_MACINFO_define;
       e.lineno = lineno;
       e.info = xstrdup (buffer);;
@@ -20309,58 +20326,376 @@ dwarf2out_undef (unsigned int lineno ATT
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
       macinfo_entry e;
+      /* Insert a dummy first entry to be able to optimize the whole
+	 predefined macro block using DW_GNU_MACRO_transparent_include.  */
+      if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0)
+	{
+	  e.code = 0;
+	  e.lineno = 0;
+	  e.info = NULL;
+	  VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
+	}
       e.code = DW_MACINFO_undef;
       e.lineno = lineno;
-      e.info = xstrdup (buffer);;
+      e.info = xstrdup (buffer);
       VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
+/* Routines to manipulate hash table of CUs.  */
+
+static hashval_t
+htab_macinfo_hash (const void *of)
+{
+  const macinfo_entry *const entry =
+    (const macinfo_entry *) of;
+
+  return htab_hash_string (entry->info);
+}
+
+static int
+htab_macinfo_eq (const void *of1, const void *of2)
+{
+  const macinfo_entry *const entry1 = (const macinfo_entry *) of1;
+  const macinfo_entry *const entry2 = (const macinfo_entry *) of2;
+
+  return !strcmp (entry1->info, entry2->info);
+}
+
+/* Output a single .debug_macinfo entry.  */
+
+static void
+output_macinfo_op (macinfo_entry *ref)
+{
+  int file_num;
+  size_t len;
+  struct indirect_string_node *node;
+  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+
+  switch (ref->code)
+    {
+    case DW_MACINFO_start_file:
+      file_num = maybe_emit_file (lookup_filename (ref->info));
+      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
+      dw2_asm_output_data_uleb128 (ref->lineno,
+				   "Included from line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+      break;
+    case DW_MACINFO_end_file:
+      dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
+      break;
+    case DW_MACINFO_define:
+    case DW_MACINFO_undef:
+      len = strlen (ref->info) + 1;
+      if (!dwarf_strict
+	  && len > DWARF_OFFSET_SIZE
+	  && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
+	  && (debug_str_section->common.flags & SECTION_MERGE) != 0)
+	{
+	  ref->code = ref->code == DW_MACINFO_define
+		      ? DW_GNU_MACRO_define_indirect
+		      : DW_GNU_MACRO_undef_indirect;
+	  output_macinfo_op (ref);
+	  return;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_define
+			   ? "Define macro" : "Undefine macro");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_nstring (ref->info, -1, "The macro");
+      break;
+    case DW_GNU_MACRO_define_indirect:
+    case DW_GNU_MACRO_undef_indirect:
+      node = find_AT_string (ref->info);
+      if (node->form != DW_FORM_strp)
+	{
+	  char label[32];
+	  ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter);
+	  ++dw2_string_counter;
+	  node->label = xstrdup (label);
+	  node->form = DW_FORM_strp;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_GNU_MACRO_define_indirect
+			   ? "Define macro indirect"
+			   : "Undefine macro indirect");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label,
+			     debug_str_section, "The macro: \"%s\"",
+			     ref->info);
+      break;
+    case DW_GNU_MACRO_transparent_include:
+      dw2_asm_output_data (1, ref->code, "Transparent include");
+      ASM_GENERATE_INTERNAL_LABEL (label,
+				   DEBUG_GNU_MACRO_SECTION_LABEL, ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL);
+      break;
+    default:
+      fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
+	       ASM_COMMENT_START, (unsigned long) ref->code);
+      break;
+    }
+}
+
+/* Attempt to make a sequence of define/undef macinfo ops shareable with
+   other compilation unit .debug_macinfo sections.  IDX is the first
+   index of a define/undef, return the number of ops that should be
+   emitted in a comdat .debug_macinfo section and emit
+   a DW_GNU_MACRO_transparent_include entry referencing it.
+   If the define/undef entry should be emitted normally, return 0.  */
+
+static unsigned
+optimize_macinfo_range (unsigned int idx, VEC (macinfo_entry, gc) *files,
+			htab_t *macinfo_htab)
+{
+  macinfo_entry *first, *second, *cur, *inc;
+  char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1];
+  unsigned char checksum[16];
+  struct md5_ctx ctx;
+  char *grp_name, *tail;
+  const char *base;
+  unsigned int i, count, encoded_filename_len, linebuf_len;
+  void **slot;
+
+  first = VEC_index (macinfo_entry, macinfo_table, idx);
+  second = VEC_index (macinfo_entry, macinfo_table, idx + 1);
+
+  /* Optimize only if there are at least two consecutive define/undef ops,
+     and either all of them are before first DW_MACINFO_start_file
+     with lineno 0 (i.e. predefined macro block), or all of them are
+     in some included header file.  */
+  if (second->code != DW_MACINFO_define && second->code != DW_MACINFO_undef)
+    return 0;
+  if (VEC_empty (macinfo_entry, files))
+    {
+      if (first->lineno != 0 || second->lineno != 0)
+	return 0;
+    }
+  else if (first->lineno == 0)
+    return 0;
+
+  /* Find the last define/undef entry that can be grouped together
+     with first and at the same time compute md5 checksum of their
+     codes, linenumbers and strings.  */
+  md5_init_ctx (&ctx);
+  for (i = idx; VEC_iterate (macinfo_entry, macinfo_table, i, cur); i++)
+    if (cur->code != DW_MACINFO_define && cur->code != DW_MACINFO_undef)
+      break;
+    else if (first->lineno == 0 && cur->lineno != 0)
+      break;
+    else
+      {
+	unsigned char code = cur->code;
+	md5_process_bytes (&code, 1, &ctx);
+	checksum_uleb128 (cur->lineno, &ctx);
+	md5_process_bytes (cur->info, strlen (cur->info) + 1, &ctx);
+      }
+  md5_finish_ctx (&ctx, checksum);
+  count = i - idx;
+
+  /* From the containing include filename (if any) pick up just
+     usable characters from its basename.  */
+  if (first->lineno == 0)
+    base = "";
+  else
+    base = lbasename (VEC_last (macinfo_entry, files)->info);
+  for (encoded_filename_len = 0, i = 0; base[i]; i++)
+    if (ISIDNUM (base[i]) || base[i] == '.')
+      encoded_filename_len++;
+  /* Count . at the end.  */
+  if (encoded_filename_len)
+    encoded_filename_len++;
+
+  sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED, first->lineno);
+  linebuf_len = strlen (linebuf);
+
+  /* The group name format is: wmN.[<encoded filename>.]<lineno>.<md5sum>  */
+  grp_name = XNEWVEC (char, 4 + encoded_filename_len + linebuf_len + 1
+		      + 16 * 2 + 1);
+  memcpy (grp_name, DWARF_OFFSET_SIZE == 4 ? "wm4." : "wm8.", 4);
+  tail = grp_name + 4;
+  if (encoded_filename_len)
+    {
+      for (i = 0; base[i]; i++)
+	if (ISIDNUM (base[i]) || base[i] == '.')
+	  *tail++ = base[i];
+      *tail++ = '.';
+    }
+  memcpy (tail, linebuf, linebuf_len);
+  tail += linebuf_len;
+  *tail++ = '.';
+  for (i = 0; i < 16; i++)
+    sprintf (tail + i * 2, "%02x", checksum[i] & 0xff);
+
+  /* Construct a macinfo_entry for DW_GNU_MACRO_transparent_include
+     in the empty vector entry before the first define/undef.  */
+  inc = VEC_index (macinfo_entry, macinfo_table, idx - 1);
+  inc->code = DW_GNU_MACRO_transparent_include;
+  inc->lineno = 0;
+  inc->info = grp_name;
+  if (*macinfo_htab == NULL)
+    *macinfo_htab = htab_create (10, htab_macinfo_hash, htab_macinfo_eq, NULL);
+  /* Avoid emitting duplicates.  */
+  slot = htab_find_slot (*macinfo_htab, inc, INSERT);
+  if (*slot != NULL)
+    {
+      free (CONST_CAST (char *, inc->info));
+      inc->code = 0;
+      inc->info = NULL;
+      /* If such an entry has been used before, just emit
+	 a DW_GNU_MACRO_transparent_include op.  */
+      inc = (macinfo_entry *) *slot;
+      output_macinfo_op (inc);
+      /* And clear all macinfo_entry in the range to avoid emitting them
+	 in the second pass.  */
+      for (i = idx;
+	   VEC_iterate (macinfo_entry, macinfo_table, i, cur)
+	   && i < idx + count;
+	   i++)
+	{
+	  cur->code = 0;
+	  free (CONST_CAST (char *, cur->info));
+	  cur->info = NULL;
+	}
+    }
+  else
+    {
+      *slot = inc;
+      inc->lineno = htab_elements (*macinfo_htab);
+      output_macinfo_op (inc);
+    }
+  return count;
+}
+
+/* Output macinfo section(s).  */
+
 static void
 output_macinfo (void)
 {
   unsigned i;
   unsigned long length = VEC_length (macinfo_entry, macinfo_table);
   macinfo_entry *ref;
+  VEC (macinfo_entry, gc) *files = NULL;
+  htab_t macinfo_htab = NULL;
 
   if (! length)
     return;
 
+  /* output_macinfo* uses these interchangeably.  */
+  gcc_assert ((int) DW_MACINFO_define == (int) DW_GNU_MACRO_define
+	      && (int) DW_MACINFO_undef == (int) DW_GNU_MACRO_undef
+	      && (int) DW_MACINFO_start_file == (int) DW_GNU_MACRO_start_file
+	      && (int) DW_MACINFO_end_file == (int) DW_GNU_MACRO_end_file);
+
+  /* For .debug_gnu_macro emit the section header.  */
+  if (!dwarf_strict)
+    {
+      dw2_asm_output_data (2, 4, "DWARF GNU macro version number");
+      dw2_asm_output_data (1, DWARF_OFFSET_SIZE, "Offset size");
+    }
+
+  /* In the first loop, it emits the primary .debug_macinfo section
+     and after each emitted op the macinfo_entry is cleared.
+     If a longer range of define/undef ops can be optimized using
+     DW_GNU_MACRO_transparent_include, the
+     DW_GNU_MACRO_transparent_include op is emitted and kept in
+     the vector before the first define/undef in the range and the
+     whole range of define/undef ops is not emitted and kept.  */
   for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
     {
       switch (ref->code)
 	{
-	  case DW_MACINFO_start_file:
+	case DW_MACINFO_start_file:
+	  VEC_safe_push (macinfo_entry, gc, files, ref);
+	  break;
+	case DW_MACINFO_end_file:
+	  if (!VEC_empty (macinfo_entry, files))
 	    {
-	      int file_num = maybe_emit_file (lookup_filename (ref->info));
-	      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
-	      dw2_asm_output_data_uleb128 
-			(ref->lineno, "Included from line number %lu", 
-			 			(unsigned long)ref->lineno);
-	      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+	      macinfo_entry *file = VEC_last (macinfo_entry, files);
+	      free (CONST_CAST (char *, file->info));
+	      VEC_pop (macinfo_entry, files);
 	    }
-	    break;
-	  case DW_MACINFO_end_file:
-	    dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
-	    break;
-	  case DW_MACINFO_define:
-	    dw2_asm_output_data (1, DW_MACINFO_define, "Define macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  case DW_MACINFO_undef:
-	    dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  default:
-	   fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
-	     ASM_COMMENT_START, (unsigned long)ref->code);
+	  break;
+	case DW_MACINFO_define:
+	case DW_MACINFO_undef:
+	  if (!dwarf_strict
+	      && HAVE_COMDAT_GROUP
+	      && VEC_length (macinfo_entry, files) != 1
+	      && i > 0
+	      && i + 1 < length
+	      && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0)
+	    {
+	      unsigned count = optimize_macinfo_range (i, files, &macinfo_htab);
+	      if (count)
+		{
+		  i += count - 1;
+		  continue;
+		}
+	    }
+	  break;
+	case 0:
+	  /* A dummy entry may be inserted at the beginning to be able
+	     to optimize the whole block of predefined macros.  */
+	  if (i == 0)
+	    continue;
+	default:
 	  break;
 	}
+      output_macinfo_op (ref);
+      /* For DW_MACINFO_start_file ref->info has been copied into files
+	 vector.  */
+      if (ref->code != DW_MACINFO_start_file)
+	free (CONST_CAST (char *, ref->info));
+      ref->info = NULL;
+      ref->code = 0;
     }
+
+  if (macinfo_htab == NULL)
+    return;
+
+  htab_delete (macinfo_htab);
+
+  /* If any DW_GNU_MACRO_transparent_include were used, on those
+     DW_GNU_MACRO_transparent_include entries terminate the
+     current chain and switch to a new comdat .debug_macinfo
+     section and emit the define/undef entries within it.  */
+  for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
+    switch (ref->code)
+      {
+      case 0:
+	continue;
+      case DW_GNU_MACRO_transparent_include:
+	{
+	  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+	  tree comdat_key = get_identifier (ref->info);
+	  /* Terminate the previous .debug_macinfo section.  */
+	  dw2_asm_output_data (1, 0, "End compilation unit");
+	  targetm.asm_out.named_section (DEBUG_GNU_MACRO_SECTION,
+					 SECTION_DEBUG
+					 | SECTION_LINKONCE,
+					 comdat_key);
+	  ASM_GENERATE_INTERNAL_LABEL (label,
+				       DEBUG_GNU_MACRO_SECTION_LABEL,
+				       ref->lineno);
+	  ASM_OUTPUT_LABEL (asm_out_file, label);
+	  ref->code = 0;
+	  free (CONST_CAST (char *, ref->info));
+	  ref->info = NULL;
+	}
+	break;
+      case DW_MACINFO_define:
+      case DW_MACINFO_undef:
+	output_macinfo_op (ref);
+	ref->code = 0;
+	free (CONST_CAST (char *, ref->info));
+	ref->info = NULL;
+	break;
+      default:
+	gcc_unreachable ();
+      }
 }
 
 /* Set up for Dwarf output at the start of compilation.  */
@@ -20409,7 +20744,9 @@ dwarf2out_init (const char *filename ATT
 				      SECTION_DEBUG, NULL);
   debug_aranges_section = get_section (DEBUG_ARANGES_SECTION,
 				       SECTION_DEBUG, NULL);
-  debug_macinfo_section = get_section (DEBUG_MACINFO_SECTION,
+  debug_macinfo_section = get_section (dwarf_strict
+				       ? DEBUG_MACINFO_SECTION
+				       : DEBUG_GNU_MACRO_SECTION,
 				       SECTION_DEBUG, NULL);
   debug_line_section = get_section (DEBUG_LINE_SECTION,
 				    SECTION_DEBUG, NULL);
@@ -20441,7 +20778,9 @@ dwarf2out_init (const char *filename ATT
   ASM_GENERATE_INTERNAL_LABEL (ranges_section_label,
 			       DEBUG_RANGES_SECTION_LABEL, 0);
   ASM_GENERATE_INTERNAL_LABEL (macinfo_section_label,
-			       DEBUG_MACINFO_SECTION_LABEL, 0);
+			       dwarf_strict
+			       ? DEBUG_MACINFO_SECTION_LABEL
+			       : DEBUG_GNU_MACRO_SECTION_LABEL, 0);
 
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     macinfo_table = VEC_alloc (macinfo_entry, gc, 64);
@@ -21984,7 +22323,9 @@ dwarf2out_finish (const char *filename)
 		    debug_line_section_label);
 
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
-    add_AT_macptr (comp_unit_die (), DW_AT_macro_info, macinfo_section_label);
+    add_AT_macptr (comp_unit_die (),
+		   dwarf_strict ? DW_AT_macro_info : DW_AT_GNU_macros,
+		   macinfo_section_label);
 
   if (have_location_lists)
     optimize_location_lists (comp_unit_die ());

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-21 11:38         ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Jakub Jelinek
@ 2011-07-21 17:25           ` Richard Henderson
  2011-07-21 18:13             ` Jakub Jelinek
                               ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Richard Henderson @ 2011-07-21 17:25 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil,
	Cary Coutant, Mark Wielaard

On 07/21/2011 04:22 AM, Jakub Jelinek wrote:
> Currently, the patch emits 3 byte section headers at the start of
> the .debug_gnu_macro chunks referenced from .debug_info (through
> DW_AT_GNU_macros), containing version number (2 byte, 4 ATM) and
> 1 byte section offset, but the DW_GNU_MACRO_transparent_include
> referenced sequences don't have it.
> The .debug_gnu_macro section isn't completely usable without the referencing
> CUs anyway, so IMHO we could still get away completely without
> any section header, but if we need it, the question is if
> the offset size there is useful and if the section header shouldn't
> go before the transparent_include chains as well (only with that
> e.g. readelf -wm would be able to dump .debug_gnu_macro without
> reading .debug_info and tracking offsets to it).

I've been wondering if the header shouldn't contain the opcode
definitions, similar to .debug_line, and drop your _define_opcode.
It does mean that you couldn't re-define opcodes within any one
sequence, but does that actually seem useful?

Defining the opcodes in the header makes it clear that there 
should be a header for the include sequences, and that makes it
clear that the defined opcodes are local to a given sequence,
without having to have awkward wording as for _define_opcode.

I do like mjw's idea of using the version number to distinguish
our implementation and one with the dwarf5 stamp of approval.
This suggests going ahead with .debug_macro as the section name.

> In x86_64 cc1plus for which I've been posting figures, I see
> 395 CUs referencing .debug_gnu_macro and at most 511 different
> .debug_gnu_macro chains with unique md5sums.  So, the cost of the
> 3 byte headers is for cc1plus just in CU referenced chunks
> 1185 bytes, 3 byte headers in all .debug_gnu_macro chunks
> 2718 bytes.

Putting the opcode definitions into the header would increase
the overhead more, somewhere between 12 and 20 bytes per chain.
Which is, I think still manageable.

> Also, should the decision whether to emit .debug_gnu_macro or .debug_macinfo
> depend on -gdwarf-strict, or should we have a separate switch for that?

I'm fine with strict.  Anyone else have an opinion?


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-21 17:25           ` Richard Henderson
@ 2011-07-21 18:13             ` Jakub Jelinek
  2011-07-22 13:49             ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek
  2011-07-22 20:33             ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Michael Eager
  2 siblings, 0 replies; 25+ messages in thread
From: Jakub Jelinek @ 2011-07-21 18:13 UTC (permalink / raw)
  To: Richard Henderson
  Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil,
	Cary Coutant, Mark Wielaard, dwarf-discuss

On Thu, Jul 21, 2011 at 10:10:39AM -0700, Richard Henderson wrote:
> On 07/21/2011 04:22 AM, Jakub Jelinek wrote:
> > Currently, the patch emits 3 byte section headers at the start of
> > the .debug_gnu_macro chunks referenced from .debug_info (through
> > DW_AT_GNU_macros), containing version number (2 byte, 4 ATM) and
> > 1 byte section offset, but the DW_GNU_MACRO_transparent_include
> > referenced sequences don't have it.
> > The .debug_gnu_macro section isn't completely usable without the referencing
> > CUs anyway, so IMHO we could still get away completely without
> > any section header, but if we need it, the question is if
> > the offset size there is useful and if the section header shouldn't
> > go before the transparent_include chains as well (only with that
> > e.g. readelf -wm would be able to dump .debug_gnu_macro without
> > reading .debug_info and tracking offsets to it).
> 
> I've been wondering if the header shouldn't contain the opcode
> definitions, similar to .debug_line, and drop your _define_opcode.
> It does mean that you couldn't re-define opcodes within any one
> sequence, but does that actually seem useful?

I've talked to Tom about it last night.  The advantage of
not having it in the header is saving 1 byte for the case when
no extension opcodes need to be defined, and perhaps if we changed
the wording that the defined opcodes end at 0 termination to allow
it to last, then we could with many opcodes share the opcode arguments
descriptions.
So we could have
DW_GNU_MACRO_transparent_include .Ldebug_macro17
after the header of many sections and then
.Ldebug_macro17:
DW_GNU_MACRO_define_opcode DW_GNU_MACRO_lo_user+0 1 DW_FORM_udata
DW_GNU_MACRO_define_opcode DW_GNU_MACRO_lo_user+1 2 DW_FORM_udata DW_FORM_sdata
DW_GNU_MACRO_define_opcode DW_GNU_MACRO_lo_user+2 1 DW_FORM_strp
0

If the opcode definitions were in the header, then they could be
either after a uleb128 that would say how many of the definitions there
are, followed by what I've been proposing as DW_GNU_MACRO_define_opcode
arguments alone (i.e. opcode number and DW_FORM_block array of forms
for arguments).  Or it could be instead without the uleb128, but zero
terminated.

> Defining the opcodes in the header makes it clear that there 
> should be a header for the include sequences, and that makes it
> clear that the defined opcodes are local to a given sequence,
> without having to have awkward wording as for _define_opcode.
> 
> I do like mjw's idea of using the version number to distinguish
> our implementation and one with the dwarf5 stamp of approval.
> This suggests going ahead with .debug_macro as the section name.

If we knew that DWARF5 would either start the .debug_macro sections
with a header starting with the 2 byte version and the version there would be
5 (I think if it does start with a 2 byte version number, it would use 5),
then perhaps it would be safe to use .debug_macro section with version 4 (or
1?).  Shall we use DW_GNU_MACRO_* names, or DW_MACRO_GNU_* names?

> > In x86_64 cc1plus for which I've been posting figures, I see
> > 395 CUs referencing .debug_gnu_macro and at most 511 different
> > .debug_gnu_macro chains with unique md5sums.  So, the cost of the
> > 3 byte headers is for cc1plus just in CU referenced chunks
> > 1185 bytes, 3 byte headers in all .debug_gnu_macro chunks
> > 2718 bytes.
> 
> Putting the opcode definitions into the header would increase
> the overhead more, somewhere between 12 and 20 bytes per chain.
> Which is, I think still manageable.

The question is, do we want to always describe all the opcodes we use,
or can we assume the ops described in the corresponding standard as
given?  Say if DWARF 5 (and our version 4) defines 8 standard opcodes,
and DWARF 6 adds another 3, and we want to use the new opcodes, with
-gdwarf-5 -gno-strict-dwarf we'd define the opcode arguments for the
3 DWARF 6 ops (or a subset of them that we actually use), while for
-gdwarf-6 we wouldn't define any and just put version 6 into the section.

> > Also, should the decision whether to emit .debug_gnu_macro or .debug_macinfo
> > depend on -gdwarf-strict, or should we have a separate switch for that?
> 
> I'm fine with strict.  Anyone else have an opinion?

Ok.

	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5)
  2011-07-21 17:25           ` Richard Henderson
  2011-07-21 18:13             ` Jakub Jelinek
@ 2011-07-22 13:49             ` Jakub Jelinek
  2011-07-22 15:34               ` Tom Tromey
  2011-07-22 17:24               ` Richard Henderson
  2011-07-22 20:33             ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Michael Eager
  2 siblings, 2 replies; 25+ messages in thread
From: Jakub Jelinek @ 2011-07-22 13:49 UTC (permalink / raw)
  To: Richard Henderson
  Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil,
	Cary Coutant, Mark Wielaard

On Thu, Jul 21, 2011 at 10:10:39AM -0700, Richard Henderson wrote:
> On 07/21/2011 04:22 AM, Jakub Jelinek wrote:
> > Currently, the patch emits 3 byte section headers at the start of
> > the .debug_gnu_macro chunks referenced from .debug_info (through
> > DW_AT_GNU_macros), containing version number (2 byte, 4 ATM) and
> > 1 byte section offset, but the DW_GNU_MACRO_transparent_include
> > referenced sequences don't have it.
> > The .debug_gnu_macro section isn't completely usable without the referencing
> > CUs anyway, so IMHO we could still get away completely without
> > any section header, but if we need it, the question is if
> > the offset size there is useful and if the section header shouldn't
> > go before the transparent_include chains as well (only with that
> > e.g. readelf -wm would be able to dump .debug_gnu_macro without
> > reading .debug_info and tracking offsets to it).
> 
> I've been wondering if the header shouldn't contain the opcode
> definitions, similar to .debug_line, and drop your _define_opcode.
> It does mean that you couldn't re-define opcodes within any one
> sequence, but does that actually seem useful?

Ok, based on further discussions here, on Dwarf-discuss and on IRC
here is a hopefully final version.

Changes since last version:
1) uses .debug_macro section instead of .debug_gnu_macro
2) GNU_MACRO -> MACRO_GNU
3) define_opcode is gone
4) each .debug_macro part has a header, 3+ byte long
   2 byte version (4)
   1 byte flags
	(flag & 1) ? "64-bit DWARF format" : "32-bit DWARF format"
	(flag & 2) ? "lineptr present" : "lineptr missing"
	(flag & 4) ? "defopcode table present" : "defopcode table missing"
  [{4,8} byte lineptr offset (if flag & 2)]
  [1-byte count; count x { 1-byte opcode; uleb128 len; len x { 1-byte DW_FORM_* } } (if flag & 4)]
5) for the defopcode the standard lists DW_FORM_* values that are allowed,
   which rules out e.g. DW_FORM_addr (which would require us to know address size),
   DW_FORM_flag_present (which is meaningless in that case), etc.
Currently GCC emits the base macro sequences with lineptr included, while
the chains referenced from transparent_include with no lineptr and
no DW_MACRO_GNU_*_file in it.  Perhaps we could gain some more savings
by creating a transparent_include chain also from everything after depth > 1
start_file up to (and including) corresponding end_file, if it is at least a
few ops long, probably still by using transparent_includes for the
consecutive define/undef ranges.  The reason for the define/undef only
sequences is now primarily include guards in the headers (different
.debug_line reason is gone with the possibility to put the .debug_line
reference in the header).  By keeping it two level we could have bigger
savings if in several CUs include guards didn't make a difference or the
header inclusion is the same, but still not lose completely if include
guards make a difference.  Anyway, this proposal format allows to add it any
time if it results in significant savings.

DWARF edits:
============================
2.2
Change "Macro information (#define, #undef)" to "Legacy macro information (#define, #undef)".
Add
"DW_AT_macros		Macro information (#define, #undef)"
row to the figure 2.

3.1.1
5. Replace "DW_AT_macro_info" with "DW_AT_macros".  Replace ".debug_macinfo"
with ".debug_macro".  Add at the end of the paragraph
"The DW_AT_macro_info attribute instead might refer to the .debug_macinfo
section as defined in DWARF version 4."

6.3
Replace ".debug_macinfo" with ".debug_macro".
Replace:
"The macro information for each compilation unit is represented as a series
of “macinfo” entries. Each macinfo entry consists of a “type code”
and up to two additional operands."
with:
"The macro information for each compilation unit starts with a section
header followed by a series of “macinfo” entries. Each macinfo entry
consists of a “type code” and zero or more operands."
Add at the end:
"The section header starts with a 2-byte version number, followed by
1-byte flags value.  If the least significant bit (bit 0) in the flags is
cleared, this is 32-bit DWARF format macro section and offsets are 4 byte long,
if it is set, it is 64-bit DWARF format macro section and offsets are 8 byte long.
If the second least significant bit (bit 1) in the flags is set,
the flags byte is followed by an offset in the .debug_line section of the
beginning of the line number information, encoded as 4 byte offset for
32-bit DWARF format macro section and 8 byte offset for 64-bit DWARF format
macro section.  If the third least significant bit (bit 2) in the flags is set,
this is followed by a table describing arguments of the macinfo entry types.
The macinfo entry types defined in this standard may, but might not, be
described in the table, other macinfo entry types used in the section
should be described there.  Vendor extension macinfo entry types should be
allocated in the range from DW_MACRO_lo_user to DW_MACRO_hi_user, other
unassigned codes are reserved for future DWARF standards.
The table starts with a 1-byte count of the defined opcodes, followed by
an entry for each of those opcodes.  Each entry starts with a 1-byte
opcode number, followed by unsigned LEB128 encoded number of arguments
and for each argument there is a single byte describing the form in which
the argument is encoded.  The allowed values are DW_FORM_data1,
DW_FORM_data2, DW_FORM_data4, DW_FORM_data8, DW_FORM_sdata, DW_FORM_udata,
DW_FORM_block, DW_FORM_block1, DW_FORM_block2, DW_FORM_block4, DW_FORM_flag,
DW_FORM_string, DW_FORM_strp and DW_FORM_sec_offset.
This table allows a consumer to skip over unknown macinfo entry types."

6.3.1
Replace:
"DW_MACINFO_define	A macro definition.
DW_MACINFO_undef	A macro undefinition.
DW_MACINFO_start_file	The start of a new source file inclusion.
DW_MACINFO_end_file	The end of the current source file inclusion.
DW_MACINFO_vendor_ext	Vendor specific macro information directives."
with
"DW_MACRO_define		A macro definition.
DW_MACRO_undef			A macro undefinition.
DW_MACRO_start_file		The start of a new source file inclusion.
DW_MACRO_end_file		The end of the current source file inclusion.
DW_MACRO_define_indirect	A macro definition.
DW_MACRO_undef_indirect		A macro undefinition.
DW_MACRO_transparent_include	Include a sequence of entries from given offset."

6.3.1.1
Replace all "DW_MACINFO_define" occurrences with "DW_MACRO_define" and
all "DW_MACINFO_undef" with "DW_MACRO_undef".  Append:
"All DW_MACRO_define_indirect and DW_MACRO_undef_indirect entries have
two operands.  The first operand encodes the line number of the source line
on which the relevant defining or undefining macro directives appeared.
The second operand consists of an offset into a string table contained in
the .debug_str section of the object file.  The size of the operand is
given in the section header offset size.  Apart from the
encoding of the operands these entries are equivalent to DW_MACRO_define
resp. DW_MACRO_undef.

6.3.1.2
Replace all "DW_MACINFO_start_file" occurrences with "DW_MACRO_start_file".

Append: "If any DW_MACINFO_start_file entries are present, the header should
contain a reference to .debug_line section."

6.3.1.3
Replace all "DW_MACINFO_end_file" occurrences with "DW_MACRO_end_file".

6.3.1.4
Replace the whole section with:
"6.3.1.4  Transparent inclusion of a sequence of entries

A DW_MACRO_transparent_include entry has one operand, offset into
another part of the .debug_macro section.  The size of the operand
is given in the section header offset size.  This entry instructs
the consumer to replace this entry with a sequence of macinfo entries found
after the section header at the given .debug_macro offset, up to, but excluding,
the terminating entry with type code 0.  This entry type is aimed at sharing duplicate
sequences of macinfo entries between macinfo from different compilation
units."

6.3.2
Replace "DW_MACINFO_start_file" with "DW_MACRO_start_file" and
"DW_MACINFO_end_file" with "DW_MACRO_end_file".

6.3.3
Replace "DW_MACINFO_define and DW_MACINFO_undef" with
"DW_MACRO_define, DW_MACRO_define_indirect, DW_MACRO_undef and
DW_MACRO_undef_indirect" and replace
"DW_MACINFO_define or DW_MACINFO_undef" with
"DW_MACRO_define, DW_MACRO_define_indirect, DW_MACRO_undef or
DW_MACRO_undef_indirect".
Replace "DW_MACINFO_start_file" with "DW_MACRO_start_file".

6.3.4
Replace ".debug_macinfo" with ".debug_macro".

7.5.4
Replace ".debug_macinfo section to the first byte" with
".debug_macro section to the header".

Add:
"DW_AT_macros	0xXX	macptr"
to figure 20.

7.22
Remove:
"as are the constants in a DW_MACINFO_vendor_ext entry".

Replace figure 39 with:
"Macinfo Type Name		Value
DW_MACRO_define			0x01
DW_MACRO_undef			0x02
DW_MACRO_start_file		0x03
DW_MACRO_end_file		0x04
DW_MACRO_define_indirect	0x05
DW_MACRO_undef_indirect		0x06
DW_MACRO_transparent_include	0x07
DW_MACRO_lo_user		0xe0
DW_MACRO_hi_user		0xff

Figure 39. Macinfo Type Encodings"

Appendix A
Add "DW_AT_macros" to allowable attributes for "DW_TAG_compile_unit"
and "DW_TAG_partial_unit".

Appendix B
Replace "DW_AT_macinfo" with "DW_AT_macros".  Replace
".debug_macinfo" with ".debug_macro".  Add
( .debug_macro ) -> [ DW_MACRO_transparent_include (j) ] -> ( .debug_macro ) and
( .debug_macro ) -> [ DW_MACRO_define_indirect, DW_MACRO_undef_indirect (k) ] -> ( .debug_str )
( .debug_macro ) -> [ DW_MACRO_start_file (l) ] -> ( .debug_line )
to the picture.
In (h) replace ".debug_macinfo" with ".debug_macro".
Add:
(j) .debug_macro A macinfo operand of the form DW_FORM_sec_offset is an
		 offset into another part of the .debug_macro section,
		 to the first macinfo entry in the sequence instead of
		 a section header.
(k) .debug_macro A macinfo operand of the form DW_FORM_strp is an offset
		 into the .debug_str section of the corresponding string.
(l) .debug_macro DW_MACRO_start_file second operand refers to a file entry
		 in the .debug_line section, with the .debug_macro header
		 containing an offset to the start of the referenced
		 .debug_line section."

Appendix F
Change
".debug_macinfo	-	-	-"
to
".debug_macinfo	-	-	-	x"
Add
".debug_macro	x	x	x	5"
row to the table.
============================

Patch has been bootstrapped/regtested on x86_64-linux and i686-linux.
Ok for trunk?

2011-07-22  Jakub Jelinek  <jakub@redhat.com>

	* dwarf2.h (DW_AT_GNU_macros): New.
	(enum dwarf_macro_record_type): New enum.  Add DW_MACRO_GNU_*.

	* dwarf2out.c (struct macinfo_struct): Change code to unsigned char.
	(DEBUG_MACRO_SECTION, DEBUG_MACRO_SECTION_LABEL): Define.
	(dwarf_attr_name): Handle DW_AT_GNU_macros.
	(dwarf2out_define): If the vector is empty and
	lineno is 0, emit a dummy entry first.
	(dwarf2out_undef): Likewise.  Remove redundant semicolon.
	(htab_macinfo_hash, htab_macinfo_eq, output_macinfo_op,
	optimize_macinfo_range): New functions.
	(output_macinfo): Use them.  If !dwarf_strict and .debug_str is
	mergeable, optimize longer strings using
	DW_MACRO_GNU_{define,undef}_indirect and if HAVE_COMDAT_GROUP,
	optimize longer sequences of define/undef ops from headers
	using DW_MACRO_GNU_transparent_include.  For !dwarf_strict
	emit a section headers.
	(dwarf2out_init): For !dwarf_strict set debug_macinfo_section
	and macinfo_section_label to DEBUG_MACRO_SECTION
	resp. DEBUG_MACRO_SECTION_LABEL.
	(dwarf2out_finish): For !dwarf_strict emit DW_AT_GNU_macros
	instead of DW_AT_macro_info.

--- include/dwarf2.h.jj	2011-07-15 20:46:32.000000000 +0200
+++ include/dwarf2.h	2011-07-22 09:06:47.000000000 +0200
@@ -366,6 +366,8 @@ enum dwarf_attribute
     DW_AT_GNU_all_tail_call_sites = 0x2116,
     DW_AT_GNU_all_call_sites = 0x2117,
     DW_AT_GNU_all_source_call_sites = 0x2118,
+    /* Section offset into .debug_macro section.  */
+    DW_AT_GNU_macros = 0x2119,
     /* VMS extensions.  */
     DW_AT_VMS_rtnbeg_pd_address = 0x2201,
     /* GNAT extensions.  */
@@ -879,6 +881,20 @@ enum dwarf_macinfo_record_type
     DW_MACINFO_end_file = 4,
     DW_MACINFO_vendor_ext = 255
   };
+
+/* Names and codes for new style macro information.  */
+enum dwarf_macro_record_type
+  {
+    DW_MACRO_GNU_define = 1,
+    DW_MACRO_GNU_undef = 2,
+    DW_MACRO_GNU_start_file = 3,
+    DW_MACRO_GNU_end_file = 4,
+    DW_MACRO_GNU_define_indirect = 5,
+    DW_MACRO_GNU_undef_indirect = 6,
+    DW_MACRO_GNU_transparent_include = 7,
+    DW_MACRO_GNU_lo_user = 0xe0,
+    DW_MACRO_GNU_hi_user = 0xff
+  };
 \f
 /* @@@ For use with GNU frame unwind information.  */
 
--- gcc/dwarf2out.c.jj	2011-07-21 09:54:49.000000000 +0200
+++ gcc/dwarf2out.c	2011-07-22 09:18:49.000000000 +0200
@@ -2770,7 +2770,7 @@ struct GTY(()) dw_ranges_struct {
 /* A structure to hold a macinfo entry.  */
 
 typedef struct GTY(()) macinfo_struct {
-  unsigned HOST_WIDE_INT code;
+  unsigned char code;
   unsigned HOST_WIDE_INT lineno;
   const char *info;
 }
@@ -3417,6 +3417,9 @@ static void gen_scheduled_generic_parms_
 #ifndef DEBUG_MACINFO_SECTION
 #define DEBUG_MACINFO_SECTION	".debug_macinfo"
 #endif
+#ifndef DEBUG_MACRO_SECTION
+#define DEBUG_MACRO_SECTION	".debug_macro"
+#endif
 #ifndef DEBUG_LINE_SECTION
 #define DEBUG_LINE_SECTION	".debug_line"
 #endif
@@ -3474,6 +3477,9 @@ static void gen_scheduled_generic_parms_
 #ifndef DEBUG_MACINFO_SECTION_LABEL
 #define DEBUG_MACINFO_SECTION_LABEL     "Ldebug_macinfo"
 #endif
+#ifndef DEBUG_MACRO_SECTION_LABEL
+#define DEBUG_MACRO_SECTION_LABEL	"Ldebug_macro"
+#endif
 
 
 /* Definitions of defaults for formats and names of various special
@@ -4015,6 +4021,8 @@ dwarf_attr_name (unsigned int attr)
       return "DW_AT_GNU_all_call_sites";
     case DW_AT_GNU_all_source_call_sites:
       return "DW_AT_GNU_all_source_call_sites";
+    case DW_AT_GNU_macros:
+      return "DW_AT_GNU_macros";
 
     case DW_AT_GNAT_descriptive_type:
       return "DW_AT_GNAT_descriptive_type";
@@ -20291,6 +20299,15 @@ dwarf2out_define (unsigned int lineno AT
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
       macinfo_entry e;
+      /* Insert a dummy first entry to be able to optimize the whole
+	 predefined macro block using DW_MACRO_GNU_transparent_include.  */
+      if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0)
+	{
+	  e.code = 0;
+	  e.lineno = 0;
+	  e.info = NULL;
+	  VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
+	}
       e.code = DW_MACINFO_define;
       e.lineno = lineno;
       e.info = xstrdup (buffer);;
@@ -20309,58 +20326,386 @@ dwarf2out_undef (unsigned int lineno ATT
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     {
       macinfo_entry e;
+      /* Insert a dummy first entry to be able to optimize the whole
+	 predefined macro block using DW_MACRO_GNU_transparent_include.  */
+      if (VEC_empty (macinfo_entry, macinfo_table) && lineno == 0)
+	{
+	  e.code = 0;
+	  e.lineno = 0;
+	  e.info = NULL;
+	  VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
+	}
       e.code = DW_MACINFO_undef;
       e.lineno = lineno;
-      e.info = xstrdup (buffer);;
+      e.info = xstrdup (buffer);
       VEC_safe_push (macinfo_entry, gc, macinfo_table, &e);
     }
 }
 
+/* Routines to manipulate hash table of CUs.  */
+
+static hashval_t
+htab_macinfo_hash (const void *of)
+{
+  const macinfo_entry *const entry =
+    (const macinfo_entry *) of;
+
+  return htab_hash_string (entry->info);
+}
+
+static int
+htab_macinfo_eq (const void *of1, const void *of2)
+{
+  const macinfo_entry *const entry1 = (const macinfo_entry *) of1;
+  const macinfo_entry *const entry2 = (const macinfo_entry *) of2;
+
+  return !strcmp (entry1->info, entry2->info);
+}
+
+/* Output a single .debug_macinfo entry.  */
+
+static void
+output_macinfo_op (macinfo_entry *ref)
+{
+  int file_num;
+  size_t len;
+  struct indirect_string_node *node;
+  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+
+  switch (ref->code)
+    {
+    case DW_MACINFO_start_file:
+      file_num = maybe_emit_file (lookup_filename (ref->info));
+      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
+      dw2_asm_output_data_uleb128 (ref->lineno,
+				   "Included from line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+      break;
+    case DW_MACINFO_end_file:
+      dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
+      break;
+    case DW_MACINFO_define:
+    case DW_MACINFO_undef:
+      len = strlen (ref->info) + 1;
+      if (!dwarf_strict
+	  && len > DWARF_OFFSET_SIZE
+	  && !DWARF2_INDIRECT_STRING_SUPPORT_MISSING_ON_TARGET
+	  && (debug_str_section->common.flags & SECTION_MERGE) != 0)
+	{
+	  ref->code = ref->code == DW_MACINFO_define
+		      ? DW_MACRO_GNU_define_indirect
+		      : DW_MACRO_GNU_undef_indirect;
+	  output_macinfo_op (ref);
+	  return;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACINFO_define
+			   ? "Define macro" : "Undefine macro");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_nstring (ref->info, -1, "The macro");
+      break;
+    case DW_MACRO_GNU_define_indirect:
+    case DW_MACRO_GNU_undef_indirect:
+      node = find_AT_string (ref->info);
+      if (node->form != DW_FORM_strp)
+	{
+	  char label[32];
+	  ASM_GENERATE_INTERNAL_LABEL (label, "LASF", dw2_string_counter);
+	  ++dw2_string_counter;
+	  node->label = xstrdup (label);
+	  node->form = DW_FORM_strp;
+	}
+      dw2_asm_output_data (1, ref->code,
+			   ref->code == DW_MACRO_GNU_define_indirect
+			   ? "Define macro indirect"
+			   : "Undefine macro indirect");
+      dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
+				   (unsigned long) ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, node->label,
+			     debug_str_section, "The macro: \"%s\"",
+			     ref->info);
+      break;
+    case DW_MACRO_GNU_transparent_include:
+      dw2_asm_output_data (1, ref->code, "Transparent include");
+      ASM_GENERATE_INTERNAL_LABEL (label,
+				   DEBUG_MACRO_SECTION_LABEL, ref->lineno);
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, label, NULL, NULL);
+      break;
+    default:
+      fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
+	       ASM_COMMENT_START, (unsigned long) ref->code);
+      break;
+    }
+}
+
+/* Attempt to make a sequence of define/undef macinfo ops shareable with
+   other compilation unit .debug_macinfo sections.  IDX is the first
+   index of a define/undef, return the number of ops that should be
+   emitted in a comdat .debug_macinfo section and emit
+   a DW_MACRO_GNU_transparent_include entry referencing it.
+   If the define/undef entry should be emitted normally, return 0.  */
+
+static unsigned
+optimize_macinfo_range (unsigned int idx, VEC (macinfo_entry, gc) *files,
+			htab_t *macinfo_htab)
+{
+  macinfo_entry *first, *second, *cur, *inc;
+  char linebuf[sizeof (HOST_WIDE_INT) * 3 + 1];
+  unsigned char checksum[16];
+  struct md5_ctx ctx;
+  char *grp_name, *tail;
+  const char *base;
+  unsigned int i, count, encoded_filename_len, linebuf_len;
+  void **slot;
+
+  first = VEC_index (macinfo_entry, macinfo_table, idx);
+  second = VEC_index (macinfo_entry, macinfo_table, idx + 1);
+
+  /* Optimize only if there are at least two consecutive define/undef ops,
+     and either all of them are before first DW_MACINFO_start_file
+     with lineno 0 (i.e. predefined macro block), or all of them are
+     in some included header file.  */
+  if (second->code != DW_MACINFO_define && second->code != DW_MACINFO_undef)
+    return 0;
+  if (VEC_empty (macinfo_entry, files))
+    {
+      if (first->lineno != 0 || second->lineno != 0)
+	return 0;
+    }
+  else if (first->lineno == 0)
+    return 0;
+
+  /* Find the last define/undef entry that can be grouped together
+     with first and at the same time compute md5 checksum of their
+     codes, linenumbers and strings.  */
+  md5_init_ctx (&ctx);
+  for (i = idx; VEC_iterate (macinfo_entry, macinfo_table, i, cur); i++)
+    if (cur->code != DW_MACINFO_define && cur->code != DW_MACINFO_undef)
+      break;
+    else if (first->lineno == 0 && cur->lineno != 0)
+      break;
+    else
+      {
+	unsigned char code = cur->code;
+	md5_process_bytes (&code, 1, &ctx);
+	checksum_uleb128 (cur->lineno, &ctx);
+	md5_process_bytes (cur->info, strlen (cur->info) + 1, &ctx);
+      }
+  md5_finish_ctx (&ctx, checksum);
+  count = i - idx;
+
+  /* From the containing include filename (if any) pick up just
+     usable characters from its basename.  */
+  if (first->lineno == 0)
+    base = "";
+  else
+    base = lbasename (VEC_last (macinfo_entry, files)->info);
+  for (encoded_filename_len = 0, i = 0; base[i]; i++)
+    if (ISIDNUM (base[i]) || base[i] == '.')
+      encoded_filename_len++;
+  /* Count . at the end.  */
+  if (encoded_filename_len)
+    encoded_filename_len++;
+
+  sprintf (linebuf, HOST_WIDE_INT_PRINT_UNSIGNED, first->lineno);
+  linebuf_len = strlen (linebuf);
+
+  /* The group name format is: wmN.[<encoded filename>.]<lineno>.<md5sum>  */
+  grp_name = XNEWVEC (char, 4 + encoded_filename_len + linebuf_len + 1
+		      + 16 * 2 + 1);
+  memcpy (grp_name, DWARF_OFFSET_SIZE == 4 ? "wm4." : "wm8.", 4);
+  tail = grp_name + 4;
+  if (encoded_filename_len)
+    {
+      for (i = 0; base[i]; i++)
+	if (ISIDNUM (base[i]) || base[i] == '.')
+	  *tail++ = base[i];
+      *tail++ = '.';
+    }
+  memcpy (tail, linebuf, linebuf_len);
+  tail += linebuf_len;
+  *tail++ = '.';
+  for (i = 0; i < 16; i++)
+    sprintf (tail + i * 2, "%02x", checksum[i] & 0xff);
+
+  /* Construct a macinfo_entry for DW_MACRO_GNU_transparent_include
+     in the empty vector entry before the first define/undef.  */
+  inc = VEC_index (macinfo_entry, macinfo_table, idx - 1);
+  inc->code = DW_MACRO_GNU_transparent_include;
+  inc->lineno = 0;
+  inc->info = grp_name;
+  if (*macinfo_htab == NULL)
+    *macinfo_htab = htab_create (10, htab_macinfo_hash, htab_macinfo_eq, NULL);
+  /* Avoid emitting duplicates.  */
+  slot = htab_find_slot (*macinfo_htab, inc, INSERT);
+  if (*slot != NULL)
+    {
+      free (CONST_CAST (char *, inc->info));
+      inc->code = 0;
+      inc->info = NULL;
+      /* If such an entry has been used before, just emit
+	 a DW_MACRO_GNU_transparent_include op.  */
+      inc = (macinfo_entry *) *slot;
+      output_macinfo_op (inc);
+      /* And clear all macinfo_entry in the range to avoid emitting them
+	 in the second pass.  */
+      for (i = idx;
+	   VEC_iterate (macinfo_entry, macinfo_table, i, cur)
+	   && i < idx + count;
+	   i++)
+	{
+	  cur->code = 0;
+	  free (CONST_CAST (char *, cur->info));
+	  cur->info = NULL;
+	}
+    }
+  else
+    {
+      *slot = inc;
+      inc->lineno = htab_elements (*macinfo_htab);
+      output_macinfo_op (inc);
+    }
+  return count;
+}
+
+/* Output macinfo section(s).  */
+
 static void
 output_macinfo (void)
 {
   unsigned i;
   unsigned long length = VEC_length (macinfo_entry, macinfo_table);
   macinfo_entry *ref;
+  VEC (macinfo_entry, gc) *files = NULL;
+  htab_t macinfo_htab = NULL;
 
   if (! length)
     return;
 
+  /* output_macinfo* uses these interchangeably.  */
+  gcc_assert ((int) DW_MACINFO_define == (int) DW_MACRO_GNU_define
+	      && (int) DW_MACINFO_undef == (int) DW_MACRO_GNU_undef
+	      && (int) DW_MACINFO_start_file == (int) DW_MACRO_GNU_start_file
+	      && (int) DW_MACINFO_end_file == (int) DW_MACRO_GNU_end_file);
+
+  /* For .debug_macro emit the section header.  */
+  if (!dwarf_strict)
+    {
+      dw2_asm_output_data (2, 4, "DWARF macro version number");
+      if (DWARF_OFFSET_SIZE == 8)
+	dw2_asm_output_data (1, 3, "Flags: 64-bit, lineptr present");
+      else
+	dw2_asm_output_data (1, 2, "Flags: 32-bit, lineptr present");
+      dw2_asm_output_offset (DWARF_OFFSET_SIZE, debug_line_section_label,
+			     debug_line_section, NULL);
+    }
+
+  /* In the first loop, it emits the primary .debug_macinfo section
+     and after each emitted op the macinfo_entry is cleared.
+     If a longer range of define/undef ops can be optimized using
+     DW_MACRO_GNU_transparent_include, the
+     DW_MACRO_GNU_transparent_include op is emitted and kept in
+     the vector before the first define/undef in the range and the
+     whole range of define/undef ops is not emitted and kept.  */
   for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
     {
       switch (ref->code)
 	{
-	  case DW_MACINFO_start_file:
+	case DW_MACINFO_start_file:
+	  VEC_safe_push (macinfo_entry, gc, files, ref);
+	  break;
+	case DW_MACINFO_end_file:
+	  if (!VEC_empty (macinfo_entry, files))
 	    {
-	      int file_num = maybe_emit_file (lookup_filename (ref->info));
-	      dw2_asm_output_data (1, DW_MACINFO_start_file, "Start new file");
-	      dw2_asm_output_data_uleb128 
-			(ref->lineno, "Included from line number %lu", 
-			 			(unsigned long)ref->lineno);
-	      dw2_asm_output_data_uleb128 (file_num, "file %s", ref->info);
+	      macinfo_entry *file = VEC_last (macinfo_entry, files);
+	      free (CONST_CAST (char *, file->info));
+	      VEC_pop (macinfo_entry, files);
 	    }
-	    break;
-	  case DW_MACINFO_end_file:
-	    dw2_asm_output_data (1, DW_MACINFO_end_file, "End file");
-	    break;
-	  case DW_MACINFO_define:
-	    dw2_asm_output_data (1, DW_MACINFO_define, "Define macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu", 
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  case DW_MACINFO_undef:
-	    dw2_asm_output_data (1, DW_MACINFO_undef, "Undefine macro");
-	    dw2_asm_output_data_uleb128 (ref->lineno, "At line number %lu",
-			 			(unsigned long)ref->lineno);
-	    dw2_asm_output_nstring (ref->info, -1, "The macro");
-	    break;
-	  default:
-	   fprintf (asm_out_file, "%s unrecognized macinfo code %lu\n",
-	     ASM_COMMENT_START, (unsigned long)ref->code);
+	  break;
+	case DW_MACINFO_define:
+	case DW_MACINFO_undef:
+	  if (!dwarf_strict
+	      && HAVE_COMDAT_GROUP
+	      && VEC_length (macinfo_entry, files) != 1
+	      && i > 0
+	      && i + 1 < length
+	      && VEC_index (macinfo_entry, macinfo_table, i - 1)->code == 0)
+	    {
+	      unsigned count = optimize_macinfo_range (i, files, &macinfo_htab);
+	      if (count)
+		{
+		  i += count - 1;
+		  continue;
+		}
+	    }
+	  break;
+	case 0:
+	  /* A dummy entry may be inserted at the beginning to be able
+	     to optimize the whole block of predefined macros.  */
+	  if (i == 0)
+	    continue;
+	default:
 	  break;
 	}
+      output_macinfo_op (ref);
+      /* For DW_MACINFO_start_file ref->info has been copied into files
+	 vector.  */
+      if (ref->code != DW_MACINFO_start_file)
+	free (CONST_CAST (char *, ref->info));
+      ref->info = NULL;
+      ref->code = 0;
     }
+
+  if (macinfo_htab == NULL)
+    return;
+
+  htab_delete (macinfo_htab);
+
+  /* If any DW_MACRO_GNU_transparent_include were used, on those
+     DW_MACRO_GNU_transparent_include entries terminate the
+     current chain and switch to a new comdat .debug_macinfo
+     section and emit the define/undef entries within it.  */
+  for (i = 0; VEC_iterate (macinfo_entry, macinfo_table, i, ref); i++)
+    switch (ref->code)
+      {
+      case 0:
+	continue;
+      case DW_MACRO_GNU_transparent_include:
+	{
+	  char label[MAX_ARTIFICIAL_LABEL_BYTES];
+	  tree comdat_key = get_identifier (ref->info);
+	  /* Terminate the previous .debug_macinfo section.  */
+	  dw2_asm_output_data (1, 0, "End compilation unit");
+	  targetm.asm_out.named_section (DEBUG_MACRO_SECTION,
+					 SECTION_DEBUG
+					 | SECTION_LINKONCE,
+					 comdat_key);
+	  ASM_GENERATE_INTERNAL_LABEL (label,
+				       DEBUG_MACRO_SECTION_LABEL,
+				       ref->lineno);
+	  ASM_OUTPUT_LABEL (asm_out_file, label);
+	  ref->code = 0;
+	  free (CONST_CAST (char *, ref->info));
+	  ref->info = NULL;
+	  dw2_asm_output_data (2, 4, "DWARF macro version number");
+	  if (DWARF_OFFSET_SIZE == 8)
+	    dw2_asm_output_data (1, 1, "Flags: 64-bit");
+	  else
+	    dw2_asm_output_data (1, 0, "Flags: 32-bit");
+	}
+	break;
+      case DW_MACINFO_define:
+      case DW_MACINFO_undef:
+	output_macinfo_op (ref);
+	ref->code = 0;
+	free (CONST_CAST (char *, ref->info));
+	ref->info = NULL;
+	break;
+      default:
+	gcc_unreachable ();
+      }
 }
 
 /* Set up for Dwarf output at the start of compilation.  */
@@ -20409,7 +20754,9 @@ dwarf2out_init (const char *filename ATT
 				      SECTION_DEBUG, NULL);
   debug_aranges_section = get_section (DEBUG_ARANGES_SECTION,
 				       SECTION_DEBUG, NULL);
-  debug_macinfo_section = get_section (DEBUG_MACINFO_SECTION,
+  debug_macinfo_section = get_section (dwarf_strict
+				       ? DEBUG_MACINFO_SECTION
+				       : DEBUG_MACRO_SECTION,
 				       SECTION_DEBUG, NULL);
   debug_line_section = get_section (DEBUG_LINE_SECTION,
 				    SECTION_DEBUG, NULL);
@@ -20441,7 +20788,9 @@ dwarf2out_init (const char *filename ATT
   ASM_GENERATE_INTERNAL_LABEL (ranges_section_label,
 			       DEBUG_RANGES_SECTION_LABEL, 0);
   ASM_GENERATE_INTERNAL_LABEL (macinfo_section_label,
-			       DEBUG_MACINFO_SECTION_LABEL, 0);
+			       dwarf_strict
+			       ? DEBUG_MACINFO_SECTION_LABEL
+			       : DEBUG_MACRO_SECTION_LABEL, 0);
 
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
     macinfo_table = VEC_alloc (macinfo_entry, gc, 64);
@@ -21984,7 +22333,9 @@ dwarf2out_finish (const char *filename)
 		    debug_line_section_label);
 
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
-    add_AT_macptr (comp_unit_die (), DW_AT_macro_info, macinfo_section_label);
+    add_AT_macptr (comp_unit_die (),
+		   dwarf_strict ? DW_AT_macro_info : DW_AT_GNU_macros,
+		   macinfo_section_label);
 
   if (have_location_lists)
     optimize_location_lists (comp_unit_die ());


	Jakub

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5)
  2011-07-22 13:49             ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek
@ 2011-07-22 15:34               ` Tom Tromey
  2011-07-22 17:24               ` Richard Henderson
  1 sibling, 0 replies; 25+ messages in thread
From: Tom Tromey @ 2011-07-22 15:34 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Richard Henderson, gcc-patches, Jason Merrill, Jan Kratochvil,
	Cary Coutant, Mark Wielaard

>>>>> "Jakub" == Jakub Jelinek <jakub@redhat.com> writes:

Jakub> Ok, based on further discussions here, on Dwarf-discuss and on IRC
Jakub> here is a hopefully final version.

I've updated the gdb patch.

Tom

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5)
  2011-07-22 13:49             ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek
  2011-07-22 15:34               ` Tom Tromey
@ 2011-07-22 17:24               ` Richard Henderson
  1 sibling, 0 replies; 25+ messages in thread
From: Richard Henderson @ 2011-07-22 17:24 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: gcc-patches, Jason Merrill, Tom Tromey, Jan Kratochvil,
	Cary Coutant, Mark Wielaard

On 07/22/2011 06:20 AM, Jakub Jelinek wrote:
> Ok, based on further discussions here, on Dwarf-discuss and on IRC
> here is a hopefully final version.

Ok by me.  The dwarf edits look good too.


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-21 17:25           ` Richard Henderson
  2011-07-21 18:13             ` Jakub Jelinek
  2011-07-22 13:49             ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek
@ 2011-07-22 20:33             ` Michael Eager
  2011-07-22 21:50               ` Richard Henderson
  2 siblings, 1 reply; 25+ messages in thread
From: Michael Eager @ 2011-07-22 20:33 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey,
	Jan Kratochvil, Cary Coutant, Mark Wielaard

On 07/21/2011 10:10 AM, Richard Henderson wrote:

> I've been wondering if the header shouldn't contain the opcode
> definitions, similar to .debug_line, and drop your _define_opcode.
> It does mean that you couldn't re-define opcodes within any one
> sequence, but does that actually seem useful?

The definition of opcodes in the line number table is different from
opcodes in other tables, including a modified macro table.  There
are many opcodes (essentially every possible value is used) and the
specific meaning of the opcodes may be different for different targets.

It seems unlikely that different targets would have different meanings
for the macro opcodes.

-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-22 20:33             ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Michael Eager
@ 2011-07-22 21:50               ` Richard Henderson
  2011-07-22 21:51                 ` Michael Eager
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2011-07-22 21:50 UTC (permalink / raw)
  To: Michael Eager
  Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey,
	Jan Kratochvil, Cary Coutant, Mark Wielaard

On 07/22/2011 12:54 PM, Michael Eager wrote:
> The definition of opcodes in the line number table is different from
> opcodes in other tables, including a modified macro table.  There
> are many opcodes (essentially every possible value is used) and the
> specific meaning of the opcodes may be different for different targets.

I'm referring to the standard_opcode_lengths section of the .debug_line
header here.  We're trying to do something similar for the .debug_macro
section.


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-22 21:50               ` Richard Henderson
@ 2011-07-22 21:51                 ` Michael Eager
  2011-07-22 22:10                   ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Michael Eager @ 2011-07-22 21:51 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey,
	Jan Kratochvil, Cary Coutant, Mark Wielaard

On 07/22/2011 02:08 PM, Richard Henderson wrote:
> On 07/22/2011 12:54 PM, Michael Eager wrote:
>> The definition of opcodes in the line number table is different from
>> opcodes in other tables, including a modified macro table.  There
>> are many opcodes (essentially every possible value is used) and the
>> specific meaning of the opcodes may be different for different targets.
>
> I'm referring to the standard_opcode_lengths section of the .debug_line
> header here.  We're trying to do something similar for the .debug_macro
> section.

There doesn't seem to be any need.  standard_opcode_lengths is only needed
because the opcode meanings can vary for different targets.

-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-22 21:51                 ` Michael Eager
@ 2011-07-22 22:10                   ` Richard Henderson
  2011-07-23  0:32                     ` Michael Eager
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2011-07-22 22:10 UTC (permalink / raw)
  To: Michael Eager
  Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey,
	Jan Kratochvil, Cary Coutant, Mark Wielaard

On 07/22/2011 02:16 PM, Michael Eager wrote:
> On 07/22/2011 02:08 PM, Richard Henderson wrote:
>> On 07/22/2011 12:54 PM, Michael Eager wrote:
>>> The definition of opcodes in the line number table is different from
>>> opcodes in other tables, including a modified macro table.  There
>>> are many opcodes (essentially every possible value is used) and the
>>> specific meaning of the opcodes may be different for different targets.
>>
>> I'm referring to the standard_opcode_lengths section of the .debug_line
>> header here.  We're trying to do something similar for the .debug_macro
>> section.
> 
> There doesn't seem to be any need.  standard_opcode_lengths is only needed
> because the opcode meanings can vary for different targets.

I beg your pardon, but no, the meanings of the *standard* opcodes
cannot vary.  Only the special opcode meanings vary.

See 6.2.4 #10:

# By increasing opcode_base, and adding elements to this array,
# new standard opcodes can be added, while allowing consumers who
# do not know about these new opcodes to be able to skip them.


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-22 22:10                   ` Richard Henderson
@ 2011-07-23  0:32                     ` Michael Eager
  2011-07-23  0:36                       ` Richard Henderson
  0 siblings, 1 reply; 25+ messages in thread
From: Michael Eager @ 2011-07-23  0:32 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey,
	Jan Kratochvil, Cary Coutant, Mark Wielaard

On 07/22/2011 02:20 PM, Richard Henderson wrote:
> On 07/22/2011 02:16 PM, Michael Eager wrote:
>> On 07/22/2011 02:08 PM, Richard Henderson wrote:
>>> On 07/22/2011 12:54 PM, Michael Eager wrote:
>>>> The definition of opcodes in the line number table is different from
>>>> opcodes in other tables, including a modified macro table.  There
>>>> are many opcodes (essentially every possible value is used) and the
>>>> specific meaning of the opcodes may be different for different targets.
>>>
>>> I'm referring to the standard_opcode_lengths section of the .debug_line
>>> header here.  We're trying to do something similar for the .debug_macro
>>> section.
>>
>> There doesn't seem to be any need.  standard_opcode_lengths is only needed
>> because the opcode meanings can vary for different targets.
>
> I beg your pardon, but no, the meanings of the *standard* opcodes
> cannot vary.  Only the special opcode meanings vary.
>
> See 6.2.4 #10:
>
> # By increasing opcode_base, and adding elements to this array,
> # new standard opcodes can be added, while allowing consumers who
> # do not know about these new opcodes to be able to skip them.

Which part of "not needed" did you misunderstand?

-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-23  0:32                     ` Michael Eager
@ 2011-07-23  0:36                       ` Richard Henderson
  2011-07-26  7:34                         ` Jason Merrill
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2011-07-23  0:36 UTC (permalink / raw)
  To: Michael Eager
  Cc: Jakub Jelinek, gcc-patches, Jason Merrill, Tom Tromey,
	Jan Kratochvil, Cary Coutant, Mark Wielaard

On 07/22/2011 04:15 PM, Michael Eager wrote:
> On 07/22/2011 02:20 PM, Richard Henderson wrote:
>> On 07/22/2011 02:16 PM, Michael Eager wrote:
>>> On 07/22/2011 02:08 PM, Richard Henderson wrote:
>>>> On 07/22/2011 12:54 PM, Michael Eager wrote:
>>>>> The definition of opcodes in the line number table is different from
>>>>> opcodes in other tables, including a modified macro table.  There
>>>>> are many opcodes (essentially every possible value is used) and the
>>>>> specific meaning of the opcodes may be different for different targets.
>>>>
>>>> I'm referring to the standard_opcode_lengths section of the .debug_line
>>>> header here.  We're trying to do something similar for the .debug_macro
>>>> section.
>>>
>>> There doesn't seem to be any need.  standard_opcode_lengths is only needed
>>> because the opcode meanings can vary for different targets.
>>
>> I beg your pardon, but no, the meanings of the *standard* opcodes
>> cannot vary.  Only the special opcode meanings vary.
>>
>> See 6.2.4 #10:
>>
>> # By increasing opcode_base, and adding elements to this array,
>> # new standard opcodes can be added, while allowing consumers who
>> # do not know about these new opcodes to be able to skip them.
> 
> Which part of "not needed" did you misunderstand?

The part in which "not needed" appears.

I'm afraid I have no idea what you're talking about anymore.


r~

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4)
  2011-07-23  0:36                       ` Richard Henderson
@ 2011-07-26  7:34                         ` Jason Merrill
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Merrill @ 2011-07-26  7:34 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Michael Eager, Jakub Jelinek, gcc-patches, Tom Tromey,
	Jan Kratochvil, Cary Coutant, Mark Wielaard

There seems to be some violent agreement going on here.  I think 
everyone agrees that we don't need to define anything about standard 
.debug_macro opcodes in the binary, that they will always mean the same 
thing.  The question was how to establish extended opcodes, whether via 
a define_opcode operation or in the header.

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2011-07-26  5:17 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-13 17:12 [RFC] More compact (100x) -g3 .debug_macinfo Jakub Jelinek
2011-07-13 19:59 ` Tom Tromey
2011-07-13 20:37   ` Jakub Jelinek
2011-07-18 15:42     ` Tom Tromey
2011-07-15 15:52 ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Jakub Jelinek
2011-07-15 17:19   ` Richard Henderson
2011-07-15 21:18     ` [RFC] More compact (100x) -g3 .debug_macinfo (take 3) Jakub Jelinek
2011-07-18 15:09       ` Tom Tromey
2011-07-20  1:17       ` Richard Henderson
2011-07-21 11:38         ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Jakub Jelinek
2011-07-21 17:25           ` Richard Henderson
2011-07-21 18:13             ` Jakub Jelinek
2011-07-22 13:49             ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 5) Jakub Jelinek
2011-07-22 15:34               ` Tom Tromey
2011-07-22 17:24               ` Richard Henderson
2011-07-22 20:33             ` [RFC] More compact (100x) -g3 .debug_gnu_macro (take 4) Michael Eager
2011-07-22 21:50               ` Richard Henderson
2011-07-22 21:51                 ` Michael Eager
2011-07-22 22:10                   ` Richard Henderson
2011-07-23  0:32                     ` Michael Eager
2011-07-23  0:36                       ` Richard Henderson
2011-07-26  7:34                         ` Jason Merrill
2011-07-15 18:28   ` [RFC] More compact (100x) -g3 .debug_macinfo (take 2) Tom Tromey
2011-07-15 19:21     ` Jakub Jelinek
2011-07-15 19:30       ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).