public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jason Merrill <jason@redhat.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: gcc@gcc.gnu.org, binutils@sourceware.org,
	       Roland McGrath <mcgrathr@google.com>,
	       Cary Coutant <ccoutant@google.com>,
	Tom Tromey <tromey@redhat.com>,
	       Mark Wielaard <mjw@redhat.com>,
	Nick Clifton <nickc@redhat.com>,
	       Alan Modra <amodra@gmail.com>
Subject: Re: Debug info for comdat functions
Date: Wed, 18 Apr 2012 12:44:00 -0000	[thread overview]
Message-ID: <4F8EB6F9.9080008@redhat.com> (raw)
In-Reply-To: <20120418115314.GO6148@sunsite.ms.mff.cuni.cz>

[-- Attachment #1: Type: text/plain, Size: 2653 bytes --]

On 04/18/2012 07:53 AM, Jakub Jelinek wrote:
> Consider attached testcase with comdat foo function, seems the
> current linker behavior (well, tested with 2.21.53.0.1 ld.bfd)
> is that for DW_TAG_subprogram with DW_AT_low_pc/DW_AT_high_pc
> having section relative relocs against comdat functions
> if the comdat text section has the same size in both object
> files, then DW_AT_low_pc (and DW_AT_high_pc) attributes
> in both CUs will point to the same range.

This seems clearly wrong to me.  A reference to a symbol in a discarded 
section should not resolve to an offset into a different section.  I 
thought the linker always resolved such references to 0, and I think 
that is what we want.

> When discussed on IRC recently Jason preferred to move the DW_TAG_subprogram
> describing a comdat function to a comdat .debug_info DW_TAG_partial_unit
> and just reference all DIEs that need to be referenced from it
> using DW_FORM_ref_addr back to the non-comdat .debug_info.

I played around with implementing this in the compiler yesterday; my 
initial patch is attached.  It seems that with normal DWARF 4 this can 
work well, but I ran into issues with various GNU extensions:

DW_TAG_GNU_call_site wants to refer to the called function's DIE, so the 
function die in the separate unit needs to have its own symbol.  Perhaps 
_call_site could refer to the function symbol instead?  That seems more 
correct anyway, since with COMDAT functions you might end up calling a 
different version of the function that has a different DIE.

The typed stack ops such as DW_OP_GNU_deref_type want to refer to a type 
in the same CU, so we would need to copy any referenced base types into 
the separate function CU.  Could we add variants of these ops that take 
an offset from .debug_info?

> Perhaps put its
> sole .debug_loc contributions into comdat part as well, .debug_ranges maybe
> too.

I haven't done anything with .debug_loc yet.

.debug_ranges mostly goes away with this change; the main CU becomes 
just .text and the separate CUs are just their own function.  I suppose 
.debug_ranges would still be needed with hot/cold optimizations.

> We would have DW_TAG_imported_unit with DW_AT_import
> attribute pointing to the start DW_TAG_partial_unit in the section
> (we would need to hardcode the +11 bytes offset, assuming nobody
> ever emits 64-bit DWARF) and not refer to any other DIEs from the partial
> unit.

I think it would be both better and more correct to have the 
DW_AT_imported_unit going the other way, so the function CU imports the 
main CU.  That's what DWARF4 appendix E suggests.  My patch doesn't 
implement this yet.

Jason

[-- Attachment #2: comdat-fn-debug.patch --]
[-- Type: text/x-patch, Size: 14692 bytes --]

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index abe3f1b..c113c63 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -8612,6 +8612,7 @@ ix86_code_end (void)
 				       NULL_TREE, void_type_node);
       TREE_PUBLIC (decl) = 1;
       TREE_STATIC (decl) = 1;
+      DECL_IGNORED_P (decl) = 1;
 
 #if TARGET_MACHO
       if (TARGET_MACHO)
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 7e2ce58..0c33af2 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -1007,6 +1007,7 @@ dwarf2out_begin_prologue (unsigned int line ATTRIBUTE_UNUSED,
   fde->dw_fde_current_label = dup_label;
   fde->in_std_section = (fnsec == text_section
 			 || (cold_text_section && fnsec == cold_text_section));
+  fde->comdat = DECL_ONE_ONLY (current_function_decl);
 
   /* We only want to output line number information for the genuine dwarf2
      prologue case, not the eh frame case.  */
@@ -3291,8 +3292,10 @@ static void compute_section_prefix (dw_die_ref);
 static int is_type_die (dw_die_ref);
 static int is_comdat_die (dw_die_ref);
 static int is_symbol_die (dw_die_ref);
+static int is_abstract_die (dw_die_ref);
 static void assign_symbol_names (dw_die_ref);
 static void break_out_includes (dw_die_ref);
+static void break_out_comdat_functions (dw_die_ref);
 static int is_declaration_die (dw_die_ref);
 static int should_move_die_to_comdat (dw_die_ref);
 static dw_die_ref clone_as_declaration (dw_die_ref);
@@ -4105,6 +4108,9 @@ dwarf_attr_name (unsigned int attr)
     case DW_AT_GNU_macros:
       return "DW_AT_GNU_macros";
 
+    case DW_AT_GNU_comdat:
+      return "DW_AT_GNU_comdat";
+
     case DW_AT_GNAT_descriptive_type:
       return "DW_AT_GNAT_descriptive_type";
 
@@ -6698,6 +6704,9 @@ is_symbol_die (dw_die_ref c)
 {
   return (is_type_die (c)
 	  || is_declaration_die (c)
+	  || is_abstract_die (c)
+	  /* DW_TAG_GNU_call_site can refer to subprograms.  */
+	  || c->die_tag == DW_TAG_subprogram
 	  || c->die_tag == DW_TAG_namespace
 	  || c->die_tag == DW_TAG_module);
 }
@@ -6728,6 +6737,8 @@ assign_symbol_names (dw_die_ref die)
 
   if (is_symbol_die (die))
     {
+      if (die->die_id.die_symbol)
+	return;
       if (comdat_symbol_id)
 	{
 	  char *p = XALLOCAVEC (char, strlen (comdat_symbol_id) + 64);
@@ -6900,6 +6911,65 @@ break_out_includes (dw_die_ref die)
   htab_delete (cu_hash_table);
 }
 
+static const char *
+function_die_label (const char *comdat_id)
+{
+  char *p = XALLOCAVEC (char, strlen (comdat_id) + 10);
+  sprintf (p, DIE_LABEL_PREFIX ".%s", comdat_id);
+  return xstrdup (p);
+}
+
+/* Traverse the DIE (which is always comp_unit_die), and set up additional
+   compilation units for each of the comdat functions we see.  */
+
+static void
+break_out_comdat_functions (dw_die_ref die)
+{
+  dw_die_ref c;
+  dw_die_ref unit = NULL;
+  limbo_die_node *node;
+  dw_die_ref prev;
+  bool found = false;
+
+  prev = die->die_child;
+  if (prev == NULL)
+    return;
+
+  do
+    {
+      c = prev->die_sib;
+      if (c->die_tag == DW_TAG_subprogram && get_AT (c, DW_AT_GNU_comdat))
+	{
+	  dw_die_ref unit = gen_compile_unit_die (NULL);
+
+	  /* This DIE is for a secondary CU; remove it from the main one.  */
+	  remove_child_with_prev (c, prev);
+	  add_child_die (unit, c);
+	  found = true;
+	}
+      else
+	prev = c;
+    }
+  while (c != die->die_child);
+
+  if (!found)
+    return;
+
+  assign_symbol_names (die);
+  for (node = limbo_die_list;
+       node;
+       node = node->next)
+    {
+      const char *comdat_id;
+      unit = node->die;
+      c = unit->die_child->die_sib;
+      comdat_id = get_AT_string (c, DW_AT_GNU_comdat);
+      remove_AT (c, DW_AT_GNU_comdat);
+      unit->die_id.die_symbol = comdat_id;
+      c->die_id.die_symbol = function_die_label (comdat_id);
+    }
+}
+
 /* Return non-zero if this DIE is a declaration.  */
 
 static int
@@ -6915,6 +6985,37 @@ is_declaration_die (dw_die_ref die)
   return 0;
 }
 
+/* Return non-zero if this DIE is part of an abstract inline.  That is, if
+   it's an inline function or a variable, label or parameter of an inline
+   function.  */
+
+static int
+is_abstract_die (dw_die_ref die)
+{
+  switch (die->die_tag)
+    {
+    case DW_TAG_subprogram:
+    case DW_TAG_variable:
+    case DW_TAG_label:
+    case DW_TAG_formal_parameter:
+      break;
+
+    default:
+      return 0;
+    }
+
+  for (; die->die_tag != DW_TAG_subprogram; die = die->die_parent)
+    {
+      if (die->die_tag == DW_TAG_compile_unit
+	  || die->die_tag == DW_TAG_namespace)
+	return 0;
+    }
+
+  if (get_AT (die, DW_AT_inline))
+    return 1;
+  return 0;
+}
+
 /* Return non-zero if this DIE is nested inside a subprogram.  */
 
 static int
@@ -8560,8 +8661,7 @@ output_compilation_unit_header (void)
 static void
 output_comp_unit (dw_die_ref die, int output_if_empty)
 {
-  const char *secname;
-  char *oldsym, *tmp;
+  char *oldsym;
 
   /* Unless we are outputting main CU, we may throw away empty ones.  */
   if (!output_if_empty && die->die_child == NULL)
@@ -8583,12 +8683,19 @@ output_comp_unit (dw_die_ref die, int output_if_empty)
   oldsym = die->die_id.die_symbol;
   if (oldsym)
     {
-      tmp = XALLOCAVEC (char, strlen (oldsym) + 24);
+#ifdef OBJECT_FORMAT_ELF
+      targetm.asm_out.named_section (".debug_info",
+				     SECTION_DEBUG | SECTION_LINKONCE,
+				     get_identifier (oldsym));
+#else
+      char *tmp = XALLOCAVEC (char, strlen (oldsym) + 24);
+      const char *secname;
 
       sprintf (tmp, ".gnu.linkonce.wi.%s", oldsym);
       secname = tmp;
-      die->die_id.die_symbol = NULL;
       switch_to_section (get_section (secname, SECTION_DEBUG, NULL));
+#endif
+      die->die_id.die_symbol = NULL;
     }
   else
     {
@@ -17298,13 +17405,13 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
   else if (!DECL_EXTERNAL (decl))
     {
       HOST_WIDE_INT cfa_fb_offset;
+      dw_fde_ref fde = cfun->fde;
 
       if (!old_die || !get_AT (old_die, DW_AT_inline))
 	equate_decl_number_to_die (decl, subr_die);
 
       if (!flag_reorder_blocks_and_partition)
 	{
-	  dw_fde_ref fde = cfun->fde;
 	  if (fde->dw_fde_begin)
 	    {
 	      /* We have already generated the labels.  */
@@ -17352,8 +17459,6 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
       else
 	{
 	  /* Generate pubnames entries for the split function code ranges.  */
-	  dw_fde_ref fde = cfun->fde;
-
 	  if (fde->dw_fde_second_begin)
 	    {
 	      if (dwarf_version >= 3 || !dwarf_strict)
@@ -17455,6 +17560,13 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
 	    add_AT_loc (subr_die, DW_AT_frame_base, list->expr);
 	}
 
+      if (fde->comdat)
+	{
+	  gcc_assert (context_die == comp_unit_die ());
+	  add_AT_string (subr_die, DW_AT_GNU_comdat,
+			 IDENTIFIER_POINTER (DECL_COMDAT_GROUP (decl)));
+	}
+
       /* Compute a displacement from the "steady-state frame pointer" to
 	 the CFA.  The former is what all stack slots and argument slots
 	 will reference in the rtl; the later is what we've told the
@@ -18152,7 +18264,6 @@ add_high_low_attributes (tree stmt, dw_die_ref die)
 	}
 
       add_AT_range_list (die, DW_AT_ranges, add_ranges (stmt));
-
       chain = BLOCK_FRAGMENT_CHAIN (stmt);
       do
 	{
@@ -22609,6 +22720,8 @@ dwarf2out_finish (const char *filename)
       prune_unused_types ();
     }
 
+  break_out_comdat_functions (comp_unit_die ());
+
   /* Traverse the DIE's and add add sibling attributes to those DIE's
      that have children.  */
   add_sibling_attributes (comp_unit_die ());
@@ -22626,67 +22739,115 @@ dwarf2out_finish (const char *filename)
       targetm.asm_out.internal_label (asm_out_file, COLD_END_LABEL, 0);
     }
 
-  /* We can only use the low/high_pc attributes if all of the code was
-     in .text.  */
-  if (!have_multiple_function_sections 
-      || (dwarf_version < 3 && dwarf_strict))
-    {
-      /* Don't add if the CU has no associated code.  */
-      if (text_section_used)
+  /* Add range information to the main CU.  We can only use the low/high_pc
+     attributes if all of the code was in .text.  */
+  {
+    unsigned fde_idx;
+    dw_fde_ref fde;
+    bool found = false;
+
+    /* Are there any extra function sections that belong to the main CU? */
+    FOR_EACH_VEC_ELT (dw_fde_ref, fde_vec, fde_idx, fde)
+      if (!fde->comdat && !DECL_IGNORED_P (fde->decl)
+	  && (!fde->in_std_section
+	      || (fde->dw_fde_second_begin && !fde->second_in_std_section)))
 	{
-	  add_AT_lbl_id (comp_unit_die (), DW_AT_low_pc, text_section_label);
-	  add_AT_lbl_id (comp_unit_die (), DW_AT_high_pc, text_end_label);
+	  found = true;
+	  break;
 	}
-    }
-  else
+
+    if (!(found || cold_text_section_used)
+	|| (dwarf_version < 3 && dwarf_strict))
+      {
+	/* Don't add if the CU has no associated code.  */
+	if (text_section_used)
+	  {
+	    add_AT_lbl_id (comp_unit_die (), DW_AT_low_pc, text_section_label);
+	    add_AT_lbl_id (comp_unit_die (), DW_AT_high_pc, text_end_label);
+	  }
+      }
+    else
+      {
+	bool range_list_added = false;
+
+	if (text_section_used)
+	  add_ranges_by_labels (comp_unit_die (), text_section_label,
+				text_end_label, &range_list_added);
+	if (cold_text_section_used)
+	  add_ranges_by_labels (comp_unit_die (), cold_text_section_label,
+				cold_end_label, &range_list_added);
+
+	FOR_EACH_VEC_ELT (dw_fde_ref, fde_vec, fde_idx, fde)
+	  {
+	    if (fde->comdat || DECL_IGNORED_P (fde->decl))
+	      continue;
+	    if (!fde->in_std_section)
+	      add_ranges_by_labels (comp_unit_die (), fde->dw_fde_begin,
+				    fde->dw_fde_end, &range_list_added);
+	    if (fde->dw_fde_second_begin && !fde->second_in_std_section)
+	      add_ranges_by_labels (comp_unit_die (), fde->dw_fde_second_begin,
+				    fde->dw_fde_second_end, &range_list_added);
+	  }
+
+	if (range_list_added)
+	  {
+	    /* We need to give .debug_loc and .debug_ranges an appropriate
+	       "base address".  Use zero so that these addresses become
+	       absolute.  Historically, we've emitted the unexpected
+	       DW_AT_entry_pc instead of DW_AT_low_pc for this purpose.
+	       Emit both to give time for other tools to adapt.  */
+	    add_AT_addr (comp_unit_die (), DW_AT_low_pc, const0_rtx);
+	    if (! dwarf_strict && dwarf_version < 4)
+	      add_AT_addr (comp_unit_die (), DW_AT_entry_pc, const0_rtx);
+
+	    add_ranges (NULL);
+	  }
+      }
+  }
+
+  /* Also add ranges to the CUs for comdat functions.  */
+  for (node = limbo_die_list; node; node = node->next)
     {
-      unsigned fde_idx;
-      dw_fde_ref fde;
-      bool range_list_added = false;
-
-      if (text_section_used)
-	add_ranges_by_labels (comp_unit_die (), text_section_label,
-			      text_end_label, &range_list_added);
-      if (cold_text_section_used)
-	add_ranges_by_labels (comp_unit_die (), cold_text_section_label,
-			      cold_end_label, &range_list_added);
-
-      FOR_EACH_VEC_ELT (dw_fde_ref, fde_vec, fde_idx, fde)
-	{
-	  if (!fde->in_std_section)
-	    add_ranges_by_labels (comp_unit_die (), fde->dw_fde_begin,
-				  fde->dw_fde_end, &range_list_added);
-	  if (fde->dw_fde_second_begin && !fde->second_in_std_section)
-	    add_ranges_by_labels (comp_unit_die (), fde->dw_fde_second_begin,
-				  fde->dw_fde_second_end, &range_list_added);
-	}
+      dw_die_ref c = node->die->die_child->die_sib;
+      dw_attr_ref a;
+
+      if (c->die_tag != DW_TAG_subprogram)
+	continue;
 
-      if (range_list_added)
+      if ((a = get_AT (c, DW_AT_ranges)))
+	add_AT_range_list (node->die, DW_AT_ranges,
+			   a->dw_attr_val.v.val_offset);
+      else
 	{
-	  /* We need to give .debug_loc and .debug_ranges an appropriate
-	     "base address".  Use zero so that these addresses become
-	     absolute.  Historically, we've emitted the unexpected
-	     DW_AT_entry_pc instead of DW_AT_low_pc for this purpose.
-	     Emit both to give time for other tools to adapt.  */
-	  add_AT_addr (comp_unit_die (), DW_AT_low_pc, const0_rtx);
-	  if (! dwarf_strict && dwarf_version < 4)
-	    add_AT_addr (comp_unit_die (), DW_AT_entry_pc, const0_rtx);
-
-	  add_ranges (NULL);
+	  add_AT_lbl_id (node->die, DW_AT_low_pc, get_AT_low_pc (c));
+	  add_AT_lbl_id (node->die, DW_AT_high_pc, get_AT_hi_pc (c));
 	}
     }
 
   if (debug_info_level >= DINFO_LEVEL_NORMAL)
-    add_AT_lineptr (comp_unit_die (), DW_AT_stmt_list,
-		    debug_line_section_label);
+    {
+      add_AT_lineptr (comp_unit_die (), DW_AT_stmt_list,
+		      debug_line_section_label);
+      /* ??? comdat line info for comdat functions?  */
+      for (node = limbo_die_list; node; node = node->next)
+	add_AT_lineptr (node->die, DW_AT_stmt_list,
+			debug_line_section_label);
+    }
 
   if (debug_info_level >= DINFO_LEVEL_VERBOSE)
-    add_AT_macptr (comp_unit_die (),
-		   dwarf_strict ? DW_AT_macro_info : DW_AT_GNU_macros,
-		   macinfo_section_label);
+    {
+      add_AT_macptr (comp_unit_die (),
+		     dwarf_strict ? DW_AT_macro_info : DW_AT_GNU_macros,
+		     macinfo_section_label);
+      for (node = limbo_die_list; node; node = node->next)
+	add_AT_macptr (node->die,
+		       dwarf_strict ? DW_AT_macro_info : DW_AT_GNU_macros,
+		       macinfo_section_label);
+    }
 
   if (have_location_lists)
-    optimize_location_lists (comp_unit_die ());
+    for (node = limbo_die_list; node; node = node->next)
+      optimize_location_lists (node->die);
 
   /* Output all of the compilation units.  We put the main one last so that
      the offsets are available to output_pubnames.  */
@@ -22735,6 +22896,8 @@ dwarf2out_finish (const char *filename)
 				   DEBUG_LOC_SECTION_LABEL, 0);
       ASM_OUTPUT_LABEL (asm_out_file, loc_section_label);
       output_location_lists (comp_unit_die ());
+      for (node = limbo_die_list; node; node = node->next)
+	output_location_lists (node->die);
     }
 
   /* Output public names table if necessary.  */
diff --git a/gcc/dwarf2out.h b/gcc/dwarf2out.h
index 711e8ab..bee757a 100644
--- a/gcc/dwarf2out.h
+++ b/gcc/dwarf2out.h
@@ -109,6 +109,8 @@ typedef struct GTY(()) dw_fde_struct {
   /* True iff dw_fde_second_begin label is in text_section or
      cold_text_section.  */
   unsigned second_in_std_section : 1;
+  /* True iff the function is part of a comdat group.  */
+  unsigned comdat : 1;
 }
 dw_fde_node;
 
diff --git a/include/dwarf2.h b/include/dwarf2.h
index 8c0c9ed..d3c8cb3 100644
--- a/include/dwarf2.h
+++ b/include/dwarf2.h
@@ -379,6 +379,8 @@ enum dwarf_attribute
     DW_AT_GNU_addr_base = 0x2133,
     DW_AT_GNU_pubnames = 0x2134,
     DW_AT_GNU_pubtypes = 0x2135,
+    /* Never actually emitted.  */
+    DW_AT_GNU_comdat = 0x2136,
     /* VMS extensions.  */
     DW_AT_VMS_rtnbeg_pd_address = 0x2201,
     /* GNAT extensions.  */

  reply	other threads:[~2012-04-18 12:44 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-18 11:17 Jakub Jelinek
2012-04-18 11:53 ` Jakub Jelinek
2012-04-18 12:44   ` Jason Merrill [this message]
2012-04-18 13:24     ` Jakub Jelinek
2012-04-19  5:29       ` Jakub Jelinek
2012-04-18 23:40     ` Cary Coutant
2012-04-19  4:44       ` Jason Merrill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F8EB6F9.9080008@redhat.com \
    --to=jason@redhat.com \
    --cc=amodra@gmail.com \
    --cc=binutils@sourceware.org \
    --cc=ccoutant@google.com \
    --cc=gcc@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=mcgrathr@google.com \
    --cc=mjw@redhat.com \
    --cc=nickc@redhat.com \
    --cc=tromey@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).