[PATCH 0/2] Clean up language handling in the DWARF reader

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 0/2] Clean up language handling in the DWARF reader
@ 2021-06-08 15:26 Tom Tromey
  2021-06-08 15:26 ` [PATCH 1/2] Consolidate CU language setting Tom Tromey
  2021-06-08 15:26 ` [PATCH 2/2] Remove dwarf2_cu::language Tom Tromey
  0 siblings, 2 replies; 16+ messages in thread
From: Tom Tromey @ 2021-06-08 15:26 UTC (permalink / raw)
  To: gdb-patches

In my series to rewrite the DWARF psymtab reader, I needed to use the
dwarf2_per_cu_data 'lang' field.  However, I found that this field is
not normally set before the scanning is done, only afterward.

I fixed this with a hack on my WIP branch, but I decided to pull this
out into a separate patch.  Looking at it more deeply, I found that
this was set inconsistently (fixed in patch #1), and also that the
field is redundant (fixed in patch #2).

Regression tested on x86-64 Fedora 32.

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] Consolidate CU language setting
  2021-06-08 15:26 [PATCH 0/2] Clean up language handling in the DWARF reader Tom Tromey
@ 2021-06-08 15:26 ` Tom Tromey
  2021-06-09  1:32   ` Simon Marchi
  2021-06-08 15:26 ` [PATCH 2/2] Remove dwarf2_cu::language Tom Tromey
  1 sibling, 1 reply; 16+ messages in thread
From: Tom Tromey @ 2021-06-08 15:26 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

The DWARF reader currently sets the CU's language in two different
spots.  It is primarily done in prepare_one_comp_unit, but
read_file_scope also checks the producer and may change the language
based on the result.

This patch consolidates all language-setting into
prepare_one_comp_unit.  set_cu_language is changed not to set
language_defn; instead that is done in prepare_one_comp_unit after the
correct language enum value is chosen.

This fixes a minor latent bug, which is that read_file_scope could set
the language enum value to language_opencl, but then neglected to
reset language_defn in this case.

2021-06-06  Tom Tromey  <tom@tromey.com>

	* dwarf2/read.c (read_file_scope): Don't call set_cu_language.
	(set_cu_language): Don't set language_defn.
	(prepare_one_comp_unit): Check producer and set language_defn.
---
 gdb/ChangeLog     |  6 ++++++
 gdb/dwarf2/read.c | 34 ++++++++++++++++++----------------
 2 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 96009f1418f..b1a0b8bce88 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -10509,16 +10509,6 @@ read_file_scope (struct die_info *die, struct dwarf2_cu *cu)
 
   file_and_directory fnd = find_file_and_directory (die, cu);
 
-  /* The XLCL doesn't generate DW_LANG_OpenCL because this attribute is not
-     standardised yet.  As a workaround for the language detection we fall
-     back to the DW_AT_producer string.  */
-  if (cu->producer && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
-    cu->language = language_opencl;
-
-  /* Similar hack for Go.  */
-  if (cu->producer && strstr (cu->producer, "GNU Go ") != NULL)
-    set_cu_language (DW_LANG_Go, cu);
-
   cu->start_symtab (fnd.name, fnd.comp_dir, lowpc);
 
   /* Decode line number information if present.  We do this before
@@ -20407,7 +20397,6 @@ set_cu_language (unsigned int lang, struct dwarf2_cu *cu)
       cu->language = language_minimal;
       break;
     }
-  cu->language_defn = language_def (cu->language);
 }
 
 /* Return the named attribute or NULL if not there.  */
@@ -24413,17 +24402,30 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
 {
   struct attribute *attr;
 
+  cu->producer = dwarf2_string_attr (comp_unit_die, DW_AT_producer, cu);
+
   /* Set the language we're debugging.  */
   attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
   if (attr != nullptr)
     set_cu_language (attr->constant_value (0), cu);
-  else
+  else if (cu->producer != nullptr
+	   && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
     {
-      cu->language = pretend_language;
-      cu->language_defn = language_def (cu->language);
+      /* The XLCL doesn't generate DW_LANG_OpenCL because this
+	 attribute is not standardised yet.  As a workaround for the
+	 language detection we fall back to the DW_AT_producer
+	 string.  */
+      cu->language = language_opencl;
     }
-
-  cu->producer = dwarf2_string_attr (comp_unit_die, DW_AT_producer, cu);
+  else if (cu->producer != nullptr
+	   && strstr (cu->producer, "GNU Go ") != NULL)
+    {
+      /* Similar hack for Go.  */
+      cu->language = language_go;
+    }
+  else
+    cu->language = pretend_language;
+  cu->language_defn = language_def (cu->language);
 }
 
 /* See read.h.  */
-- 
2.26.3


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-08 15:26 [PATCH 0/2] Clean up language handling in the DWARF reader Tom Tromey
  2021-06-08 15:26 ` [PATCH 1/2] Consolidate CU language setting Tom Tromey
@ 2021-06-08 15:26 ` Tom Tromey
  2021-06-09  1:58   ` Simon Marchi
  1 sibling, 1 reply; 16+ messages in thread
From: Tom Tromey @ 2021-06-08 15:26 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

dwarf2_cu has a 'language' value, but dwarf2_per_cu_data also holds a
value of this same type.  There doesn't seem to be any reason to keep
two copies of this value.  This patch removes the field from
dwarf2_cu, and arranges to set the value in the per-CU object instead.

Note that the value must still be set when expanding the full CU.
This is needed because the CUs will not be scanned when a DWARF index
is in use.

2021-06-06  Tom Tromey  <tom@tromey.com>

	* dwarf2/read.c (process_psymtab_comp_unit): Don't set 'lang'.
	(scan_partial_symbols, partial_die_parent_scope)
	(add_partial_symbol, add_partial_subprogram)
	(compute_delayed_physnames, rust_union_quirks)
	(process_full_comp_unit, process_full_type_unit)
	(process_imported_unit_die, process_die, dw2_linkage_name)
	(dwarf2_compute_name, dwarf2_physname, read_import_statement)
	(read_file_scope, queue_and_load_dwo_tu, read_func_scope)
	(read_variable, dwarf2_get_subprogram_pc_bounds)
	(dwarf2_attach_fields_to_type, dwarf2_add_member_fn)
	(dwarf2_attach_fn_fields_to_type)
	(quirk_ada_thick_pointer_struct, read_structure_type)
	(handle_struct_member_die, process_structure_scope)
	(read_array_type, read_array_order, prototyped_function_p)
	(read_subroutine_type, dwarf2_init_complex_target_type)
	(read_base_type, read_subrange_type, read_unspecified_type)
	(load_partial_dies, partial_die_info::fixup, set_cu_language)
	(new_symbol, need_gnat_info, determine_prefix, typename_concat)
	(dwarf2_canonicalize_name, follow_die_offset)
	(prepare_one_comp_unit): Update.
	* dwarf2/cu.c (dwarf2_cu::start_symtab): Update.
---
 gdb/ChangeLog     |  24 ++++
 gdb/dwarf2/cu.c   |   2 +-
 gdb/dwarf2/cu.h   |   1 -
 gdb/dwarf2/read.c | 319 +++++++++++++++++++++++-----------------------
 4 files changed, 187 insertions(+), 159 deletions(-)

diff --git a/gdb/dwarf2/cu.c b/gdb/dwarf2/cu.c
index 1031ed3aa00..fffce264ebb 100644
--- a/gdb/dwarf2/cu.c
+++ b/gdb/dwarf2/cu.c
@@ -60,7 +60,7 @@ dwarf2_cu::start_symtab (const char *name, const char *comp_dir,
 
   m_builder.reset (new struct buildsym_compunit
 		   (this->per_objfile->objfile,
-		    name, comp_dir, language, low_pc));
+		    name, comp_dir, per_cu->lang, low_pc));
 
   list_in_scope = get_builder ()->get_file_symbols ();
 
diff --git a/gdb/dwarf2/cu.h b/gdb/dwarf2/cu.h
index b4a5b08d5a6..f4d093b6d3a 100644
--- a/gdb/dwarf2/cu.h
+++ b/gdb/dwarf2/cu.h
@@ -103,7 +103,6 @@ struct dwarf2_cu
   gdb::optional<CORE_ADDR> base_address;
 
   /* The language we are debugging.  */
-  enum language language = language_unknown;
   const struct language_defn *language_defn = nullptr;
 
   const char *producer = nullptr;
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index b1a0b8bce88..74fdf086f3a 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -7046,8 +7046,6 @@ process_psymtab_comp_unit (dwarf2_per_cu_data *this_cu,
 				      reader.comp_unit_die,
 				      pretend_language);
 
-  this_cu->lang = reader.cu->language;
-
   /* Age out any secondary CUs.  */
   per_objfile->age_comp_units ();
 }
@@ -7563,7 +7561,7 @@ scan_partial_symbols (struct partial_die_info *first_die, CORE_ADDR *lowpc,
 	    case DW_TAG_subprogram:
 	    case DW_TAG_inlined_subroutine:
 	      add_partial_subprogram (pdi, lowpc, highpc, set_addrmap, cu);
-	      if (cu->language == language_cplus)
+	      if (cu->per_cu->lang == language_cplus)
 		scan_partial_symbols (pdi->die_child, lowpc, highpc,
 				      set_addrmap, cu);
 	      break;
@@ -7584,8 +7582,9 @@ scan_partial_symbols (struct partial_die_info *first_die, CORE_ADDR *lowpc,
 		{
 		  add_partial_symbol (pdi, cu);
 		}
-	      if ((cu->language == language_rust
-		   || cu->language == language_cplus) && pdi->has_children)
+	      if ((cu->per_cu->lang == language_rust
+		   || cu->per_cu->lang == language_cplus)
+		  && pdi->has_children)
 		scan_partial_symbols (pdi->die_child, lowpc, highpc,
 				      set_addrmap, cu);
 	      break;
@@ -7624,7 +7623,7 @@ scan_partial_symbols (struct partial_die_info *first_die, CORE_ADDR *lowpc,
 		/* Go read the partial unit, if needed.  */
 		if (per_cu->v.psymtab == NULL)
 		  process_psymtab_comp_unit (per_cu, cu->per_objfile, true,
-					     cu->language);
+					     cu->per_cu->lang);
 
 		cu->per_cu->imported_symtabs_push (per_cu);
 	      }
@@ -7699,7 +7698,7 @@ partial_die_parent_scope (struct partial_die_info *pdi,
   /* GCC 4.0 and 4.1 had a bug (PR c++/28460) where they generated bogus
      DW_TAG_namespace DIEs with a name of "::" for the global namespace.
      Work around this problem here.  */
-  if (cu->language == language_cplus
+  if (cu->per_cu->lang == language_cplus
       && parent->tag == DW_TAG_namespace
       && strcmp (parent->name (cu), "::") == 0
       && grandparent_scope == NULL)
@@ -7720,7 +7719,7 @@ partial_die_parent_scope (struct partial_die_info *pdi,
       || parent->tag == DW_TAG_interface_type
       || parent->tag == DW_TAG_union_type
       || parent->tag == DW_TAG_enumeration_type
-      || (cu->language == language_fortran
+      || (cu->per_cu->lang == language_fortran
 	  && parent->tag == DW_TAG_subprogram
 	  && pdi->tag == DW_TAG_subprogram))
     {
@@ -7810,7 +7809,8 @@ add_partial_symbol (struct partial_die_info *pdi, struct dwarf2_cu *cu)
 
   partial_symbol psymbol;
   memset (&psymbol, 0, sizeof (psymbol));
-  psymbol.ginfo.set_language (cu->language, &objfile->objfile_obstack);
+  psymbol.ginfo.set_language (cu->per_cu->lang,
+			      &objfile->objfile_obstack);
   psymbol.ginfo.set_section_index (-1);
 
   /* The code below indicates that the psymbol should be installed by
@@ -7824,8 +7824,8 @@ add_partial_symbol (struct partial_die_info *pdi, struct dwarf2_cu *cu)
       addr = (gdbarch_adjust_dwarf2_addr (gdbarch, pdi->lowpc + baseaddr)
 	      - baseaddr);
       if (pdi->is_external
-	  || cu->language == language_ada
-	  || (cu->language == language_fortran
+	  || cu->per_cu->lang == language_ada
+	  || (cu->per_cu->lang == language_fortran
 	      && pdi->die_parent != NULL
 	      && pdi->die_parent->tag == DW_TAG_subprogram))
 	{
@@ -7844,7 +7844,7 @@ add_partial_symbol (struct partial_die_info *pdi, struct dwarf2_cu *cu)
       psymbol.ginfo.value.address = addr;
 
       if (pdi->main_subprogram && actual_name != NULL)
-	set_objfile_main_name (objfile, actual_name, cu->language);
+	set_objfile_main_name (objfile, actual_name, cu->per_cu->lang);
       break;
     case DW_TAG_constant:
       psymbol.domain = VAR_DOMAIN;
@@ -7949,14 +7949,14 @@ add_partial_symbol (struct partial_die_info *pdi, struct dwarf2_cu *cu)
 	 static vs. global.  */
       psymbol.domain = STRUCT_DOMAIN;
       psymbol.aclass = LOC_TYPEDEF;
-      where = (cu->language == language_cplus
+      where = (cu->per_cu->lang == language_cplus
 	       ? psymbol_placement::GLOBAL
 	       : psymbol_placement::STATIC);
       break;
     case DW_TAG_enumerator:
       psymbol.domain = VAR_DOMAIN;
       psymbol.aclass = LOC_CONST;
-      where = (cu->language == language_cplus
+      where = (cu->per_cu->lang == language_cplus
 	       ? psymbol_placement::GLOBAL
 	       : psymbol_placement::STATIC);
       break;
@@ -7968,7 +7968,8 @@ add_partial_symbol (struct partial_die_info *pdi, struct dwarf2_cu *cu)
     {
       if (built_actual_name != nullptr)
 	actual_name = objfile->intern (actual_name);
-      if (pdi->linkage_name == nullptr || cu->language == language_ada)
+      if (pdi->linkage_name == nullptr
+	  || cu->per_cu->lang == language_ada)
 	psymbol.ginfo.set_linkage_name (actual_name);
       else
 	{
@@ -8080,7 +8081,8 @@ add_partial_subprogram (struct partial_die_info *pdi,
   if (! pdi->has_children)
     return;
 
-  if (cu->language == language_ada || cu->language == language_fortran)
+  if (cu->per_cu->lang == language_ada
+      || cu->per_cu->lang == language_fortran)
     {
       pdi = pdi->die_child;
       while (pdi != NULL)
@@ -8688,7 +8690,7 @@ compute_delayed_physnames (struct dwarf2_cu *cu)
   /* Only C++ delays computing physnames.  */
   if (cu->method_list.empty ())
     return;
-  gdb_assert (cu->language == language_cplus);
+  gdb_assert (cu->per_cu->lang == language_cplus);
 
   for (const delayed_method_info &mi : cu->method_list)
     {
@@ -9117,7 +9119,7 @@ quirk_rust_enum (struct type *type, struct objfile *objfile)
 static void
 rust_union_quirks (struct dwarf2_cu *cu)
 {
-  gdb_assert (cu->language == language_rust);
+  gdb_assert (cu->per_cu->lang == language_rust);
   for (type *type_ : cu->rust_unions)
     quirk_rust_enum (type_, cu->per_objfile->objfile);
   /* We don't need this any more.  */
@@ -9290,9 +9292,6 @@ process_full_comp_unit (dwarf2_cu *cu, enum language pretend_language)
   /* Clear the list here in case something was left over.  */
   cu->method_list.clear ();
 
-  cu->language = pretend_language;
-  cu->language_defn = language_def (cu->language);
-
   dwarf2_find_base_address (cu->dies, cu);
 
   /* Before we start reading the top-level DIE, ensure it has a valid tag
@@ -9314,7 +9313,7 @@ process_full_comp_unit (dwarf2_cu *cu, enum language pretend_language)
   process_die (cu->dies, cu);
 
   /* For now fudge the Go package.  */
-  if (cu->language == language_go)
+  if (cu->per_cu->lang == language_go)
     fixup_go_packaging (cu);
 
   /* Now that we have processed all the DIEs in the CU, all the types
@@ -9322,7 +9321,7 @@ process_full_comp_unit (dwarf2_cu *cu, enum language pretend_language)
      physnames.  */
   compute_delayed_physnames (cu);
 
-  if (cu->language == language_rust)
+  if (cu->per_cu->lang == language_rust)
     rust_union_quirks (cu);
 
   /* Some compilers don't define a DW_AT_high_pc attribute for the
@@ -9351,9 +9350,9 @@ process_full_comp_unit (dwarf2_cu *cu, enum language pretend_language)
       /* Set symtab language to language from DW_AT_language.  If the
 	 compilation is from a C file generated by language preprocessors, do
 	 not set the language if it was already deduced by start_subfile.  */
-      if (!(cu->language == language_c
+      if (!(cu->per_cu->lang == language_c
 	    && COMPUNIT_FILETABS (cust)->language != language_unknown))
-	COMPUNIT_FILETABS (cust)->language = cu->language;
+	COMPUNIT_FILETABS (cust)->language = cu->per_cu->lang;
 
       /* GCC-4.0 has started to support -fvar-tracking.  GCC-3.x still can
 	 produce DW_AT_location with location lists but it can be possibly
@@ -9403,14 +9402,11 @@ process_full_type_unit (dwarf2_cu *cu,
   /* Clear the list here in case something was left over.  */
   cu->method_list.clear ();
 
-  cu->language = pretend_language;
-  cu->language_defn = language_def (cu->language);
-
   /* The symbol tables are set up in read_type_unit_scope.  */
   process_die (cu->dies, cu);
 
   /* For now fudge the Go package.  */
-  if (cu->language == language_go)
+  if (cu->per_cu->lang == language_go)
     fixup_go_packaging (cu);
 
   /* Now that we have processed all the DIEs in the CU, all the types
@@ -9418,7 +9414,7 @@ process_full_type_unit (dwarf2_cu *cu,
      physnames.  */
   compute_delayed_physnames (cu);
 
-  if (cu->language == language_rust)
+  if (cu->per_cu->lang == language_rust)
     rust_union_quirks (cu);
 
   /* TUs share symbol tables.
@@ -9439,9 +9435,9 @@ process_full_type_unit (dwarf2_cu *cu,
 	     compilation is from a C file generated by language preprocessors,
 	     do not set the language if it was already deduced by
 	     start_subfile.  */
-	  if (!(cu->language == language_c
+	  if (!(cu->per_cu->lang == language_c
 		&& COMPUNIT_FILETABS (cust)->language != language_c))
-	    COMPUNIT_FILETABS (cust)->language = cu->language;
+	    COMPUNIT_FILETABS (cust)->language = cu->per_cu->lang;
 	}
     }
   else
@@ -9489,9 +9485,10 @@ process_imported_unit_die (struct die_info *die, struct dwarf2_cu *cu)
 	return;
 
       /* If necessary, add it to the queue and load its DIEs.  */
-      if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language))
+      if (maybe_queue_comp_unit (cu, per_cu, per_objfile,
+				 cu->per_cu->lang))
 	load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
-			     false, cu->language);
+			     false, cu->per_cu->lang);
 
       cu->per_cu->imported_symtabs_push (per_cu);
     }
@@ -9549,7 +9546,7 @@ process_die (struct die_info *die, struct dwarf2_cu *cu)
       break;
     case DW_TAG_subprogram:
       /* Nested subprograms in Fortran get a prefix.  */
-      if (cu->language == language_fortran
+      if (cu->per_cu->lang == language_fortran
 	  && die->parent != NULL
 	  && die->parent->tag == DW_TAG_subprogram)
 	cu->processing_has_namespace_info = true;
@@ -9592,7 +9589,7 @@ process_die (struct die_info *die, struct dwarf2_cu *cu)
       /* We only need to handle this case for Ada -- in other
 	 languages, it's normal for the compiler to emit a typedef
 	 instead.  */
-      if (cu->language != language_ada)
+      if (cu->per_cu->lang != language_ada)
 	break;
       /* FALLTHROUGH */
     case DW_TAG_base_type:
@@ -9624,7 +9621,7 @@ process_die (struct die_info *die, struct dwarf2_cu *cu)
     case DW_TAG_imported_module:
       cu->processing_has_namespace_info = true;
       if (die->child != NULL && (die->tag == DW_TAG_imported_declaration
-				 || cu->language != language_fortran))
+				 || cu->per_cu->lang != language_fortran))
 	complaint (_("Tag '%s' has unexpected children"),
 		   dwarf_tag_name (die->tag));
       read_import_statement (die, cu);
@@ -9736,7 +9733,7 @@ dw2_linkage_name (struct die_info *die, struct dwarf2_cu *cu)
 
   /* rustc emits invalid values for DW_AT_linkage_name.  Ignore these.
      See https://github.com/rust-lang/rust/issues/32925.  */
-  if (cu->language == language_rust && linkage_name != NULL
+  if (cu->per_cu->lang == language_rust && linkage_name != NULL
       && strchr (linkage_name, '{') != NULL)
     linkage_name = NULL;
 
@@ -9768,6 +9765,8 @@ dwarf2_compute_name (const char *name,
   if (name == NULL)
     name = dwarf2_name (die, cu);
 
+  enum language lang = cu->per_cu->lang;
+
   /* For Fortran GDB prefers DW_AT_*linkage_name for the physname if present
      but otherwise compute it by typename_concat inside GDB.
      FIXME: Actually this is not really true, or at least not always true.
@@ -9775,8 +9774,8 @@ dwarf2_compute_name (const char *name,
      Fortran names because there is no mangling standard.  So new_symbol
      will set the demangled name to the result of dwarf2_full_name, and it is
      the demangled name that GDB uses if it exists.  */
-  if (cu->language == language_ada
-      || (cu->language == language_fortran && physname))
+  if (lang == language_ada
+      || (lang == language_fortran && physname))
     {
       /* For Ada unit, we prefer the linkage name over the name, as
 	 the former contains the exported name, which the user expects
@@ -9791,9 +9790,9 @@ dwarf2_compute_name (const char *name,
 
   /* These are the only languages we know how to qualify names in.  */
   if (name != NULL
-      && (cu->language == language_cplus
-	  || cu->language == language_fortran || cu->language == language_d
-	  || cu->language == language_rust))
+      && (lang == language_cplus
+	  || lang == language_fortran || lang == language_d
+	  || lang == language_rust))
     {
       if (die_needs_namespace (die, cu))
 	{
@@ -9834,12 +9833,11 @@ dwarf2_compute_name (const char *name,
 	     templates; two instantiated function templates are allowed to
 	     differ only by their return types, which we do not add here.  */
 
-	  if (cu->language == language_cplus && strchr (name, '<') == NULL)
+	  if (lang == language_cplus && strchr (name, '<') == NULL)
 	    {
 	      struct attribute *attr;
 	      struct die_info *child;
 	      int first = 1;
-	      const language_defn *cplus_lang = language_def (cu->language);
 
 	      die->building_fullname = 1;
 
@@ -9874,8 +9872,8 @@ dwarf2_compute_name (const char *name,
 
 		  if (child->tag == DW_TAG_template_type_param)
 		    {
-		      cplus_lang->print_type (type, "", &buf, -1, 0,
-					      &type_print_raw_options);
+		      cu->language_defn->print_type (type, "", &buf, -1, 0,
+						     &type_print_raw_options);
 		      continue;
 		    }
 
@@ -9895,7 +9893,7 @@ dwarf2_compute_name (const char *name,
 		  if (type->has_no_signedness ())
 		    /* GDB prints characters as NUMBER 'CHAR'.  If that's
 		       changed, this can use value_print instead.  */
-		    cplus_lang->printchar (value, type, &buf);
+		    cu->language_defn->printchar (value, type, &buf);
 		  else
 		    {
 		      struct value_print_options opts;
@@ -9941,14 +9939,14 @@ dwarf2_compute_name (const char *name,
 	     information, if PHYSNAME.  */
 
 	  if (physname && die->tag == DW_TAG_subprogram
-	      && cu->language == language_cplus)
+	      && lang == language_cplus)
 	    {
 	      struct type *type = read_type_die (die, cu);
 
-	      c_type_print_args (type, &buf, 1, cu->language,
+	      c_type_print_args (type, &buf, 1, lang,
 				 &type_print_raw_options);
 
-	      if (cu->language == language_cplus)
+	      if (lang == language_cplus)
 		{
 		  /* Assume that an artificial first parameter is
 		     "this", but do not crash if it is not.  RealView
@@ -9965,7 +9963,7 @@ dwarf2_compute_name (const char *name,
 
 	  const std::string &intermediate_name = buf.string ();
 
-	  if (cu->language == language_cplus)
+	  if (lang == language_cplus)
 	    canonical_name
 	      = dwarf2_canonicalize_name (intermediate_name.c_str (), cu,
 					  objfile);
@@ -10016,7 +10014,7 @@ dwarf2_physname (const char *name, struct die_info *die, struct dwarf2_cu *cu)
   if (!die_needs_namespace (die, cu))
     return dwarf2_compute_name (name, die, cu, 1);
 
-  if (cu->language != language_rust)
+  if (cu->per_cu->lang != language_rust)
     mangled = dw2_linkage_name (die, cu);
 
   /* DW_AT_linkage_name is missing in some cases - depend on what GDB
@@ -10024,8 +10022,7 @@ dwarf2_physname (const char *name, struct die_info *die, struct dwarf2_cu *cu)
   gdb::unique_xmalloc_ptr<char> demangled;
   if (mangled != NULL)
     {
-
-      if (language_def (cu->language)->store_sym_names_in_linkage_form_p ())
+      if (cu->language_defn->store_sym_names_in_linkage_form_p ())
 	{
 	  /* Do nothing (do not demangle the symbol name).  */
 	}
@@ -10159,7 +10156,7 @@ read_namespace_alias (struct die_info *die, struct dwarf2_cu *cu)
 static struct using_direct **
 using_directives (struct dwarf2_cu *cu)
 {
-  if (cu->language == language_ada
+  if (cu->per_cu->lang == language_ada
       && cu->get_builder ()->outermost_context_p ())
     return cu->get_builder ()->get_global_using_directives ();
   else
@@ -10250,12 +10247,15 @@ read_import_statement (struct die_info *die, struct dwarf2_cu *cu)
   else if (strlen (imported_name_prefix) > 0)
     canonical_name = obconcat (&objfile->objfile_obstack,
 			       imported_name_prefix,
-			       (cu->language == language_d ? "." : "::"),
+			       (cu->per_cu->lang == language_d
+				? "."
+				: "::"),
 			       imported_name, (char *) NULL);
   else
     canonical_name = imported_name;
 
-  if (die->tag == DW_TAG_imported_module && cu->language == language_fortran)
+  if (die->tag == DW_TAG_imported_module
+      && cu->per_cu->lang == language_fortran)
     for (child_die = die->child; child_die && child_die->tag;
 	 child_die = child_die->sibling)
       {
@@ -10496,7 +10496,7 @@ read_file_scope (struct die_info *die, struct dwarf2_cu *cu)
   struct die_info *child_die;
   CORE_ADDR baseaddr;
 
-  prepare_one_comp_unit (cu, die, cu->language);
+  prepare_one_comp_unit (cu, die, cu->per_cu->lang);
   baseaddr = objfile->text_section_offset ();
 
   get_scope_pc_bounds (die, &lowpc, &highpc, cu);
@@ -12698,7 +12698,7 @@ queue_and_load_dwo_tu (void **slot, void *info)
 	 a real dependency of PER_CU on SIG_TYPE.  That is detected later
 	 while processing PER_CU.  */
       if (maybe_queue_comp_unit (NULL, sig_type, cu->per_objfile,
-				 cu->language))
+				 cu->per_cu->lang))
 	load_full_type_unit (sig_type, cu->per_objfile);
       cu->per_cu->imported_symtabs_push (sig_type);
     }
@@ -12986,7 +12986,7 @@ read_func_scope (struct die_info *die, struct dwarf2_cu *cu)
 
   if (dwarf2_flag_true_p (die, DW_AT_main_subprogram, cu))
     set_objfile_main_name (objfile, newobj->name->linkage_name (),
-			   cu->language);
+			   cu->per_cu->lang);
 
   /* If there is a location expression for DW_AT_frame_base, record
      it.  */
@@ -13031,7 +13031,7 @@ read_func_scope (struct die_info *die, struct dwarf2_cu *cu)
   /* If we have a DW_AT_specification, we might need to import using
      directives from the context of the specification DIE.  See the
      comment in determine_prefix.  */
-  if (cu->language == language_cplus
+  if (cu->per_cu->lang == language_cplus
       && dwarf2_attr (die, DW_AT_specification, cu))
     {
       struct dwarf2_cu *spec_cu = cu;
@@ -13059,10 +13059,10 @@ read_func_scope (struct die_info *die, struct dwarf2_cu *cu)
 				     cstk.static_link, lowpc, highpc);
 
   /* For C++, set the block's scope.  */
-  if ((cu->language == language_cplus
-       || cu->language == language_fortran
-       || cu->language == language_d
-       || cu->language == language_rust)
+  if ((cu->per_cu->lang == language_cplus
+       || cu->per_cu->lang == language_fortran
+       || cu->per_cu->lang == language_d
+       || cu->per_cu->lang == language_rust)
       && cu->processing_has_namespace_info)
     block_set_scope (block, determine_prefix (die, cu),
 		     &objfile->objfile_obstack);
@@ -13538,7 +13538,7 @@ read_variable (struct die_info *die, struct dwarf2_cu *cu)
 {
   struct rust_vtable_symbol *storage = NULL;
 
-  if (cu->language == language_rust)
+  if (cu->per_cu->lang == language_rust)
     {
       struct type *containing_type = rust_containing_type (die, cu);
 
@@ -14044,7 +14044,7 @@ dwarf2_get_subprogram_pc_bounds (struct die_info *die,
 
   /* If the language does not allow nested subprograms (either inside
      subprograms or lexical blocks), we're done.  */
-  if (cu->language != language_ada)
+  if (cu->per_cu->lang != language_ada)
     return;
 
   /* Check all the children of the given DIE.  If it contains nested
@@ -14811,7 +14811,7 @@ dwarf2_attach_fields_to_type (struct field_info *fip, struct type *type,
   type->set_fields
     ((struct field *) TYPE_ZALLOC (type, sizeof (struct field) * nfields));
 
-  if (fip->non_public_fields && cu->language != language_ada)
+  if (fip->non_public_fields && cu->per_cu->lang != language_ada)
     {
       ALLOCATE_CPLUS_STRUCT_TYPE (type);
 
@@ -14830,7 +14830,7 @@ dwarf2_attach_fields_to_type (struct field_info *fip, struct type *type,
 
   /* If the type has baseclasses, allocate and clear a bit vector for
      TYPE_FIELD_VIRTUAL_BITS.  */
-  if (!fip->baseclasses.empty () && cu->language != language_ada)
+  if (!fip->baseclasses.empty () && cu->per_cu->lang != language_ada)
     {
       int num_bytes = B_BYTES (fip->baseclasses.size ());
       unsigned char *pointer;
@@ -14856,12 +14856,12 @@ dwarf2_attach_fields_to_type (struct field_info *fip, struct type *type,
       switch (field.accessibility)
 	{
 	case DW_ACCESS_private:
-	  if (cu->language != language_ada)
+	  if (cu->per_cu->lang != language_ada)
 	    SET_TYPE_FIELD_PRIVATE (type, i);
 	  break;
 
 	case DW_ACCESS_protected:
-	  if (cu->language != language_ada)
+	  if (cu->per_cu->lang != language_ada)
 	    SET_TYPE_FIELD_PROTECTED (type, i);
 	  break;
 
@@ -14882,7 +14882,7 @@ dwarf2_attach_fields_to_type (struct field_info *fip, struct type *type,
 	    {
 	    case DW_VIRTUALITY_virtual:
 	    case DW_VIRTUALITY_pure_virtual:
-	      if (cu->language == language_ada)
+	      if (cu->per_cu->lang == language_ada)
 		error (_("unexpected virtuality in component of Ada type"));
 	      SET_TYPE_FIELD_VIRTUAL (type, i);
 	      break;
@@ -14933,7 +14933,7 @@ dwarf2_add_member_fn (struct field_info *fip, struct die_info *die,
   const char *fieldname;
   struct type *this_type;
 
-  if (cu->language == language_ada)
+  if (cu->per_cu->lang == language_ada)
     error (_("unexpected member function in Ada type"));
 
   /* Get name of member function.  */
@@ -14966,7 +14966,7 @@ dwarf2_add_member_fn (struct field_info *fip, struct die_info *die,
   fnp = &flp->fnfields.back ();
 
   /* Delay processing of the physname until later.  */
-  if (cu->language == language_cplus)
+  if (cu->per_cu->lang == language_cplus)
     add_to_method_list (type, i, flp->fnfields.size () - 1, fieldname,
 			die, cu);
   else
@@ -15121,7 +15121,7 @@ static void
 dwarf2_attach_fn_fields_to_type (struct field_info *fip, struct type *type,
 				 struct dwarf2_cu *cu)
 {
-  if (cu->language == language_ada)
+  if (cu->per_cu->lang == language_ada)
     error (_("unexpected member functions in Ada type"));
 
   ALLOCATE_CPLUS_STRUCT_TYPE (type);
@@ -15264,7 +15264,7 @@ static void
 quirk_ada_thick_pointer_struct (struct die_info *die, struct dwarf2_cu *cu,
 				struct type *type)
 {
-  gdb_assert (cu->language == language_ada);
+  gdb_assert (cu->per_cu->lang == language_ada);
 
   /* Check for a structure with two children.  */
   if (type->code () != TYPE_CODE_STRUCT || type->num_fields () != 2)
@@ -15443,9 +15443,9 @@ read_structure_type (struct die_info *die, struct dwarf2_cu *cu)
   name = dwarf2_name (die, cu);
   if (name != NULL)
     {
-      if (cu->language == language_cplus
-	  || cu->language == language_d
-	  || cu->language == language_rust)
+      if (cu->per_cu->lang == language_cplus
+	  || cu->per_cu->lang == language_d
+	  || cu->per_cu->lang == language_rust)
 	{
 	  const char *full_name = dwarf2_full_name (name, die, cu);
 
@@ -15477,7 +15477,7 @@ read_structure_type (struct die_info *die, struct dwarf2_cu *cu)
       type->set_code (TYPE_CODE_STRUCT);
     }
 
-  if (cu->language == language_cplus && die->tag == DW_TAG_class_type)
+  if (cu->per_cu->lang == language_cplus && die->tag == DW_TAG_class_type)
     type->set_is_declared_class (true);
 
   /* Store the calling convention in the type if it's available in
@@ -15688,7 +15688,7 @@ handle_struct_member_die (struct die_info *child_die, struct type *type,
       /* Rust doesn't have member functions in the C++ sense.
 	 However, it does emit ordinary functions as children
 	 of a struct DIE.  */
-      if (cu->language == language_rust)
+      if (cu->per_cu->lang == language_rust)
 	read_func_scope (child_die, cu);
       else
 	{
@@ -15849,7 +15849,8 @@ process_structure_scope (struct die_info *die, struct dwarf2_cu *cu)
 
       /* Copy fi.nested_types_list linked list elements content into the
 	 allocated array TYPE_NESTED_TYPES_ARRAY (type).  */
-      if (!fi.nested_types_list.empty () && cu->language != language_ada)
+      if (!fi.nested_types_list.empty ()
+	  && cu->per_cu->lang != language_ada)
 	{
 	  int count = fi.nested_types_list.size ();
 
@@ -15865,9 +15866,9 @@ process_structure_scope (struct die_info *die, struct dwarf2_cu *cu)
     }
 
   quirk_gcc_member_function_pointer (type, objfile);
-  if (cu->language == language_rust && die->tag == DW_TAG_union_type)
+  if (cu->per_cu->lang == language_rust && die->tag == DW_TAG_union_type)
     cu->rust_unions.push_back (type);
-  else if (cu->language == language_ada)
+  else if (cu->per_cu->lang == language_ada)
     quirk_ada_thick_pointer_struct (die, cu, type);
 
   /* NOTE: carlton/2004-03-16: GCC 3.4 (or at least one of its
@@ -16570,7 +16571,7 @@ read_array_type (struct die_info *die, struct dwarf2_cu *cu)
   maybe_set_alignment (cu, die, type);
 
   struct type *replacement_type = nullptr;
-  if (cu->language == language_ada)
+  if (cu->per_cu->lang == language_ada)
     {
       replacement_type = quirk_ada_thick_pointer (die, cu, type);
       if (replacement_type != nullptr)
@@ -16607,7 +16608,7 @@ read_array_order (struct die_info *die, struct dwarf2_cu *cu)
      FIXME: dsl/2004-8-20: If G77 is ever fixed, this will also need
      version checking.  */
 
-  if (cu->language == language_fortran
+  if (cu->per_cu->lang == language_fortran
       && cu->producer && strstr (cu->producer, "GNU F77"))
     {
       return DW_ORD_row_major;
@@ -17346,9 +17347,9 @@ prototyped_function_p (struct die_info *die, struct dwarf2_cu *cu)
      languages that allow unprototyped functions (Eg: Objective C).
      For all other languages, assume that functions are always
      prototyped.  */
-  if (cu->language != language_c
-      && cu->language != language_objc
-      && cu->language != language_opencl)
+  if (cu->per_cu->lang != language_c
+      && cu->per_cu->lang != language_objc
+      && cu->per_cu->lang != language_opencl)
     return 1;
 
   /* RealView does not emit DW_AT_prototyped.  We can not distinguish
@@ -17473,7 +17474,8 @@ read_subroutine_type (struct die_info *die, struct dwarf2_cu *cu)
 	      /* RealView does not mark THIS as const, which the testsuite
 		 expects.  GCC marks THIS as const in method definitions,
 		 but not in the class specifications (GCC PR 43053).  */
-	      if (cu->language == language_cplus && !TYPE_CONST (arg_type)
+	      if (cu->per_cu->lang == language_cplus
+		  && !TYPE_CONST (arg_type)
 		  && TYPE_FIELD_ARTIFICIAL (ftype, iparams))
 		{
 		  int is_this = 0;
@@ -17908,7 +17910,7 @@ dwarf2_init_complex_target_type (struct dwarf2_cu *cu,
   /* Try to find a suitable floating point builtin type of size BITS.
      We're going to use the name of this type as the name for the complex
      target type that we are about to create.  */
-  switch (cu->language)
+  switch (cu->per_cu->lang)
     {
     case language_fortran:
       switch (bits)
@@ -17998,7 +18000,7 @@ read_base_type (struct die_info *die, struct dwarf2_cu *cu)
     }
 
   if ((encoding == DW_ATE_signed_fixed || encoding == DW_ATE_unsigned_fixed)
-      && cu->language == language_ada
+      && cu->per_cu->lang == language_ada
       && has_zero_over_zero_small_attribute (die, cu))
     {
       /* brobecker/2018-02-24: This is a fixed point type for which
@@ -18020,7 +18022,7 @@ read_base_type (struct die_info *die, struct dwarf2_cu *cu)
      than an "else if".  */
   const char *gnat_encoding_suffix = nullptr;
   if ((encoding == DW_ATE_signed || encoding == DW_ATE_unsigned)
-      && cu->language == language_ada
+      && cu->per_cu->lang == language_ada
       && name != nullptr)
     {
       gnat_encoding_suffix = gnat_encoded_fixed_point_type_info (name);
@@ -18077,7 +18079,7 @@ read_base_type (struct die_info *die, struct dwarf2_cu *cu)
 	type = dwarf2_init_integer_type (cu, objfile, bits, 0, name);
 	break;
       case DW_ATE_unsigned:
-	if (cu->language == language_fortran
+	if (cu->per_cu->lang == language_fortran
 	    && name
 	    && startswith (name, "character("))
 	  type = init_character_type (objfile, bits, 1, name);
@@ -18085,18 +18087,20 @@ read_base_type (struct die_info *die, struct dwarf2_cu *cu)
 	  type = dwarf2_init_integer_type (cu, objfile, bits, 1, name);
 	break;
       case DW_ATE_signed_char:
-	if (cu->language == language_ada || cu->language == language_m2
-	    || cu->language == language_pascal
-	    || cu->language == language_fortran)
+	if (cu->per_cu->lang == language_ada
+	    || cu->per_cu->lang == language_m2
+	    || cu->per_cu->lang == language_pascal
+	    || cu->per_cu->lang == language_fortran)
 	  type = init_character_type (objfile, bits, 0, name);
 	else
 	  type = dwarf2_init_integer_type (cu, objfile, bits, 0, name);
 	break;
       case DW_ATE_unsigned_char:
-	if (cu->language == language_ada || cu->language == language_m2
-	    || cu->language == language_pascal
-	    || cu->language == language_fortran
-	    || cu->language == language_rust)
+	if (cu->per_cu->lang == language_ada
+	    || cu->per_cu->lang == language_m2
+	    || cu->per_cu->lang == language_pascal
+	    || cu->per_cu->lang == language_fortran
+	    || cu->per_cu->lang == language_rust)
 	  type = init_character_type (objfile, bits, 1, name);
 	else
 	  type = dwarf2_init_integer_type (cu, objfile, bits, 1, name);
@@ -18391,7 +18395,7 @@ read_subrange_type (struct die_info *die, struct dwarf2_cu *cu)
 
   /* Set LOW_DEFAULT_IS_VALID if current language and DWARF version allow
      omitting DW_AT_lower_bound.  */
-  switch (cu->language)
+  switch (cu->per_cu->lang)
     {
     case language_c:
     case language_cplus:
@@ -18527,7 +18531,7 @@ read_subrange_type (struct die_info *die, struct dwarf2_cu *cu)
     range_type->bounds ()->flag_upper_bound_is_count = 1;
 
   /* Ada expects an empty array on no boundary attributes.  */
-  if (attr == NULL && cu->language != language_ada)
+  if (attr == NULL && cu->per_cu->lang != language_ada)
     range_type->bounds ()->high.set_undefined ();
 
   name = dwarf2_name (die, cu);
@@ -18560,7 +18564,7 @@ read_unspecified_type (struct die_info *die, struct dwarf2_cu *cu)
      of the type is deferred to a different unit.  When encountering
      such a type, we treat it as a stub, and try to resolve it later on,
      when needed.  */
-  if (cu->language == language_ada)
+  if (cu->per_cu->lang == language_ada)
     type->set_is_stub (true);
 
   return set_die_type (die, type, cu);
@@ -18860,7 +18864,7 @@ load_partial_dies (const struct die_reader_specs *reader,
       /* Check for template arguments.  We never save these; if
 	 they're seen, we just mark the parent, and go on our way.  */
       if (parent_die != NULL
-	  && cu->language == language_cplus
+	  && cu->per_cu->lang == language_cplus
 	  && (abbrev->tag == DW_TAG_template_type_param
 	      || abbrev->tag == DW_TAG_template_value_param))
 	{
@@ -18877,7 +18881,7 @@ load_partial_dies (const struct die_reader_specs *reader,
       /* We only recurse into c++ subprograms looking for template arguments.
 	 Skip their other children.  */
       if (!load_all
-	  && cu->language == language_cplus
+	  && cu->per_cu->lang == language_cplus
 	  && parent_die != NULL
 	  && parent_die->tag == DW_TAG_subprogram
 	  && abbrev->tag != DW_TAG_inlined_subroutine)
@@ -18891,7 +18895,7 @@ load_partial_dies (const struct die_reader_specs *reader,
 	 later variables referencing them via DW_AT_specification (for
 	 static members).  */
       if (!load_all
-	  && !is_type_tag_for_partial (abbrev->tag, cu->language)
+	  && !is_type_tag_for_partial (abbrev->tag, cu->per_cu->lang)
 	  && abbrev->tag != DW_TAG_constant
 	  && abbrev->tag != DW_TAG_enumerator
 	  && abbrev->tag != DW_TAG_subprogram
@@ -19048,17 +19052,17 @@ load_partial_dies (const struct die_reader_specs *reader,
 	      || last_die->tag == DW_TAG_namespace
 	      || last_die->tag == DW_TAG_module
 	      || last_die->tag == DW_TAG_enumeration_type
-	      || (cu->language == language_cplus
+	      || (cu->per_cu->lang == language_cplus
 		  && last_die->tag == DW_TAG_subprogram
 		  && (last_die->raw_name == NULL
 		      || strchr (last_die->raw_name, '<') == NULL))
-	      || (cu->language != language_c
+	      || (cu->per_cu->lang != language_c
 		  && (last_die->tag == DW_TAG_class_type
 		      || last_die->tag == DW_TAG_interface_type
 		      || last_die->tag == DW_TAG_structure_type
 		      || last_die->tag == DW_TAG_union_type))
-	      || ((cu->language == language_ada
-		   || cu->language == language_fortran)
+	      || ((cu->per_cu->lang == language_ada
+		   || cu->per_cu->lang == language_fortran)
 		  && (last_die->tag == DW_TAG_subprogram
 		      || last_die->tag == DW_TAG_lexical_block))))
 	{
@@ -19234,7 +19238,7 @@ partial_die_info::read (const struct die_reader_specs *reader,
 	     information, we support this practice for backward
 	     compatibility.  */
 	  if (attr.constant_value (0) == DW_CC_program
-	      && cu->language == language_fortran)
+	      && cu->per_cu->lang == language_fortran)
 	    main_subprogram = 1;
 	  break;
 	case DW_AT_inline:
@@ -19286,7 +19290,7 @@ partial_die_info::read (const struct die_reader_specs *reader,
      of the order in which the name and linkage name were emitted.
      Really, though, this is just a workaround for the fact that gdb
      doesn't store both the name and the linkage name.  */
-  if (cu->language == language_ada && linkage_name != nullptr)
+  if (cu->per_cu->lang == language_ada && linkage_name != nullptr)
     raw_name = linkage_name;
 
   if (high_pc_relative)
@@ -19552,7 +19556,7 @@ partial_die_info::fixup (struct dwarf2_cu *cu)
   /* If there is no parent die to provide a namespace, and there are
      children, see if we can determine the namespace from their linkage
      name.  */
-  if (cu->language == language_cplus
+  if (cu->per_cu->lang == language_cplus
       && !cu->per_objfile->per_bfd->types.empty ()
       && die_parent == NULL
       && has_children
@@ -20350,51 +20354,51 @@ set_cu_language (unsigned int lang, struct dwarf2_cu *cu)
     case DW_LANG_C11:
     case DW_LANG_C:
     case DW_LANG_UPC:
-      cu->language = language_c;
+      cu->per_cu->lang = language_c;
       break;
     case DW_LANG_Java:
     case DW_LANG_C_plus_plus:
     case DW_LANG_C_plus_plus_11:
     case DW_LANG_C_plus_plus_14:
-      cu->language = language_cplus;
+      cu->per_cu->lang = language_cplus;
       break;
     case DW_LANG_D:
-      cu->language = language_d;
+      cu->per_cu->lang = language_d;
       break;
     case DW_LANG_Fortran77:
     case DW_LANG_Fortran90:
     case DW_LANG_Fortran95:
     case DW_LANG_Fortran03:
     case DW_LANG_Fortran08:
-      cu->language = language_fortran;
+      cu->per_cu->lang = language_fortran;
       break;
     case DW_LANG_Go:
-      cu->language = language_go;
+      cu->per_cu->lang = language_go;
       break;
     case DW_LANG_Mips_Assembler:
-      cu->language = language_asm;
+      cu->per_cu->lang = language_asm;
       break;
     case DW_LANG_Ada83:
     case DW_LANG_Ada95:
-      cu->language = language_ada;
+      cu->per_cu->lang = language_ada;
       break;
     case DW_LANG_Modula2:
-      cu->language = language_m2;
+      cu->per_cu->lang = language_m2;
       break;
     case DW_LANG_Pascal83:
-      cu->language = language_pascal;
+      cu->per_cu->lang = language_pascal;
       break;
     case DW_LANG_ObjC:
-      cu->language = language_objc;
+      cu->per_cu->lang = language_objc;
       break;
     case DW_LANG_Rust:
     case DW_LANG_Rust_old:
-      cu->language = language_rust;
+      cu->per_cu->lang = language_rust;
       break;
     case DW_LANG_Cobol74:
     case DW_LANG_Cobol85:
     default:
-      cu->language = language_minimal;
+      cu->per_cu->lang = language_minimal;
       break;
     }
 }
@@ -21531,16 +21535,16 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
       OBJSTAT (objfile, n_syms++);
 
       /* Cache this symbol's name and the name's demangled form (if any).  */
-      sym->set_language (cu->language, &objfile->objfile_obstack);
+      sym->set_language (cu->per_cu->lang, &objfile->objfile_obstack);
       /* Fortran does not have mangling standard and the mangling does differ
 	 between gfortran, iFort etc.  */
       const char *physname
-	= (cu->language == language_fortran
+	= (cu->per_cu->lang == language_fortran
 	   ? dwarf2_full_name (name, die, cu)
 	   : dwarf2_physname (name, die, cu));
       const char *linkagename = dw2_linkage_name (die, cu);
 
-      if (linkagename == nullptr || cu->language == language_ada)
+      if (linkagename == nullptr || cu->per_cu->lang == language_ada)
 	sym->set_linkage_name (physname);
       else
 	{
@@ -21607,8 +21611,8 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
 	  SYMBOL_ACLASS_INDEX (sym) = LOC_BLOCK;
 	  attr2 = dwarf2_attr (die, DW_AT_external, cu);
 	  if ((attr2 != nullptr && attr2->as_boolean ())
-	      || cu->language == language_ada
-	      || cu->language == language_fortran)
+	      || cu->per_cu->lang == language_ada
+	      || cu->per_cu->lang == language_fortran)
 	    {
 	      /* Subprograms marked external are stored as a global symbol.
 		 Ada and Fortran subprograms, whether marked external or
@@ -21673,7 +21677,7 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
 
 	      /* Fortran explicitly imports any global symbols to the local
 		 scope by DW_TAG_common_block.  */
-	      if (cu->language == language_fortran && die->parent
+	      if (cu->per_cu->lang == language_fortran && die->parent
 		  && die->parent->tag == DW_TAG_common_block)
 		attr2 = NULL;
 
@@ -21727,7 +21731,7 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
 
 	      /* Fortran explicitly imports any global symbols to the local
 		 scope by DW_TAG_common_block.  */
-	      if (cu->language == language_fortran && die->parent
+	      if (cu->per_cu->lang == language_fortran && die->parent
 		  && die->parent->tag == DW_TAG_common_block)
 		{
 		  /* SYMBOL_CLASS doesn't matter here because
@@ -21813,16 +21817,16 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
 		buildsym_compunit *builder = cu->get_builder ();
 		list_to_add
 		  = (cu->list_in_scope == builder->get_file_symbols ()
-		     && cu->language == language_cplus
+		     && cu->per_cu->lang == language_cplus
 		     ? builder->get_global_symbols ()
 		     : cu->list_in_scope);
 
 		/* The semantics of C++ state that "struct foo {
 		   ... }" also defines a typedef for "foo".  */
-		if (cu->language == language_cplus
-		    || cu->language == language_ada
-		    || cu->language == language_d
-		    || cu->language == language_rust)
+		if (cu->per_cu->lang == language_cplus
+		    || cu->per_cu->lang == language_ada
+		    || cu->per_cu->lang == language_d
+		    || cu->per_cu->lang == language_rust)
 		  {
 		    /* The symbol's name is already allocated along
 		       with this objfile, so we don't need to
@@ -21857,7 +21861,7 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
 
 	    list_to_add
 	      = (cu->list_in_scope == cu->get_builder ()->get_file_symbols ()
-		 && cu->language == language_cplus
+		 && cu->per_cu->lang == language_cplus
 		 ? cu->get_builder ()->get_global_symbols ()
 		 : cu->list_in_scope);
 	  }
@@ -21900,7 +21904,7 @@ new_symbol (struct die_info *die, struct type *type, struct dwarf2_cu *cu,
       /* For the benefit of old versions of GCC, check for anonymous
 	 namespaces based on the demangled name.  */
       if (!cu->processing_has_namespace_info
-	  && cu->language == language_cplus)
+	  && cu->per_cu->lang == language_cplus)
 	cp_scan_for_anonymous_namespaces (cu->get_builder (), sym, objfile);
     }
   return (sym);
@@ -22112,7 +22116,7 @@ need_gnat_info (struct dwarf2_cu *cu)
 {
   /* Assume that the Ada compiler was GNAT, which always produces
      the auxiliary information.  */
-  return (cu->language == language_ada);
+  return (cu->per_cu->lang == language_ada);
 }
 
 /* Return the auxiliary type of the die in question using its
@@ -22487,9 +22491,10 @@ determine_prefix (struct die_info *die, struct dwarf2_cu *cu)
   struct type *parent_type;
   const char *retval;
 
-  if (cu->language != language_cplus
-      && cu->language != language_fortran && cu->language != language_d
-      && cu->language != language_rust)
+  if (cu->per_cu->lang != language_cplus
+      && cu->per_cu->lang != language_fortran
+      && cu->per_cu->lang != language_d
+      && cu->per_cu->lang != language_rust)
     return "";
 
   retval = anonymous_struct_prefix (die, cu);
@@ -22577,7 +22582,7 @@ determine_prefix (struct die_info *die, struct dwarf2_cu *cu)
 	/* GCC 4.0 and 4.1 had a bug (PR c++/28460) where they generated bogus
 	   DW_TAG_namespace DIEs with a name of "::" for the global namespace.
 	   Work around this problem here.  */
-	if (cu->language == language_cplus
+	if (cu->per_cu->lang == language_cplus
 	    && strcmp (parent_type->name (), "::") == 0)
 	  return "";
 	/* We give a name to even anonymous namespaces.  */
@@ -22598,7 +22603,7 @@ determine_prefix (struct die_info *die, struct dwarf2_cu *cu)
       case DW_TAG_compile_unit:
       case DW_TAG_partial_unit:
 	/* gcc-4.5 -gdwarf-4 can drop the enclosing namespace.  Cope.  */
-	if (cu->language == language_cplus
+	if (cu->per_cu->lang == language_cplus
 	    && !per_objfile->per_bfd->types.empty ()
 	    && die->child != NULL
 	    && (die->tag == DW_TAG_class_type
@@ -22613,7 +22618,7 @@ determine_prefix (struct die_info *die, struct dwarf2_cu *cu)
       case DW_TAG_subprogram:
 	/* Nested subroutines in Fortran get a prefix with the name
 	   of the parent's subroutine.  */
-	if (cu->language == language_fortran)
+	if (cu->per_cu->lang == language_fortran)
 	  {
 	    if ((die->tag ==  DW_TAG_subprogram)
 		&& (dwarf2_name (parent, cu) != NULL))
@@ -22652,7 +22657,7 @@ typename_concat (struct obstack *obs, const char *prefix, const char *suffix,
   if (suffix == NULL || suffix[0] == '\0'
       || prefix == NULL || prefix[0] == '\0')
     sep = "";
-  else if (cu->language == language_d)
+  else if (cu->per_cu->lang == language_d)
     {
       /* For D, the 'main' function could be defined in any module, but it
 	 should never be prefixed.  */
@@ -22664,7 +22669,7 @@ typename_concat (struct obstack *obs, const char *prefix, const char *suffix,
       else
 	sep = ".";
     }
-  else if (cu->language == language_fortran && physname)
+  else if (cu->per_cu->lang == language_fortran && physname)
     {
       /* This is gfortran specific mangling.  Normally DW_AT_linkage_name or
 	 DW_AT_MIPS_linkage_name is preferred and used instead.  */
@@ -22705,7 +22710,7 @@ static const char *
 dwarf2_canonicalize_name (const char *name, struct dwarf2_cu *cu,
 			  struct objfile *objfile)
 {
-  if (name && cu->language == language_cplus)
+  if (name && cu->per_cu->lang == language_cplus)
     {
       gdb::unique_xmalloc_ptr<char> canon_name
 	= cp_canonicalize_string (name);
@@ -23084,10 +23089,10 @@ follow_die_offset (sect_offset sect_off, int offset_in_dwz,
 	 Even if maybe_queue_comp_unit doesn't require us to load the CU's DIEs,
 	 it doesn't mean they are currently loaded.  Since we require them
 	 to be loaded, we must check for ourselves.  */
-      if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->language)
+      if (maybe_queue_comp_unit (cu, per_cu, per_objfile, cu->per_cu->lang)
 	  || per_objfile->get_cu (per_cu) == nullptr)
 	load_full_comp_unit (per_cu, per_objfile, per_objfile->get_cu (per_cu),
-			     false, cu->language);
+			     false, cu->per_cu->lang);
 
       target_cu = per_objfile->get_cu (per_cu);
       gdb_assert (target_cu != nullptr);
@@ -24415,17 +24420,17 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
 	 attribute is not standardised yet.  As a workaround for the
 	 language detection we fall back to the DW_AT_producer
 	 string.  */
-      cu->language = language_opencl;
+      cu->per_cu->lang = language_opencl;
     }
   else if (cu->producer != nullptr
 	   && strstr (cu->producer, "GNU Go ") != NULL)
     {
       /* Similar hack for Go.  */
-      cu->language = language_go;
+      cu->per_cu->lang = language_go;
     }
   else
-    cu->language = pretend_language;
-  cu->language_defn = language_def (cu->language);
+    cu->per_cu->lang = pretend_language;
+  cu->language_defn = language_def (cu->per_cu->lang);
 }
 
 /* See read.h.  */
-- 
2.26.3


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] Consolidate CU language setting
  2021-06-08 15:26 ` [PATCH 1/2] Consolidate CU language setting Tom Tromey
@ 2021-06-09  1:32   ` Simon Marchi
  2021-06-09 20:22     ` Tom Tromey
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Marchi @ 2021-06-09  1:32 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 2021-06-08 11:26 a.m., Tom Tromey wrote:
> @@ -24413,17 +24402,30 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
>  {
>    struct attribute *attr;
>  
> +  cu->producer = dwarf2_string_attr (comp_unit_die, DW_AT_producer, cu);
> +
>    /* Set the language we're debugging.  */
>    attr = dwarf2_attr (comp_unit_die, DW_AT_language, cu);
>    if (attr != nullptr)
>      set_cu_language (attr->constant_value (0), cu);

Not very important, but I'd even change set_cu_language to be a pure
DWARF lang value -> enum language version function, and write:

  cu->language = dwarf_language_to_gdb_language (attr->constant_value (0));

That would be consistent with the rest of the function which assigns
cu->language.

> -  else
> +  else if (cu->producer != nullptr
> +	   && strstr (cu->producer, "IBM XL C for OpenCL") != NULL)
>      {

I don't have build artifacts to test, but I'm curious whether that
compiler still puts a DW_AT_language value in the CU's DIE, just another
language than OpenCL.  Because DW_AT_language existed before, it's just
DW_LANG_OpenCL that didn't exist.  If so, I think the behavior would
change, because that branch of the condition would not be taken - the
first one would.

I also noticed that DW_LANG_OpenCL isn't handled in set_cu_language...
it probably should, so that a producer using the right value
(DW_LANG_OpenCL) in DW_AT_language will get the language set correctly.

> -      cu->language = pretend_language;
> -      cu->language_defn = language_def (cu->language);
> +      /* The XLCL doesn't generate DW_LANG_OpenCL because this
> +	 attribute is not standardised yet.  As a workaround for the
> +	 language detection we fall back to the DW_AT_producer
> +	 string.  */
> +      cu->language = language_opencl;
>      }
> -
> -  cu->producer = dwarf2_string_attr (comp_unit_die, DW_AT_producer, cu);
> +  else if (cu->producer != nullptr
> +	   && strstr (cu->producer, "GNU Go ") != NULL)

Same here, did some version of gccgo put some other language in
DW_AT_language?

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-08 15:26 ` [PATCH 2/2] Remove dwarf2_cu::language Tom Tromey
@ 2021-06-09  1:58   ` Simon Marchi
  2021-06-09 20:26     ` Tom Tromey
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Marchi @ 2021-06-09  1:58 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 2021-06-08 11:26 a.m., Tom Tromey wrote:
> @@ -20350,51 +20354,51 @@ set_cu_language (unsigned int lang, struct dwarf2_cu *cu)

If this function stays as-is in the previous patch, I think it should
take the per_cu as a parameter, instead of the cu.  And maybe be renamed
set_per_cu_language (and maybe a candidate to be a method).

> @@ -24415,17 +24420,17 @@ prepare_one_comp_unit (struct dwarf2_cu *cu, struct die_info *comp_unit_die,
>  	 attribute is not standardised yet.  As a workaround for the
>  	 language detection we fall back to the DW_AT_producer
>  	 string.  */
> -      cu->language = language_opencl;
> +      cu->per_cu->lang = language_opencl;
>      }
>    else if (cu->producer != nullptr
>  	   && strstr (cu->producer, "GNU Go ") != NULL)
>      {
>        /* Similar hack for Go.  */
> -      cu->language = language_go;
> +      cu->per_cu->lang = language_go;
>      }
>    else
> -    cu->language = pretend_language;
> -  cu->language_defn = language_def (cu->language);
> +    cu->per_cu->lang = pretend_language;
> +  cu->language_defn = language_def (cu->per_cu->lang);

It's not totally clear to me why it's better to go this route, eliminate
dwarf2_cu::language and keep dwarf2_per_cu_data::lang.  Because I find
it a bit hard to follow that a dwarf2_per_cu_data is initialized by a
dwarf2_cu being constructed.  That suggests to me that the field would
belong in dwarf2_cu.

From what I see, the only possible way for dwarf2_per_cu_data::lang to
be set is if a dwarf2_cu exists (or has existed) for this per_cu.

And the only place where dwarf2_per_cu_data::lang is read (prior to this
patch) is in process_imported_unit_die:

      /* We're importing a C++ compilation unit with tag DW_TAG_compile_unit
	 into another compilation unit, at root level.  Regard this as a hint,
	 and ignore it.  */
      if (die->parent && die->parent->parent == NULL
	  && per_cu->unit_type == DW_UT_compile
	  && per_cu->lang == language_cplus)
	return;

So here, instead of referring to per_cu->lang, we could perhaps look up
to see if a dwarf2_cu exists for `per_cu`, and lookup the language
there?  The downside would be if a dwarf2_cu has existed but was then
freed, then we won't see the language.  Whereas by storing the language
in dwarf2_per_cu_data::lang, it will persist even if the dwarf2_cu gets
freed.

I'm not sure what my point is here, I think the patch looks good, but I
just wanted to share these thoughts in case it lights a light bulb in
your head.

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] Consolidate CU language setting
  2021-06-09  1:32   ` Simon Marchi
@ 2021-06-09 20:22     ` Tom Tromey
  0 siblings, 0 replies; 16+ messages in thread
From: Tom Tromey @ 2021-06-09 20:22 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, gdb-patches

Simon> Not very important, but I'd even change set_cu_language to be a pure
Simon> DWARF lang value -> enum language version function, and write:

Yeah, I'll do that.

Simon> I don't have build artifacts to test, but I'm curious whether that
Simon> compiler still puts a DW_AT_language value in the CU's DIE, just another
Simon> language than OpenCL.  Because DW_AT_language existed before, it's just
Simon> DW_LANG_OpenCL that didn't exist.  If so, I think the behavior would
Simon> change, because that branch of the condition would not be taken - the
Simon> first one would.

Thanks for noticing that.  I'll fix it.

Simon> I also noticed that DW_LANG_OpenCL isn't handled in set_cu_language...
Simon> it probably should, so that a producer using the right value
Simon> (DW_LANG_OpenCL) in DW_AT_language will get the language set correctly.

I'll add this too.

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-09  1:58   ` Simon Marchi
@ 2021-06-09 20:26     ` Tom Tromey
  2021-06-09 21:00       ` Tom Tromey
                         ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Tom Tromey @ 2021-06-09 20:26 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, gdb-patches

Simon> It's not totally clear to me why it's better to go this route, eliminate
Simon> dwarf2_cu::language and keep dwarf2_per_cu_data::lang.  Because I find
Simon> it a bit hard to follow that a dwarf2_per_cu_data is initialized by a
Simon> dwarf2_cu being constructed.

I didn't give it much thought TBH - I was using dwarf2_per_cu_data::lang
in my series, and I assumed that this was there for some other important
reason.  However, I see that this member was added in commit 589902954d
("[gdb] Skip imports of c++ CUs"), and from what I can tell, it should
be fine to switch that code to using dwarf2_cu::language instead.

Simon> So here, instead of referring to per_cu->lang, we could perhaps look up
Simon> to see if a dwarf2_cu exists for `per_cu`, and lookup the language
Simon> there?  The downside would be if a dwarf2_cu has existed but was then
Simon> freed, then we won't see the language.  Whereas by storing the language
Simon> in dwarf2_per_cu_data::lang, it will persist even if the dwarf2_cu gets
Simon> freed.

Yeah, I arrived at this same conclusion...

But actually, I wonder whether that code even needs a language check.
My thought is that an import of a DW_UT_compile / DW_TAG_compile_unit CU
can always be skipped on the grounds that the CU is being scanned
separately anyway.

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-09 20:26     ` Tom Tromey
@ 2021-06-09 21:00       ` Tom Tromey
  2021-06-10  2:28       ` Tom Tromey
  2021-06-10 22:35       ` Tom de Vries
  2 siblings, 0 replies; 16+ messages in thread
From: Tom Tromey @ 2021-06-09 21:00 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Simon Marchi, gdb-patches

Tom> But actually, I wonder whether that code even needs a language check.
Tom> My thought is that an import of a DW_UT_compile / DW_TAG_compile_unit CU
Tom> can always be skipped on the grounds that the CU is being scanned
Tom> separately anyway.

Removing the check causes gdb.dwarf2/imported-unit-bp.exp to fail.
However, I think this is probably a bug in the test case.
Or, if it isn't, then it points out that this import special case is
wrong -- because the test case fails if I change it to think the
sources are C++.

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-09 20:26     ` Tom Tromey
  2021-06-09 21:00       ` Tom Tromey
@ 2021-06-10  2:28       ` Tom Tromey
  2021-06-10 13:50         ` Simon Marchi
  2021-06-10 22:35       ` Tom de Vries
  2 siblings, 1 reply; 16+ messages in thread
From: Tom Tromey @ 2021-06-10  2:28 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Simon Marchi, gdb-patches

Tom> I didn't give it much thought TBH - I was using dwarf2_per_cu_data::lang
Tom> in my series, and I assumed that this was there for some other important
Tom> reason.  However, I see that this member was added in commit 589902954d
Tom> ("[gdb] Skip imports of c++ CUs"), and from what I can tell, it should
Tom> be fine to switch that code to using dwarf2_cu::language instead.

I went back and looked at all the uses in my new code, and I think
preserving this on the dwarf2_per_cu_data is important.  The new code
works by indexing the DIEs, in a way that's similar to how .debug_names
is supposed to work.  Some code may need the language of an entry for
some kind of processing, but at times where the dwarf2_cu no longer
exists.

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-10  2:28       ` Tom Tromey
@ 2021-06-10 13:50         ` Simon Marchi
  2021-06-10 18:24           ` Tom Tromey
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Marchi @ 2021-06-10 13:50 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 2021-06-09 10:28 p.m., Tom Tromey wrote:
> Tom> I didn't give it much thought TBH - I was using dwarf2_per_cu_data::lang
> Tom> in my series, and I assumed that this was there for some other important
> Tom> reason.  However, I see that this member was added in commit 589902954d
> Tom> ("[gdb] Skip imports of c++ CUs"), and from what I can tell, it should
> Tom> be fine to switch that code to using dwarf2_cu::language instead.
> 
> I went back and looked at all the uses in my new code, and I think
> preserving this on the dwarf2_per_cu_data is important.  The new code
> works by indexing the DIEs, in a way that's similar to how .debug_names
> is supposed to work.  Some code may need the language of an entry for
> some kind of processing, but at times where the dwarf2_cu no longer
> exists.

Ok, and the code is ok with the fact that dwarf2_per_cu_data::lang may
not be set yet, if a dwarf2_cu was never created yet for that CU?

Simon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-10 13:50         ` Simon Marchi
@ 2021-06-10 18:24           ` Tom Tromey
  0 siblings, 0 replies; 16+ messages in thread
From: Tom Tromey @ 2021-06-10 18:24 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, gdb-patches

Simon> Ok, and the code is ok with the fact that dwarf2_per_cu_data::lang may
Simon> not be set yet, if a dwarf2_cu was never created yet for that CU?

A dwarf2_cu is created during indexing, and so the language is set at
that time.  It's not possible to make an entry in the index without this
happening.  This value then persists, so it's fine.

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-09 20:26     ` Tom Tromey
  2021-06-09 21:00       ` Tom Tromey
  2021-06-10  2:28       ` Tom Tromey
@ 2021-06-10 22:35       ` Tom de Vries
  2021-06-11 15:00         ` Tom Tromey
  2 siblings, 1 reply; 16+ messages in thread
From: Tom de Vries @ 2021-06-10 22:35 UTC (permalink / raw)
  To: Tom Tromey, Simon Marchi; +Cc: gdb-patches

On 6/9/21 10:26 PM, Tom Tromey wrote:
> Simon> It's not totally clear to me why it's better to go this route, eliminate
> Simon> dwarf2_cu::language and keep dwarf2_per_cu_data::lang.  Because I find
> Simon> it a bit hard to follow that a dwarf2_per_cu_data is initialized by a
> Simon> dwarf2_cu being constructed.
> 
> I didn't give it much thought TBH - I was using dwarf2_per_cu_data::lang
> in my series, and I assumed that this was there for some other important
> reason.  However, I see that this member was added in commit 589902954d
> ("[gdb] Skip imports of c++ CUs"), and from what I can tell, it should
> be fine to switch that code to using dwarf2_cu::language instead.
> 
> Simon> So here, instead of referring to per_cu->lang, we could perhaps look up
> Simon> to see if a dwarf2_cu exists for `per_cu`, and lookup the language
> Simon> there?  The downside would be if a dwarf2_cu has existed but was then
> Simon> freed, then we won't see the language.  Whereas by storing the language
> Simon> in dwarf2_per_cu_data::lang, it will persist even if the dwarf2_cu gets
> Simon> freed.
> 
> Yeah, I arrived at this same conclusion...
> 
> But actually, I wonder whether that code even needs a language check.
> My thought is that an import of a DW_UT_compile / DW_TAG_compile_unit CU
> can always be skipped on the grounds that the CU is being scanned
> separately anyway.

It's about whether the language has global namespace or not.

In c++, it has, so, take an example CU A, and a CU B, each with two
function entries.

Now if CU A doesn't import CU B, the global namespace has four entries.
And if CU A does import CU B, the global namespace still has four
entries.  So, it's safe to ignore the import because semantically it
doesn't make a difference.

But with C, there's no global namespace so each CU declares its own
namespace.

Now if CU A doesn't import CU B, the namespace for CU A has two entries,
and the namespace for CU B has two entries.
And if CU A does import CU B, then the namespace for CU A has four
entries, and the CU B has two entries.  It's not safe to ignore the
import because semantically there is a difference.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-10 22:35       ` Tom de Vries
@ 2021-06-11 15:00         ` Tom Tromey
  2021-06-14 22:16           ` Tom de Vries
  0 siblings, 1 reply; 16+ messages in thread
From: Tom Tromey @ 2021-06-11 15:00 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Tom Tromey, Simon Marchi, gdb-patches

>> But actually, I wonder whether that code even needs a language check.
>> My thought is that an import of a DW_UT_compile / DW_TAG_compile_unit CU
>> can always be skipped on the grounds that the CU is being scanned
>> separately anyway.

Tom> It's about whether the language has global namespace or not.

Tom> In c++, it has, so, take an example CU A, and a CU B, each with two
Tom> function entries.

Tom> Now if CU A doesn't import CU B, the global namespace has four entries.
Tom> And if CU A does import CU B, the global namespace still has four
Tom> entries.  So, it's safe to ignore the import because semantically it
Tom> doesn't make a difference.

Tom> But with C, there's no global namespace so each CU declares its own
Tom> namespace.

I don't think that's the case, though.  In C a function may have
external or static linkage, and in effect the external linkage is a
global namespace.

Tom> Now if CU A doesn't import CU B, the namespace for CU A has two entries,
Tom> and the namespace for CU B has two entries.
Tom> And if CU A does import CU B, then the namespace for CU A has four
Tom> entries, and the CU B has two entries.  It's not safe to ignore the
Tom> import because semantically there is a difference.

What sort of test case would this affect?

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-11 15:00         ` Tom Tromey
@ 2021-06-14 22:16           ` Tom de Vries
  2021-06-16 11:09             ` Tom de Vries
  2021-06-28 20:11             ` Tom Tromey
  0 siblings, 2 replies; 16+ messages in thread
From: Tom de Vries @ 2021-06-14 22:16 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Simon Marchi, gdb-patches

[-- Attachment #1: Type: text/plain, Size: 1743 bytes --]

On 6/11/21 5:00 PM, Tom Tromey wrote:
>>> But actually, I wonder whether that code even needs a language check.
>>> My thought is that an import of a DW_UT_compile / DW_TAG_compile_unit 
CU
>>> can always be skipped on the grounds that the CU is being scanned
>>> separately anyway.
> 
> Tom> It's about whether the language has global namespace or not.
> 
> Tom> In c++, it has, so, take an example CU A, and a CU B, each with two
> Tom> function entries.
> 
> Tom> Now if CU A doesn't import CU B, the global namespace has four entries.
> Tom> And if CU A does import CU B, the global namespace still has four
> Tom> entries.  So, it's safe to ignore the import because semantically it
> Tom> doesn't make a difference.
> 
> Tom> But with C, there's no global namespace so each CU declares its own
> Tom> namespace.
> 
> I don't think that's the case, though.  In C a function may have
> external or static linkage, and in effect the external linkage is a
> global namespace.
> 

OK, perhaps not the best example then.  Then substitute type for
function in the story above.  [ FWIW, I have these notions from reading
dwarf standard appendix E, where explicitly this distinction is made
between c and c++ and whether import is necessary or not. So perhaps
you'll get a better explanation there. ]

> Tom> Now if CU A doesn't import CU B, the namespace for CU A has two entries,
> Tom> and the namespace for CU B has two entries.
> Tom> And if CU A does import CU B, then the namespace for CU A has four
> Tom> entries, and the CU B has two entries.  It's not safe to ignore the
> Tom> import because semantically there is a difference.
> 
> What sort of test case would this affect?

I've written a dwarf assembly test-case.

Thanks,
- Tom



[-- Attachment #2: 0001-gdb-testsuite-Add-gdb.dwarf2-imported-unit-c.exp.patch --]
[-- Type: text/x-patch, Size: 3344 bytes --]

[gdb/testsuite] Add gdb.dwarf2/imported-unit-c.exp

This test-case is intended to excercise this code in process_imported_unit_die:
...
      /* We're importing a C++ compilation unit with tag DW_TAG_compile_unit
	 into another compilation unit, at root level.  Regard this as a hint,
	 and ignore it.  */
      if (die->parent && die->parent->parent == NULL
	  && per_cu->unit_type == DW_UT_compile
	  && per_cu->lang == language_cplus)
	return;
...
in the sense that the test-case should fail if the
"per_cu->lang == language_cplus" clause is removed.

Tested on x86_64-linux.

gdb/testsuite/ChangeLog:

2021-06-14  Tom de Vries  <tdevries@suse.de>

	* gdb.dwarf2/imported-unit-c.exp: New file.

---
 gdb/testsuite/gdb.dwarf2/imported-unit-c.exp | 110 +++++++++++++++++++++++++++
 1 file changed, 110 insertions(+)

diff --git a/gdb/testsuite/gdb.dwarf2/imported-unit-c.exp b/gdb/testsuite/gdb.dwarf2/imported-unit-c.exp
new file mode 100644
index 00000000000..14047ab8eb9
--- /dev/null
+++ b/gdb/testsuite/gdb.dwarf2/imported-unit-c.exp
@@ -0,0 +1,110 @@
+load_lib dwarf.exp
+
+# This test can only be run on targets which support DWARF-2 and use gas.
+if {![dwarf2_support]} {
+    return 0
+};
+
+standard_testfile main-foo.c .S
+
+set executable ${testfile}
+set asm_file [standard_output_file ${srcfile2}]
+
+# We need to know the size of integer and address types in order
+# to write some of the debugging info we'd like to generate.
+if [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} {debug}] {
+    return -1
+}
+
+# Create the DWARF.
+Dwarf::assemble $asm_file {
+    declare_labels cu_label cu2_label int_label int2_label
+    set int_size [get_sizeof "int" 4]
+
+    # imported CU 1: inty unsigned
+    cu {} {
+	cu_label: compile_unit {
+	    {language @DW_LANG_C}
+	    {name "<artificial>"}
+	} {
+	    int_label: base_type {
+		{byte_size $int_size sdata}
+		{encoding @DW_ATE_unsigned}
+		{name {unsigned int}}
+	    }
+            DW_TAG_typedef {
+                {DW_AT_name inty}
+                {DW_AT_type :$int_label}
+            }
+	}
+    }
+
+    # imported CU 2: inty signed
+    cu {} {
+	cu2_label: compile_unit {
+	    {language @DW_LANG_C}
+	    {name "<artificial>"}
+	} {
+	    int2_label: base_type {
+		{byte_size $int_size sdata}
+		{encoding @DW_ATE_signed}
+		{name {int}}
+	    }
+            DW_TAG_typedef {
+                {DW_AT_name inty}
+                {DW_AT_type :$int2_label}
+            }
+	}
+    }
+
+    # main CU
+    cu {} {
+	compile_unit {
+	    {language @DW_LANG_C}
+	    {name "<artificial>"}
+	} {
+	    imported_unit {
+		{import %$cu2_label}
+	    }
+
+	    subprogram {
+		{MACRO_AT_func {main}}
+		{external 1 flag}
+	    }
+	}
+    }
+
+    # foo CU
+    cu {} {
+	compile_unit {
+	    {language @DW_LANG_C}
+	    {name "<artificial>"}
+	} {
+	    imported_unit {
+		{import %$cu_label}
+	    }
+
+	    subprogram {
+		{MACRO_AT_func {foo}}
+		{external 1 flag}
+	    }
+	}
+    }
+
+}
+
+if { [prepare_for_testing "failed to prepare" ${testfile} \
+	  [list $srcfile $asm_file] {nodebug}] } {
+    return -1
+}
+
+if ![runto_main] {
+    return -1
+}
+
+gdb_test "ptype inty" "type = int" "ptype in main"
+
+gdb_breakpoint "foo"
+gdb_continue_to_breakpoint "continue to breakpoint for foo"
+
+gdb_test "ptype inty" "type = unsigned int" "ptype in foo"

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-14 22:16           ` Tom de Vries
@ 2021-06-16 11:09             ` Tom de Vries
  2021-06-28 20:11             ` Tom Tromey
  1 sibling, 0 replies; 16+ messages in thread
From: Tom de Vries @ 2021-06-16 11:09 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Simon Marchi, gdb-patches

On 6/15/21 12:16 AM, Tom de Vries wrote:
>> What sort of test case would this affect?
> I've written a dwarf assembly test-case.
> 

I'd like to commit this.  Any objections?

> Thanks,
> - Tom
> 
> 
> 
> 0001-gdb-testsuite-Add-gdb.dwarf2-imported-unit-c.exp.patch
> 
> [gdb/testsuite] Add gdb.dwarf2/imported-unit-c.exp
> 
> This test-case is intended to excercise this code in process_imported_unit_die:
> ...
>       /* We're importing a C++ compilation unit with tag DW_TAG_compile_unit
> 	 into another compilation unit, at root level.  Regard this as a hint,
> 	 and ignore it.  */
>       if (die->parent && die->parent->parent == NULL
> 	  && per_cu->unit_type == DW_UT_compile
> 	  && per_cu->lang == language_cplus)
> 	return;
> ...
> in the sense that the test-case should fail if the
> "per_cu->lang == language_cplus" clause is removed.
> 
> Tested on x86_64-linux.
> 
> gdb/testsuite/ChangeLog:
> 
> 2021-06-14  Tom de Vries  <tdevries@suse.de>
> 
> 	* gdb.dwarf2/imported-unit-c.exp: New file.
> 
> ---
>  gdb/testsuite/gdb.dwarf2/imported-unit-c.exp | 110 +++++++++++++++++++++++++++
>  1 file changed, 110 insertions(+)
> 
> diff --git a/gdb/testsuite/gdb.dwarf2/imported-unit-c.exp b/gdb/testsuite/gdb.dwarf2/imported-unit-c.exp
> new file mode 100644
> index 00000000000..14047ab8eb9
> --- /dev/null
> +++ b/gdb/testsuite/gdb.dwarf2/imported-unit-c.exp
> @@ -0,0 +1,110 @@
> +load_lib dwarf.exp
> +
> +# This test can only be run on targets which support DWARF-2 and use gas.
> +if {![dwarf2_support]} {
> +    return 0
> +};
> +
> +standard_testfile main-foo.c .S
> +
> +set executable ${testfile}
> +set asm_file [standard_output_file ${srcfile2}]
> +
> +# We need to know the size of integer and address types in order
> +# to write some of the debugging info we'd like to generate.
> +if [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} {debug}] {
> +    return -1
> +}
> +
> +# Create the DWARF.
> +Dwarf::assemble $asm_file {
> +    declare_labels cu_label cu2_label int_label int2_label
> +    set int_size [get_sizeof "int" 4]
> +
> +    # imported CU 1: inty unsigned
> +    cu {} {
> +	cu_label: compile_unit {
> +	    {language @DW_LANG_C}
> +	    {name "<artificial>"}
> +	} {
> +	    int_label: base_type {
> +		{byte_size $int_size sdata}
> +		{encoding @DW_ATE_unsigned}
> +		{name {unsigned int}}
> +	    }
> +            DW_TAG_typedef {
> +                {DW_AT_name inty}
> +                {DW_AT_type :$int_label}
> +            }
> +	}
> +    }
> +
> +    # imported CU 2: inty signed
> +    cu {} {
> +	cu2_label: compile_unit {
> +	    {language @DW_LANG_C}
> +	    {name "<artificial>"}
> +	} {
> +	    int2_label: base_type {
> +		{byte_size $int_size sdata}
> +		{encoding @DW_ATE_signed}
> +		{name {int}}
> +	    }
> +            DW_TAG_typedef {
> +                {DW_AT_name inty}
> +                {DW_AT_type :$int2_label}
> +            }
> +	}
> +    }
> +
> +    # main CU
> +    cu {} {
> +	compile_unit {
> +	    {language @DW_LANG_C}
> +	    {name "<artificial>"}
> +	} {
> +	    imported_unit {
> +		{import %$cu2_label}
> +	    }
> +
> +	    subprogram {
> +		{MACRO_AT_func {main}}
> +		{external 1 flag}
> +	    }
> +	}
> +    }
> +
> +    # foo CU
> +    cu {} {
> +	compile_unit {
> +	    {language @DW_LANG_C}
> +	    {name "<artificial>"}
> +	} {
> +	    imported_unit {
> +		{import %$cu_label}
> +	    }
> +
> +	    subprogram {
> +		{MACRO_AT_func {foo}}
> +		{external 1 flag}
> +	    }
> +	}
> +    }
> +
> +}
> +
> +if { [prepare_for_testing "failed to prepare" ${testfile} \
> +	  [list $srcfile $asm_file] {nodebug}] } {
> +    return -1
> +}
> +
> +if ![runto_main] {
> +    return -1
> +}
> +
> +gdb_test "ptype inty" "type = int" "ptype in main"
> +
> +gdb_breakpoint "foo"
> +gdb_continue_to_breakpoint "continue to breakpoint for foo"
> +
> +gdb_test "ptype inty" "type = unsigned int" "ptype in foo"
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] Remove dwarf2_cu::language
  2021-06-14 22:16           ` Tom de Vries
  2021-06-16 11:09             ` Tom de Vries
@ 2021-06-28 20:11             ` Tom Tromey
  1 sibling, 0 replies; 16+ messages in thread
From: Tom Tromey @ 2021-06-28 20:11 UTC (permalink / raw)
  To: Tom de Vries; +Cc: Tom Tromey, gdb-patches

>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:

Tom> Now if CU A doesn't import CU B, the namespace for CU A has two entries,
Tom> and the namespace for CU B has two entries.
Tom> And if CU A does import CU B, then the namespace for CU A has four
Tom> entries, and the CU B has two entries.  It's not safe to ignore the
Tom> import because semantically there is a difference.

>> What sort of test case would this affect?

Tom> I've written a dwarf assembly test-case.

I still don't understand why this test should differ between C and C++.

IIUC -- it declares the 'inty' typedef two different ways in two
different scopes, then tests that the expected output depends on the
scope:

Tom> +gdb_test "ptype inty" "type = int" "ptype in main"
Tom> +
Tom> +gdb_breakpoint "foo"
Tom> +gdb_continue_to_breakpoint "continue to breakpoint for foo"
Tom> +
Tom> +gdb_test "ptype inty" "type = unsigned int" "ptype in foo"

... if this is correct for C, why wouldn't it be correct for C++?

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-06-28 20:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08 15:26 [PATCH 0/2] Clean up language handling in the DWARF reader Tom Tromey
2021-06-08 15:26 ` [PATCH 1/2] Consolidate CU language setting Tom Tromey
2021-06-09  1:32   ` Simon Marchi
2021-06-09 20:22     ` Tom Tromey
2021-06-08 15:26 ` [PATCH 2/2] Remove dwarf2_cu::language Tom Tromey
2021-06-09  1:58   ` Simon Marchi
2021-06-09 20:26     ` Tom Tromey
2021-06-09 21:00       ` Tom Tromey
2021-06-10  2:28       ` Tom Tromey
2021-06-10 13:50         ` Simon Marchi
2021-06-10 18:24           ` Tom Tromey
2021-06-10 22:35       ` Tom de Vries
2021-06-11 15:00         ` Tom Tromey
2021-06-14 22:16           ` Tom de Vries
2021-06-16 11:09             ` Tom de Vries
2021-06-28 20:11             ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).