[PATCH 00/10] Fix some C++ name canonicalizer problems

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 00/10] Fix some C++ name canonicalizer problems
@ 2024-04-21 17:00 Tom Tromey
  2024-04-21 17:00 ` [PATCH 01/10] Remove test code from cp-name-parser.y Tom Tromey
                   ` (9 more replies)
  0 siblings, 10 replies; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

This series fixes a few bug in the C++ name canonicalizer.

It doesn't fix all of them; in particular, there's a bug open about
ABI tags that I didn't try to fix.  I think there may be other bugs as
well, because C++ has continued to evolve but the canonicalizer, by
and large, has not.

Regression tested on x86-64 Fedora 38.

---
Tom Tromey (10):
      Remove test code from cp-name-parser.y
      Allow initialization functions in .y files
      Clean up demangle_parse_info
      Change storage of demangle_component
      Fix C++ name canonicalizations of character literals
      Remove some unnecessary allocations from cpname_state::parse_number
      Fix C++ canonicalization of hex literals
      Implement C++14 numeric separators
      Allow function types as template parameters in name canonicalizer
      Add spaceship operator to cp-name-parser.y

 gdb/Makefile.in               |  19 +--
 gdb/c-exp.y                   |  28 +++-
 gdb/cp-name-parser.y          | 374 +++++++++++++++++-------------------------
 gdb/cp-support.c              |   9 +-
 gdb/cp-support.h              |  19 +--
 gdb/testsuite/gdb.cp/misc.exp |   4 +
 gdb/yy-remap.h                |   4 -
 7 files changed, 186 insertions(+), 271 deletions(-)
---
base-commit: 7e9ef24e4a72d8d174932c7dd6be44226328ab88
change-id: 20240421-canon-fixes-28ad0802b238

Best regards,
-- 
Tom Tromey <tom@tromey.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/10] Remove test code from cp-name-parser.y
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-22 17:11   ` John Baldwin
  2024-04-21 17:00 ` [PATCH 02/10] Allow initialization functions in .y files Tom Tromey
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

This removes the current test 'main' from cp-name-parser.y.  There
aren't any tests using this, and nowadays it would be better as a unit
test.
---
 gdb/Makefile.in      |  15 +-----
 gdb/cp-name-parser.y | 135 ---------------------------------------------------
 gdb/yy-remap.h       |   4 --
 3 files changed, 1 insertion(+), 153 deletions(-)

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 23894ea4a4d..2f7fc1ca249 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -2147,16 +2147,6 @@ uninstall: force $(CONFIG_UNINSTALL)
 	rm -f $(DESTDIR)$(bindir)/$$transformed_name
 	@$(MAKE) DO=uninstall "DODIRS=$(SUBDIRS)" $(FLAGS_TO_PASS) subdir_do
 
-# The C++ name parser can be built standalone for testing.
-test-cp-name-parser.o: cp-name-parser.c
-	$(COMPILE) -DTEST_CPNAMES cp-name-parser.c
-	$(POSTCOMPILE)
-
-test-cp-name-parser$(EXEEXT): test-cp-name-parser.o $(LIBIBERTY)
-	$(ECHO_CXXLD) $(CC_LD) $(INTERNAL_LDFLAGS) \
-		-o test-cp-name-parser$(EXEEXT) test-cp-name-parser.o \
-		$(LIBIBERTY)
-
 # We do this by grepping through sources.  If that turns out to be too slow,
 # maybe we could just require every .o file to have an initialization routine
 # of a given name (top.o -> _initialize_top, etc.).
@@ -2168,7 +2158,6 @@ test-cp-name-parser$(EXEEXT): test-cp-name-parser.o $(LIBIBERTY)
 # computing the list of source files from the list of object files.
 
 INIT_FILES_FILTER_OUT = \
-	cp-name-parser.o \
 	init.o \
 	version.o \
 	xml-builtin.o \
@@ -2244,7 +2233,6 @@ clean mostlyclean: $(CONFIG_CLEAN)
 	rm -f init.c stamp-init version.c stamp-version
 	rm -f gdb$(EXEEXT) core make.log
 	rm -f gdb[0-9]$(EXEEXT)
-	rm -f test-cp-name-parser$(EXEEXT)
 	rm -f xml-builtin.c stamp-xml
 	rm -f $(DEPDIR)/*
 	for i in $(CONFIG_SRC_SUBDIR); do \
@@ -2677,8 +2665,7 @@ endif
 
 # A list of all the objects we might care about in this build, for
 # dependency tracking.
-all_object_files = gdb.o $(LIBGDB_OBS) gdbtk-main.o \
-	test-cp-name-parser.o
+all_object_files = gdb.o $(LIBGDB_OBS) gdbtk-main.o
 
 # All the .deps files to include.
 all_deps_files = $(foreach dep,$(patsubst %.o,%.Po,$(all_object_files)),\
diff --git a/gdb/cp-name-parser.y b/gdb/cp-name-parser.y
index 87f13445bba..9f4561a36b4 100644
--- a/gdb/cp-name-parser.y
+++ b/gdb/cp-name-parser.y
@@ -171,11 +171,6 @@ cpname_state::d_grab ()
 #define INT_SIGNED	(1 << 4)
 #define INT_UNSIGNED	(1 << 5)
 
-/* Enable yydebug for the stand-alone parser.  */
-#ifdef TEST_CPNAMES
-# define YYDEBUG	1
-#endif
-
 /* Helper functions.  These wrap the demangler tree interface, handle
    allocation from our global store, and return the allocated component.  */
 
@@ -2051,133 +2046,3 @@ cp_demangled_name_to_comp (const char *demangled_name,
 
   return result;
 }
-
-#ifdef TEST_CPNAMES
-
-static void
-cp_print (struct demangle_component *result)
-{
-  char *str;
-  size_t err = 0;
-
-  str = gdb_cplus_demangle_print (DMGL_PARAMS | DMGL_ANSI, result, 64, &err);
-  if (str == NULL)
-    return;
-
-  fputs (str, stdout);
-
-  free (str);
-}
-
-static char
-trim_chars (char *lexptr, char **extra_chars)
-{
-  char *p = (char *) symbol_end (lexptr);
-  char c = 0;
-
-  if (*p)
-    {
-      c = *p;
-      *p = 0;
-      *extra_chars = p + 1;
-    }
-
-  return c;
-}
-
-/* When this file is built as a standalone program, xmalloc comes from
-   libiberty --- in which case we have to provide xfree ourselves.  */
-
-void
-xfree (void *ptr)
-{
-  if (ptr != NULL)
-    {
-      /* Literal `free' would get translated back to xfree again.  */
-      CONCAT2 (fr,ee) (ptr);
-    }
-}
-
-/* GDB normally defines internal_error itself, but when this file is built
-   as a standalone program, we must also provide an implementation.  */
-
-void
-internal_error (const char *file, int line, const char *fmt, ...)
-{
-  va_list ap;
-
-  va_start (ap, fmt);
-  fprintf (stderr, "%s:%d: internal error: ", file, line);
-  vfprintf (stderr, fmt, ap);
-  exit (1);
-}
-
-int
-main (int argc, char **argv)
-{
-  char *str2, *extra_chars, c;
-  char buf[65536];
-  int arg;
-
-  arg = 1;
-  if (argv[arg] && strcmp (argv[arg], "--debug") == 0)
-    {
-      yydebug = 1;
-      arg++;
-    }
-
-  if (argv[arg] == NULL)
-    while (fgets (buf, 65536, stdin) != NULL)
-      {
-	buf[strlen (buf) - 1] = 0;
-	/* Use DMGL_VERBOSE to get expanded standard substitutions.  */
-	c = trim_chars (buf, &extra_chars);
-	str2 = cplus_demangle (buf, DMGL_PARAMS | DMGL_ANSI | DMGL_VERBOSE);
-	if (str2 == NULL)
-	  {
-	    printf ("Demangling error\n");
-	    if (c)
-	      printf ("%s%c%s\n", buf, c, extra_chars);
-	    else
-	      printf ("%s\n", buf);
-	    continue;
-	  }
-
-	std::string errmsg;
-	std::unique_ptr<demangle_parse_info> result
-	  = cp_demangled_name_to_comp (str2, &errmsg);
-	if (result == NULL)
-	  {
-	    fputs (errmsg.c_str (), stderr);
-	    fputc ('\n', stderr);
-	    continue;
-	  }
-
-	cp_print (result->tree);
-
-	free (str2);
-	if (c)
-	  {
-	    putchar (c);
-	    fputs (extra_chars, stdout);
-	  }
-	putchar ('\n');
-      }
-  else
-    {
-      std::string errmsg;
-      std::unique_ptr<demangle_parse_info> result
-	= cp_demangled_name_to_comp (argv[arg], &errmsg);
-      if (result == NULL)
-	{
-	  fputs (errmsg.c_str (), stderr);
-	  fputc ('\n', stderr);
-	  return 0;
-	}
-      cp_print (result->tree);
-      putchar ('\n');
-    }
-  return 0;
-}
-
-#endif
diff --git a/gdb/yy-remap.h b/gdb/yy-remap.h
index d52a59d11b9..e32ce394339 100644
--- a/gdb/yy-remap.h
+++ b/gdb/yy-remap.h
@@ -93,8 +93,4 @@
 # define YYDEBUG 1  /* Default to yydebug support */
 #endif
 
-#ifndef TEST_CPNAMES
-# define YYFPRINTF parser_fprintf
-#endif
-
 #endif /* YY_REMAP_H */

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 02/10] Allow initialization functions in .y files
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
  2024-04-21 17:00 ` [PATCH 01/10] Remove test code from cp-name-parser.y Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-21 17:00 ` [PATCH 03/10] Clean up demangle_parse_info Tom Tromey
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

If you add an initialization function to a .y file, it will not show
up in init.c, because if the yacc output is in the build tree, it
won't be found.

This patch changes the Makefile to be more robust in this situation.
---
 gdb/Makefile.in | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index 2f7fc1ca249..b51a294ed88 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -2172,7 +2172,9 @@ INIT_FILES = \
 init.c: stamp-init; @true
 stamp-init: $(INIT_FILES) config.status $(srcdir)/make-init-c
 	$(ECHO_INIT_C)
-	$(SILENCE) $(srcdir)/make-init-c $(addprefix $(srcdir)/,$(INIT_FILES)) > init.c-tmp
+	$(SILENCE) $(srcdir)/make-init-c \
+		$(filter-out config.status $(srcdir)/make-init-c,$^) \
+		> init.c-tmp
 	$(SILENCE) $(SHELL) $(srcdir)/../move-if-change init.c-tmp init.c
 	$(SILENCE) echo stamp > stamp-init
 

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 03/10] Clean up demangle_parse_info
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
  2024-04-21 17:00 ` [PATCH 01/10] Remove test code from cp-name-parser.y Tom Tromey
  2024-04-21 17:00 ` [PATCH 02/10] Allow initialization functions in .y files Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-22 17:12   ` John Baldwin
  2024-04-21 17:00 ` [PATCH 04/10] Change storage of demangle_component Tom Tromey
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

This changes demangle_parse_info to use inline initializers and to
remove some manual memory management.
---
 gdb/cp-name-parser.y | 12 ------------
 gdb/cp-support.h     |  8 ++++----
 2 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/gdb/cp-name-parser.y b/gdb/cp-name-parser.y
index 9f4561a36b4..2f73be35f37 100644
--- a/gdb/cp-name-parser.y
+++ b/gdb/cp-name-parser.y
@@ -1959,15 +1959,6 @@ cp_comp_to_string (struct demangle_component *result, int estimated_len)
   return gdb::unique_xmalloc_ptr<char> (res);
 }
 
-/* Constructor for demangle_parse_info.  */
-
-demangle_parse_info::demangle_parse_info ()
-: info (NULL),
-  tree (NULL)
-{
-  obstack_init (&obstack);
-}
-
 /* Destructor for demangle_parse_info.  */
 
 demangle_parse_info::~demangle_parse_info ()
@@ -1980,9 +1971,6 @@ demangle_parse_info::~demangle_parse_info ()
       free (info);
       info = next;
     }
-
-  /* Free any memory allocated during typedef replacement.  */
-  obstack_free (&obstack, NULL);
 }
 
 /* Merge the two parse trees given by DEST and SRC.  The parse tree
diff --git a/gdb/cp-support.h b/gdb/cp-support.h
index 4015126154b..d0bedcd7b80 100644
--- a/gdb/cp-support.h
+++ b/gdb/cp-support.h
@@ -58,18 +58,18 @@ struct using_direct;
 
 struct demangle_parse_info
 {
-  demangle_parse_info ();
+  demangle_parse_info () = default;
 
   ~demangle_parse_info ();
 
   /* The memory used during the parse.  */
-  struct demangle_info *info;
+  struct demangle_info *info = nullptr;
 
   /* The result of the parse.  */
-  struct demangle_component *tree;
+  struct demangle_component *tree = nullptr;
 
   /* Any temporary memory used during typedef replacement.  */
-  struct obstack obstack;
+  auto_obstack obstack;
 };
 
 

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 04/10] Change storage of demangle_component
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
                   ` (2 preceding siblings ...)
  2024-04-21 17:00 ` [PATCH 03/10] Clean up demangle_parse_info Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-22 17:17   ` John Baldwin
  2024-04-21 17:00 ` [PATCH 05/10] Fix C++ name canonicalizations of character literals Tom Tromey
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

This changes demangle_component objects to be stored on the obstack
that is part of demangle_info.  It also arranges for a demangle_info
object to be kept alive by cp_merge_demangle_parse_infos.  This way,
other data on the obstack can be kept while an "outer" demangle_info
needs it.
---
 gdb/cp-name-parser.y | 78 +++++-----------------------------------------------
 gdb/cp-support.c     |  2 +-
 gdb/cp-support.h     | 15 ++++------
 3 files changed, 14 insertions(+), 81 deletions(-)

diff --git a/gdb/cp-name-parser.y b/gdb/cp-name-parser.y
index 2f73be35f37..6003ed0b01e 100644
--- a/gdb/cp-name-parser.y
+++ b/gdb/cp-name-parser.y
@@ -48,17 +48,6 @@
 #define GDB_YY_REMAP_PREFIX cpname
 #include "yy-remap.h"
 
-/* The components built by the parser are allocated ahead of time,
-   and cached in this structure.  */
-
-#define ALLOC_CHUNK 100
-
-struct demangle_info {
-  int used;
-  struct demangle_info *next;
-  struct demangle_component comps[ALLOC_CHUNK];
-};
-
 %}
 
 %union
@@ -92,7 +81,7 @@ struct cpname_state
 
   const char *lexptr, *prev_lexptr, *error_lexptr, *global_errmsg;
 
-  struct demangle_info *demangle_info;
+  demangle_parse_info *demangle_info;
 
   /* The parse tree created by the parser is stored here after a
      successful parse.  */
@@ -136,23 +125,7 @@ struct cpname_state
 struct demangle_component *
 cpname_state::d_grab ()
 {
-  struct demangle_info *more;
-
-  if (demangle_info->used >= ALLOC_CHUNK)
-    {
-      if (demangle_info->next == NULL)
-	{
-	  more = XNEW (struct demangle_info);
-	  more->next = NULL;
-	  demangle_info->next = more;
-	}
-      else
-	more = demangle_info->next;
-
-      more->used = 0;
-      demangle_info = more;
-    }
-  return &demangle_info->comps[demangle_info->used++];
+  return obstack_new<demangle_component> (&demangle_info->obstack);
 }
 
 /* Flags passed to d_qualify.  */
@@ -1933,20 +1906,6 @@ yyerror (cpname_state *state, const char *msg)
   state->global_errmsg = msg ? msg : "parse error";
 }
 
-/* Allocate a chunk of the components we'll need to build a tree.  We
-   generally allocate too many components, but the extra memory usage
-   doesn't hurt because the trees are temporary and the storage is
-   reused.  More may be allocated later, by d_grab.  */
-static struct demangle_info *
-allocate_info (void)
-{
-  struct demangle_info *info = XNEW (struct demangle_info);
-
-  info->next = NULL;
-  info->used = 0;
-  return info;
-}
-
 /* See cp-support.h.  */
 
 gdb::unique_xmalloc_ptr<char>
@@ -1959,20 +1918,6 @@ cp_comp_to_string (struct demangle_component *result, int estimated_len)
   return gdb::unique_xmalloc_ptr<char> (res);
 }
 
-/* Destructor for demangle_parse_info.  */
-
-demangle_parse_info::~demangle_parse_info ()
-{
-  /* Free any allocated chunks of memory for the parse.  */
-  while (info != NULL)
-    {
-      struct demangle_info *next = info->next;
-
-      free (info);
-      info = next;
-    }
-}
-
 /* Merge the two parse trees given by DEST and SRC.  The parse tree
    in SRC is attached to DEST at the node represented by TARGET.
 
@@ -1986,21 +1931,14 @@ demangle_parse_info::~demangle_parse_info ()
 void
 cp_merge_demangle_parse_infos (struct demangle_parse_info *dest,
 			       struct demangle_component *target,
-			       struct demangle_parse_info *src)
+			       std::unique_ptr<demangle_parse_info> src)
 
 {
-  struct demangle_info *di;
-
   /* Copy the SRC's parse data into DEST.  */
   *target = *src->tree;
-  di = dest->info;
-  while (di->next != NULL)
-    di = di->next;
-  di->next = src->info;
-
-  /* Clear the (pointer to) SRC's parse data so that it is not freed when
-     cp_demangled_parse_info_free is called.  */
-  src->info = NULL;
+
+  /* Make sure SRC is owned by DEST.  */
+  dest->infos.push_back (std::move (src));
 }
 
 /* Convert a demangled name to a demangle_component tree.  On success,
@@ -2018,10 +1956,8 @@ cp_demangled_name_to_comp (const char *demangled_name,
   state.error_lexptr = NULL;
   state.global_errmsg = NULL;
 
-  state.demangle_info = allocate_info ();
-
   auto result = std::make_unique<demangle_parse_info> ();
-  result->info = state.demangle_info;
+  state.demangle_info = result.get ();
 
   if (yyparse (&state))
     {
diff --git a/gdb/cp-support.c b/gdb/cp-support.c
index e6e811ddf50..4c64e4963e4 100644
--- a/gdb/cp-support.c
+++ b/gdb/cp-support.c
@@ -258,7 +258,7 @@ inspect_type (struct demangle_parse_info *info,
 	  if (i != NULL)
 	    {
 	      /* Merge the two trees.  */
-	      cp_merge_demangle_parse_infos (info, ret_comp, i.get ());
+	      cp_merge_demangle_parse_infos (info, ret_comp, std::move (i));
 
 	      /* Replace any newly introduced typedefs -- but not
 		 if the type is anonymous (that would lead to infinite
diff --git a/gdb/cp-support.h b/gdb/cp-support.h
index d0bedcd7b80..765c4435f41 100644
--- a/gdb/cp-support.h
+++ b/gdb/cp-support.h
@@ -58,18 +58,15 @@ struct using_direct;
 
 struct demangle_parse_info
 {
-  demangle_parse_info () = default;
-
-  ~demangle_parse_info ();
-
-  /* The memory used during the parse.  */
-  struct demangle_info *info = nullptr;
-
   /* The result of the parse.  */
   struct demangle_component *tree = nullptr;
 
-  /* Any temporary memory used during typedef replacement.  */
+  /* Any memory used during processing.  */
   auto_obstack obstack;
+
+  /* Any other objects referred to by this object, and whose storage
+     lifetime must be linked.  */
+  std::vector<std::unique_ptr<demangle_parse_info>> infos;
 };
 
 
@@ -182,7 +179,7 @@ extern gdb::unique_xmalloc_ptr<char> cp_comp_to_string
 
 extern void cp_merge_demangle_parse_infos (struct demangle_parse_info *,
 					   struct demangle_component *,
-					   struct demangle_parse_info *);
+					   std::unique_ptr<demangle_parse_info>);
 
 /* The list of "maint cplus" commands.  */
 

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 05/10] Fix C++ name canonicalizations of character literals
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
                   ` (3 preceding siblings ...)
  2024-04-21 17:00 ` [PATCH 04/10] Change storage of demangle_component Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-22 17:19   ` John Baldwin
  2024-04-21 17:00 ` [PATCH 06/10] Remove some unnecessary allocations from cpname_state::parse_number Tom Tromey
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

The names "void C<(char)1>::m()" and "void C<'\001'>::m()" should
canonicalize to the same string, but currently they do not -- the
former remains unchanged and the latter is transformed to
"void C<(char)'\001'>::m()".

This patch fixes the bug and also adds some unit tests.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16843
---
 gdb/cp-name-parser.y | 49 +++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 43 insertions(+), 6 deletions(-)

diff --git a/gdb/cp-name-parser.y b/gdb/cp-name-parser.y
index 6003ed0b01e..6590194545f 100644
--- a/gdb/cp-name-parser.y
+++ b/gdb/cp-name-parser.y
@@ -44,6 +44,7 @@
 #include "cp-support.h"
 #include "c-support.h"
 #include "parser-defs.h"
+#include "gdbsupport/selftest.h"
 
 #define GDB_YY_REMAP_PREFIX cpname
 #include "yy-remap.h"
@@ -1514,6 +1515,7 @@ yylex (YYSTYPE *lvalp, cpname_state *state)
   int c;
   int namelen;
   const char *tokstart;
+  char *copy;
 
  retry:
   state->prev_lexptr = state->lexptr;
@@ -1544,6 +1546,10 @@ yylex (YYSTYPE *lvalp, cpname_state *state)
 	  return ERROR;
 	}
 
+      /* We over-allocate here, but it doesn't really matter . */
+      copy = (char *) obstack_alloc (&state->demangle_info->obstack, 30);
+      xsnprintf (copy, 30, "%d", c);
+
       c = *state->lexptr++;
       if (c != '\'')
 	{
@@ -1551,15 +1557,10 @@ yylex (YYSTYPE *lvalp, cpname_state *state)
 	  return ERROR;
 	}
 
-      /* FIXME: We should refer to a canonical form of the character,
-	 presumably the same one that appears in manglings - the decimal
-	 representation.  But if that isn't in our input then we have to
-	 allocate memory for it somewhere.  */
       lvalp->comp
 	= state->fill_comp (DEMANGLE_COMPONENT_LITERAL,
 			    state->make_builtin_type ("char"),
-			    state->make_name (tokstart,
-					      state->lexptr - tokstart));
+			    state->make_name (copy, strlen (copy)));
 
       return INT;
 
@@ -1970,3 +1971,39 @@ cp_demangled_name_to_comp (const char *demangled_name,
 
   return result;
 }
+
+#if GDB_SELF_TEST
+
+static void
+should_be_the_same (const char *one, const char *two)
+{
+  gdb::unique_xmalloc_ptr<char> cpone = cp_canonicalize_string (one);
+  gdb::unique_xmalloc_ptr<char> cptwo = cp_canonicalize_string (two);
+
+  if (cpone != nullptr)
+    one = cpone.get ();
+  if (cptwo != nullptr)
+    two = cptwo.get ();
+
+  SELF_CHECK (strcmp (one, two) == 0);
+}
+
+static void
+canonicalize_tests ()
+{
+  should_be_the_same ("short int", "short");
+  should_be_the_same ("int short", "short");
+
+  should_be_the_same ("C<(char) 1>::m()", "C<(char) '\\001'>::m()");
+}
+
+#endif
+
+void _initialize_cp_name_parser ();
+void
+_initialize_cp_name_parser ()
+{
+#if GDB_SELF_TEST
+  selftests::register_test ("canonicalize", canonicalize_tests);
+#endif
+}

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 06/10] Remove some unnecessary allocations from cpname_state::parse_number
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
                   ` (4 preceding siblings ...)
  2024-04-21 17:00 ` [PATCH 05/10] Fix C++ name canonicalizations of character literals Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-22 17:20   ` John Baldwin
  2024-04-21 17:00 ` [PATCH 07/10] Fix C++ canonicalization of hex literals Tom Tromey
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

cpname_state::parse_number allocates nodes for various types and then
only uses one of them.  This patch reduces the number of allocations
by not performing the unnecessary ones.
---
 gdb/cp-name-parser.y | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/gdb/cp-name-parser.y b/gdb/cp-name-parser.y
index 6590194545f..692fe84071c 100644
--- a/gdb/cp-name-parser.y
+++ b/gdb/cp-name-parser.y
@@ -1290,8 +1290,6 @@ cpname_state::parse_number (const char *p, int len, int parsed_float,
   /* Number of "L" suffixes encountered.  */
   int long_p = 0;
 
-  struct demangle_component *signed_type;
-  struct demangle_component *unsigned_type;
   struct demangle_component *type, *name;
   enum demangle_component_type literal_type;
 
@@ -1362,25 +1360,26 @@ cpname_state::parse_number (const char *p, int len, int parsed_float,
 
   if (long_p == 0)
     {
-      unsigned_type = make_builtin_type ("unsigned int");
-      signed_type = make_builtin_type ("int");
+      if (unsigned_p)
+	type = make_builtin_type ("unsigned int");
+      else
+	type = make_builtin_type ("int");
     }
   else if (long_p == 1)
     {
-      unsigned_type = make_builtin_type ("unsigned long");
-      signed_type = make_builtin_type ("long");
+      if (unsigned_p)
+	type = make_builtin_type ("unsigned long");
+      else
+	type = make_builtin_type ("long");
     }
   else
     {
-      unsigned_type = make_builtin_type ("unsigned long long");
-      signed_type = make_builtin_type ("long long");
+      if (unsigned_p)
+	type = make_builtin_type ("unsigned long long");
+      else
+	type = make_builtin_type ("long long");
     }
 
-   if (unsigned_p)
-     type = unsigned_type;
-   else
-     type = signed_type;
-
    name = make_name (p, len);
    lvalp->comp = fill_comp (literal_type, type, name);
 

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 07/10] Fix C++ canonicalization of hex literals
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
                   ` (5 preceding siblings ...)
  2024-04-21 17:00 ` [PATCH 06/10] Remove some unnecessary allocations from cpname_state::parse_number Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-22 17:22   ` John Baldwin
  2024-04-21 17:00 ` [PATCH 08/10] Implement C++14 numeric separators Tom Tromey
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

Currently names like "x::y::z<1>" and "x::y::z<0x01>" canonicalize to
different things.  I think it's nicer for them to be the same.
Differences between types can be done using suffixes like "ll" and "u"
-- it's not really possible to implement C++ rules in the
canoncalizer, because no gdbarch is available.  Possibly gdb should
even drop the type here and just represent all integers the same way
in names.
---
 gdb/cp-name-parser.y | 61 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 56 insertions(+), 5 deletions(-)

diff --git a/gdb/cp-name-parser.y b/gdb/cp-name-parser.y
index 692fe84071c..de08f9c0728 100644
--- a/gdb/cp-name-parser.y
+++ b/gdb/cp-name-parser.y
@@ -1336,8 +1336,35 @@ cpname_state::parse_number (const char *p, int len, int parsed_float,
       return FLOAT;
     }
 
-  /* This treats 0x1 and 1 as different literals.  We also do not
-     automatically generate unsigned types.  */
+  /* Note that we do not automatically generate unsigned types.  This
+     can't be done because we don't have access to the gdbarch
+     here.  */
+
+  int base = 10;
+  if (len > 1 && p[0] == '0')
+    {
+      if (p[1] == 'x' || p[1] == 'X')
+	{
+	  base = 16;
+	  p += 2;
+	  len -= 2;
+	}
+      else if (p[1] == 'b' || p[1] == 'B')
+	{
+	  base = 2;
+	  p += 2;
+	  len -= 2;
+	}
+      else if (p[1] == 'd' || p[1] == 'D' || p[1] == 't' || p[1] == 'T')
+	{
+	  /* Apparently gdb extensions.  */
+	  base = 10;
+	  p += 2;
+	  len -= 2;
+	}
+      else
+	base = 8;
+    }
 
   long_p = 0;
   unsigned_p = 0;
@@ -1358,6 +1385,24 @@ cpname_state::parse_number (const char *p, int len, int parsed_float,
       break;
     }
 
+  /* Use gdb_mpz here in case a 128-bit value appears.  */
+  gdb_mpz value (0);
+  for (int off = 0; off < len; ++off)
+    {
+      int dig;
+      if (ISDIGIT (p[off]))
+	dig = p[off] - '0';
+      else
+	dig = TOLOWER (p[off]) - 'a' + 10;
+      if (dig >= base)
+	return ERROR;
+      value *= base;
+      value += dig;
+    }
+
+  std::string printed = value.str ();
+  const char *copy = obstack_strdup (&demangle_info->obstack, printed);
+
   if (long_p == 0)
     {
       if (unsigned_p)
@@ -1380,10 +1425,10 @@ cpname_state::parse_number (const char *p, int len, int parsed_float,
 	type = make_builtin_type ("long long");
     }
 
-   name = make_name (p, len);
-   lvalp->comp = fill_comp (literal_type, type, name);
+  name = make_name (copy, strlen (copy));
+  lvalp->comp = fill_comp (literal_type, type, name);
 
-   return INT;
+  return INT;
 }
 
 static const char backslashable[] = "abefnrtv";
@@ -1994,6 +2039,12 @@ canonicalize_tests ()
   should_be_the_same ("int short", "short");
 
   should_be_the_same ("C<(char) 1>::m()", "C<(char) '\\001'>::m()");
+  should_be_the_same ("x::y::z<1>", "x::y::z<0x01>");
+  should_be_the_same ("x::y::z<1>", "x::y::z<01>");
+  should_be_the_same ("x::y::z<(unsigned long long) 1>", "x::y::z<01ull>");
+  should_be_the_same ("x::y::z<0b111>", "x::y::z<7>");
+  should_be_the_same ("x::y::z<0b111>", "x::y::z<0t7>");
+  should_be_the_same ("x::y::z<0b111>", "x::y::z<0D7>");
 }
 
 #endif

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 08/10] Implement C++14 numeric separators
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
                   ` (6 preceding siblings ...)
  2024-04-21 17:00 ` [PATCH 07/10] Fix C++ canonicalization of hex literals Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-22 17:29   ` John Baldwin
  2024-04-21 17:00 ` [PATCH 09/10] Allow function types as template parameters in name canonicalizer Tom Tromey
  2024-04-21 17:00 ` [PATCH 10/10] Add spaceship operator to cp-name-parser.y Tom Tromey
  9 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

C++14 allows the use of the apostrophe as a numeric separator; that
is, "23000" and "23'000" represent the same number.  This patch
implements this for gdb's C++ parser and the C++ name canonicalizer.

I did this unconditionally for all C variants because I think it's
unambiguous.

For the name canonicalizer, there's at least one compiler that can
emit constants with this form, see bug 30845.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=23457
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30845
---
 gdb/c-exp.y                   | 28 +++++++++++++++++++++++-----
 gdb/cp-name-parser.y          | 31 ++++++++++++++++++++++++++-----
 gdb/testsuite/gdb.cp/misc.exp |  4 ++++
 3 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/gdb/c-exp.y b/gdb/c-exp.y
index 663c30f9517..69cea523abd 100644
--- a/gdb/c-exp.y
+++ b/gdb/c-exp.y
@@ -2755,6 +2755,10 @@ lex_one_token (struct parser_state *par_state, bool *is_quoted_name)
 	    hex = 0;
 	  }
 
+	/* If the token includes the C++14 digits separator, we make a
+	   copy so that we don't have to handle the separator in
+	   parse_number.  */
+	std::optional<std::string> no_tick;
 	for (;; ++p)
 	  {
 	    /* This test includes !hex because 'e' is a valid hex digit
@@ -2771,18 +2775,32 @@ lex_one_token (struct parser_state *par_state, bool *is_quoted_name)
 	    else if (((got_e && (p[-1] == 'e' || p[-1] == 'E'))
 		      || (got_p && (p[-1] == 'p' || p[-1] == 'P')))
 		     && (*p == '-' || *p == '+'))
-	      /* This is the sign of the exponent, not the end of the
-		 number.  */
-	      continue;
+	      {
+		/* This is the sign of the exponent, not the end of
+		   the number.  */
+	      }
+	    else if (*p == '\'')
+	      {
+		if (!no_tick.has_value ())
+		  no_tick.emplace (tokstart, p);
+		continue;
+	      }
 	    /* We will take any letters or digits.  parse_number will
 	       complain if past the radix, or if L or U are not final.  */
 	    else if ((*p < '0' || *p > '9')
 		     && ((*p < 'a' || *p > 'z')
 				  && (*p < 'A' || *p > 'Z')))
 	      break;
+	    if (no_tick.has_value ())
+	      no_tick->push_back (*p);
 	  }
-	toktype = parse_number (par_state, tokstart, p - tokstart,
-				got_dot | got_e | got_p, &yylval);
+	if (no_tick.has_value ())
+	  toktype = parse_number (par_state, no_tick->c_str (),
+				  no_tick->length (),
+				  got_dot | got_e | got_p, &yylval);
+	else
+	  toktype = parse_number (par_state, tokstart, p - tokstart,
+				  got_dot | got_e | got_p, &yylval);
 	if (toktype == ERROR)
 	  {
 	    char *err_copy = (char *) alloca (p - tokstart + 1);
diff --git a/gdb/cp-name-parser.y b/gdb/cp-name-parser.y
index de08f9c0728..3c5dea2de1c 100644
--- a/gdb/cp-name-parser.y
+++ b/gdb/cp-name-parser.y
@@ -1686,6 +1686,10 @@ yylex (YYSTYPE *lvalp, cpname_state *state)
 	    hex = 0;
 	  }
 
+	/* If the token includes the C++14 digits separator, we make a
+	   copy so that we don't have to handle the separator in
+	   parse_number.  */
+	std::optional<std::string> no_tick;
 	for (;; ++p)
 	  {
 	    /* This test includes !hex because 'e' is a valid hex digit
@@ -1703,16 +1707,31 @@ yylex (YYSTYPE *lvalp, cpname_state *state)
 	      got_dot = 1;
 	    else if (got_e && (p[-1] == 'e' || p[-1] == 'E')
 		     && (*p == '-' || *p == '+'))
-	      /* This is the sign of the exponent, not the end of the
-		 number.  */
-	      continue;
+	      {
+		/* This is the sign of the exponent, not the end of
+		   the number.  */
+	      }
+	    /* C++14 allows a separator.  */
+	    else if (*p == '\'')
+	      {
+		if (!no_tick.has_value ())
+		  no_tick.emplace (tokstart, p);
+		continue;
+	      }
 	    /* We will take any letters or digits.  parse_number will
 	       complain if past the radix, or if L or U are not final.  */
 	    else if (! ISALNUM (*p))
 	      break;
+	    if (no_tick.has_value ())
+	      no_tick->push_back (*p);
 	  }
-	toktype = state->parse_number (tokstart, p - tokstart, got_dot|got_e,
-				       lvalp);
+	if (no_tick.has_value ())
+	  toktype = state->parse_number (no_tick->c_str (),
+					 no_tick->length (),
+					 got_dot|got_e, lvalp);
+	else
+	  toktype = state->parse_number (tokstart, p - tokstart,
+					 got_dot|got_e, lvalp);
 	if (toktype == ERROR)
 	  {
 	    char *err_copy = (char *) alloca (p - tokstart + 1);
@@ -2045,6 +2064,8 @@ canonicalize_tests ()
   should_be_the_same ("x::y::z<0b111>", "x::y::z<7>");
   should_be_the_same ("x::y::z<0b111>", "x::y::z<0t7>");
   should_be_the_same ("x::y::z<0b111>", "x::y::z<0D7>");
+
+  should_be_the_same ("x::y::z<0xff'ff>", "x::y::z<65535>");
 }
 
 #endif
diff --git a/gdb/testsuite/gdb.cp/misc.exp b/gdb/testsuite/gdb.cp/misc.exp
index 264294f857d..bcb20f85eee 100644
--- a/gdb/testsuite/gdb.cp/misc.exp
+++ b/gdb/testsuite/gdb.cp/misc.exp
@@ -114,3 +114,7 @@ gdb_test "print *(number_ref + v_bool_array)" "\\$\[0-9\]* = false" \
     "integer reference addition with pointer"
 gdb_test "print *(v_bool_array - number_ref)" "\\$\[0-9\]* = false" \
     "pointer subtraction with integer reference"
+
+# C++14 digit separator.
+gdb_test "print 23'23" " = 2323"
+gdb_test "print 2'3.5" " = 23.5"

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 09/10] Allow function types as template parameters in name canonicalizer
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
                   ` (7 preceding siblings ...)
  2024-04-21 17:00 ` [PATCH 08/10] Implement C++14 numeric separators Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-22 17:30   ` John Baldwin
  2024-04-21 17:00 ` [PATCH 10/10] Add spaceship operator to cp-name-parser.y Tom Tromey
  9 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

This adds function types as template parameters in the C++ name
canonicalizer.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=11907
---
 gdb/cp-name-parser.y | 4 ++++
 gdb/cp-support.c     | 7 -------
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/gdb/cp-name-parser.y b/gdb/cp-name-parser.y
index 3c5dea2de1c..e820faa2db9 100644
--- a/gdb/cp-name-parser.y
+++ b/gdb/cp-name-parser.y
@@ -596,6 +596,7 @@ template_arg	:	typespec_2
 		|	'&' '(' start ')'
 			{ $$ = state->fill_comp (DEMANGLE_COMPONENT_UNARY, state->make_operator ("&", 1), $3); }
 		|	exp
+		|	function
 		;
 
 function_args	:	typespec_2
@@ -2066,6 +2067,9 @@ canonicalize_tests ()
   should_be_the_same ("x::y::z<0b111>", "x::y::z<0D7>");
 
   should_be_the_same ("x::y::z<0xff'ff>", "x::y::z<65535>");
+
+  should_be_the_same ("something<void ()>", "something<  void()  >");
+  should_be_the_same ("something<void ()>", "something<void (void)>");
 }
 
 #endif
diff --git a/gdb/cp-support.c b/gdb/cp-support.c
index 4c64e4963e4..74a1b61ed68 100644
--- a/gdb/cp-support.c
+++ b/gdb/cp-support.c
@@ -2186,15 +2186,8 @@ test_cp_remove_params ()
   CHECK_INCOMPL ("A::foo<(anonymous namespace)::B",
 		 "A::foo");
 
-  /* Shouldn't this parse?  Looks like a bug in
-     cp_demangled_name_to_comp.  See PR c++/22411.  */
-#if 0
   CHECK ("A::foo<void(int)>::func(int)",
 	 "A::foo<void(int)>::func");
-#else
-  CHECK_INCOMPL ("A::foo<void(int)>::func(int)",
-		 "A::foo");
-#endif
 
   CHECK_INCOMPL ("A::foo<void(int",
 		 "A::foo");

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 10/10] Add spaceship operator to cp-name-parser.y
  2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
                   ` (8 preceding siblings ...)
  2024-04-21 17:00 ` [PATCH 09/10] Allow function types as template parameters in name canonicalizer Tom Tromey
@ 2024-04-21 17:00 ` Tom Tromey
  2024-04-22 17:31   ` John Baldwin
  9 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-21 17:00 UTC (permalink / raw)
  To: gdb-patches

While debugging gdb, I saw this:

During symbol reading: unexpected demangled name 'operator<=><std::chrono::_V2::system_clock, std::chrono::duration<long int>, std::chrono::duration<long int> >'

This happens because cp-name-parser.y does not handle the spaceship
operator.  This patch implements this.
---
 gdb/cp-name-parser.y | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/gdb/cp-name-parser.y b/gdb/cp-name-parser.y
index e820faa2db9..db5c349d212 100644
--- a/gdb/cp-name-parser.y
+++ b/gdb/cp-name-parser.y
@@ -297,7 +297,7 @@ static void yyerror (cpname_state *, const char *);
 %left '^'
 %left '&'
 %left EQUAL NOTEQUAL
-%left '<' '>' LEQ GEQ
+%left '<' '>' LEQ GEQ SPACESHIP
 %left LSH RSH
 %left '@'
 %left '+' '-'
@@ -451,6 +451,8 @@ oper	:	OPERATOR NEW
 			{ $$ = state->make_operator ("<=", 2); }
 		|	OPERATOR GEQ
 			{ $$ = state->make_operator (">=", 2); }
+		|	OPERATOR SPACESHIP
+			{ $$ = state->make_operator ("<=>", 2); }
 		|	OPERATOR ANDAND
 			{ $$ = state->make_operator ("&&", 2); }
 		|	OPERATOR OROR
@@ -1077,6 +1079,10 @@ exp	:	exp GEQ exp
 		{ $$ = state->d_binary (">=", $1, $3); }
 	;
 
+exp	:	exp SPACESHIP exp
+		{ $$ = state->d_binary ("<=>", $1, $3); }
+	;
+
 exp	:	exp '<' exp
 		{ $$ = state->d_binary ("<", $1, $3); }
 	;
@@ -1783,6 +1789,7 @@ yylex (YYSTYPE *lvalp, cpname_state *state)
       return c;
     case '<':
       HANDLE_TOKEN3 ("<<=", ASSIGN_MODIFY);
+      HANDLE_TOKEN3 ("<=>", SPACESHIP);
       HANDLE_TOKEN2 ("<=", LEQ);
       HANDLE_TOKEN2 ("<<", LSH);
       state->lexptr++;
@@ -2052,6 +2059,14 @@ should_be_the_same (const char *one, const char *two)
   SELF_CHECK (strcmp (one, two) == 0);
 }
 
+static void
+should_parse (const char *name)
+{
+  std::string err;
+  auto parsed = cp_demangled_name_to_comp (name, &err);
+  SELF_CHECK (parsed != nullptr);
+}
+
 static void
 canonicalize_tests ()
 {
@@ -2070,6 +2085,8 @@ canonicalize_tests ()
 
   should_be_the_same ("something<void ()>", "something<  void()  >");
   should_be_the_same ("something<void ()>", "something<void (void)>");
+
+  should_parse ("void whatever::operator<=><int, int>");
 }
 
 #endif

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 01/10] Remove test code from cp-name-parser.y
  2024-04-21 17:00 ` [PATCH 01/10] Remove test code from cp-name-parser.y Tom Tromey
@ 2024-04-22 17:11   ` John Baldwin
  0 siblings, 0 replies; 22+ messages in thread
From: John Baldwin @ 2024-04-22 17:11 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 4/21/24 10:00 AM, Tom Tromey wrote:
> This removes the current test 'main' from cp-name-parser.y.  There
> aren't any tests using this, and nowadays it would be better as a unit
> test.

Approved-By: John Baldwin <jhb@FreeBSD.org>

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 03/10] Clean up demangle_parse_info
  2024-04-21 17:00 ` [PATCH 03/10] Clean up demangle_parse_info Tom Tromey
@ 2024-04-22 17:12   ` John Baldwin
  0 siblings, 0 replies; 22+ messages in thread
From: John Baldwin @ 2024-04-22 17:12 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 4/21/24 10:00 AM, Tom Tromey wrote:
> This changes demangle_parse_info to use inline initializers and to
> remove some manual memory management.

Approved-By: John Baldwin <jhb@FreeBSD.org>

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 04/10] Change storage of demangle_component
  2024-04-21 17:00 ` [PATCH 04/10] Change storage of demangle_component Tom Tromey
@ 2024-04-22 17:17   ` John Baldwin
  0 siblings, 0 replies; 22+ messages in thread
From: John Baldwin @ 2024-04-22 17:17 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 4/21/24 10:00 AM, Tom Tromey wrote:
> This changes demangle_component objects to be stored on the obstack
> that is part of demangle_info.  It also arranges for a demangle_info
> object to be kept alive by cp_merge_demangle_parse_infos.  This way,
> other data on the obstack can be kept while an "outer" demangle_info
> needs it.

The mechanical changes to switch to using better memory management look
ok to me.  I did not do a detailed review of the objects' relationships
to understand the "outer" comment in the log.

(Not quite sure where on the line this falls for Acked-By vs
Approved-By)

Acked-By: John Baldwin <jhb@FreeBSD.org>

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 05/10] Fix C++ name canonicalizations of character literals
  2024-04-21 17:00 ` [PATCH 05/10] Fix C++ name canonicalizations of character literals Tom Tromey
@ 2024-04-22 17:19   ` John Baldwin
  0 siblings, 0 replies; 22+ messages in thread
From: John Baldwin @ 2024-04-22 17:19 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 4/21/24 10:00 AM, Tom Tromey wrote:
> The names "void C<(char)1>::m()" and "void C<'\001'>::m()" should
> canonicalize to the same string, but currently they do not -- the
> former remains unchanged and the latter is transformed to
> "void C<(char)'\001'>::m()".
> 
> This patch fixes the bug and also adds some unit tests.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16843

Approved-By: John Baldwin <jhb@FreeBSD.org>

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 06/10] Remove some unnecessary allocations from cpname_state::parse_number
  2024-04-21 17:00 ` [PATCH 06/10] Remove some unnecessary allocations from cpname_state::parse_number Tom Tromey
@ 2024-04-22 17:20   ` John Baldwin
  0 siblings, 0 replies; 22+ messages in thread
From: John Baldwin @ 2024-04-22 17:20 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 4/21/24 10:00 AM, Tom Tromey wrote:
> cpname_state::parse_number allocates nodes for various types and then
> only uses one of them.  This patch reduces the number of allocations
> by not performing the unnecessary ones.

Approved-By: John Baldwin <jhb@FreeBSD.org>

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 07/10] Fix C++ canonicalization of hex literals
  2024-04-21 17:00 ` [PATCH 07/10] Fix C++ canonicalization of hex literals Tom Tromey
@ 2024-04-22 17:22   ` John Baldwin
  0 siblings, 0 replies; 22+ messages in thread
From: John Baldwin @ 2024-04-22 17:22 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 4/21/24 10:00 AM, Tom Tromey wrote:
> Currently names like "x::y::z<1>" and "x::y::z<0x01>" canonicalize to
> different things.  I think it's nicer for them to be the same.
> Differences between types can be done using suffixes like "ll" and "u"
> -- it's not really possible to implement C++ rules in the
> canoncalizer, because no gdbarch is available.  Possibly gdb should
> even drop the type here and just represent all integers the same way
> in names.

Approved-By: John Baldwin <jhb@FreeBSD.org>

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 08/10] Implement C++14 numeric separators
  2024-04-21 17:00 ` [PATCH 08/10] Implement C++14 numeric separators Tom Tromey
@ 2024-04-22 17:29   ` John Baldwin
  2024-04-24 21:42     ` Tom Tromey
  0 siblings, 1 reply; 22+ messages in thread
From: John Baldwin @ 2024-04-22 17:29 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 4/21/24 10:00 AM, Tom Tromey wrote:
> C++14 allows the use of the apostrophe as a numeric separator; that
> is, "23000" and "23'000" represent the same number.  This patch
> implements this for gdb's C++ parser and the C++ name canonicalizer.
> 
> I did this unconditionally for all C variants because I think it's
> unambiguous.
> 
> For the name canonicalizer, there's at least one compiler that can
> emit constants with this form, see bug 30845.

I guess this is the only separator allowed so far?  Longer term I
do think it might be easier to understand the code if you always
allocate a new string and copy valid characters into just using a
'continue' to skip characters you want to drop.  Perhaps though
this is a hot path where the extra allocation would be noticable?

I also wonder if there might be benefit in a commit prior to
this that pulls out the shared code here into a single routine
that these files share?  (Or is that not quite as doable with
yacc?)

-- 
John Baldwin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 09/10] Allow function types as template parameters in name canonicalizer
  2024-04-21 17:00 ` [PATCH 09/10] Allow function types as template parameters in name canonicalizer Tom Tromey
@ 2024-04-22 17:30   ` John Baldwin
  0 siblings, 0 replies; 22+ messages in thread
From: John Baldwin @ 2024-04-22 17:30 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 4/21/24 10:00 AM, Tom Tromey wrote:
> This adds function types as template parameters in the C++ name
> canonicalizer.
> 
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=11907

Approved-By: John Baldwin <jhb@FreeBSD.org>

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 10/10] Add spaceship operator to cp-name-parser.y
  2024-04-21 17:00 ` [PATCH 10/10] Add spaceship operator to cp-name-parser.y Tom Tromey
@ 2024-04-22 17:31   ` John Baldwin
  0 siblings, 0 replies; 22+ messages in thread
From: John Baldwin @ 2024-04-22 17:31 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches

On 4/21/24 10:00 AM, Tom Tromey wrote:
> While debugging gdb, I saw this:
> 
> During symbol reading: unexpected demangled name 'operator<=><std::chrono::_V2::system_clock, std::chrono::duration<long int>, std::chrono::duration<long int> >'
> 
> This happens because cp-name-parser.y does not handle the spaceship
> operator.  This patch implements this.

Approved-By: John Baldwin <jhb@FreeBSD.org>

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 08/10] Implement C++14 numeric separators
  2024-04-22 17:29   ` John Baldwin
@ 2024-04-24 21:42     ` Tom Tromey
  2024-04-30 16:33       ` John Baldwin
  0 siblings, 1 reply; 22+ messages in thread
From: Tom Tromey @ 2024-04-24 21:42 UTC (permalink / raw)
  To: John Baldwin; +Cc: Tom Tromey, gdb-patches

>>>>> "John" == John Baldwin <jhb@FreeBSD.org> writes:

>> For the name canonicalizer, there's at least one compiler that can
>> emit constants with this form, see bug 30845.

John> I guess this is the only separator allowed so far?

Yeah.  It was added in C++14.

John> Longer term I
John> do think it might be easier to understand the code if you always
John> allocate a new string and copy valid characters into just using a
John> 'continue' to skip characters you want to drop.  Perhaps though
John> this is a hot path where the extra allocation would be noticable?

In c-exp.y, probably not.

In cp-name-parser.y -- maybe.  That code is run when demangling, and
sometimes that means it is run quite a bit.

John> I also wonder if there might be benefit in a commit prior to
John> this that pulls out the shared code here into a single routine
John> that these files share?  (Or is that not quite as doable with
John> yacc?)

I tried that and it was pretty messy.  It could be done but I just took
an easier route out of laziness I guess.

Tom

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 08/10] Implement C++14 numeric separators
  2024-04-24 21:42     ` Tom Tromey
@ 2024-04-30 16:33       ` John Baldwin
  0 siblings, 0 replies; 22+ messages in thread
From: John Baldwin @ 2024-04-30 16:33 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

On 4/24/24 2:42 PM, Tom Tromey wrote:
>>>>>> "John" == John Baldwin <jhb@FreeBSD.org> writes:
> 
>>> For the name canonicalizer, there's at least one compiler that can
>>> emit constants with this form, see bug 30845.
> 
> John> I guess this is the only separator allowed so far?
> 
> Yeah.  It was added in C++14.
> 
> John> Longer term I
> John> do think it might be easier to understand the code if you always
> John> allocate a new string and copy valid characters into just using a
> John> 'continue' to skip characters you want to drop.  Perhaps though
> John> this is a hot path where the extra allocation would be noticable?
> 
> In c-exp.y, probably not.
> 
> In cp-name-parser.y -- maybe.  That code is run when demangling, and
> sometimes that means it is run quite a bit.
> 
> John> I also wonder if there might be benefit in a commit prior to
> John> this that pulls out the shared code here into a single routine
> John> that these files share?  (Or is that not quite as doable with
> John> yacc?)
> 
> I tried that and it was pretty messy.  It could be done but I just took
> an easier route out of laziness I guess.

Ok, I think your current patch is fine then.

Approved-By: John Baldwin <jhb@FreeBSD.org>

-- 
John Baldwin


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-04-30 16:33 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-21 17:00 [PATCH 00/10] Fix some C++ name canonicalizer problems Tom Tromey
2024-04-21 17:00 ` [PATCH 01/10] Remove test code from cp-name-parser.y Tom Tromey
2024-04-22 17:11   ` John Baldwin
2024-04-21 17:00 ` [PATCH 02/10] Allow initialization functions in .y files Tom Tromey
2024-04-21 17:00 ` [PATCH 03/10] Clean up demangle_parse_info Tom Tromey
2024-04-22 17:12   ` John Baldwin
2024-04-21 17:00 ` [PATCH 04/10] Change storage of demangle_component Tom Tromey
2024-04-22 17:17   ` John Baldwin
2024-04-21 17:00 ` [PATCH 05/10] Fix C++ name canonicalizations of character literals Tom Tromey
2024-04-22 17:19   ` John Baldwin
2024-04-21 17:00 ` [PATCH 06/10] Remove some unnecessary allocations from cpname_state::parse_number Tom Tromey
2024-04-22 17:20   ` John Baldwin
2024-04-21 17:00 ` [PATCH 07/10] Fix C++ canonicalization of hex literals Tom Tromey
2024-04-22 17:22   ` John Baldwin
2024-04-21 17:00 ` [PATCH 08/10] Implement C++14 numeric separators Tom Tromey
2024-04-22 17:29   ` John Baldwin
2024-04-24 21:42     ` Tom Tromey
2024-04-30 16:33       ` John Baldwin
2024-04-21 17:00 ` [PATCH 09/10] Allow function types as template parameters in name canonicalizer Tom Tromey
2024-04-22 17:30   ` John Baldwin
2024-04-21 17:00 ` [PATCH 10/10] Add spaceship operator to cp-name-parser.y Tom Tromey
2024-04-22 17:31   ` John Baldwin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).