public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Fix comparator bug in cooked index
@ 2023-01-27 19:56 Tom Tromey
  2023-01-28  4:20 ` Simon Marchi
  2023-01-30 10:33 ` Andrew Burgess
  0 siblings, 2 replies; 5+ messages in thread
From: Tom Tromey @ 2023-01-27 19:56 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

Simon pointed out that the cooked index template-matching patch
introduced a failure in libstdc++ debug mode.  In particular, the new
code violates the assumption of std::lower_bound and std::upper_bound
that the range is sorted with respect to the comparison.

When I first debugged this, I thought the problem was unfixable as-is
and that a second layer of filtering would have to be done.  However,
on irc, Simon pointed out that it could perhaps be solved if the
comparison function were assured that one operand always came from the
index, with the other always being the search string.

This patch implements this idea.

First, a new mode is introduced: a sorting mode for
cooked_index_entry::compare.  In this mode, strings are compared
case-insensitively, but we're careful to always sort '<' before any
other printable character.  This way, two names like "func" and
"func<param>" will be sorted next to each other -- i.e., "func1" will
not be seen between them.  This is important when searching.

Second, the compare function is changed to work in a strcmp-like way.
This makes it easier to test and (IMO) understand.

Third, the compare function is modified so that in non-sorting modes,
the index entry is always the first argument.  This allows consistency
in compares.

I regression tested this in libstdc++ debug mode on x86-64 Fedora 36.
It fixes the crash that Simon saw.
---
 gdb/dwarf2/cooked-index.c | 166 ++++++++++++++++++++------------------
 gdb/dwarf2/cooked-index.h |  42 ++++++++--
 2 files changed, 123 insertions(+), 85 deletions(-)

diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c
index 09b3fd70b26..f6b1df6e529 100644
--- a/gdb/dwarf2/cooked-index.c
+++ b/gdb/dwarf2/cooked-index.c
@@ -30,56 +30,38 @@
 
 /* See cooked-index.h.  */
 
-bool
+int
 cooked_index_entry::compare (const char *stra, const char *strb,
-			     bool completing)
+			     comparison_mode mode)
 {
-  /* If we've ever matched "<" in both strings, then we disable the
-     special template parameter handling.  */
-  bool seen_lt = false;
+  /* We want to sort '<' before any other printable character.  So,
+     rewrite '<' to something just before ' '.  */
+#define MUNGE(c) (c == '<' ? '\x1f' : TOLOWER ((unsigned char) c))
 
   while (*stra != '\0'
 	 && *strb != '\0'
-	 && (TOLOWER ((unsigned char) *stra)
-	     == TOLOWER ((unsigned char ) *strb)))
+	 && (MUNGE (*stra) == MUNGE (*strb)))
     {
-      if (*stra == '<')
-	seen_lt = true;
       ++stra;
       ++strb;
     }
 
-  unsigned c1 = TOLOWER ((unsigned char) *stra);
-  unsigned c2 = TOLOWER ((unsigned char) *strb);
+  unsigned c1 = MUNGE (*stra);
+  unsigned c2 = MUNGE (*strb);
 
-  if (completing)
-    {
-      /* When completing, if one string ends earlier than the other,
-	 consider them as equal.  Also, completion mode ignores the
-	 special '<' handling.  */
-      if (c1 == '\0' || c2 == '\0')
-	return false;
-      /* Fall through to the generic case.  */
-    }
-  else if (seen_lt)
-    {
-      /* Fall through to the generic case.  */
-    }
-  else if (c1 == '\0' || c1 == '<')
-    {
-      /* Maybe they both end at the same spot.  */
-      if (c2 == '\0' || c2 == '<')
-	return false;
-      /* First string ended earlier.  */
-      return true;
-    }
-  else if (c2 == '\0' || c2 == '<')
+  if (c1 == c2)
+    return 0;
+
+  /* When completing, if STRB ends earlier than STRA, consider them as
+     equal.  When comparing, if STRB ends earlier and STRA ends with
+     '<', consider them as equal.  */
+  if (mode == COMPLETE || (mode == COMPARE && c1 == MUNGE ('<')))
     {
-      /* Second string ended earlier.  */
-      return false;
+      if (c2 == '\0')
+	return 0;
     }
 
-  return c1 < c2;
+  return c1 < c2 ? -1 : 1;
 }
 
 #if GDB_SELF_TEST
@@ -89,45 +71,69 @@ namespace {
 void
 test_compare ()
 {
-  SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", false));
-  SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", false));
-  SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", true));
-  SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", true));
-
-  SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE", false));
-  SELF_CHECK (!cooked_index_entry::compare ("ABCDE", "abcd", false));
-  SELF_CHECK (!cooked_index_entry::compare ("abcd", "ABCDE", true));
-  SELF_CHECK (!cooked_index_entry::compare ("ABCDE", "abcd", true));
-
-  SELF_CHECK (!cooked_index_entry::compare ("name", "name<>", false));
-  SELF_CHECK (!cooked_index_entry::compare ("name<>", "name", false));
-  SELF_CHECK (!cooked_index_entry::compare ("name", "name<>", true));
-  SELF_CHECK (!cooked_index_entry::compare ("name<>", "name", true));
-
-  SELF_CHECK (!cooked_index_entry::compare ("name<arg>", "name<arg>", false));
-  SELF_CHECK (!cooked_index_entry::compare ("name<arg>", "name<arg>", false));
-  SELF_CHECK (!cooked_index_entry::compare ("name<arg>", "name<arg>", true));
-  SELF_CHECK (!cooked_index_entry::compare ("name<arg>", "name<ag>", true));
-
-  SELF_CHECK (!cooked_index_entry::compare ("name<arg<more>>",
-					    "name<arg<more>>", false));
-
-  SELF_CHECK (!cooked_index_entry::compare ("name", "name<arg<more>>", false));
-  SELF_CHECK (!cooked_index_entry::compare ("name<arg<more>>", "name", false));
-  SELF_CHECK (cooked_index_entry::compare ("name<arg<", "name<arg<more>>",
-					   false));
-  SELF_CHECK (!cooked_index_entry::compare ("name<arg<",
-					    "name<arg<more>>",
-					    true));
-  SELF_CHECK (!cooked_index_entry::compare ("name<arg<more>>", "name<arg<",
-					    false));
-  SELF_CHECK (!cooked_index_entry::compare ("name<arg<more>>", "name<arg<",
-					    true));
-
-  SELF_CHECK (cooked_index_entry::compare ("", "abcd", false));
-  SELF_CHECK (!cooked_index_entry::compare ("", "abcd", true));
-  SELF_CHECK (!cooked_index_entry::compare ("abcd", "", false));
-  SELF_CHECK (!cooked_index_entry::compare ("abcd", "", true));
+  /* Convenience aliases.  */
+  const auto mode_compare = cooked_index_entry::COMPARE;
+  const auto mode_sort = cooked_index_entry::SORT;
+  const auto mode_complete = cooked_index_entry::COMPLETE;
+
+  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
+					   mode_compare) == 0);
+  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
+					   mode_compare) == 0);
+  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
+					   mode_complete) == 0);
+  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
+					   mode_complete) == 0);
+
+  SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE",
+					   mode_compare) < 0);
+  SELF_CHECK (cooked_index_entry::compare ("ABCDE", "abcd",
+					   mode_compare) > 0);
+  SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE",
+					   mode_complete) < 0);
+  SELF_CHECK (cooked_index_entry::compare ("ABCDE", "abcd",
+					   mode_complete) == 0);
+
+  SELF_CHECK (cooked_index_entry::compare ("name", "name<>",
+					   mode_compare) < 0);
+  SELF_CHECK (cooked_index_entry::compare ("name<>", "name",
+					   mode_compare) == 0);
+  SELF_CHECK (cooked_index_entry::compare ("name", "name<>",
+					   mode_complete) < 0);
+  SELF_CHECK (cooked_index_entry::compare ("name<>", "name",
+					   mode_complete) == 0);
+
+  SELF_CHECK (cooked_index_entry::compare ("name<arg>", "name<arg>",
+					   mode_compare) == 0);
+  SELF_CHECK (cooked_index_entry::compare ("name<arg>", "name<ag>",
+					   mode_compare) > 0);
+  SELF_CHECK (cooked_index_entry::compare ("name<arg>", "name<arg>",
+					   mode_complete) == 0);
+  SELF_CHECK (cooked_index_entry::compare ("name<arg>", "name<ag>",
+					   mode_complete) > 0);
+
+  SELF_CHECK (cooked_index_entry::compare ("name<arg<more>>",
+					   "name<arg<more>>",
+					   mode_compare) == 0);
+
+  SELF_CHECK (cooked_index_entry::compare ("name", "name<arg<more>>",
+					   mode_compare) < 0);
+  SELF_CHECK (cooked_index_entry::compare ("name<arg<more>>", "name",
+					   mode_compare) == 0);
+  SELF_CHECK (cooked_index_entry::compare ("name<arg<more>>", "name<arg<",
+					   mode_compare) > 0);
+  SELF_CHECK (cooked_index_entry::compare ("name<arg<more>>", "name<arg<",
+					   mode_complete) == 0);
+
+  SELF_CHECK (cooked_index_entry::compare ("", "abcd", mode_compare) < 0);
+  SELF_CHECK (cooked_index_entry::compare ("", "abcd", mode_complete) < 0);
+  SELF_CHECK (cooked_index_entry::compare ("abcd", "", mode_compare) > 0);
+  SELF_CHECK (cooked_index_entry::compare ("abcd", "", mode_complete) == 0);
+
+  SELF_CHECK (cooked_index_entry::compare ("func", "func<type>",
+					   mode_sort) < 0);
+  SELF_CHECK (cooked_index_entry::compare ("func<type>", "func1",
+					   mode_sort) < 0);
 }
 
 } /* anonymous namespace */
@@ -359,20 +365,22 @@ cooked_index::find (const std::string &name, bool completing)
 {
   wait ();
 
+  cooked_index_entry::comparison_mode mode = (completing
+					      ? cooked_index_entry::COMPLETE
+					      : cooked_index_entry::COMPARE);
+
   auto lower = std::lower_bound (m_entries.begin (), m_entries.end (), name,
 				 [=] (const cooked_index_entry *entry,
 				      const std::string &n)
   {
-    return cooked_index_entry::compare (entry->canonical, n.c_str (),
-					completing);
+    return cooked_index_entry::compare (entry->canonical, n.c_str (), mode) < 0;
   });
 
   auto upper = std::upper_bound (m_entries.begin (), m_entries.end (), name,
 				 [=] (const std::string &n,
 				      const cooked_index_entry *entry)
   {
-    return cooked_index_entry::compare (n.c_str (), entry->canonical,
-					completing);
+    return cooked_index_entry::compare (entry->canonical, n.c_str (), mode) > 0;
   });
 
   return range (lower, upper);
diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
index 55eaf9955ab..1c291ba5694 100644
--- a/gdb/dwarf2/cooked-index.h
+++ b/gdb/dwarf2/cooked-index.h
@@ -143,16 +143,46 @@ struct cooked_index_entry : public allocate_on_obstack
      STORAGE.  */
   const char *full_name (struct obstack *storage) const;
 
-  /* Compare two strings, case-insensitively.  Return true if STRA is
-     less than STRB.  If one string has template parameters, but the
-     other does not, then they are considered to be equal; so for
-     example "t<x>" == "t<x>", "t<x>" < "t<y>", but "t" == "t<x>".  */
-  static bool compare (const char *stra, const char *strb, bool completing);
+  /* Comparison modes for the 'compare' function.  See the function
+     for a description.  */
+  enum comparison_mode
+  {
+    COMPARE,
+    SORT,
+    COMPLETE,
+  };
+
+  /* Compare two strings, case-insensitively.  Return -1 if STRA is
+     less than STRB, 0 if they are equal, and 1 if STRA is greater.
+
+     When comparing, '<' is considered to be less than all other
+     printable characters.  This ensures that "t<x>" sorts before
+     "t1", which is necessary when looking up "t".
+
+     MODE controls how the comparison proceeds.
+
+     MODE==SORT is used when sorting and does no special template
+     handling.
+
+     MODE==COMPARE is used when searching for a symbol.  In this case,
+     STRB must always be the search name, and STRA must be the name in
+     the index that is under consideration.  In compare mode, early
+     termination of STRB may match STRA -- for example, "t<int>" and
+     "t" will be considered to be equal.  (However, if A=="t" and
+     B=="t<int>", then this will not consider them as equal.)
+
+     MODE==COMPLETE is used when searching for a symbol for
+     completion.  In this case, STRB must always be the search name,
+     and STRA must be the name in the index that is under
+     consideration.  In completion mode, early termination of STRB
+     always results in a match.  */
+  static int compare (const char *stra, const char *strb,
+		      comparison_mode mode);
 
   /* Compare two entries by canonical name.  */
   bool operator< (const cooked_index_entry &other) const
   {
-    return compare (canonical, other.canonical, false);
+    return compare (canonical, other.canonical, SORT) < 0;
   }
 
   /* The name as it appears in DWARF.  This always points into one of
-- 
2.38.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix comparator bug in cooked index
  2023-01-27 19:56 [PATCH] Fix comparator bug in cooked index Tom Tromey
@ 2023-01-28  4:20 ` Simon Marchi
  2023-01-30 14:30   ` Tom Tromey
  2023-01-30 10:33 ` Andrew Burgess
  1 sibling, 1 reply; 5+ messages in thread
From: Simon Marchi @ 2023-01-28  4:20 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches



On 1/27/23 14:56, Tom Tromey via Gdb-patches wrote:
> Simon pointed out that the cooked index template-matching patch
> introduced a failure in libstdc++ debug mode.  In particular, the new
> code violates the assumption of std::lower_bound and std::upper_bound
> that the range is sorted with respect to the comparison.
> 
> When I first debugged this, I thought the problem was unfixable as-is
> and that a second layer of filtering would have to be done.  However,
> on irc, Simon pointed out that it could perhaps be solved if the
> comparison function were assured that one operand always came from the
> index, with the other always being the search string.
> 
> This patch implements this idea.
> 
> First, a new mode is introduced: a sorting mode for
> cooked_index_entry::compare.  In this mode, strings are compared
> case-insensitively, but we're careful to always sort '<' before any
> other printable character.  This way, two names like "func" and
> "func<param>" will be sorted next to each other -- i.e., "func1" will
> not be seen between them.  This is important when searching.
> 
> Second, the compare function is changed to work in a strcmp-like way.
> This makes it easier to test and (IMO) understand.
> 
> Third, the compare function is modified so that in non-sorting modes,
> the index entry is always the first argument.  This allows consistency
> in compares.
> 
> I regression tested this in libstdc++ debug mode on x86-64 Fedora 36.
> It fixes the crash that Simon saw.

I'm glad to know this idea works.

Just some minor comments below, otherwise the patch LGTM.

> ---
>  gdb/dwarf2/cooked-index.c | 166 ++++++++++++++++++++------------------
>  gdb/dwarf2/cooked-index.h |  42 ++++++++--
>  2 files changed, 123 insertions(+), 85 deletions(-)
> 
> diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c
> index 09b3fd70b26..f6b1df6e529 100644
> --- a/gdb/dwarf2/cooked-index.c
> +++ b/gdb/dwarf2/cooked-index.c
> @@ -30,56 +30,38 @@
>  
>  /* See cooked-index.h.  */
>  
> -bool
> +int
>  cooked_index_entry::compare (const char *stra, const char *strb,
> -			     bool completing)
> +			     comparison_mode mode)
>  {
> -  /* If we've ever matched "<" in both strings, then we disable the
> -     special template parameter handling.  */
> -  bool seen_lt = false;
> +  /* We want to sort '<' before any other printable character.  So,
> +     rewrite '<' to something just before ' '.  */
> +#define MUNGE(c) (c == '<' ? '\x1f' : TOLOWER ((unsigned char) c))

Can you make it a local function, like

  auto munge = [] (char c)
    {
      ...
    };

?  Macros are harder to debug.

> +  /* Convenience aliases.  */
> +  const auto mode_compare = cooked_index_entry::COMPARE;
> +  const auto mode_sort = cooked_index_entry::SORT;
> +  const auto mode_complete = cooked_index_entry::COMPLETE;
> +
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
> +					   mode_compare) == 0);
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
> +					   mode_compare) == 0);

Maybe I'm missing something, but the two checks above seem identical.

> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
> +					   mode_complete) == 0);
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
> +					   mode_complete) == 0);

And these two too.

> diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
> index 55eaf9955ab..1c291ba5694 100644
> --- a/gdb/dwarf2/cooked-index.h
> +++ b/gdb/dwarf2/cooked-index.h
> @@ -143,16 +143,46 @@ struct cooked_index_entry : public allocate_on_obstack
>       STORAGE.  */
>    const char *full_name (struct obstack *storage) const;
>  
> -  /* Compare two strings, case-insensitively.  Return true if STRA is
> -     less than STRB.  If one string has template parameters, but the
> -     other does not, then they are considered to be equal; so for
> -     example "t<x>" == "t<x>", "t<x>" < "t<y>", but "t" == "t<x>".  */
> -  static bool compare (const char *stra, const char *strb, bool completing);
> +  /* Comparison modes for the 'compare' function.  See the function
> +     for a description.  */
> +  enum comparison_mode

Just a nit, but I like to use enum class.  For the simple reason that
users do "comparison_mode::SORT" instead of just "SORT", and I find that
this little bit of context helps when reading.

> +  {
> +    COMPARE,
> +    SORT,
> +    COMPLETE,

It's slightly confusing to have a more called COMPARE, since the
function itself is called compare.  Perhaps SEARCH or MATCH?

Simon

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix comparator bug in cooked index
  2023-01-27 19:56 [PATCH] Fix comparator bug in cooked index Tom Tromey
  2023-01-28  4:20 ` Simon Marchi
@ 2023-01-30 10:33 ` Andrew Burgess
  2023-01-30 15:15   ` Simon Marchi
  1 sibling, 1 reply; 5+ messages in thread
From: Andrew Burgess @ 2023-01-30 10:33 UTC (permalink / raw)
  To: Tom Tromey via Gdb-patches, gdb-patches; +Cc: Tom Tromey

Tom Tromey via Gdb-patches <gdb-patches@sourceware.org> writes:

> Simon pointed out that the cooked index template-matching patch
> introduced a failure in libstdc++ debug mode.  In particular, the new
> code violates the assumption of std::lower_bound and std::upper_bound
> that the range is sorted with respect to the comparison.
>
> When I first debugged this, I thought the problem was unfixable as-is
> and that a second layer of filtering would have to be done.  However,
> on irc, Simon pointed out that it could perhaps be solved if the
> comparison function were assured that one operand always came from the
> index, with the other always being the search string.
>
> This patch implements this idea.
>
> First, a new mode is introduced: a sorting mode for
> cooked_index_entry::compare.  In this mode, strings are compared
> case-insensitively, but we're careful to always sort '<' before any
> other printable character.  This way, two names like "func" and
> "func<param>" will be sorted next to each other -- i.e., "func1" will
> not be seen between them.  This is important when searching.
>
> Second, the compare function is changed to work in a strcmp-like way.
> This makes it easier to test and (IMO) understand.
>
> Third, the compare function is modified so that in non-sorting modes,
> the index entry is always the first argument.  This allows consistency
> in compares.
>
> I regression tested this in libstdc++ debug mode on x86-64 Fedora 36.
> It fixes the crash that Simon saw.
> ---
>  gdb/dwarf2/cooked-index.c | 166 ++++++++++++++++++++------------------
>  gdb/dwarf2/cooked-index.h |  42 ++++++++--
>  2 files changed, 123 insertions(+), 85 deletions(-)
>
> diff --git a/gdb/dwarf2/cooked-index.c b/gdb/dwarf2/cooked-index.c
> index 09b3fd70b26..f6b1df6e529 100644
> --- a/gdb/dwarf2/cooked-index.c
> +++ b/gdb/dwarf2/cooked-index.c
> @@ -30,56 +30,38 @@
>  
>  /* See cooked-index.h.  */
>  
> -bool
> +int
>  cooked_index_entry::compare (const char *stra, const char *strb,
> -			     bool completing)
> +			     comparison_mode mode)
>  {
> -  /* If we've ever matched "<" in both strings, then we disable the
> -     special template parameter handling.  */
> -  bool seen_lt = false;
> +  /* We want to sort '<' before any other printable character.  So,
> +     rewrite '<' to something just before ' '.  */
> +#define MUNGE(c) (c == '<' ? '\x1f' : TOLOWER ((unsigned char) c))
>  
>    while (*stra != '\0'
>  	 && *strb != '\0'
> -	 && (TOLOWER ((unsigned char) *stra)
> -	     == TOLOWER ((unsigned char ) *strb)))
> +	 && (MUNGE (*stra) == MUNGE (*strb)))
>      {
> -      if (*stra == '<')
> -	seen_lt = true;
>        ++stra;
>        ++strb;
>      }
>  
> -  unsigned c1 = TOLOWER ((unsigned char) *stra);
> -  unsigned c2 = TOLOWER ((unsigned char) *strb);
> +  unsigned c1 = MUNGE (*stra);
> +  unsigned c2 = MUNGE (*strb);
>  
> -  if (completing)
> -    {
> -      /* When completing, if one string ends earlier than the other,
> -	 consider them as equal.  Also, completion mode ignores the
> -	 special '<' handling.  */
> -      if (c1 == '\0' || c2 == '\0')
> -	return false;
> -      /* Fall through to the generic case.  */
> -    }
> -  else if (seen_lt)
> -    {
> -      /* Fall through to the generic case.  */
> -    }
> -  else if (c1 == '\0' || c1 == '<')
> -    {
> -      /* Maybe they both end at the same spot.  */
> -      if (c2 == '\0' || c2 == '<')
> -	return false;
> -      /* First string ended earlier.  */
> -      return true;
> -    }
> -  else if (c2 == '\0' || c2 == '<')
> +  if (c1 == c2)
> +    return 0;
> +
> +  /* When completing, if STRB ends earlier than STRA, consider them as
> +     equal.  When comparing, if STRB ends earlier and STRA ends with
> +     '<', consider them as equal.  */
> +  if (mode == COMPLETE || (mode == COMPARE && c1 == MUNGE ('<')))
>      {
> -      /* Second string ended earlier.  */
> -      return false;
> +      if (c2 == '\0')
> +	return 0;
>      }
>  
> -  return c1 < c2;
> +  return c1 < c2 ? -1 : 1;
>  }
>  
>  #if GDB_SELF_TEST
> @@ -89,45 +71,69 @@ namespace {
>  void
>  test_compare ()
>  {
> -  SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", true));
> -  SELF_CHECK (!cooked_index_entry::compare ("abcd", "abcd", true));
> -
> -  SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("ABCDE", "abcd", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("abcd", "ABCDE", true));
> -  SELF_CHECK (!cooked_index_entry::compare ("ABCDE", "abcd", true));
> -
> -  SELF_CHECK (!cooked_index_entry::compare ("name", "name<>", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("name<>", "name", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("name", "name<>", true));
> -  SELF_CHECK (!cooked_index_entry::compare ("name<>", "name", true));
> -
> -  SELF_CHECK (!cooked_index_entry::compare ("name<arg>", "name<arg>", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("name<arg>", "name<arg>", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("name<arg>", "name<arg>", true));
> -  SELF_CHECK (!cooked_index_entry::compare ("name<arg>", "name<ag>", true));
> -
> -  SELF_CHECK (!cooked_index_entry::compare ("name<arg<more>>",
> -					    "name<arg<more>>", false));
> -
> -  SELF_CHECK (!cooked_index_entry::compare ("name", "name<arg<more>>", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("name<arg<more>>", "name", false));
> -  SELF_CHECK (cooked_index_entry::compare ("name<arg<", "name<arg<more>>",
> -					   false));
> -  SELF_CHECK (!cooked_index_entry::compare ("name<arg<",
> -					    "name<arg<more>>",
> -					    true));
> -  SELF_CHECK (!cooked_index_entry::compare ("name<arg<more>>", "name<arg<",
> -					    false));
> -  SELF_CHECK (!cooked_index_entry::compare ("name<arg<more>>", "name<arg<",
> -					    true));
> -
> -  SELF_CHECK (cooked_index_entry::compare ("", "abcd", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("", "abcd", true));
> -  SELF_CHECK (!cooked_index_entry::compare ("abcd", "", false));
> -  SELF_CHECK (!cooked_index_entry::compare ("abcd", "", true));
> +  /* Convenience aliases.  */
> +  const auto mode_compare = cooked_index_entry::COMPARE;
> +  const auto mode_sort = cooked_index_entry::SORT;
> +  const auto mode_complete = cooked_index_entry::COMPLETE;
> +
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
> +					   mode_compare) == 0);
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
> +					   mode_compare) == 0);
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
> +					   mode_complete) == 0);
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "abcd",
> +					   mode_complete) == 0);
> +
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE",
> +					   mode_compare) < 0);
> +  SELF_CHECK (cooked_index_entry::compare ("ABCDE", "abcd",
> +					   mode_compare) > 0);
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "ABCDE",
> +					   mode_complete) < 0);
> +  SELF_CHECK (cooked_index_entry::compare ("ABCDE", "abcd",
> +					   mode_complete) == 0);
> +
> +  SELF_CHECK (cooked_index_entry::compare ("name", "name<>",
> +					   mode_compare) < 0);
> +  SELF_CHECK (cooked_index_entry::compare ("name<>", "name",
> +					   mode_compare) == 0);
> +  SELF_CHECK (cooked_index_entry::compare ("name", "name<>",
> +					   mode_complete) < 0);
> +  SELF_CHECK (cooked_index_entry::compare ("name<>", "name",
> +					   mode_complete) == 0);
> +
> +  SELF_CHECK (cooked_index_entry::compare ("name<arg>", "name<arg>",
> +					   mode_compare) == 0);
> +  SELF_CHECK (cooked_index_entry::compare ("name<arg>", "name<ag>",
> +					   mode_compare) > 0);
> +  SELF_CHECK (cooked_index_entry::compare ("name<arg>", "name<arg>",
> +					   mode_complete) == 0);
> +  SELF_CHECK (cooked_index_entry::compare ("name<arg>", "name<ag>",
> +					   mode_complete) > 0);
> +
> +  SELF_CHECK (cooked_index_entry::compare ("name<arg<more>>",
> +					   "name<arg<more>>",
> +					   mode_compare) == 0);
> +
> +  SELF_CHECK (cooked_index_entry::compare ("name", "name<arg<more>>",
> +					   mode_compare) < 0);
> +  SELF_CHECK (cooked_index_entry::compare ("name<arg<more>>", "name",
> +					   mode_compare) == 0);
> +  SELF_CHECK (cooked_index_entry::compare ("name<arg<more>>", "name<arg<",
> +					   mode_compare) > 0);
> +  SELF_CHECK (cooked_index_entry::compare ("name<arg<more>>", "name<arg<",
> +					   mode_complete) == 0);
> +
> +  SELF_CHECK (cooked_index_entry::compare ("", "abcd", mode_compare) < 0);
> +  SELF_CHECK (cooked_index_entry::compare ("", "abcd", mode_complete) < 0);
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "", mode_compare) > 0);
> +  SELF_CHECK (cooked_index_entry::compare ("abcd", "", mode_complete) == 0);
> +
> +  SELF_CHECK (cooked_index_entry::compare ("func", "func<type>",
> +					   mode_sort) < 0);
> +  SELF_CHECK (cooked_index_entry::compare ("func<type>", "func1",
> +					   mode_sort) < 0);
>  }
>  
>  } /* anonymous namespace */
> @@ -359,20 +365,22 @@ cooked_index::find (const std::string &name, bool completing)
>  {
>    wait ();
>  
> +  cooked_index_entry::comparison_mode mode = (completing
> +					      ? cooked_index_entry::COMPLETE
> +					      : cooked_index_entry::COMPARE);
> +
>    auto lower = std::lower_bound (m_entries.begin (), m_entries.end (), name,
>  				 [=] (const cooked_index_entry *entry,
>  				      const std::string &n)
>    {
> -    return cooked_index_entry::compare (entry->canonical, n.c_str (),
> -					completing);
> +    return cooked_index_entry::compare (entry->canonical, n.c_str (), mode) < 0;
>    });
>  
>    auto upper = std::upper_bound (m_entries.begin (), m_entries.end (), name,
>  				 [=] (const std::string &n,
>  				      const cooked_index_entry *entry)
>    {
> -    return cooked_index_entry::compare (n.c_str (), entry->canonical,
> -					completing);
> +    return cooked_index_entry::compare (entry->canonical, n.c_str (), mode) > 0;
>    });
>  
>    return range (lower, upper);
> diff --git a/gdb/dwarf2/cooked-index.h b/gdb/dwarf2/cooked-index.h
> index 55eaf9955ab..1c291ba5694 100644
> --- a/gdb/dwarf2/cooked-index.h
> +++ b/gdb/dwarf2/cooked-index.h
> @@ -143,16 +143,46 @@ struct cooked_index_entry : public allocate_on_obstack
>       STORAGE.  */
>    const char *full_name (struct obstack *storage) const;
>  
> -  /* Compare two strings, case-insensitively.  Return true if STRA is
> -     less than STRB.  If one string has template parameters, but the
> -     other does not, then they are considered to be equal; so for
> -     example "t<x>" == "t<x>", "t<x>" < "t<y>", but "t" == "t<x>".  */
> -  static bool compare (const char *stra, const char *strb, bool completing);
> +  /* Comparison modes for the 'compare' function.  See the function
> +     for a description.  */
> +  enum comparison_mode
> +  {
> +    COMPARE,
> +    SORT,
> +    COMPLETE,
> +  };
> +
> +  /* Compare two strings, case-insensitively.  Return -1 if STRA is
> +     less than STRB, 0 if they are equal, and 1 if STRA is greater.
> +
> +     When comparing, '<' is considered to be less than all other
> +     printable characters.  This ensures that "t<x>" sorts before
> +     "t1", which is necessary when looking up "t".
> +
> +     MODE controls how the comparison proceeds.
> +
> +     MODE==SORT is used when sorting and does no special template
> +     handling.

I think you should replace "template" with "'<'" here.  The handling is
not limited to template argument blocks, but to anything after a '<'.
So languages that don't have templates will be impacted, as will things
like 'operator<', 'operator<<', etc in C++.

This is all fine, returning false positives from the index is clearly
OK, we do the exact matching back in core GDB once the symbols are
expanded, but I don't think we should give the impression that this code
is actually checking for / skipping template arguments, when all it's
really doing is giving up on matching at a '<' (which, again, is fine).

I don't know if it's worth explicitly stating that the '<' handling is
just a rough heuristic?  All the examples you give are template based,
so the comment does leave the impression that the compare function does
some smart detection/handling of template blocks.

Also, it might be worth saying why its fine that we do this C++ specific
'<' handling for all languages.

On a similar theme, it might be worth saying why a case-insensitive
match is fine, even for languages where symbols are case-sensitive.

For me, when I started looking at this code, these were all the
questions that I had.  After some thought I figured out the answers, but
I had to dig into other parts of GDB to confirm them.  It would have
been useful for me if the comments had covered all this stuff.

Thanks,
Andrew

> +
> +     MODE==COMPARE is used when searching for a symbol.  In this case,
> +     STRB must always be the search name, and STRA must be the name in
> +     the index that is under consideration.  In compare mode, early
> +     termination of STRB may match STRA -- for example, "t<int>" and
> +     "t" will be considered to be equal.  (However, if A=="t" and
> +     B=="t<int>", then this will not consider them as equal.)
> +
> +     MODE==COMPLETE is used when searching for a symbol for
> +     completion.  In this case, STRB must always be the search name,
> +     and STRA must be the name in the index that is under
> +     consideration.  In completion mode, early termination of STRB
> +     always results in a match.  */
> +  static int compare (const char *stra, const char *strb,
> +		      comparison_mode mode);
>  
>    /* Compare two entries by canonical name.  */
>    bool operator< (const cooked_index_entry &other) const
>    {
> -    return compare (canonical, other.canonical, false);
> +    return compare (canonical, other.canonical, SORT) < 0;
>    }
>  
>    /* The name as it appears in DWARF.  This always points into one of
> -- 
> 2.38.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix comparator bug in cooked index
  2023-01-28  4:20 ` Simon Marchi
@ 2023-01-30 14:30   ` Tom Tromey
  0 siblings, 0 replies; 5+ messages in thread
From: Tom Tromey @ 2023-01-30 14:30 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, gdb-patches

>>>>> "Simon" == Simon Marchi <simark@simark.ca> writes:

>> +  /* Comparison modes for the 'compare' function.  See the function
>> +     for a description.  */
>> +  enum comparison_mode

Simon> Just a nit, but I like to use enum class.  For the simple reason that
Simon> users do "comparison_mode::SORT" instead of just "SORT", and I find that
Simon> this little bit of context helps when reading.

I normally do too, but because it's in a class, this would lead to
outside users having to write cooked_index_entry::comparison_mode::SORT
which seemed excessively wordy.

Tom

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Fix comparator bug in cooked index
  2023-01-30 10:33 ` Andrew Burgess
@ 2023-01-30 15:15   ` Simon Marchi
  0 siblings, 0 replies; 5+ messages in thread
From: Simon Marchi @ 2023-01-30 15:15 UTC (permalink / raw)
  To: Andrew Burgess, Tom Tromey via Gdb-patches; +Cc: Tom Tromey

> I think you should replace "template" with "'<'" here.  The handling is
> not limited to template argument blocks, but to anything after a '<'.
> So languages that don't have templates will be impacted, as will things
> like 'operator<', 'operator<<', etc in C++.
> 
> This is all fine, returning false positives from the index is clearly
> OK, we do the exact matching back in core GDB once the symbols are
> expanded, but I don't think we should give the impression that this code
> is actually checking for / skipping template arguments, when all it's
> really doing is giving up on matching at a '<' (which, again, is fine).
> 
> I don't know if it's worth explicitly stating that the '<' handling is
> just a rough heuristic?  All the examples you give are template based,
> so the comment does leave the impression that the compare function does
> some smart detection/handling of template blocks.
> 
> Also, it might be worth saying why its fine that we do this C++ specific
> '<' handling for all languages.
> 
> On a similar theme, it might be worth saying why a case-insensitive
> match is fine, even for languages where symbols are case-sensitive.
> 
> For me, when I started looking at this code, these were all the
> questions that I had.  After some thought I figured out the answers, but
> I had to dig into other parts of GDB to confirm them.  It would have
> been useful for me if the comments had covered all this stuff.

Thanks for the precisions, I wasn't aware of all of that.

Simon


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-30 15:15 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-27 19:56 [PATCH] Fix comparator bug in cooked index Tom Tromey
2023-01-28  4:20 ` Simon Marchi
2023-01-30 14:30   ` Tom Tromey
2023-01-30 10:33 ` Andrew Burgess
2023-01-30 15:15   ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).