From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <palves@sourceware.org>
Received: by sourceware.org (Postfix, from userid 1551)
 id 528033836655; Tue, 31 May 2022 12:57:06 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 528033836655
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
From: Pedro Alves <palves@sourceware.org>
To: gdb-cvs@sourceware.org
Subject: [binutils-gdb] Clarify why we unit test matching symbol names with
 0xff characters
X-Act-Checkin: binutils-gdb
X-Git-Author: Pedro Alves <pedro@palves.net>
X-Git-Refname: refs/heads/master
X-Git-Oldrev: e595ad4cc20a9b34fbda044b161cc7daccdfcf66
X-Git-Newrev: 102a644eaaa8b258f021da71028c32e0744d73ce
Message-Id: <20220531125706.528033836655@sourceware.org>
Date: Tue, 31 May 2022 12:57:06 +0000 (GMT)
X-BeenThere: gdb-cvs@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gdb-cvs mailing list <gdb-cvs.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-cvs>,
 <mailto:gdb-cvs-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/gdb-cvs/>
List-Help: <mailto:gdb-cvs-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-cvs>,
 <mailto:gdb-cvs-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Tue, 31 May 2022 12:57:06 -0000

https://sourceware.org/git/gitweb.cgi?p=3Dbinutils-gdb.git;h=3D102a644eaaa8=
b258f021da71028c32e0744d73ce

commit 102a644eaaa8b258f021da71028c32e0744d73ce
Author: Pedro Alves <pedro@palves.net>
Date:   Tue May 31 13:36:32 2022 +0100

    Clarify why we unit test matching symbol names with 0xff characters
   =20
    In the name matching unit tests in gdb/dwarf2/read.c, explain better
    why we test symbols with \377 / 0xff characters (Latin1 '=C3=BF').
   =20
    Change-Id: I517f13adfff2e4d3cd783fec1d744e2b26e18b8e

Diff:
---
 gdb/dwarf2/read.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index c4578c687d2..848fd5627b8 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -3628,10 +3628,17 @@ static const char *test_symbols[] =3D {
      is "function" in PT).  */
   u8"u8fun=C3=A7=C3=A3o",
=20
-  /* \377 (0xff) is Latin1 '=C3=BF'.  */
+  /* Test a symbol name that ends with a 0xff character, which is a
+     valid character in non-UTF-8 source character sets (e.g. Latin1
+     '=C3=BF'), and we can't rule out compilers allowing it in identifiers.
+     We test this because the completion algorithm finds the upper
+     bound of symbols by looking for the insertion point of
+     "func"-with-last-character-incremented, i.e. "fund", and adding 1
+     to 0xff should wraparound and carry to the previous character.
+     See comments in make_sort_after_prefix_name.  */
   "yfunc\377",
=20
-  /* \377 (0xff) is Latin1 '=C3=BF'.  */
+  /* Some more symbols with \377 (0xff).  See above.  */
   "\377",
   "\377\377123",
=20
@@ -3701,7 +3708,8 @@ test_mapped_index_find_name_component_bounds ()
   }
=20
   /* Check that the increment-last-char in the name matching algorithm
-     for completion doesn't get confused with Ansi1 '=C3=BF' / 0xff.  */
+     for completion doesn't get confused with Ansi1 '=C3=BF' / 0xff.  See
+     make_sort_after_prefix_name.  */
   {
     static const char *expected_syms1[] =3D {
       "\377",
@@ -3770,7 +3778,8 @@ test_dw2_expand_symtabs_matching_symbol ()
     }
=20
   /* Check that the name matching algorithm for completion doesn't get
-     confused with Latin1 '=C3=BF' / 0xff.  */
+     confused with Latin1 '=C3=BF' / 0xff.  See
+     make_sort_after_prefix_name.  */
   {
     static const char str[] =3D "\377";
     CHECK_MATCH (str, symbol_name_match_type::FULL, true,