[PATCH 0/5] Handle non-ASCII identifiers in Ada

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 0/5] Handle non-ASCII identifiers in Ada
@ 2022-02-28 18:32 Tom Tromey
  2022-02-28 18:33 ` [PATCH 1/5] Simplify a regular expression in ada-lex.l Tom Tromey
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Tom Tromey @ 2022-02-28 18:32 UTC (permalink / raw)
  To: gdb-patches

Ada supports non-ASCII identifiers, but gdb cannot access them.  This
series adds the missing support.

The first few patches are simple refactorings.  Probably only #3
requires any examination, the others are either trivial or
Ada-specific.

Patch #5 introduces a new setting and has documentation on the new
feature.  This one may come through a little strangely as differents
tests have different encodings.

Regression tested on x86-64 Fedora 34.

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] Simplify a regular expression in ada-lex.l
  2022-02-28 18:32 [PATCH 0/5] Handle non-ASCII identifiers in Ada Tom Tromey
@ 2022-02-28 18:33 ` Tom Tromey
  2022-02-28 18:33 ` [PATCH 2/5] Don't pre-size result string in ada_decode Tom Tromey
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Tom Tromey @ 2022-02-28 18:33 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

ada-lex.l uses "%option case-insensitive", so there is no need for
regular expressions to match upper case.
---
 gdb/ada-lex.l | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gdb/ada-lex.l b/gdb/ada-lex.l
index f61efba81a9..a1e19423691 100644
--- a/gdb/ada-lex.l
+++ b/gdb/ada-lex.l
@@ -220,7 +220,7 @@ false		{ return FALSEKEYWORD; }
 
         /* ATTRIBUTES */
 
-{TICK}[a-zA-Z][a-zA-Z_]+ { BEGIN INITIAL; return processAttribute (yytext+1); }
+{TICK}[a-z][a-z_]+ { BEGIN INITIAL; return processAttribute (yytext+1); }
 
 	/* PUNCTUATION */
 
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/5] Don't pre-size result string in ada_decode
  2022-02-28 18:32 [PATCH 0/5] Handle non-ASCII identifiers in Ada Tom Tromey
  2022-02-28 18:33 ` [PATCH 1/5] Simplify a regular expression in ada-lex.l Tom Tromey
@ 2022-02-28 18:33 ` Tom Tromey
  2022-02-28 18:33 ` [PATCH 3/5] Let phex and phex_nz handle sizeof_l==1 Tom Tromey
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Tom Tromey @ 2022-02-28 18:33 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

Currently, ada_decode pre-sizes the output string, filling it with 'X'
characters.  However, it's a bit simpler and more flexible to let
std::string do the work here, and simply append characters to the
string as we go.  This turns out to be useful for a subsequent patch.
---
 gdb/ada-lang.c | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index d44b0906e6d..9a7ab72f0e5 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -1004,7 +1004,7 @@ remove_compiler_suffix (const char *encoded, int *len)
 std::string
 ada_decode (const char *encoded, bool wrap)
 {
-  int i, j;
+  int i;
   int len0;
   const char *p;
   int at_start_name;
@@ -1068,10 +1068,6 @@ ada_decode (const char *encoded, bool wrap)
   if (len0 > 1 && startswith (encoded + len0 - 1, "B"))
     len0 -= 1;
 
-  /* Make decoded big enough for possible expansion by operator name.  */
-
-  decoded.resize (2 * len0 + 1, 'X');
-
   /* Remove trailing __{digit}+ or trailing ${digit}+.  */
 
   if (len0 > 1 && isdigit (encoded[len0 - 1]))
@@ -1089,8 +1085,8 @@ ada_decode (const char *encoded, bool wrap)
   /* The first few characters that are not alphabetic are not part
      of any encoding we use, so we can copy them over verbatim.  */
 
-  for (i = 0, j = 0; i < len0 && !isalpha (encoded[i]); i += 1, j += 1)
-    decoded[j] = encoded[i];
+  for (i = 0; i < len0 && !isalpha (encoded[i]); i += 1)
+    decoded.push_back (encoded[i]);
 
   at_start_name = 1;
   while (i < len0)
@@ -1107,10 +1103,9 @@ ada_decode (const char *encoded, bool wrap)
 			    op_len - 1) == 0)
 		  && !isalnum (encoded[i + op_len]))
 		{
-		  strcpy (&decoded.front() + j, ada_opname_table[k].decoded);
+		  decoded.append (ada_opname_table[k].decoded);
 		  at_start_name = 0;
 		  i += op_len;
-		  j += strlen (ada_opname_table[k].decoded);
 		  break;
 		}
 	    }
@@ -1214,21 +1209,18 @@ ada_decode (const char *encoded, bool wrap)
       else if (i < len0 - 2 && encoded[i] == '_' && encoded[i + 1] == '_')
 	{
 	 /* Replace '__' by '.'.  */
-	  decoded[j] = '.';
+	  decoded.push_back ('.');
 	  at_start_name = 1;
 	  i += 2;
-	  j += 1;
 	}
       else
 	{
 	  /* It's a character part of the decoded name, so just copy it
 	     over.  */
-	  decoded[j] = encoded[i];
+	  decoded.push_back (encoded[i]);
 	  i += 1;
-	  j += 1;
 	}
     }
-  decoded.resize (j);
 
   /* Decoded names should never contain any uppercase character.
      Double-check this, and abort the decoding if we find one.  */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/5] Let phex and phex_nz handle sizeof_l==1
  2022-02-28 18:32 [PATCH 0/5] Handle non-ASCII identifiers in Ada Tom Tromey
  2022-02-28 18:33 ` [PATCH 1/5] Simplify a regular expression in ada-lex.l Tom Tromey
  2022-02-28 18:33 ` [PATCH 2/5] Don't pre-size result string in ada_decode Tom Tromey
@ 2022-02-28 18:33 ` Tom Tromey
  2022-03-01 14:26   ` Simon Marchi
  2022-02-28 18:33 ` [PATCH 4/5] Define HOST_UTF32 in charset.h Tom Tromey
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Tom Tromey @ 2022-02-28 18:33 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

Currently, neither phex nor phex_nz handle sizeof_l==1 -- they let
this case fall through to the default case.  However, a subsequent
patch in this series needs this case to work correctly.

I looked at all calls to these functions that pass a 1 for the
sizeof_l parameter.  The only such case seems to be correct with this
change.
---
 gdbsupport/print-utils.cc | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/gdbsupport/print-utils.cc b/gdbsupport/print-utils.cc
index 0ef8cb829a1..73ff1afda30 100644
--- a/gdbsupport/print-utils.cc
+++ b/gdbsupport/print-utils.cc
@@ -168,6 +168,10 @@ phex (ULONGEST l, int sizeof_l)
       str = get_print_cell ();
       xsnprintf (str, PRINT_CELL_SIZE, "%04x", (unsigned short) (l & 0xffff));
       break;
+    case 1:
+      str = get_print_cell ();
+      xsnprintf (str, PRINT_CELL_SIZE, "%02x", (unsigned short) (l & 0xff));
+      break;
     default:
       str = phex (l, sizeof (l));
       break;
@@ -206,6 +210,10 @@ phex_nz (ULONGEST l, int sizeof_l)
       str = get_print_cell ();
       xsnprintf (str, PRINT_CELL_SIZE, "%x", (unsigned short) (l & 0xffff));
       break;
+    case 1:
+      str = get_print_cell ();
+      xsnprintf (str, PRINT_CELL_SIZE, "%x", (unsigned short) (l & 0xff));
+      break;
     default:
       str = phex_nz (l, sizeof (l));
       break;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 4/5] Define HOST_UTF32 in charset.h
  2022-02-28 18:32 [PATCH 0/5] Handle non-ASCII identifiers in Ada Tom Tromey
                   ` (2 preceding siblings ...)
  2022-02-28 18:33 ` [PATCH 3/5] Let phex and phex_nz handle sizeof_l==1 Tom Tromey
@ 2022-02-28 18:33 ` Tom Tromey
  2022-02-28 18:33 ` [PATCH 5/5] Handle non-ASCII identifiers in Ada Tom Tromey
  2022-03-07 14:52 ` [PATCH 0/5] " Tom Tromey
  5 siblings, 0 replies; 15+ messages in thread
From: Tom Tromey @ 2022-02-28 18:33 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

rust-parse.c has a #define for the host-specific UTF-32 charset name.
A later patch needs the same thing, so this patch moves the definition
to charset.h for easier reuse.
---
 gdb/charset.h    |  6 ++++++
 gdb/rust-parse.c | 12 ++++--------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/gdb/charset.h b/gdb/charset.h
index 7a7041f10f2..2daa9a25060 100644
--- a/gdb/charset.h
+++ b/gdb/charset.h
@@ -159,4 +159,10 @@ class wchar_iterator
    character.  */
 char host_letter_to_control_character (char c);
 
+#if WORDS_BIGENDIAN
+#define HOST_UTF32 "UTF-32BE"
+#else
+#define HOST_UTF32 "UTF-32LE"
+#endif
+
 #endif /* CHARSET_H */
diff --git a/gdb/rust-parse.c b/gdb/rust-parse.c
index 1f75b4290c2..4006df7086b 100644
--- a/gdb/rust-parse.c
+++ b/gdb/rust-parse.c
@@ -33,12 +33,6 @@
 
 using namespace expr;
 
-#if WORDS_BIGENDIAN
-#define UTF32 "UTF-32BE"
-#else
-#define UTF32 "UTF-32LE"
-#endif
-
 /* A regular expression for matching Rust numbers.  This is split up
    since it is very long and this gives us a way to comment the
    sections.  */
@@ -601,7 +595,8 @@ lex_multibyte_char (const char *text, int *len)
     return 0;
 
   auto_obstack result;
-  convert_between_encodings (host_charset (), UTF32, (const gdb_byte *) text,
+  convert_between_encodings (host_charset (), HOST_UTF32,
+			     (const gdb_byte *) text,
 			     quote, 1, &result, translit_none);
 
   int size = obstack_object_size (&result);
@@ -732,7 +727,8 @@ rust_parser::lex_string ()
 	  if (is_byte)
 	    obstack_1grow (&obstack, value);
 	  else
-	    convert_between_encodings (UTF32, "UTF-8", (gdb_byte *) &value,
+	    convert_between_encodings (HOST_UTF32, "UTF-8",
+				       (gdb_byte *) &value,
 				       sizeof (value), sizeof (value),
 				       &obstack, translit_none);
 	}
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 5/5] Handle non-ASCII identifiers in Ada
  2022-02-28 18:32 [PATCH 0/5] Handle non-ASCII identifiers in Ada Tom Tromey
                   ` (3 preceding siblings ...)
  2022-02-28 18:33 ` [PATCH 4/5] Define HOST_UTF32 in charset.h Tom Tromey
@ 2022-02-28 18:33 ` Tom Tromey
  2022-02-28 18:59   ` Eli Zaretskii
  2022-03-01 15:33   ` Tom Tromey
  2022-03-07 14:52 ` [PATCH 0/5] " Tom Tromey
  5 siblings, 2 replies; 15+ messages in thread
From: Tom Tromey @ 2022-02-28 18:33 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 78849 bytes --]

Ada allows non-ASCII identifiers, and GNAT supports several such
encodings.  This patch adds the corresponding support to gdb.

GNAT encodes non-ASCII characters using special symbol names.

For character sets like Latin-1, where all characters are a single
byte, it uses a "U" followed by the hex for the character.  So, for
example, thorn would be encoded as "Ufe" (0xFE being lower case
thorn).

For wider characters, despite what the manual says (it claims
Shift-JIS and EUC can be used), in practice recent versions only
support Unicode.  Here, characters in the base plane are represented
using "Wxxxx" and characters outside the base plane using
"WWxxxxxxxx".

GNAT has some further quirks here.  Ada is case-insensitive, and GNAT
emits symbols that have been case-folded.  For characters in ASCII,
and for all characters in non-Unicode character sets, lower case is
used.  For Unicode, however, characters that fit in a single byte are
converted to lower case, but all others are converted to upper case.

Furthermore, there is a bug in GNAT where two symbols that differ only
in the case of "Y WITH DIAERESIS" (and potentially others, I did not
check exhaustively) can be used in one program.  I chose to omit
handling this case from gdb, on the theory that it is hard to figure
out the logic, and anyway if the bug is ever fixed, we'll regret
having a heuristic.

This patch introduces a new "ada source-charset" setting.  It defaults
to Latin-1, as that is GNAT's default.  This setting controls how "U"
characters are decoded -- W/WW are always handled as UTF-32.

The ada_tag_name_from_tsd change is needed because this function will
read memory from the inferior and interpret it -- and this caused an
encoding failure on PPC when running a test that tries to read
uninitialized memory.

This patch implements its own UTF-32-based case folder.  This avoids
host platform quirks, and is relatively simple.  A short Python
program to generate the case-folding table is included.  It simply
relies on whatever version of Unicode is used by the host Python,
which seems basically acceptable.

Test cases for UTF-8, Latin-1, and Latin-3 are included.  This
exercises most of the new code paths, aside from Y WITH DIAERESIS as
noted above.
---
 gdb/NEWS                                      |    6 +
 gdb/ada-casefold.h                            | 1343 +++++++++++++++++
 gdb/ada-exp.y                                 |   10 +-
 gdb/ada-lang.c                                |  397 ++++-
 gdb/ada-lex.l                                 |    2 +-
 gdb/ada-unicode.py                            |  114 ++
 gdb/copyright.py                              |    1 +
 gdb/doc/gdb.texinfo                           |   23 +
 gdb/testsuite/gdb.ada/non-ascii-latin-1.exp   |   50 +
 .../gdb.ada/non-ascii-latin-1/pack.adb        |   28 +
 .../gdb.ada/non-ascii-latin-1/pack.ads        |   21 +
 .../gdb.ada/non-ascii-latin-1/prog.adb        |   23 +
 gdb/testsuite/gdb.ada/non-ascii-latin-3.exp   |   50 +
 .../gdb.ada/non-ascii-latin-3/pack.adb        |   28 +
 .../gdb.ada/non-ascii-latin-3/pack.ads        |   21 +
 .../gdb.ada/non-ascii-latin-3/prog.adb        |   24 +
 gdb/testsuite/gdb.ada/non-ascii-utf-8.exp     |   57 +
 .../gdb.ada/non-ascii-utf-8/pack.adb          |   43 +
 .../gdb.ada/non-ascii-utf-8/pack.ads          |   24 +
 .../gdb.ada/non-ascii-utf-8/prog.adb          |   36 +
 20 files changed, 2280 insertions(+), 21 deletions(-)
 create mode 100644 gdb/ada-casefold.h
 create mode 100755 gdb/ada-unicode.py
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-latin-1.exp
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-latin-1/pack.adb
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-latin-1/pack.ads
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-latin-1/prog.adb
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-latin-3.exp
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-latin-3/pack.adb
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-latin-3/pack.ads
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-latin-3/prog.adb
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-utf-8.exp
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-utf-8/pack.adb
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-utf-8/pack.ads
 create mode 100644 gdb/testsuite/gdb.ada/non-ascii-utf-8/prog.adb

diff --git a/gdb/NEWS b/gdb/NEWS
index dc2cac1871b..068a6c46bc8 100644
--- a/gdb/NEWS
+++ b/gdb/NEWS
@@ -111,6 +111,12 @@ show style disassembler enabled
   package is available, then, when this setting is on, disassembler
   output will have styling applied.
 
+set ada source-charset
+show ada source-charset
+  Set the character set encoding that is assumed for Ada symbols.  Valid
+  values for this follow the values that can be passed to the GNAT
+  compiler via the '-gnati' option.  The default is ISO-8859-1.
+
 * Changed commands
 
 maint packet
diff --git a/gdb/ada-casefold.h b/gdb/ada-casefold.h
new file mode 100644
index 00000000000..fcee6e07f7d
--- /dev/null
+++ b/gdb/ada-casefold.h
@@ -0,0 +1,1343 @@
+/* *INDENT-OFF* */ /* THIS FILE IS GENERATED -*- buffer-read-only: t -*- */
+/* vi:set ro: */
+
+/* UTF-32 case-folding for GDB
+
+   Copyright (C) 2022 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* This file was created with the aid of ``ada-unicode.py''.  */
+
+   {65, 90, 0, 32},
+   {97, 122, -32, 0},
+   {181, 181, 743, 0},
+   {192, 214, 0, 32},
+   {216, 222, 0, 32},
+   {224, 246, -32, 0},
+   {248, 254, -32, 0},
+   {255, 255, 121, 0},
+   {256, 256, 0, 1},
+   {257, 257, -1, 0},
+   {258, 258, 0, 1},
+   {259, 259, -1, 0},
+   {260, 260, 0, 1},
+   {261, 261, -1, 0},
+   {262, 262, 0, 1},
+   {263, 263, -1, 0},
+   {264, 264, 0, 1},
+   {265, 265, -1, 0},
+   {266, 266, 0, 1},
+   {267, 267, -1, 0},
+   {268, 268, 0, 1},
+   {269, 269, -1, 0},
+   {270, 270, 0, 1},
+   {271, 271, -1, 0},
+   {272, 272, 0, 1},
+   {273, 273, -1, 0},
+   {274, 274, 0, 1},
+   {275, 275, -1, 0},
+   {276, 276, 0, 1},
+   {277, 277, -1, 0},
+   {278, 278, 0, 1},
+   {279, 279, -1, 0},
+   {280, 280, 0, 1},
+   {281, 281, -1, 0},
+   {282, 282, 0, 1},
+   {283, 283, -1, 0},
+   {284, 284, 0, 1},
+   {285, 285, -1, 0},
+   {286, 286, 0, 1},
+   {287, 287, -1, 0},
+   {288, 288, 0, 1},
+   {289, 289, -1, 0},
+   {290, 290, 0, 1},
+   {291, 291, -1, 0},
+   {292, 292, 0, 1},
+   {293, 293, -1, 0},
+   {294, 294, 0, 1},
+   {295, 295, -1, 0},
+   {296, 296, 0, 1},
+   {297, 297, -1, 0},
+   {298, 298, 0, 1},
+   {299, 299, -1, 0},
+   {300, 300, 0, 1},
+   {301, 301, -1, 0},
+   {302, 302, 0, 1},
+   {303, 303, -1, 0},
+   {305, 305, -232, 0},
+   {306, 306, 0, 1},
+   {307, 307, -1, 0},
+   {308, 308, 0, 1},
+   {309, 309, -1, 0},
+   {310, 310, 0, 1},
+   {311, 311, -1, 0},
+   {313, 313, 0, 1},
+   {314, 314, -1, 0},
+   {315, 315, 0, 1},
+   {316, 316, -1, 0},
+   {317, 317, 0, 1},
+   {318, 318, -1, 0},
+   {319, 319, 0, 1},
+   {320, 320, -1, 0},
+   {321, 321, 0, 1},
+   {322, 322, -1, 0},
+   {323, 323, 0, 1},
+   {324, 324, -1, 0},
+   {325, 325, 0, 1},
+   {326, 326, -1, 0},
+   {327, 327, 0, 1},
+   {328, 328, -1, 0},
+   {330, 330, 0, 1},
+   {331, 331, -1, 0},
+   {332, 332, 0, 1},
+   {333, 333, -1, 0},
+   {334, 334, 0, 1},
+   {335, 335, -1, 0},
+   {336, 336, 0, 1},
+   {337, 337, -1, 0},
+   {338, 338, 0, 1},
+   {339, 339, -1, 0},
+   {340, 340, 0, 1},
+   {341, 341, -1, 0},
+   {342, 342, 0, 1},
+   {343, 343, -1, 0},
+   {344, 344, 0, 1},
+   {345, 345, -1, 0},
+   {346, 346, 0, 1},
+   {347, 347, -1, 0},
+   {348, 348, 0, 1},
+   {349, 349, -1, 0},
+   {350, 350, 0, 1},
+   {351, 351, -1, 0},
+   {352, 352, 0, 1},
+   {353, 353, -1, 0},
+   {354, 354, 0, 1},
+   {355, 355, -1, 0},
+   {356, 356, 0, 1},
+   {357, 357, -1, 0},
+   {358, 358, 0, 1},
+   {359, 359, -1, 0},
+   {360, 360, 0, 1},
+   {361, 361, -1, 0},
+   {362, 362, 0, 1},
+   {363, 363, -1, 0},
+   {364, 364, 0, 1},
+   {365, 365, -1, 0},
+   {366, 366, 0, 1},
+   {367, 367, -1, 0},
+   {368, 368, 0, 1},
+   {369, 369, -1, 0},
+   {370, 370, 0, 1},
+   {371, 371, -1, 0},
+   {372, 372, 0, 1},
+   {373, 373, -1, 0},
+   {374, 374, 0, 1},
+   {375, 375, -1, 0},
+   {376, 376, 0, -121},
+   {377, 377, 0, 1},
+   {378, 378, -1, 0},
+   {379, 379, 0, 1},
+   {380, 380, -1, 0},
+   {381, 381, 0, 1},
+   {382, 382, -1, 0},
+   {383, 383, -300, 0},
+   {384, 384, 195, 0},
+   {385, 385, 0, 210},
+   {386, 386, 0, 1},
+   {387, 387, -1, 0},
+   {388, 388, 0, 1},
+   {389, 389, -1, 0},
+   {390, 390, 0, 206},
+   {391, 391, 0, 1},
+   {392, 392, -1, 0},
+   {393, 394, 0, 205},
+   {395, 395, 0, 1},
+   {396, 396, -1, 0},
+   {398, 398, 0, 79},
+   {399, 399, 0, 202},
+   {400, 400, 0, 203},
+   {401, 401, 0, 1},
+   {402, 402, -1, 0},
+   {403, 403, 0, 205},
+   {404, 404, 0, 207},
+   {405, 405, 97, 0},
+   {406, 406, 0, 211},
+   {407, 407, 0, 209},
+   {408, 408, 0, 1},
+   {409, 409, -1, 0},
+   {410, 410, 163, 0},
+   {412, 412, 0, 211},
+   {413, 413, 0, 213},
+   {414, 414, 130, 0},
+   {415, 415, 0, 214},
+   {416, 416, 0, 1},
+   {417, 417, -1, 0},
+   {418, 418, 0, 1},
+   {419, 419, -1, 0},
+   {420, 420, 0, 1},
+   {421, 421, -1, 0},
+   {422, 422, 0, 218},
+   {423, 423, 0, 1},
+   {424, 424, -1, 0},
+   {425, 425, 0, 218},
+   {428, 428, 0, 1},
+   {429, 429, -1, 0},
+   {430, 430, 0, 218},
+   {431, 431, 0, 1},
+   {432, 432, -1, 0},
+   {433, 434, 0, 217},
+   {435, 435, 0, 1},
+   {436, 436, -1, 0},
+   {437, 437, 0, 1},
+   {438, 438, -1, 0},
+   {439, 439, 0, 219},
+   {440, 440, 0, 1},
+   {441, 441, -1, 0},
+   {444, 444, 0, 1},
+   {445, 445, -1, 0},
+   {447, 447, 56, 0},
+   {452, 452, 0, 2},
+   {453, 453, -1, 1},
+   {454, 454, -2, 0},
+   {455, 455, 0, 2},
+   {456, 456, -1, 1},
+   {457, 457, -2, 0},
+   {458, 458, 0, 2},
+   {459, 459, -1, 1},
+   {460, 460, -2, 0},
+   {461, 461, 0, 1},
+   {462, 462, -1, 0},
+   {463, 463, 0, 1},
+   {464, 464, -1, 0},
+   {465, 465, 0, 1},
+   {466, 466, -1, 0},
+   {467, 467, 0, 1},
+   {468, 468, -1, 0},
+   {469, 469, 0, 1},
+   {470, 470, -1, 0},
+   {471, 471, 0, 1},
+   {472, 472, -1, 0},
+   {473, 473, 0, 1},
+   {474, 474, -1, 0},
+   {475, 475, 0, 1},
+   {476, 476, -1, 0},
+   {477, 477, -79, 0},
+   {478, 478, 0, 1},
+   {479, 479, -1, 0},
+   {480, 480, 0, 1},
+   {481, 481, -1, 0},
+   {482, 482, 0, 1},
+   {483, 483, -1, 0},
+   {484, 484, 0, 1},
+   {485, 485, -1, 0},
+   {486, 486, 0, 1},
+   {487, 487, -1, 0},
+   {488, 488, 0, 1},
+   {489, 489, -1, 0},
+   {490, 490, 0, 1},
+   {491, 491, -1, 0},
+   {492, 492, 0, 1},
+   {493, 493, -1, 0},
+   {494, 494, 0, 1},
+   {495, 495, -1, 0},
+   {497, 497, 0, 2},
+   {498, 498, -1, 1},
+   {499, 499, -2, 0},
+   {500, 500, 0, 1},
+   {501, 501, -1, 0},
+   {502, 502, 0, -97},
+   {503, 503, 0, -56},
+   {504, 504, 0, 1},
+   {505, 505, -1, 0},
+   {506, 506, 0, 1},
+   {507, 507, -1, 0},
+   {508, 508, 0, 1},
+   {509, 509, -1, 0},
+   {510, 510, 0, 1},
+   {511, 511, -1, 0},
+   {512, 512, 0, 1},
+   {513, 513, -1, 0},
+   {514, 514, 0, 1},
+   {515, 515, -1, 0},
+   {516, 516, 0, 1},
+   {517, 517, -1, 0},
+   {518, 518, 0, 1},
+   {519, 519, -1, 0},
+   {520, 520, 0, 1},
+   {521, 521, -1, 0},
+   {522, 522, 0, 1},
+   {523, 523, -1, 0},
+   {524, 524, 0, 1},
+   {525, 525, -1, 0},
+   {526, 526, 0, 1},
+   {527, 527, -1, 0},
+   {528, 528, 0, 1},
+   {529, 529, -1, 0},
+   {530, 530, 0, 1},
+   {531, 531, -1, 0},
+   {532, 532, 0, 1},
+   {533, 533, -1, 0},
+   {534, 534, 0, 1},
+   {535, 535, -1, 0},
+   {536, 536, 0, 1},
+   {537, 537, -1, 0},
+   {538, 538, 0, 1},
+   {539, 539, -1, 0},
+   {540, 540, 0, 1},
+   {541, 541, -1, 0},
+   {542, 542, 0, 1},
+   {543, 543, -1, 0},
+   {544, 544, 0, -130},
+   {546, 546, 0, 1},
+   {547, 547, -1, 0},
+   {548, 548, 0, 1},
+   {549, 549, -1, 0},
+   {550, 550, 0, 1},
+   {551, 551, -1, 0},
+   {552, 552, 0, 1},
+   {553, 553, -1, 0},
+   {554, 554, 0, 1},
+   {555, 555, -1, 0},
+   {556, 556, 0, 1},
+   {557, 557, -1, 0},
+   {558, 558, 0, 1},
+   {559, 559, -1, 0},
+   {560, 560, 0, 1},
+   {561, 561, -1, 0},
+   {562, 562, 0, 1},
+   {563, 563, -1, 0},
+   {570, 570, 0, 10795},
+   {571, 571, 0, 1},
+   {572, 572, -1, 0},
+   {573, 573, 0, -163},
+   {574, 574, 0, 10792},
+   {575, 576, 10815, 0},
+   {577, 577, 0, 1},
+   {578, 578, -1, 0},
+   {579, 579, 0, -195},
+   {580, 580, 0, 69},
+   {581, 581, 0, 71},
+   {582, 582, 0, 1},
+   {583, 583, -1, 0},
+   {584, 584, 0, 1},
+   {585, 585, -1, 0},
+   {586, 586, 0, 1},
+   {587, 587, -1, 0},
+   {588, 588, 0, 1},
+   {589, 589, -1, 0},
+   {590, 590, 0, 1},
+   {591, 591, -1, 0},
+   {592, 592, 10783, 0},
+   {593, 593, 10780, 0},
+   {594, 594, 10782, 0},
+   {595, 595, -210, 0},
+   {596, 596, -206, 0},
+   {598, 599, -205, 0},
+   {601, 601, -202, 0},
+   {603, 603, -203, 0},
+   {604, 604, 42319, 0},
+   {608, 608, -205, 0},
+   {609, 609, 42315, 0},
+   {611, 611, -207, 0},
+   {613, 613, 42280, 0},
+   {614, 614, 42308, 0},
+   {616, 616, -209, 0},
+   {617, 617, -211, 0},
+   {618, 618, 42308, 0},
+   {619, 619, 10743, 0},
+   {620, 620, 42305, 0},
+   {623, 623, -211, 0},
+   {625, 625, 10749, 0},
+   {626, 626, -213, 0},
+   {629, 629, -214, 0},
+   {637, 637, 10727, 0},
+   {640, 640, -218, 0},
+   {642, 642, 42307, 0},
+   {643, 643, -218, 0},
+   {647, 647, 42282, 0},
+   {648, 648, -218, 0},
+   {649, 649, -69, 0},
+   {650, 651, -217, 0},
+   {652, 652, -71, 0},
+   {658, 658, -219, 0},
+   {669, 669, 42261, 0},
+   {670, 670, 42258, 0},
+   {837, 837, 84, 0},
+   {880, 880, 0, 1},
+   {881, 881, -1, 0},
+   {882, 882, 0, 1},
+   {883, 883, -1, 0},
+   {886, 886, 0, 1},
+   {887, 887, -1, 0},
+   {891, 893, 130, 0},
+   {895, 895, 0, 116},
+   {902, 902, 0, 38},
+   {904, 906, 0, 37},
+   {908, 908, 0, 64},
+   {910, 911, 0, 63},
+   {913, 929, 0, 32},
+   {931, 939, 0, 32},
+   {940, 940, -38, 0},
+   {941, 943, -37, 0},
+   {945, 961, -32, 0},
+   {962, 962, -31, 0},
+   {963, 971, -32, 0},
+   {972, 972, -64, 0},
+   {973, 974, -63, 0},
+   {975, 975, 0, 8},
+   {976, 976, -62, 0},
+   {977, 977, -57, 0},
+   {981, 981, -47, 0},
+   {982, 982, -54, 0},
+   {983, 983, -8, 0},
+   {984, 984, 0, 1},
+   {985, 985, -1, 0},
+   {986, 986, 0, 1},
+   {987, 987, -1, 0},
+   {988, 988, 0, 1},
+   {989, 989, -1, 0},
+   {990, 990, 0, 1},
+   {991, 991, -1, 0},
+   {992, 992, 0, 1},
+   {993, 993, -1, 0},
+   {994, 994, 0, 1},
+   {995, 995, -1, 0},
+   {996, 996, 0, 1},
+   {997, 997, -1, 0},
+   {998, 998, 0, 1},
+   {999, 999, -1, 0},
+   {1000, 1000, 0, 1},
+   {1001, 1001, -1, 0},
+   {1002, 1002, 0, 1},
+   {1003, 1003, -1, 0},
+   {1004, 1004, 0, 1},
+   {1005, 1005, -1, 0},
+   {1006, 1006, 0, 1},
+   {1007, 1007, -1, 0},
+   {1008, 1008, -86, 0},
+   {1009, 1009, -80, 0},
+   {1010, 1010, 7, 0},
+   {1011, 1011, -116, 0},
+   {1012, 1012, 0, -60},
+   {1013, 1013, -96, 0},
+   {1015, 1015, 0, 1},
+   {1016, 1016, -1, 0},
+   {1017, 1017, 0, -7},
+   {1018, 1018, 0, 1},
+   {1019, 1019, -1, 0},
+   {1021, 1023, 0, -130},
+   {1024, 1039, 0, 80},
+   {1040, 1071, 0, 32},
+   {1072, 1103, -32, 0},
+   {1104, 1119, -80, 0},
+   {1120, 1120, 0, 1},
+   {1121, 1121, -1, 0},
+   {1122, 1122, 0, 1},
+   {1123, 1123, -1, 0},
+   {1124, 1124, 0, 1},
+   {1125, 1125, -1, 0},
+   {1126, 1126, 0, 1},
+   {1127, 1127, -1, 0},
+   {1128, 1128, 0, 1},
+   {1129, 1129, -1, 0},
+   {1130, 1130, 0, 1},
+   {1131, 1131, -1, 0},
+   {1132, 1132, 0, 1},
+   {1133, 1133, -1, 0},
+   {1134, 1134, 0, 1},
+   {1135, 1135, -1, 0},
+   {1136, 1136, 0, 1},
+   {1137, 1137, -1, 0},
+   {1138, 1138, 0, 1},
+   {1139, 1139, -1, 0},
+   {1140, 1140, 0, 1},
+   {1141, 1141, -1, 0},
+   {1142, 1142, 0, 1},
+   {1143, 1143, -1, 0},
+   {1144, 1144, 0, 1},
+   {1145, 1145, -1, 0},
+   {1146, 1146, 0, 1},
+   {1147, 1147, -1, 0},
+   {1148, 1148, 0, 1},
+   {1149, 1149, -1, 0},
+   {1150, 1150, 0, 1},
+   {1151, 1151, -1, 0},
+   {1152, 1152, 0, 1},
+   {1153, 1153, -1, 0},
+   {1162, 1162, 0, 1},
+   {1163, 1163, -1, 0},
+   {1164, 1164, 0, 1},
+   {1165, 1165, -1, 0},
+   {1166, 1166, 0, 1},
+   {1167, 1167, -1, 0},
+   {1168, 1168, 0, 1},
+   {1169, 1169, -1, 0},
+   {1170, 1170, 0, 1},
+   {1171, 1171, -1, 0},
+   {1172, 1172, 0, 1},
+   {1173, 1173, -1, 0},
+   {1174, 1174, 0, 1},
+   {1175, 1175, -1, 0},
+   {1176, 1176, 0, 1},
+   {1177, 1177, -1, 0},
+   {1178, 1178, 0, 1},
+   {1179, 1179, -1, 0},
+   {1180, 1180, 0, 1},
+   {1181, 1181, -1, 0},
+   {1182, 1182, 0, 1},
+   {1183, 1183, -1, 0},
+   {1184, 1184, 0, 1},
+   {1185, 1185, -1, 0},
+   {1186, 1186, 0, 1},
+   {1187, 1187, -1, 0},
+   {1188, 1188, 0, 1},
+   {1189, 1189, -1, 0},
+   {1190, 1190, 0, 1},
+   {1191, 1191, -1, 0},
+   {1192, 1192, 0, 1},
+   {1193, 1193, -1, 0},
+   {1194, 1194, 0, 1},
+   {1195, 1195, -1, 0},
+   {1196, 1196, 0, 1},
+   {1197, 1197, -1, 0},
+   {1198, 1198, 0, 1},
+   {1199, 1199, -1, 0},
+   {1200, 1200, 0, 1},
+   {1201, 1201, -1, 0},
+   {1202, 1202, 0, 1},
+   {1203, 1203, -1, 0},
+   {1204, 1204, 0, 1},
+   {1205, 1205, -1, 0},
+   {1206, 1206, 0, 1},
+   {1207, 1207, -1, 0},
+   {1208, 1208, 0, 1},
+   {1209, 1209, -1, 0},
+   {1210, 1210, 0, 1},
+   {1211, 1211, -1, 0},
+   {1212, 1212, 0, 1},
+   {1213, 1213, -1, 0},
+   {1214, 1214, 0, 1},
+   {1215, 1215, -1, 0},
+   {1216, 1216, 0, 15},
+   {1217, 1217, 0, 1},
+   {1218, 1218, -1, 0},
+   {1219, 1219, 0, 1},
+   {1220, 1220, -1, 0},
+   {1221, 1221, 0, 1},
+   {1222, 1222, -1, 0},
+   {1223, 1223, 0, 1},
+   {1224, 1224, -1, 0},
+   {1225, 1225, 0, 1},
+   {1226, 1226, -1, 0},
+   {1227, 1227, 0, 1},
+   {1228, 1228, -1, 0},
+   {1229, 1229, 0, 1},
+   {1230, 1230, -1, 0},
+   {1231, 1231, -15, 0},
+   {1232, 1232, 0, 1},
+   {1233, 1233, -1, 0},
+   {1234, 1234, 0, 1},
+   {1235, 1235, -1, 0},
+   {1236, 1236, 0, 1},
+   {1237, 1237, -1, 0},
+   {1238, 1238, 0, 1},
+   {1239, 1239, -1, 0},
+   {1240, 1240, 0, 1},
+   {1241, 1241, -1, 0},
+   {1242, 1242, 0, 1},
+   {1243, 1243, -1, 0},
+   {1244, 1244, 0, 1},
+   {1245, 1245, -1, 0},
+   {1246, 1246, 0, 1},
+   {1247, 1247, -1, 0},
+   {1248, 1248, 0, 1},
+   {1249, 1249, -1, 0},
+   {1250, 1250, 0, 1},
+   {1251, 1251, -1, 0},
+   {1252, 1252, 0, 1},
+   {1253, 1253, -1, 0},
+   {1254, 1254, 0, 1},
+   {1255, 1255, -1, 0},
+   {1256, 1256, 0, 1},
+   {1257, 1257, -1, 0},
+   {1258, 1258, 0, 1},
+   {1259, 1259, -1, 0},
+   {1260, 1260, 0, 1},
+   {1261, 1261, -1, 0},
+   {1262, 1262, 0, 1},
+   {1263, 1263, -1, 0},
+   {1264, 1264, 0, 1},
+   {1265, 1265, -1, 0},
+   {1266, 1266, 0, 1},
+   {1267, 1267, -1, 0},
+   {1268, 1268, 0, 1},
+   {1269, 1269, -1, 0},
+   {1270, 1270, 0, 1},
+   {1271, 1271, -1, 0},
+   {1272, 1272, 0, 1},
+   {1273, 1273, -1, 0},
+   {1274, 1274, 0, 1},
+   {1275, 1275, -1, 0},
+   {1276, 1276, 0, 1},
+   {1277, 1277, -1, 0},
+   {1278, 1278, 0, 1},
+   {1279, 1279, -1, 0},
+   {1280, 1280, 0, 1},
+   {1281, 1281, -1, 0},
+   {1282, 1282, 0, 1},
+   {1283, 1283, -1, 0},
+   {1284, 1284, 0, 1},
+   {1285, 1285, -1, 0},
+   {1286, 1286, 0, 1},
+   {1287, 1287, -1, 0},
+   {1288, 1288, 0, 1},
+   {1289, 1289, -1, 0},
+   {1290, 1290, 0, 1},
+   {1291, 1291, -1, 0},
+   {1292, 1292, 0, 1},
+   {1293, 1293, -1, 0},
+   {1294, 1294, 0, 1},
+   {1295, 1295, -1, 0},
+   {1296, 1296, 0, 1},
+   {1297, 1297, -1, 0},
+   {1298, 1298, 0, 1},
+   {1299, 1299, -1, 0},
+   {1300, 1300, 0, 1},
+   {1301, 1301, -1, 0},
+   {1302, 1302, 0, 1},
+   {1303, 1303, -1, 0},
+   {1304, 1304, 0, 1},
+   {1305, 1305, -1, 0},
+   {1306, 1306, 0, 1},
+   {1307, 1307, -1, 0},
+   {1308, 1308, 0, 1},
+   {1309, 1309, -1, 0},
+   {1310, 1310, 0, 1},
+   {1311, 1311, -1, 0},
+   {1312, 1312, 0, 1},
+   {1313, 1313, -1, 0},
+   {1314, 1314, 0, 1},
+   {1315, 1315, -1, 0},
+   {1316, 1316, 0, 1},
+   {1317, 1317, -1, 0},
+   {1318, 1318, 0, 1},
+   {1319, 1319, -1, 0},
+   {1320, 1320, 0, 1},
+   {1321, 1321, -1, 0},
+   {1322, 1322, 0, 1},
+   {1323, 1323, -1, 0},
+   {1324, 1324, 0, 1},
+   {1325, 1325, -1, 0},
+   {1326, 1326, 0, 1},
+   {1327, 1327, -1, 0},
+   {1329, 1366, 0, 48},
+   {1377, 1414, -48, 0},
+   {4256, 4293, 0, 7264},
+   {4295, 4295, 0, 7264},
+   {4301, 4301, 0, 7264},
+   {4304, 4346, 3008, 0},
+   {4349, 4351, 3008, 0},
+   {5024, 5103, 0, 38864},
+   {5104, 5109, 0, 8},
+   {5112, 5117, -8, 0},
+   {7296, 7296, -6254, 0},
+   {7297, 7297, -6253, 0},
+   {7298, 7298, -6244, 0},
+   {7299, 7300, -6242, 0},
+   {7301, 7301, -6243, 0},
+   {7302, 7302, -6236, 0},
+   {7303, 7303, -6181, 0},
+   {7304, 7304, 35266, 0},
+   {7312, 7354, 0, -3008},
+   {7357, 7359, 0, -3008},
+   {7545, 7545, 35332, 0},
+   {7549, 7549, 3814, 0},
+   {7566, 7566, 35384, 0},
+   {7680, 7680, 0, 1},
+   {7681, 7681, -1, 0},
+   {7682, 7682, 0, 1},
+   {7683, 7683, -1, 0},
+   {7684, 7684, 0, 1},
+   {7685, 7685, -1, 0},
+   {7686, 7686, 0, 1},
+   {7687, 7687, -1, 0},
+   {7688, 7688, 0, 1},
+   {7689, 7689, -1, 0},
+   {7690, 7690, 0, 1},
+   {7691, 7691, -1, 0},
+   {7692, 7692, 0, 1},
+   {7693, 7693, -1, 0},
+   {7694, 7694, 0, 1},
+   {7695, 7695, -1, 0},
+   {7696, 7696, 0, 1},
+   {7697, 7697, -1, 0},
+   {7698, 7698, 0, 1},
+   {7699, 7699, -1, 0},
+   {7700, 7700, 0, 1},
+   {7701, 7701, -1, 0},
+   {7702, 7702, 0, 1},
+   {7703, 7703, -1, 0},
+   {7704, 7704, 0, 1},
+   {7705, 7705, -1, 0},
+   {7706, 7706, 0, 1},
+   {7707, 7707, -1, 0},
+   {7708, 7708, 0, 1},
+   {7709, 7709, -1, 0},
+   {7710, 7710, 0, 1},
+   {7711, 7711, -1, 0},
+   {7712, 7712, 0, 1},
+   {7713, 7713, -1, 0},
+   {7714, 7714, 0, 1},
+   {7715, 7715, -1, 0},
+   {7716, 7716, 0, 1},
+   {7717, 7717, -1, 0},
+   {7718, 7718, 0, 1},
+   {7719, 7719, -1, 0},
+   {7720, 7720, 0, 1},
+   {7721, 7721, -1, 0},
+   {7722, 7722, 0, 1},
+   {7723, 7723, -1, 0},
+   {7724, 7724, 0, 1},
+   {7725, 7725, -1, 0},
+   {7726, 7726, 0, 1},
+   {7727, 7727, -1, 0},
+   {7728, 7728, 0, 1},
+   {7729, 7729, -1, 0},
+   {7730, 7730, 0, 1},
+   {7731, 7731, -1, 0},
+   {7732, 7732, 0, 1},
+   {7733, 7733, -1, 0},
+   {7734, 7734, 0, 1},
+   {7735, 7735, -1, 0},
+   {7736, 7736, 0, 1},
+   {7737, 7737, -1, 0},
+   {7738, 7738, 0, 1},
+   {7739, 7739, -1, 0},
+   {7740, 7740, 0, 1},
+   {7741, 7741, -1, 0},
+   {7742, 7742, 0, 1},
+   {7743, 7743, -1, 0},
+   {7744, 7744, 0, 1},
+   {7745, 7745, -1, 0},
+   {7746, 7746, 0, 1},
+   {7747, 7747, -1, 0},
+   {7748, 7748, 0, 1},
+   {7749, 7749, -1, 0},
+   {7750, 7750, 0, 1},
+   {7751, 7751, -1, 0},
+   {7752, 7752, 0, 1},
+   {7753, 7753, -1, 0},
+   {7754, 7754, 0, 1},
+   {7755, 7755, -1, 0},
+   {7756, 7756, 0, 1},
+   {7757, 7757, -1, 0},
+   {7758, 7758, 0, 1},
+   {7759, 7759, -1, 0},
+   {7760, 7760, 0, 1},
+   {7761, 7761, -1, 0},
+   {7762, 7762, 0, 1},
+   {7763, 7763, -1, 0},
+   {7764, 7764, 0, 1},
+   {7765, 7765, -1, 0},
+   {7766, 7766, 0, 1},
+   {7767, 7767, -1, 0},
+   {7768, 7768, 0, 1},
+   {7769, 7769, -1, 0},
+   {7770, 7770, 0, 1},
+   {7771, 7771, -1, 0},
+   {7772, 7772, 0, 1},
+   {7773, 7773, -1, 0},
+   {7774, 7774, 0, 1},
+   {7775, 7775, -1, 0},
+   {7776, 7776, 0, 1},
+   {7777, 7777, -1, 0},
+   {7778, 7778, 0, 1},
+   {7779, 7779, -1, 0},
+   {7780, 7780, 0, 1},
+   {7781, 7781, -1, 0},
+   {7782, 7782, 0, 1},
+   {7783, 7783, -1, 0},
+   {7784, 7784, 0, 1},
+   {7785, 7785, -1, 0},
+   {7786, 7786, 0, 1},
+   {7787, 7787, -1, 0},
+   {7788, 7788, 0, 1},
+   {7789, 7789, -1, 0},
+   {7790, 7790, 0, 1},
+   {7791, 7791, -1, 0},
+   {7792, 7792, 0, 1},
+   {7793, 7793, -1, 0},
+   {7794, 7794, 0, 1},
+   {7795, 7795, -1, 0},
+   {7796, 7796, 0, 1},
+   {7797, 7797, -1, 0},
+   {7798, 7798, 0, 1},
+   {7799, 7799, -1, 0},
+   {7800, 7800, 0, 1},
+   {7801, 7801, -1, 0},
+   {7802, 7802, 0, 1},
+   {7803, 7803, -1, 0},
+   {7804, 7804, 0, 1},
+   {7805, 7805, -1, 0},
+   {7806, 7806, 0, 1},
+   {7807, 7807, -1, 0},
+   {7808, 7808, 0, 1},
+   {7809, 7809, -1, 0},
+   {7810, 7810, 0, 1},
+   {7811, 7811, -1, 0},
+   {7812, 7812, 0, 1},
+   {7813, 7813, -1, 0},
+   {7814, 7814, 0, 1},
+   {7815, 7815, -1, 0},
+   {7816, 7816, 0, 1},
+   {7817, 7817, -1, 0},
+   {7818, 7818, 0, 1},
+   {7819, 7819, -1, 0},
+   {7820, 7820, 0, 1},
+   {7821, 7821, -1, 0},
+   {7822, 7822, 0, 1},
+   {7823, 7823, -1, 0},
+   {7824, 7824, 0, 1},
+   {7825, 7825, -1, 0},
+   {7826, 7826, 0, 1},
+   {7827, 7827, -1, 0},
+   {7828, 7828, 0, 1},
+   {7829, 7829, -1, 0},
+   {7835, 7835, -59, 0},
+   {7838, 7838, 0, -7615},
+   {7840, 7840, 0, 1},
+   {7841, 7841, -1, 0},
+   {7842, 7842, 0, 1},
+   {7843, 7843, -1, 0},
+   {7844, 7844, 0, 1},
+   {7845, 7845, -1, 0},
+   {7846, 7846, 0, 1},
+   {7847, 7847, -1, 0},
+   {7848, 7848, 0, 1},
+   {7849, 7849, -1, 0},
+   {7850, 7850, 0, 1},
+   {7851, 7851, -1, 0},
+   {7852, 7852, 0, 1},
+   {7853, 7853, -1, 0},
+   {7854, 7854, 0, 1},
+   {7855, 7855, -1, 0},
+   {7856, 7856, 0, 1},
+   {7857, 7857, -1, 0},
+   {7858, 7858, 0, 1},
+   {7859, 7859, -1, 0},
+   {7860, 7860, 0, 1},
+   {7861, 7861, -1, 0},
+   {7862, 7862, 0, 1},
+   {7863, 7863, -1, 0},
+   {7864, 7864, 0, 1},
+   {7865, 7865, -1, 0},
+   {7866, 7866, 0, 1},
+   {7867, 7867, -1, 0},
+   {7868, 7868, 0, 1},
+   {7869, 7869, -1, 0},
+   {7870, 7870, 0, 1},
+   {7871, 7871, -1, 0},
+   {7872, 7872, 0, 1},
+   {7873, 7873, -1, 0},
+   {7874, 7874, 0, 1},
+   {7875, 7875, -1, 0},
+   {7876, 7876, 0, 1},
+   {7877, 7877, -1, 0},
+   {7878, 7878, 0, 1},
+   {7879, 7879, -1, 0},
+   {7880, 7880, 0, 1},
+   {7881, 7881, -1, 0},
+   {7882, 7882, 0, 1},
+   {7883, 7883, -1, 0},
+   {7884, 7884, 0, 1},
+   {7885, 7885, -1, 0},
+   {7886, 7886, 0, 1},
+   {7887, 7887, -1, 0},
+   {7888, 7888, 0, 1},
+   {7889, 7889, -1, 0},
+   {7890, 7890, 0, 1},
+   {7891, 7891, -1, 0},
+   {7892, 7892, 0, 1},
+   {7893, 7893, -1, 0},
+   {7894, 7894, 0, 1},
+   {7895, 7895, -1, 0},
+   {7896, 7896, 0, 1},
+   {7897, 7897, -1, 0},
+   {7898, 7898, 0, 1},
+   {7899, 7899, -1, 0},
+   {7900, 7900, 0, 1},
+   {7901, 7901, -1, 0},
+   {7902, 7902, 0, 1},
+   {7903, 7903, -1, 0},
+   {7904, 7904, 0, 1},
+   {7905, 7905, -1, 0},
+   {7906, 7906, 0, 1},
+   {7907, 7907, -1, 0},
+   {7908, 7908, 0, 1},
+   {7909, 7909, -1, 0},
+   {7910, 7910, 0, 1},
+   {7911, 7911, -1, 0},
+   {7912, 7912, 0, 1},
+   {7913, 7913, -1, 0},
+   {7914, 7914, 0, 1},
+   {7915, 7915, -1, 0},
+   {7916, 7916, 0, 1},
+   {7917, 7917, -1, 0},
+   {7918, 7918, 0, 1},
+   {7919, 7919, -1, 0},
+   {7920, 7920, 0, 1},
+   {7921, 7921, -1, 0},
+   {7922, 7922, 0, 1},
+   {7923, 7923, -1, 0},
+   {7924, 7924, 0, 1},
+   {7925, 7925, -1, 0},
+   {7926, 7926, 0, 1},
+   {7927, 7927, -1, 0},
+   {7928, 7928, 0, 1},
+   {7929, 7929, -1, 0},
+   {7930, 7930, 0, 1},
+   {7931, 7931, -1, 0},
+   {7932, 7932, 0, 1},
+   {7933, 7933, -1, 0},
+   {7934, 7934, 0, 1},
+   {7935, 7935, -1, 0},
+   {7936, 7943, 8, 0},
+   {7944, 7951, 0, -8},
+   {7952, 7957, 8, 0},
+   {7960, 7965, 0, -8},
+   {7968, 7975, 8, 0},
+   {7976, 7983, 0, -8},
+   {7984, 7991, 8, 0},
+   {7992, 7999, 0, -8},
+   {8000, 8005, 8, 0},
+   {8008, 8013, 0, -8},
+   {8017, 8017, 8, 0},
+   {8019, 8019, 8, 0},
+   {8021, 8021, 8, 0},
+   {8023, 8023, 8, 0},
+   {8025, 8025, 0, -8},
+   {8027, 8027, 0, -8},
+   {8029, 8029, 0, -8},
+   {8031, 8031, 0, -8},
+   {8032, 8039, 8, 0},
+   {8040, 8047, 0, -8},
+   {8048, 8049, 74, 0},
+   {8050, 8053, 86, 0},
+   {8054, 8055, 100, 0},
+   {8056, 8057, 128, 0},
+   {8058, 8059, 112, 0},
+   {8060, 8061, 126, 0},
+   {8112, 8113, 8, 0},
+   {8120, 8121, 0, -8},
+   {8122, 8123, 0, -74},
+   {8126, 8126, -7205, 0},
+   {8136, 8139, 0, -86},
+   {8144, 8145, 8, 0},
+   {8152, 8153, 0, -8},
+   {8154, 8155, 0, -100},
+   {8160, 8161, 8, 0},
+   {8165, 8165, 7, 0},
+   {8168, 8169, 0, -8},
+   {8170, 8171, 0, -112},
+   {8172, 8172, 0, -7},
+   {8184, 8185, 0, -128},
+   {8186, 8187, 0, -126},
+   {8486, 8486, 0, -7517},
+   {8490, 8490, 0, -8383},
+   {8491, 8491, 0, -8262},
+   {8498, 8498, 0, 28},
+   {8526, 8526, -28, 0},
+   {8544, 8559, 0, 16},
+   {8560, 8575, -16, 0},
+   {8579, 8579, 0, 1},
+   {8580, 8580, -1, 0},
+   {9398, 9423, 0, 26},
+   {9424, 9449, -26, 0},
+   {11264, 11310, 0, 48},
+   {11312, 11358, -48, 0},
+   {11360, 11360, 0, 1},
+   {11361, 11361, -1, 0},
+   {11362, 11362, 0, -10743},
+   {11363, 11363, 0, -3814},
+   {11364, 11364, 0, -10727},
+   {11365, 11365, -10795, 0},
+   {11366, 11366, -10792, 0},
+   {11367, 11367, 0, 1},
+   {11368, 11368, -1, 0},
+   {11369, 11369, 0, 1},
+   {11370, 11370, -1, 0},
+   {11371, 11371, 0, 1},
+   {11372, 11372, -1, 0},
+   {11373, 11373, 0, -10780},
+   {11374, 11374, 0, -10749},
+   {11375, 11375, 0, -10783},
+   {11376, 11376, 0, -10782},
+   {11378, 11378, 0, 1},
+   {11379, 11379, -1, 0},
+   {11381, 11381, 0, 1},
+   {11382, 11382, -1, 0},
+   {11390, 11391, 0, -10815},
+   {11392, 11392, 0, 1},
+   {11393, 11393, -1, 0},
+   {11394, 11394, 0, 1},
+   {11395, 11395, -1, 0},
+   {11396, 11396, 0, 1},
+   {11397, 11397, -1, 0},
+   {11398, 11398, 0, 1},
+   {11399, 11399, -1, 0},
+   {11400, 11400, 0, 1},
+   {11401, 11401, -1, 0},
+   {11402, 11402, 0, 1},
+   {11403, 11403, -1, 0},
+   {11404, 11404, 0, 1},
+   {11405, 11405, -1, 0},
+   {11406, 11406, 0, 1},
+   {11407, 11407, -1, 0},
+   {11408, 11408, 0, 1},
+   {11409, 11409, -1, 0},
+   {11410, 11410, 0, 1},
+   {11411, 11411, -1, 0},
+   {11412, 11412, 0, 1},
+   {11413, 11413, -1, 0},
+   {11414, 11414, 0, 1},
+   {11415, 11415, -1, 0},
+   {11416, 11416, 0, 1},
+   {11417, 11417, -1, 0},
+   {11418, 11418, 0, 1},
+   {11419, 11419, -1, 0},
+   {11420, 11420, 0, 1},
+   {11421, 11421, -1, 0},
+   {11422, 11422, 0, 1},
+   {11423, 11423, -1, 0},
+   {11424, 11424, 0, 1},
+   {11425, 11425, -1, 0},
+   {11426, 11426, 0, 1},
+   {11427, 11427, -1, 0},
+   {11428, 11428, 0, 1},
+   {11429, 11429, -1, 0},
+   {11430, 11430, 0, 1},
+   {11431, 11431, -1, 0},
+   {11432, 11432, 0, 1},
+   {11433, 11433, -1, 0},
+   {11434, 11434, 0, 1},
+   {11435, 11435, -1, 0},
+   {11436, 11436, 0, 1},
+   {11437, 11437, -1, 0},
+   {11438, 11438, 0, 1},
+   {11439, 11439, -1, 0},
+   {11440, 11440, 0, 1},
+   {11441, 11441, -1, 0},
+   {11442, 11442, 0, 1},
+   {11443, 11443, -1, 0},
+   {11444, 11444, 0, 1},
+   {11445, 11445, -1, 0},
+   {11446, 11446, 0, 1},
+   {11447, 11447, -1, 0},
+   {11448, 11448, 0, 1},
+   {11449, 11449, -1, 0},
+   {11450, 11450, 0, 1},
+   {11451, 11451, -1, 0},
+   {11452, 11452, 0, 1},
+   {11453, 11453, -1, 0},
+   {11454, 11454, 0, 1},
+   {11455, 11455, -1, 0},
+   {11456, 11456, 0, 1},
+   {11457, 11457, -1, 0},
+   {11458, 11458, 0, 1},
+   {11459, 11459, -1, 0},
+   {11460, 11460, 0, 1},
+   {11461, 11461, -1, 0},
+   {11462, 11462, 0, 1},
+   {11463, 11463, -1, 0},
+   {11464, 11464, 0, 1},
+   {11465, 11465, -1, 0},
+   {11466, 11466, 0, 1},
+   {11467, 11467, -1, 0},
+   {11468, 11468, 0, 1},
+   {11469, 11469, -1, 0},
+   {11470, 11470, 0, 1},
+   {11471, 11471, -1, 0},
+   {11472, 11472, 0, 1},
+   {11473, 11473, -1, 0},
+   {11474, 11474, 0, 1},
+   {11475, 11475, -1, 0},
+   {11476, 11476, 0, 1},
+   {11477, 11477, -1, 0},
+   {11478, 11478, 0, 1},
+   {11479, 11479, -1, 0},
+   {11480, 11480, 0, 1},
+   {11481, 11481, -1, 0},
+   {11482, 11482, 0, 1},
+   {11483, 11483, -1, 0},
+   {11484, 11484, 0, 1},
+   {11485, 11485, -1, 0},
+   {11486, 11486, 0, 1},
+   {11487, 11487, -1, 0},
+   {11488, 11488, 0, 1},
+   {11489, 11489, -1, 0},
+   {11490, 11490, 0, 1},
+   {11491, 11491, -1, 0},
+   {11499, 11499, 0, 1},
+   {11500, 11500, -1, 0},
+   {11501, 11501, 0, 1},
+   {11502, 11502, -1, 0},
+   {11506, 11506, 0, 1},
+   {11507, 11507, -1, 0},
+   {11520, 11557, -7264, 0},
+   {11559, 11559, -7264, 0},
+   {11565, 11565, -7264, 0},
+   {42560, 42560, 0, 1},
+   {42561, 42561, -1, 0},
+   {42562, 42562, 0, 1},
+   {42563, 42563, -1, 0},
+   {42564, 42564, 0, 1},
+   {42565, 42565, -1, 0},
+   {42566, 42566, 0, 1},
+   {42567, 42567, -1, 0},
+   {42568, 42568, 0, 1},
+   {42569, 42569, -1, 0},
+   {42570, 42570, 0, 1},
+   {42571, 42571, -1, 0},
+   {42572, 42572, 0, 1},
+   {42573, 42573, -1, 0},
+   {42574, 42574, 0, 1},
+   {42575, 42575, -1, 0},
+   {42576, 42576, 0, 1},
+   {42577, 42577, -1, 0},
+   {42578, 42578, 0, 1},
+   {42579, 42579, -1, 0},
+   {42580, 42580, 0, 1},
+   {42581, 42581, -1, 0},
+   {42582, 42582, 0, 1},
+   {42583, 42583, -1, 0},
+   {42584, 42584, 0, 1},
+   {42585, 42585, -1, 0},
+   {42586, 42586, 0, 1},
+   {42587, 42587, -1, 0},
+   {42588, 42588, 0, 1},
+   {42589, 42589, -1, 0},
+   {42590, 42590, 0, 1},
+   {42591, 42591, -1, 0},
+   {42592, 42592, 0, 1},
+   {42593, 42593, -1, 0},
+   {42594, 42594, 0, 1},
+   {42595, 42595, -1, 0},
+   {42596, 42596, 0, 1},
+   {42597, 42597, -1, 0},
+   {42598, 42598, 0, 1},
+   {42599, 42599, -1, 0},
+   {42600, 42600, 0, 1},
+   {42601, 42601, -1, 0},
+   {42602, 42602, 0, 1},
+   {42603, 42603, -1, 0},
+   {42604, 42604, 0, 1},
+   {42605, 42605, -1, 0},
+   {42624, 42624, 0, 1},
+   {42625, 42625, -1, 0},
+   {42626, 42626, 0, 1},
+   {42627, 42627, -1, 0},
+   {42628, 42628, 0, 1},
+   {42629, 42629, -1, 0},
+   {42630, 42630, 0, 1},
+   {42631, 42631, -1, 0},
+   {42632, 42632, 0, 1},
+   {42633, 42633, -1, 0},
+   {42634, 42634, 0, 1},
+   {42635, 42635, -1, 0},
+   {42636, 42636, 0, 1},
+   {42637, 42637, -1, 0},
+   {42638, 42638, 0, 1},
+   {42639, 42639, -1, 0},
+   {42640, 42640, 0, 1},
+   {42641, 42641, -1, 0},
+   {42642, 42642, 0, 1},
+   {42643, 42643, -1, 0},
+   {42644, 42644, 0, 1},
+   {42645, 42645, -1, 0},
+   {42646, 42646, 0, 1},
+   {42647, 42647, -1, 0},
+   {42648, 42648, 0, 1},
+   {42649, 42649, -1, 0},
+   {42650, 42650, 0, 1},
+   {42651, 42651, -1, 0},
+   {42786, 42786, 0, 1},
+   {42787, 42787, -1, 0},
+   {42788, 42788, 0, 1},
+   {42789, 42789, -1, 0},
+   {42790, 42790, 0, 1},
+   {42791, 42791, -1, 0},
+   {42792, 42792, 0, 1},
+   {42793, 42793, -1, 0},
+   {42794, 42794, 0, 1},
+   {42795, 42795, -1, 0},
+   {42796, 42796, 0, 1},
+   {42797, 42797, -1, 0},
+   {42798, 42798, 0, 1},
+   {42799, 42799, -1, 0},
+   {42802, 42802, 0, 1},
+   {42803, 42803, -1, 0},
+   {42804, 42804, 0, 1},
+   {42805, 42805, -1, 0},
+   {42806, 42806, 0, 1},
+   {42807, 42807, -1, 0},
+   {42808, 42808, 0, 1},
+   {42809, 42809, -1, 0},
+   {42810, 42810, 0, 1},
+   {42811, 42811, -1, 0},
+   {42812, 42812, 0, 1},
+   {42813, 42813, -1, 0},
+   {42814, 42814, 0, 1},
+   {42815, 42815, -1, 0},
+   {42816, 42816, 0, 1},
+   {42817, 42817, -1, 0},
+   {42818, 42818, 0, 1},
+   {42819, 42819, -1, 0},
+   {42820, 42820, 0, 1},
+   {42821, 42821, -1, 0},
+   {42822, 42822, 0, 1},
+   {42823, 42823, -1, 0},
+   {42824, 42824, 0, 1},
+   {42825, 42825, -1, 0},
+   {42826, 42826, 0, 1},
+   {42827, 42827, -1, 0},
+   {42828, 42828, 0, 1},
+   {42829, 42829, -1, 0},
+   {42830, 42830, 0, 1},
+   {42831, 42831, -1, 0},
+   {42832, 42832, 0, 1},
+   {42833, 42833, -1, 0},
+   {42834, 42834, 0, 1},
+   {42835, 42835, -1, 0},
+   {42836, 42836, 0, 1},
+   {42837, 42837, -1, 0},
+   {42838, 42838, 0, 1},
+   {42839, 42839, -1, 0},
+   {42840, 42840, 0, 1},
+   {42841, 42841, -1, 0},
+   {42842, 42842, 0, 1},
+   {42843, 42843, -1, 0},
+   {42844, 42844, 0, 1},
+   {42845, 42845, -1, 0},
+   {42846, 42846, 0, 1},
+   {42847, 42847, -1, 0},
+   {42848, 42848, 0, 1},
+   {42849, 42849, -1, 0},
+   {42850, 42850, 0, 1},
+   {42851, 42851, -1, 0},
+   {42852, 42852, 0, 1},
+   {42853, 42853, -1, 0},
+   {42854, 42854, 0, 1},
+   {42855, 42855, -1, 0},
+   {42856, 42856, 0, 1},
+   {42857, 42857, -1, 0},
+   {42858, 42858, 0, 1},
+   {42859, 42859, -1, 0},
+   {42860, 42860, 0, 1},
+   {42861, 42861, -1, 0},
+   {42862, 42862, 0, 1},
+   {42863, 42863, -1, 0},
+   {42873, 42873, 0, 1},
+   {42874, 42874, -1, 0},
+   {42875, 42875, 0, 1},
+   {42876, 42876, -1, 0},
+   {42877, 42877, 0, -35332},
+   {42878, 42878, 0, 1},
+   {42879, 42879, -1, 0},
+   {42880, 42880, 0, 1},
+   {42881, 42881, -1, 0},
+   {42882, 42882, 0, 1},
+   {42883, 42883, -1, 0},
+   {42884, 42884, 0, 1},
+   {42885, 42885, -1, 0},
+   {42886, 42886, 0, 1},
+   {42887, 42887, -1, 0},
+   {42891, 42891, 0, 1},
+   {42892, 42892, -1, 0},
+   {42893, 42893, 0, -42280},
+   {42896, 42896, 0, 1},
+   {42897, 42897, -1, 0},
+   {42898, 42898, 0, 1},
+   {42899, 42899, -1, 0},
+   {42900, 42900, 48, 0},
+   {42902, 42902, 0, 1},
+   {42903, 42903, -1, 0},
+   {42904, 42904, 0, 1},
+   {42905, 42905, -1, 0},
+   {42906, 42906, 0, 1},
+   {42907, 42907, -1, 0},
+   {42908, 42908, 0, 1},
+   {42909, 42909, -1, 0},
+   {42910, 42910, 0, 1},
+   {42911, 42911, -1, 0},
+   {42912, 42912, 0, 1},
+   {42913, 42913, -1, 0},
+   {42914, 42914, 0, 1},
+   {42915, 42915, -1, 0},
+   {42916, 42916, 0, 1},
+   {42917, 42917, -1, 0},
+   {42918, 42918, 0, 1},
+   {42919, 42919, -1, 0},
+   {42920, 42920, 0, 1},
+   {42921, 42921, -1, 0},
+   {42922, 42922, 0, -42308},
+   {42923, 42923, 0, -42319},
+   {42924, 42924, 0, -42315},
+   {42925, 42925, 0, -42305},
+   {42926, 42926, 0, -42308},
+   {42928, 42928, 0, -42258},
+   {42929, 42929, 0, -42282},
+   {42930, 42930, 0, -42261},
+   {42931, 42931, 0, 928},
+   {42932, 42932, 0, 1},
+   {42933, 42933, -1, 0},
+   {42934, 42934, 0, 1},
+   {42935, 42935, -1, 0},
+   {42936, 42936, 0, 1},
+   {42937, 42937, -1, 0},
+   {42938, 42938, 0, 1},
+   {42939, 42939, -1, 0},
+   {42940, 42940, 0, 1},
+   {42941, 42941, -1, 0},
+   {42942, 42942, 0, 1},
+   {42943, 42943, -1, 0},
+   {42946, 42946, 0, 1},
+   {42947, 42947, -1, 0},
+   {42948, 42948, 0, -48},
+   {42949, 42949, 0, -42307},
+   {42950, 42950, 0, -35384},
+   {42951, 42951, 0, 1},
+   {42952, 42952, -1, 0},
+   {42953, 42953, 0, 1},
+   {42954, 42954, -1, 0},
+   {42997, 42997, 0, 1},
+   {42998, 42998, -1, 0},
+   {43859, 43859, -928, 0},
+   {43888, 43967, -38864, 0},
+   {65313, 65338, 0, 32},
+   {65345, 65370, -32, 0},
+   {66560, 66599, 0, 40},
+   {66600, 66639, -40, 0},
+   {66736, 66771, 0, 40},
+   {66776, 66811, -40, 0},
+   {68736, 68786, 0, 64},
+   {68800, 68850, -64, 0},
+   {71840, 71871, 0, 32},
+   {71872, 71903, -32, 0},
+   {93760, 93791, 0, 32},
+   {93792, 93823, -32, 0},
+   {125184, 125217, 0, 34},
+   {125218, 125251, -34, 0},
diff --git a/gdb/ada-exp.y b/gdb/ada-exp.y
index d3fce8d05e3..c974657dbcd 100644
--- a/gdb/ada-exp.y
+++ b/gdb/ada-exp.y
@@ -1549,10 +1549,14 @@ write_var_or_type (struct parser_state *par_state,
 	  int terminator = encoded_name[tail_index];
 
 	  encoded_name[tail_index] = '\0';
-	  std::vector<struct block_symbol> syms
-	    = ada_lookup_symbol_list (encoded_name, block, VAR_DOMAIN);
+	  /* In order to avoid double-encoding, we want to only pass
+	     the decoded form to lookup functions.  */
+	  std::string decoded_name = ada_decode (encoded_name);
 	  encoded_name[tail_index] = terminator;
 
+	  std::vector<struct block_symbol> syms
+	    = ada_lookup_symbol_list (decoded_name.c_str (), block, VAR_DOMAIN);
+
 	  type_sym = select_possible_type_sym (syms);
 
 	  if (type_sym != NULL)
@@ -1626,7 +1630,7 @@ write_var_or_type (struct parser_state *par_state,
 	  else if (syms.empty ())
 	    {
 	      struct bound_minimal_symbol msym
-		= ada_lookup_simple_minsym (encoded_name);
+		= ada_lookup_simple_minsym (decoded_name.c_str ());
 	      if (msym.minsym != NULL)
 		{
 		  par_state->push_new<ada_var_msym_value_operation> (msym);
diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index 9a7ab72f0e5..12ff0353829 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -59,6 +59,7 @@
 #include "gdbsupport/byte-vector.h"
 #include <algorithm>
 #include "ada-exp.h"
+#include "charset.h"
 
 /* Define whether or not the C operator '/' truncates towards zero for
    differently signed operands (truncation direction is undefined in C).
@@ -209,6 +210,38 @@ static symbol_name_matcher_ftype *ada_get_symbol_name_matcher
 
 \f
 
+/* The character set used for source files.  */
+static const char *ada_source_charset;
+
+/* The string "UTF-8".  This is here so we can check for the UTF-8
+   charset using == rather than strcmp.  */
+static const char ada_utf8[] = "UTF-8";
+
+/* Each entry in the UTF-32 case-folding table is of this form.  */
+struct utf8_entry
+{
+  /* The start and end, inclusive, of this range of codepoints.  */
+  uint32_t start, end;
+  /* The delta to apply to get the upper-case form.  0 if this is
+     already upper-case.  */
+  int upper_delta;
+  /* The delta to apply to get the lower-case form.  0 if this is
+     already lower-case.  */
+  int lower_delta;
+
+  bool operator< (uint32_t val) const
+  {
+    return end < val;
+  }
+};
+
+static const utf8_entry ada_case_fold[] =
+{
+#include "ada-casefold.h"
+};
+
+\f
+
 /* The result of a symbol lookup to be stored in our symbol cache.  */
 
 struct cache_entry
@@ -843,6 +876,52 @@ is_compiler_suffix (const char *str)
   return *str == '\0' || (str[0] == ']' && str[1] == '\0');
 }
 
+/* Append a non-ASCII character to RESULT.  */
+static void
+append_hex_encoded (std::string &result, uint32_t one_char)
+{
+  if (one_char <= 0xff)
+    {
+      result.append ("U");
+      result.append (phex (one_char, 1));
+    }
+  else if (one_char <= 0xffff)
+    {
+      result.append ("W");
+      result.append (phex (one_char, 2));
+    }
+  else
+    {
+      result.append ("WW");
+      result.append (phex (one_char, 4));
+    }
+}
+
+/* Return a string that is a copy of the data in STORAGE, with
+   non-ASCII characters replaced by the appropriate hex encoding.  A
+   template is used because, for UTF-8, we actually want to work with
+   UTF-32 codepoints.  */
+template<typename T>
+std::string
+copy_and_hex_encode (struct obstack *storage)
+{
+  const T *chars = (T *) obstack_base (storage);
+  int num_chars = obstack_object_size (storage) / sizeof (T);
+  std::string result;
+  for (int i = 0; i < num_chars; ++i)
+    {
+      if (chars[i] <= 0x7f)
+	{
+	  /* The host character set has to be a superset of ASCII, as
+	     are all the other character sets we can use.  */
+	  result.push_back (chars[i]);
+	}
+      else
+	append_hex_encoded (result, chars[i]);
+    }
+  return result;
+}
+
 /* The "encoded" form of DECODED, according to GNAT conventions.  If
    THROW_ERRORS, throw an error if invalid operator name is found.
    Otherwise, return the empty string in that case.  */
@@ -854,8 +933,12 @@ ada_encode_1 (const char *decoded, bool throw_errors)
     return {};
 
   std::string encoding_buffer;
+  bool saw_non_ascii = false;
   for (const char *p = decoded; *p != '\0'; p += 1)
     {
+      if ((*p & 0x80) != 0)
+	saw_non_ascii = true;
+
       if (*p == '.')
 	encoding_buffer.append ("__");
       else if (*p == '[' && is_compiler_suffix (p))
@@ -887,23 +970,70 @@ ada_encode_1 (const char *decoded, bool throw_errors)
 	encoding_buffer.push_back (*p);
     }
 
+  /* If a non-ASCII character is seen, we must convert it to the
+     appropriate hex form.  As this is more expensive, we keep track
+     of whether it is even necessary.  */
+  if (saw_non_ascii)
+    {
+      auto_obstack storage;
+      bool is_utf8 = ada_source_charset == ada_utf8;
+      try
+	{
+	  convert_between_encodings
+	    (host_charset (),
+	     is_utf8 ? HOST_UTF32 : ada_source_charset,
+	     (const gdb_byte *) encoding_buffer.c_str (),
+	     encoding_buffer.length (), 1,
+	     &storage, translit_none);
+	}
+      catch (const gdb_exception &)
+	{
+	  static bool warned = false;
+
+	  /* Converting to UTF-32 shouldn't fail, so if it doesn't, we
+	     might like to know why.  */
+	  if (!warned)
+	    {
+	      warned = true;
+	      warning (_("charset conversion failure for '%s'.\n"
+			 "You may have the wrong value for 'set ada source-charset'."),
+		       encoding_buffer.c_str ());
+	    }
+
+	  /* We don't try to recover from errors.  */
+	  return encoding_buffer;
+	}
+
+      if (is_utf8)
+	return copy_and_hex_encode<uint32_t> (&storage);
+      return copy_and_hex_encode<gdb_byte> (&storage);
+    }
+
   return encoding_buffer;
 }
 
-/* The "encoded" form of DECODED, according to GNAT conventions.  */
-
-std::string
-ada_encode (const char *decoded)
+/* Find the entry for C in the case-folding table.  Return nullptr if
+   the entry does not cover C.  */
+static const utf8_entry *
+find_case_fold_entry (uint32_t c)
 {
-  return ada_encode_1 (decoded, true);
+  auto iter = std::lower_bound (std::begin (ada_case_fold),
+				std::end (ada_case_fold),
+				c);
+  if (iter == std::end (ada_case_fold)
+      || c < iter->start
+      || c > iter->end)
+    return nullptr;
+  return &*iter;
 }
 
 /* Return NAME folded to lower case, or, if surrounded by single
-   quotes, unfolded, but with the quotes stripped away.  Result good
-   to next call.  */
+   quotes, unfolded, but with the quotes stripped away.  If
+   THROW_ON_ERROR is true, encoding failures will throw an exception
+   rather than emitting a warning.  Result good to next call.  */
 
 static const char *
-ada_fold_name (gdb::string_view name)
+ada_fold_name (gdb::string_view name, bool throw_on_error = false)
 {
   static std::string fold_storage;
 
@@ -911,14 +1041,120 @@ ada_fold_name (gdb::string_view name)
     fold_storage = gdb::to_string (name.substr (1, name.size () - 2));
   else
     {
-      fold_storage = gdb::to_string (name);
-      for (int i = 0; i < name.size (); i += 1)
-	fold_storage[i] = tolower (fold_storage[i]);
+      /* Why convert to UTF-32 and implement our own case-folding,
+	 rather than convert to wchar_t and use the platform's
+	 functions?  I'm glad you asked.
+
+	 The main problem is that GNAT implements an unusual rule for
+	 case folding.  For ASCII letters, letters in single-byte
+	 encodings (such as ISO-8859-*), and Unicode letters that fit
+	 in a single byte (i.e., code point is <= 0xff), the letter is
+	 folded to lower case.  Other Unicode letters are folded to
+	 upper case.
+
+	 This rule means that the code must be able to examine the
+	 value of the character.  And, some hosts do not use Unicode
+	 for wchar_t, so examining the value of such characters is
+	 forbidden.  */
+      auto_obstack storage;
+      try
+	{
+	  convert_between_encodings
+	    (host_charset (), HOST_UTF32,
+	     (const gdb_byte *) name.data (),
+	     name.length (), 1,
+	     &storage, translit_none);
+	}
+      catch (const gdb_exception &)
+	{
+	  if (throw_on_error)
+	    throw;
+
+	  static bool warned = false;
+
+	  /* Converting to UTF-32 shouldn't fail, so if it doesn't, we
+	     might like to know why.  */
+	  if (!warned)
+	    {
+	      warned = true;
+	      warning (_("could not convert '%s' from the host encoding (%s) to UTF-32.\n"
+			 "This normally should not happen, please file a bug report."),
+		       gdb::to_string (name).c_str (), host_charset ());
+	    }
+
+	  /* We don't try to recover from errors; just return the
+	     original string.  */
+	  fold_storage = gdb::to_string (name);
+	  return fold_storage.c_str ();
+	}
+
+      bool is_utf8 = ada_source_charset == ada_utf8;
+      uint32_t *chars = (uint32_t *) obstack_base (&storage);
+      int num_chars = obstack_object_size (&storage) / sizeof (uint32_t);
+      for (int i = 0; i < num_chars; ++i)
+	{
+	  const struct utf8_entry *entry = find_case_fold_entry (chars[i]);
+	  if (entry != nullptr)
+	    {
+	      uint32_t low = chars[i] + entry->lower_delta;
+	      if (!is_utf8 || low <= 0xff)
+		chars[i] = low;
+	      else
+		chars[i] = chars[i] + entry->upper_delta;
+	    }
+	}
+
+      /* Now convert back to ordinary characters.  */
+      auto_obstack reconverted;
+      try
+	{
+	  convert_between_encodings (HOST_UTF32,
+				     host_charset (),
+				     (const gdb_byte *) chars,
+				     num_chars * sizeof (uint32_t),
+				     sizeof (uint32_t),
+				     &reconverted,
+				     translit_none);
+	  obstack_1grow (&reconverted, '\0');
+	  fold_storage = std::string ((const char *) obstack_base (&reconverted));
+	}
+      catch (const gdb_exception &)
+	{
+	  if (throw_on_error)
+	    throw;
+
+	  static bool warned = false;
+
+	  /* Converting back from UTF-32 shouldn't normally fail, but
+	     there are some host encodings without upper/lower
+	     equivalence.  */
+	  if (!warned)
+	    {
+	      warned = true;
+	      warning (_("could not convert the lower-cased variant of '%s'\n"
+			 "from UTF-32 to the host encoding (%s)."),
+		       gdb::to_string (name).c_str (), host_charset ());
+	    }
+
+	  /* We don't try to recover from errors; just return the
+	     original string.  */
+	  fold_storage = gdb::to_string (name);
+	}
     }
 
   return fold_storage.c_str ();
 }
 
+/* The "encoded" form of DECODED, according to GNAT conventions.  */
+
+std::string
+ada_encode (const char *decoded)
+{
+  if (decoded[0] != '<')
+    decoded = ada_fold_name (decoded);
+  return ada_encode_1 (decoded, true);
+}
+
 /* Return nonzero if C is either a digit or a lowercase alphabet character.  */
 
 static int
@@ -999,6 +1235,72 @@ remove_compiler_suffix (const char *encoded, int *len)
   return -1;
 }
 
+/* Convert an ASCII hex string to a number.  Reads exactly N
+   characters from STR.  Returns true on success, false if one of the
+   digits was not a hex digit.  */
+static bool
+convert_hex (const char *str, int n, uint32_t *out)
+{
+  uint32_t result = 0;
+
+  for (int i = 0; i < n; ++i)
+    {
+      if (!isxdigit (str[i]))
+	return false;
+      result <<= 4;
+      result |= fromhex (str[i]);
+    }
+
+  *out = result;
+  return true;
+}
+
+/* Convert a wide character from its ASCII hex representation in STR
+   (consisting of exactly N characters) to the host encoding,
+   appending the resulting bytes to OUT.  If N==2 and the Ada source
+   charset is not UTF-8, then hex refers to an encoding in the
+   ADA_SOURCE_CHARSET; otherwise, use UTF-32.  Return true on success.
+   Return false and do not modify OUT on conversion failure.  */
+static bool
+convert_from_hex_encoded (std::string &out, const char *str, int n)
+{
+  uint32_t value;
+
+  if (!convert_hex (str, n, &value))
+    return false;
+  try
+    {
+      auto_obstack bytes;
+      /* In the 'U' case, the hex digits encode the character in the
+	 Ada source charset.  However, if the source charset is UTF-8,
+	 this really means it is a single-byte UTF-32 character.  */
+      if (n == 2 && ada_source_charset != ada_utf8)
+	{
+	  gdb_byte one_char = (gdb_byte) value;
+
+	  convert_between_encodings (ada_source_charset, host_charset (),
+				     &one_char,
+				     sizeof (one_char), sizeof (one_char),
+				     &bytes, translit_none);
+	}
+      else
+	convert_between_encodings (HOST_UTF32, host_charset (),
+				   (const gdb_byte *) &value,
+				   sizeof (value), sizeof (value),
+				   &bytes, translit_none);
+      obstack_1grow (&bytes, '\0');
+      out.append ((const char *) obstack_base (&bytes));
+    }
+  catch (const gdb_exception &)
+    {
+      /* On failure, the caller will just let the encoded form
+	 through, which seems basically reasonable.  */
+      return false;
+    }
+
+  return true;
+}
+
 /* See ada-lang.h.  */
 
 std::string
@@ -1191,6 +1493,32 @@ ada_decode (const char *encoded, bool wrap)
 	    i++;
 	}
 
+      if (i < len0 + 3 && encoded[i] == 'U' && isxdigit (encoded[i + 1]))
+	{
+	  if (convert_from_hex_encoded (decoded, &encoded[i + 1], 2))
+	    {
+	      i += 3;
+	      continue;
+	    }
+	}
+      else if (i < len0 + 5 && encoded[i] == 'W' && isxdigit (encoded[i + 1]))
+	{
+	  if (convert_from_hex_encoded (decoded, &encoded[i + 1], 4))
+	    {
+	      i += 5;
+	      continue;
+	    }
+	}
+      else if (i < len0 + 10 && encoded[i] == 'W' && encoded[i + 1] == 'W'
+	       && isxdigit (encoded[i + 2]))
+	{
+	  if (convert_from_hex_encoded (decoded, &encoded[i + 2], 8))
+	    {
+	      i += 10;
+	      continue;
+	    }
+	}
+
       if (encoded[i] == 'X' && i != 0 && isalnum (encoded[i - 1]))
 	{
 	  /* This is a X[bn]* sequence not separated from the previous
@@ -6212,7 +6540,6 @@ ada_get_tsd_from_tag (struct value *tag)
 static gdb::unique_xmalloc_ptr<char>
 ada_tag_name_from_tsd (struct value *tsd)
 {
-  char *p;
   struct value *val;
 
   val = ada_value_struct_elt (tsd, "expanded_name", 1);
@@ -6223,13 +6550,18 @@ ada_tag_name_from_tsd (struct value *tsd)
   if (buffer == nullptr)
     return nullptr;
 
-  for (p = buffer.get (); *p != '\0'; ++p)
+  try
     {
-      if (isalpha (*p))
-	*p = tolower (*p);
+      /* Let this throw an exception on error.  If the data is
+	 uninitialized, we'd rather not have the user see a
+	 warning.  */
+      const char *folded = ada_fold_name (buffer.get (), true);
+      return make_unique_xstrdup (folded);
+    }
+  catch (const gdb_exception &)
+    {
+      return nullptr;
     }
-
-  return buffer;
 }
 
 /* The type name of the dynamic type denoted by the 'tag value TAG, as
@@ -13435,6 +13767,26 @@ ada_free_objfile_observer (struct objfile *objfile)
   ada_clear_symbol_cache ();
 }
 
+/* Charsets known to GNAT.  */
+static const char * const gnat_source_charsets[] =
+{
+  /* Note that code below assumes that the default comes first.
+     Latin-1 is the default here, because that is also GNAT's
+     default.  */
+  "ISO-8859-1",
+  "ISO-8859-2",
+  "ISO-8859-3",
+  "ISO-8859-4",
+  "ISO-8859-5",
+  "ISO-8859-15",
+  "CP437",
+  "CP850",
+  /* Note that this value is special-cased in the encoder and
+     decoder.  */
+  ada_utf8,
+  nullptr
+};
+
 void _initialize_ada_language ();
 void
 _initialize_ada_language ()
@@ -13470,6 +13822,17 @@ Show whether the output of formal and return types for functions in the \
 overloads selection menu is activated."),
 			   NULL, NULL, NULL, &set_ada_list, &show_ada_list);
 
+  ada_source_charset = gnat_source_charsets[0];
+  add_setshow_enum_cmd ("source-charset", class_files,
+			gnat_source_charsets,
+			&ada_source_charset,  _("\
+Set the Ada source character set."), _("\
+Show the Ada source character set."), _("\
+The character set used for Ada source files.\n\
+This must correspond to the '-gnati' or '-gnatW' option passed to GNAT."),
+			nullptr, nullptr,
+			&set_ada_list, &show_ada_list);
+
   add_catch_command ("exception", _("\
 Catch Ada exceptions, when raised.\n\
 Usage: catch exception [ARG] [if CONDITION]\n\
diff --git a/gdb/ada-lex.l b/gdb/ada-lex.l
index a1e19423691..f698ad4dd57 100644
--- a/gdb/ada-lex.l
+++ b/gdb/ada-lex.l
@@ -30,7 +30,7 @@ HEXDIG	[0-9a-f]
 NUM16	({HEXDIG}({HEXDIG}|_)*)
 OCTDIG	[0-7]
 LETTER	[a-z_]
-ID	({LETTER}({LETTER}|{DIG})*|"<"{LETTER}({LETTER}|{DIG})*">")
+ID	({LETTER}({LETTER}|{DIG}|[\x80-\xff])*|"<"{LETTER}({LETTER}|{DIG})*">")
 WHITE	[ \t\n]
 TICK	("'"{WHITE}*)
 GRAPHIC [a-z0-9 #&'()*+,-./:;<>=_|!$%?@\[\]\\^`{}~]
diff --git a/gdb/ada-unicode.py b/gdb/ada-unicode.py
new file mode 100755
index 00000000000..31fa2f11520
--- /dev/null
+++ b/gdb/ada-unicode.py
@@ -0,0 +1,114 @@
+# Generate Unicode case-folding table for Ada.
+
+# Copyright (C) 2022 Free Software Foundation, Inc.
+
+# This file is part of GDB.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This generates the ada-casefold.h header.
+# Usage:
+#   python ada-unicode.py
+
+# The start of the current range of case-conversions we are
+# processing.  If RANGE_START is None, then we're outside of a range.
+range_start = None
+# End of the current range.
+range_end = None
+# The delta between RANGE_START and the upper-case variant of that
+# character.
+upper_delta = None
+# The delta between RANGE_START and the lower-case variant of that
+# character.
+lower_delta = None
+
+# All the ranges found and completed so far.
+# Each entry is a tuple of the form (START, END, UPPER_DELTA, LOWER_DELTA).
+all_ranges = []
+
+
+def finish_range():
+    global range_start
+    global range_end
+    global upper_delta
+    global lower_delta
+    if range_start is not None:
+        all_ranges.append((range_start, range_end, upper_delta, lower_delta))
+        range_start = None
+        range_end = None
+        upper_delta = None
+        lower_delta = None
+
+
+def process_codepoint(val):
+    global range_start
+    global range_end
+    global upper_delta
+    global lower_delta
+    c = chr(val)
+    low = c.lower()
+    up = c.upper()
+    # U+00DF ("LATIN SMALL LETTER SHARP S", aka eszsett) traditionally
+    # upper-cases to the two-character string "SS" (the capital form
+    # is a relatively recent addition -- 2017).  Our simple scheme
+    # can't handle this, so we skip it.  Also, because our approach
+    # just represents runs of characters with identical folding
+    # deltas, this change must terminate the current run.
+    if (c == low and c == up) or len(low) != 1 or len(up) != 1:
+        finish_range()
+        return
+    updelta = ord(up) - val
+    lowdelta = ord(low) - val
+    if range_start is not None and (updelta != upper_delta
+                                    or lowdelta != lower_delta):
+        finish_range()
+    if range_start is None:
+        range_start = val
+        upper_delta = updelta
+        lower_delta = lowdelta
+    range_end = val
+
+
+for c in range(0, 0x10FFFF):
+    process_codepoint(c)
+
+copyright = """/* *INDENT-OFF* */ /* THIS FILE IS GENERATED -*- buffer-read-only: t -*- */
+/* vi:set ro: */
+
+/* UTF-32 case-folding for GDB
+
+   Copyright (C) 2022 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+/* This file was created with the aid of ``ada-unicode.py''.  */
+"""
+
+with open('ada-casefold.h', 'w') as f:
+    print(copyright, file=f)
+    for r in all_ranges:
+        print(f"   {{{r[0]}, {r[1]}, {r[2]}, {r[3]}}},", file=f)
diff --git a/gdb/copyright.py b/gdb/copyright.py
index 8ae9ffff65b..3d5e7e5f205 100644
--- a/gdb/copyright.py
+++ b/gdb/copyright.py
@@ -247,6 +247,7 @@ MULTIPLE_COPYRIGHT_HEADERS = (
     "gdb/doc/refcard.tex",
     "gdb/gdbarch.sh",
     "gdb/syscalls/update-netbsd.sh",
+    "gdb/ada-unicode.py",
 )
 
 # The list of file which have a copyright, but not held by the FSF.
diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
index f7f5f7a6158..cb9570f9b13 100644
--- a/gdb/doc/gdb.texinfo
+++ b/gdb/doc/gdb.texinfo
@@ -18012,6 +18012,7 @@ to be difficult.
 * Ravenscar Profile::           Tasking Support when using the Ravenscar
                                    Profile
 * Ada Settings::                New settable GDB parameters for Ada.
+* Ada Source Character Set::    Character set of Ada source files.
 * Ada Glitches::                Known peculiarities of Ada mode.
 @end menu
 
@@ -18762,6 +18763,28 @@ size is less than @var{size}.
 Show the limit on types whose size is determined by run-time quantities.
 @end table
 
+@node Ada Source Character Set
+@subsubsection Ada Source Character Set
+@cindex Ada, source character set
+
+The GNAT compiler supports a number of character sets for source
+files.  @xref{Character Set Control, , Character Set Control,
+gnat_ugn}.  @value{GDBN} includes support for this as well.
+
+@table @code
+@item set ada source-charset @var{charset}
+@kindex set ada source-charset
+Set the source character set for Ada.  The character set must be one
+of the ones supported by GNAT.  Because this setting affects the
+decoding of symbols coming from the debug information in your program,
+the setting should be set as early as possible.  The default is
+@code{ISO-8859-1}, because that is also GNAT's default.
+
+@item show ada source-charset
+@kindex show ada source-charset
+Show the current source character set for Ada.
+@end table
+
 @node Ada Glitches
 @subsubsection Known Peculiarities of Ada Mode
 @cindex Ada, problems
diff --git a/gdb/testsuite/gdb.ada/non-ascii-latin-1.exp b/gdb/testsuite/gdb.ada/non-ascii-latin-1.exp
new file mode 100644
index 00000000000..5ff55d66a68
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-latin-1.exp
@@ -0,0 +1,50 @@
+# Copyright 2022 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test UTF-8 identifiers.
+
+load_lib "ada.exp"
+
+if { [skip_ada_tests] } { return -1 }
+
+# Enable basic use of UTF-8.  LC_ALL gets reset for each testfile.  We
+# want this despite the program itself using Latin-1, as this test is
+# written using UTF-8.
+setenv LC_ALL C.UTF-8
+
+standard_ada_testfile prog
+
+set flags [list debug additional_flags=-gnati1]
+if {[gdb_compile_ada "${srcfile}" "${binfile}" executable $flags] != ""} {
+    return -1
+}
+
+# Restart without an executable so that we can set the encoding early.
+clean_restart
+
+# The default is Latin-1, but set this explicitly just in case we get
+# to change the default someday.
+gdb_test_no_output "set ada source-charset ISO-8859-1"
+
+gdb_load ${binfile}
+
+set bp_location [gdb_get_line_number "BREAK" ${testdir}/prog.adb]
+runto "prog.adb:$bp_location"
+
+gdb_test "print VAR_Ãž" " = 23"
+gdb_test "print var_Ã¾" " = 23"
+
+gdb_breakpoint "FUNC_Ãž" message
+gdb_breakpoint "func_Ã¾" message
diff --git a/gdb/testsuite/gdb.ada/non-ascii-latin-1/pack.adb b/gdb/testsuite/gdb.ada/non-ascii-latin-1/pack.adb
new file mode 100644
index 00000000000..b82a0e25184
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-latin-1/pack.adb
@@ -0,0 +1,28 @@
+--  Copyright 2022 Free Software Foundation, Inc. -*- coding: iso-latin-1 -*-
+--
+--  This program is free software; you can redistribute it and/or modify
+--  it under the terms of the GNU General Public License as published by
+--  the Free Software Foundation; either version 3 of the License, or
+--  (at your option) any later version.
+--
+--  This program is distributed in the hope that it will be useful,
+--  but WITHOUT ANY WARRANTY; without even the implied warranty of
+--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+--  GNU General Public License for more details.
+--
+--  You should have received a copy of the GNU General Public License
+--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+package body Pack is
+
+   function FUNC_Þ (x : Integer) return Integer is
+   begin
+      return x;
+   end FUNC_Þ;
+
+   procedure Do_Nothing (A : System.Address) is
+   begin
+      null;
+   end Do_Nothing;
+
+end Pack;
diff --git a/gdb/testsuite/gdb.ada/non-ascii-latin-1/pack.ads b/gdb/testsuite/gdb.ada/non-ascii-latin-1/pack.ads
new file mode 100644
index 00000000000..93180b4a4e9
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-latin-1/pack.ads
@@ -0,0 +1,21 @@
+--  Copyright 2022 Free Software Foundation, Inc. -*- coding: iso-latin-1 -*-
+--
+--  This program is free software; you can redistribute it and/or modify
+--  it under the terms of the GNU General Public License as published by
+--  the Free Software Foundation; either version 3 of the License, or
+--  (at your option) any later version.
+--
+--  This program is distributed in the hope that it will be useful,
+--  but WITHOUT ANY WARRANTY; without even the implied warranty of
+--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+--  GNU General Public License for more details.
+--
+--  You should have received a copy of the GNU General Public License
+--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+with System;
+package Pack is
+   function FUNC_Þ (x : Integer) return Integer;
+
+   procedure Do_Nothing (A : System.Address);
+end Pack;
diff --git a/gdb/testsuite/gdb.ada/non-ascii-latin-1/prog.adb b/gdb/testsuite/gdb.ada/non-ascii-latin-1/prog.adb
new file mode 100644
index 00000000000..f0bd5abaa8b
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-latin-1/prog.adb
@@ -0,0 +1,23 @@
+--  Copyright 2022 Free Software Foundation, Inc. -*- coding: iso-latin-1 -*-
+--
+--  This program is free software; you can redistribute it and/or modify
+--  it under the terms of the GNU General Public License as published by
+--  the Free Software Foundation; either version 3 of the License, or
+--  (at your option) any later version.
+--
+--  This program is distributed in the hope that it will be useful,
+--  but WITHOUT ANY WARRANTY; without even the implied warranty of
+--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+--  GNU General Public License for more details.
+--
+--  You should have received a copy of the GNU General Public License
+--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+with Pack; use Pack;
+
+procedure Prog is
+   -- This should be var_Ufe.
+   VAR_Þ : Integer := FUNC_Þ (23);
+begin
+   Do_Nothing (var_þ'Address); --  BREAK
+end Prog;
diff --git a/gdb/testsuite/gdb.ada/non-ascii-latin-3.exp b/gdb/testsuite/gdb.ada/non-ascii-latin-3.exp
new file mode 100644
index 00000000000..bafcdeb13e7
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-latin-3.exp
@@ -0,0 +1,50 @@
+# Copyright 2022 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test UTF-8 identifiers.
+
+load_lib "ada.exp"
+
+if { [skip_ada_tests] } { return -1 }
+
+# Enable basic use of UTF-8.  LC_ALL gets reset for each testfile.  We
+# want this despite the program itself using Latin-1, as this test is
+# written using UTF-8.
+setenv LC_ALL C.UTF-8
+
+standard_ada_testfile prog
+
+set flags [list debug additional_flags=-gnati3]
+if {[gdb_compile_ada "${srcfile}" "${binfile}" executable $flags] != ""} {
+    return -1
+}
+
+# Restart without an executable so that we can set the encoding early.
+clean_restart
+
+gdb_test_no_output "set ada source-charset ISO-8859-3"
+
+gdb_load ${binfile}
+
+set bp_location [gdb_get_line_number "BREAK" ${testdir}/prog.adb]
+runto "prog.adb:$bp_location"
+
+gdb_test "print VAR_Å»" " = 23"
+gdb_test "print var_Å¼" " = 23"
+
+gdb_breakpoint "FUNC_Å»" message
+gdb_breakpoint "func_Å¼" message
+
+gdb_test "print var_ð•¯" "warning: charset conversion failure.*"
diff --git a/gdb/testsuite/gdb.ada/non-ascii-latin-3/pack.adb b/gdb/testsuite/gdb.ada/non-ascii-latin-3/pack.adb
new file mode 100644
index 00000000000..b639b1c19a7
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-latin-3/pack.adb
@@ -0,0 +1,28 @@
+--  Copyright 2022 Free Software Foundation, Inc. -*- coding: iso-latin-3 -*-
+--
+--  This program is free software; you can redistribute it and/or modify
+--  it under the terms of the GNU General Public License as published by
+--  the Free Software Foundation; either version 3 of the License, or
+--  (at your option) any later version.
+--
+--  This program is distributed in the hope that it will be useful,
+--  but WITHOUT ANY WARRANTY; without even the implied warranty of
+--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+--  GNU General Public License for more details.
+--
+--  You should have received a copy of the GNU General Public License
+--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+package body Pack is
+
+   function FUNC_¯ (x : Integer) return Integer is
+   begin
+      return x;
+   end FUNC_¯;
+
+   procedure Do_Nothing (A : System.Address) is
+   begin
+      null;
+   end Do_Nothing;
+
+end Pack;
diff --git a/gdb/testsuite/gdb.ada/non-ascii-latin-3/pack.ads b/gdb/testsuite/gdb.ada/non-ascii-latin-3/pack.ads
new file mode 100644
index 00000000000..d030d73003f
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-latin-3/pack.ads
@@ -0,0 +1,21 @@
+--  Copyright 2022 Free Software Foundation, Inc. -*- coding: iso-latin-3 -*-
+--
+--  This program is free software; you can redistribute it and/or modify
+--  it under the terms of the GNU General Public License as published by
+--  the Free Software Foundation; either version 3 of the License, or
+--  (at your option) any later version.
+--
+--  This program is distributed in the hope that it will be useful,
+--  but WITHOUT ANY WARRANTY; without even the implied warranty of
+--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+--  GNU General Public License for more details.
+--
+--  You should have received a copy of the GNU General Public License
+--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+with System;
+package Pack is
+   function FUNC_¯ (x : Integer) return Integer;
+
+   procedure Do_Nothing (A : System.Address);
+end Pack;
diff --git a/gdb/testsuite/gdb.ada/non-ascii-latin-3/prog.adb b/gdb/testsuite/gdb.ada/non-ascii-latin-3/prog.adb
new file mode 100644
index 00000000000..c029dd93d02
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-latin-3/prog.adb
@@ -0,0 +1,24 @@
+--  Copyright 2022 Free Software Foundation, Inc. -*- coding: iso-latin-3 -*-
+--
+--  This program is free software; you can redistribute it and/or modify
+--  it under the terms of the GNU General Public License as published by
+--  the Free Software Foundation; either version 3 of the License, or
+--  (at your option) any later version.
+--
+--  This program is distributed in the hope that it will be useful,
+--  but WITHOUT ANY WARRANTY; without even the implied warranty of
+--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+--  GNU General Public License for more details.
+--
+--  You should have received a copy of the GNU General Public License
+--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+with Pack; use Pack;
+
+procedure Prog is
+   -- The name is chosen to use a character that is not in Latin-1.
+   -- This should be var_Ubf.
+   VAR_¯ : Integer := FUNC_¯ (23);
+begin
+   Do_Nothing (var_¿'Address); --  BREAK
+end Prog;
diff --git a/gdb/testsuite/gdb.ada/non-ascii-utf-8.exp b/gdb/testsuite/gdb.ada/non-ascii-utf-8.exp
new file mode 100644
index 00000000000..4ab0ca54c63
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-utf-8.exp
@@ -0,0 +1,57 @@
+# Copyright 2022 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test UTF-8 identifiers.
+
+load_lib "ada.exp"
+
+if { [skip_ada_tests] } { return -1 }
+
+# Enable basic use of UTF-8.  LC_ALL gets reset for each testfile.
+setenv LC_ALL C.UTF-8
+
+standard_ada_testfile prog
+
+set flags [list debug additional_flags=-gnatW8]
+if {[gdb_compile_ada "${srcfile}" "${binfile}" executable $flags] != ""} {
+    return -1
+}
+
+# Restart without an executable so that we can set the encoding early.
+clean_restart
+
+gdb_test_no_output "set ada source-charset UTF-8"
+
+gdb_load ${binfile}
+
+set bp_location [gdb_get_line_number "BREAK" ${testdir}/prog.adb]
+runto "prog.adb:$bp_location"
+
+gdb_test "print VAR_Ãœ" " = 23"
+gdb_test "print var_Ã¼" " = 23"
+gdb_test "print VAR_Æ¸" " = 24"
+gdb_test "print var_Æ¹" " = 24"
+gdb_test "print VAR_ð" " = 25"
+gdb_test "print var_ð©" " = 25"
+gdb_test "print VAR_Å»" " = 26"
+gdb_test "print var_Å¼" " = 26"
+
+gdb_breakpoint "FUNC_Ãœ" message
+gdb_breakpoint "func_Ã¼" message
+gdb_breakpoint "FUNC_Æ¸" message
+gdb_breakpoint "func_Æ¹" message
+gdb_breakpoint "FUNC_Å»" message
+gdb_breakpoint "func_Å¼" message
+gdb_breakpoint "FUNC_ð" message
diff --git a/gdb/testsuite/gdb.ada/non-ascii-utf-8/pack.adb b/gdb/testsuite/gdb.ada/non-ascii-utf-8/pack.adb
new file mode 100644
index 00000000000..f7893f20976
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-utf-8/pack.adb
@@ -0,0 +1,43 @@
+--  Copyright 2022 Free Software Foundation, Inc.
+--
+--  This program is free software; you can redistribute it and/or modify
+--  it under the terms of the GNU General Public License as published by
+--  the Free Software Foundation; either version 3 of the License, or
+--  (at your option) any later version.
+--
+--  This program is distributed in the hope that it will be useful,
+--  but WITHOUT ANY WARRANTY; without even the implied warranty of
+--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+--  GNU General Public License for more details.
+--
+--  You should have received a copy of the GNU General Public License
+--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+package body Pack is
+
+   function FUNC_Ãœ (x : Integer) return Integer is
+   begin
+      return x;
+   end FUNC_Ãœ;
+
+   function FUNC_Æ¸ (x : Integer) return Integer is
+   begin
+      return x;
+   end FUNC_Æ¸;
+
+   function FUNC_ð (x : Integer) return Integer is
+   begin
+      return x;
+   end FUNC_ð;
+
+   function FUNC_Å» (x : Integer) return Integer is
+   begin
+      return x;
+   end FUNC_Å»;
+
+   procedure Do_Nothing (A : System.Address) is
+   begin
+      null;
+   end Do_Nothing;
+
+end Pack;
diff --git a/gdb/testsuite/gdb.ada/non-ascii-utf-8/pack.ads b/gdb/testsuite/gdb.ada/non-ascii-utf-8/pack.ads
new file mode 100644
index 00000000000..f44c487295f
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-utf-8/pack.ads
@@ -0,0 +1,24 @@
+--  Copyright 2022 Free Software Foundation, Inc.
+--
+--  This program is free software; you can redistribute it and/or modify
+--  it under the terms of the GNU General Public License as published by
+--  the Free Software Foundation; either version 3 of the License, or
+--  (at your option) any later version.
+--
+--  This program is distributed in the hope that it will be useful,
+--  but WITHOUT ANY WARRANTY; without even the implied warranty of
+--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+--  GNU General Public License for more details.
+--
+--  You should have received a copy of the GNU General Public License
+--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+with System;
+package Pack is
+   function FUNC_Ãœ (x : Integer) return Integer;
+   function FUNC_Æ¸ (x : Integer) return Integer;
+   function FUNC_ð (x : Integer) return Integer;
+   function FUNC_Å» (x : Integer) return Integer;
+
+   procedure Do_Nothing (A : System.Address);
+end Pack;
diff --git a/gdb/testsuite/gdb.ada/non-ascii-utf-8/prog.adb b/gdb/testsuite/gdb.ada/non-ascii-utf-8/prog.adb
new file mode 100644
index 00000000000..b9c1b17d448
--- /dev/null
+++ b/gdb/testsuite/gdb.ada/non-ascii-utf-8/prog.adb
@@ -0,0 +1,36 @@
+--  Copyright 2022 Free Software Foundation, Inc.
+--
+--  This program is free software; you can redistribute it and/or modify
+--  it under the terms of the GNU General Public License as published by
+--  the Free Software Foundation; either version 3 of the License, or
+--  (at your option) any later version.
+--
+--  This program is distributed in the hope that it will be useful,
+--  but WITHOUT ANY WARRANTY; without even the implied warranty of
+--  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+--  GNU General Public License for more details.
+--
+--  You should have received a copy of the GNU General Public License
+--  along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+with Pack; use Pack;
+
+procedure Prog is
+   -- This should be var_Ufc.
+   VAR_Ãœ : Integer := FUNC_Ãœ (23);
+   -- This should be var_W01b8, because with UTF-8, non-ASCII
+   -- letters are upper-cased.
+   VAR_Æ¸ : Integer := FUNC_Æ¸ (24);
+   -- This should be var_WW00010401, because with UTF-8, non-ASCII
+   -- letters are upper-cased.
+   VAR_ð : Integer := FUNC_ð (25);
+   -- This is the same name as the corresponding Latin 3 test,
+   -- and helps show the peculiarity of the case folding rule.
+   -- This winds up as var_W017b, the upper-case variant.
+   VAR_Å» : Integer := FUNC_Å» (26);
+begin
+   Do_Nothing (var_Ã¼'Address); --  BREAK
+   Do_Nothing (var_Æ¹'Address);
+   Do_Nothing (var_ð©'Address);
+   Do_Nothing (var_Å¼'Address);
+end Prog;
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] Handle non-ASCII identifiers in Ada
  2022-02-28 18:33 ` [PATCH 5/5] Handle non-ASCII identifiers in Ada Tom Tromey
@ 2022-02-28 18:59   ` Eli Zaretskii
  2022-02-28 20:59     ` Tom Tromey
  2022-03-01 15:33   ` Tom Tromey
  1 sibling, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2022-02-28 18:59 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

> Date: Mon, 28 Feb 2022 11:33:04 -0700
> From: Tom Tromey via Gdb-patches <gdb-patches@sourceware.org>
> Cc: Tom Tromey <tromey@adacore.com>
> 
> +for c in range(0, 0x10FFFF):
> +    process_codepoint(c)

This script assumes that the version of Python which will run it is
up-to-date with the latest Unicode Character Database (UCD), right?
Is that a good assumption?  Wouldn't it be better to process the UCD
from the latest Unicode Standard directly?

> +@kindex set ada source-charset
> +Set the source character set for Ada.  The character set must be one
> +of the ones supported by GNAT.  Because this setting affects the

"must be one of the ones supported" sounds awkward.  Can you reword
it?

Other than that, the documentation parts are okay.  Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] Handle non-ASCII identifiers in Ada
  2022-02-28 18:59   ` Eli Zaretskii
@ 2022-02-28 20:59     ` Tom Tromey
  2022-03-01  3:28       ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Tromey @ 2022-02-28 20:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Tom Tromey, gdb-patches

>>>>> "Eli" == Eli Zaretskii <eliz@gnu.org> writes:

>> Date: Mon, 28 Feb 2022 11:33:04 -0700
>> From: Tom Tromey via Gdb-patches <gdb-patches@sourceware.org>
>> Cc: Tom Tromey <tromey@adacore.com>
>> 
>> +for c in range(0, 0x10FFFF):
>> +    process_codepoint(c)

Eli> This script assumes that the version of Python which will run it is
Eli> up-to-date with the latest Unicode Character Database (UCD), right?
Eli> Is that a good assumption?  Wouldn't it be better to process the UCD
Eli> from the latest Unicode Standard directly?

Ordinarily, yes, but in practice the Ada compiler uses quite old data,
and so whatever is provided by a recent-ish Python is more than good
enough.

If the Ada compiler is changed, I'll update the script.  I suspect this
won't happen, though.

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] Handle non-ASCII identifiers in Ada
  2022-02-28 20:59     ` Tom Tromey
@ 2022-03-01  3:28       ` Eli Zaretskii
  2022-03-01 14:49         ` Tom Tromey
  0 siblings, 1 reply; 15+ messages in thread
From: Eli Zaretskii @ 2022-03-01  3:28 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

> From: Tom Tromey <tromey@adacore.com>
> Cc: Tom Tromey <tromey@adacore.com>,  gdb-patches@sourceware.org
> Date: Mon, 28 Feb 2022 13:59:37 -0700
> 
> Eli> This script assumes that the version of Python which will run it is
> Eli> up-to-date with the latest Unicode Character Database (UCD), right?
> Eli> Is that a good assumption?  Wouldn't it be better to process the UCD
> Eli> from the latest Unicode Standard directly?
> 
> Ordinarily, yes, but in practice the Ada compiler uses quite old data,
> and so whatever is provided by a recent-ish Python is more than good
> enough.

How old is "old data", and how recent-ish should be "recent-ish
Python", for this purpose?  Or maybe we should document what is the
oldest version of Python that currently suits the needs?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] Let phex and phex_nz handle sizeof_l==1
  2022-02-28 18:33 ` [PATCH 3/5] Let phex and phex_nz handle sizeof_l==1 Tom Tromey
@ 2022-03-01 14:26   ` Simon Marchi
  2022-03-01 14:32     ` Tom Tromey
  0 siblings, 1 reply; 15+ messages in thread
From: Simon Marchi @ 2022-03-01 14:26 UTC (permalink / raw)
  To: Tom Tromey, gdb-patches



On 2022-02-28 13:33, Tom Tromey via Gdb-patches wrote:
> Currently, neither phex nor phex_nz handle sizeof_l==1 -- they let
> this case fall through to the default case.  However, a subsequent
> patch in this series needs this case to work correctly.
> 
> I looked at all calls to these functions that pass a 1 for the
> sizeof_l parameter.  The only such case seems to be correct with this
> change.
> ---
>  gdbsupport/print-utils.cc | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/gdbsupport/print-utils.cc b/gdbsupport/print-utils.cc
> index 0ef8cb829a1..73ff1afda30 100644
> --- a/gdbsupport/print-utils.cc
> +++ b/gdbsupport/print-utils.cc
> @@ -168,6 +168,10 @@ phex (ULONGEST l, int sizeof_l)
>        str = get_print_cell ();
>        xsnprintf (str, PRINT_CELL_SIZE, "%04x", (unsigned short) (l & 0xffff));
>        break;
> +    case 1:
> +      str = get_print_cell ();
> +      xsnprintf (str, PRINT_CELL_SIZE, "%02x", (unsigned short) (l & 0xff));

I'm just wondering why you're casting to unsigned short specifically (and not
unsigned char).  But in practice it works, the patch LGTM.

Simon

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] Let phex and phex_nz handle sizeof_l==1
  2022-03-01 14:26   ` Simon Marchi
@ 2022-03-01 14:32     ` Tom Tromey
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Tromey @ 2022-03-01 14:32 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Tom Tromey, gdb-patches

>>>>> "Simon" == Simon Marchi <simon.marchi@polymtl.ca> writes:

>> str = get_print_cell ();
>> xsnprintf (str, PRINT_CELL_SIZE, "%04x", (unsigned short) (l & 0xffff));
>> break;
>> +    case 1:
>> +      str = get_print_cell ();
>> +      xsnprintf (str, PRINT_CELL_SIZE, "%02x", (unsigned short) (l & 0xff));

Simon> I'm just wondering why you're casting to unsigned short specifically (and not
Simon> unsigned char).  But in practice it works, the patch LGTM.

I just copied the code above.  I think any integer type that isn't wider
than "int" would be ok here, as they all get promoted.

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] Handle non-ASCII identifiers in Ada
  2022-03-01  3:28       ` Eli Zaretskii
@ 2022-03-01 14:49         ` Tom Tromey
  2022-03-01 15:17           ` Eli Zaretskii
  0 siblings, 1 reply; 15+ messages in thread
From: Tom Tromey @ 2022-03-01 14:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Tom Tromey, gdb-patches

>> Ordinarily, yes, but in practice the Ada compiler uses quite old data,
>> and so whatever is provided by a recent-ish Python is more than good
>> enough.

Eli> How old is "old data", and how recent-ish should be "recent-ish
Eli> Python", for this purpose?

The Ada front end doesn't actually document this, aside from:

   --  Note these tables are derived from those given in AI-285. For details
   --  see www.ada-auth.org/cgi-bin/cvsweb.cgi/AIs/AI-00285.TXT?rev=1.22.

... which I know to be false because other changes have been made to
some of these tables after this.  (You can see this code in
gcc/gnat/libgnat/s-utf_32.adb.)

However, when I examine the case-folding tables (e.g. look for
"Lower_Case_Letters"), the last letters seen are:

     (16#10428#, 16#1044F#),  -- DESERET SMALL LETTER LONG I .. DESERET SMALL LETTER EW
     (16#E0061#, 16#E007A#)); -- TAG LATIN SMALL LETTER A .. TAG LATIN SMALL LETTER Z

These were in Unicode back in 2001.

Eli> Or maybe we should document what is the
Eli> oldest version of Python that currently suits the needs?

Most people shouldn't run this script.  The output is checked in.  And
if they do and get wildly different results, that will be caught in
review.

Of course, it won't really matter, because you can't really write an Ada
program -- at least, not using GNAT -- that uses anything after 2001
anyway.  This covers all the Python versions that are in normal use.

For example Python 2.7, the oldest one I have around (and for which gdb
is going to drop support soon anyway):

>>> import unicodedata
>>> unicodedata.unidata_version
'5.2.0'

This version of the data comes from 2009, plenty new enough.

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] Handle non-ASCII identifiers in Ada
  2022-03-01 14:49         ` Tom Tromey
@ 2022-03-01 15:17           ` Eli Zaretskii
  0 siblings, 0 replies; 15+ messages in thread
From: Eli Zaretskii @ 2022-03-01 15:17 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

> From: Tom Tromey <tromey@adacore.com>
> Cc: Tom Tromey <tromey@adacore.com>,  gdb-patches@sourceware.org
> Date: Tue, 01 Mar 2022 07:49:29 -0700
> 
> These were in Unicode back in 2001.
> 
> Eli> Or maybe we should document what is the
> Eli> oldest version of Python that currently suits the needs?
> 
> Most people shouldn't run this script.  The output is checked in.  And
> if they do and get wildly different results, that will be caught in
> review.
> 
> Of course, it won't really matter, because you can't really write an Ada
> program -- at least, not using GNAT -- that uses anything after 2001
> anyway.  This covers all the Python versions that are in normal use.
> 
> For example Python 2.7, the oldest one I have around (and for which gdb
> is going to drop support soon anyway):
> 
> >>> import unicodedata
> >>> unicodedata.unidata_version
> '5.2.0'
> 
> This version of the data comes from 2009, plenty new enough.

Ok, I guess there's no real problem, then.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/5] Handle non-ASCII identifiers in Ada
  2022-02-28 18:33 ` [PATCH 5/5] Handle non-ASCII identifiers in Ada Tom Tromey
  2022-02-28 18:59   ` Eli Zaretskii
@ 2022-03-01 15:33   ` Tom Tromey
  1 sibling, 0 replies; 15+ messages in thread
From: Tom Tromey @ 2022-03-01 15:33 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches

>>>>> "Tom" == Tom Tromey <tromey@adacore.com> writes:

I meant to mention...

Tom> diff --git a/gdb/ada-unicode.py b/gdb/ada-unicode.py
Tom> new file mode 100755
Tom> index 00000000000..31fa2f11520
Tom> --- /dev/null
Tom> +++ b/gdb/ada-unicode.py
Tom> @@ -0,0 +1,114 @@
[...]
Tom> +copyright = """/* *INDENT-OFF* */ /* THIS FILE IS GENERATED -*- buffer-read-only: t -*- */
Tom> +/* vi:set ro: */
[...]

I plan to rebase this onto:

https://sourceware.org/pipermail/gdb-patches/2022-February/185866.html

when appropriate.

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] Handle non-ASCII identifiers in Ada
  2022-02-28 18:32 [PATCH 0/5] Handle non-ASCII identifiers in Ada Tom Tromey
                   ` (4 preceding siblings ...)
  2022-02-28 18:33 ` [PATCH 5/5] Handle non-ASCII identifiers in Ada Tom Tromey
@ 2022-03-07 14:52 ` Tom Tromey
  5 siblings, 0 replies; 15+ messages in thread
From: Tom Tromey @ 2022-03-07 14:52 UTC (permalink / raw)
  To: Tom Tromey via Gdb-patches; +Cc: Tom Tromey

>>>>> "Tom" == Tom Tromey via Gdb-patches <gdb-patches@sourceware.org> writes:

Tom> Ada supports non-ASCII identifiers, but gdb cannot access them.  This
Tom> series adds the missing support.

I've adjusted the docs per review and I am going to check this in now.

I've also made one minor change to the new .py script, to adapt it to
use gdbcopyright.py.

Tom

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-03-07 14:52 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-28 18:32 [PATCH 0/5] Handle non-ASCII identifiers in Ada Tom Tromey
2022-02-28 18:33 ` [PATCH 1/5] Simplify a regular expression in ada-lex.l Tom Tromey
2022-02-28 18:33 ` [PATCH 2/5] Don't pre-size result string in ada_decode Tom Tromey
2022-02-28 18:33 ` [PATCH 3/5] Let phex and phex_nz handle sizeof_l==1 Tom Tromey
2022-03-01 14:26   ` Simon Marchi
2022-03-01 14:32     ` Tom Tromey
2022-02-28 18:33 ` [PATCH 4/5] Define HOST_UTF32 in charset.h Tom Tromey
2022-02-28 18:33 ` [PATCH 5/5] Handle non-ASCII identifiers in Ada Tom Tromey
2022-02-28 18:59   ` Eli Zaretskii
2022-02-28 20:59     ` Tom Tromey
2022-03-01  3:28       ` Eli Zaretskii
2022-03-01 14:49         ` Tom Tromey
2022-03-01 15:17           ` Eli Zaretskii
2022-03-01 15:33   ` Tom Tromey
2022-03-07 14:52 ` [PATCH 0/5] " Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).