From: Tom de Vries <tdevries@suse.de>
To: Tom Tromey <tom@tromey.com>,
Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>
Subject: Re: [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt
Date: Thu, 15 Jun 2023 11:42:17 +0200 [thread overview]
Message-ID: <5add65be-93f8-8786-c987-447a1b31538d@suse.de> (raw)
In-Reply-To: <87ilbs5xmh.fsf@tromey.com>
[-- Attachment #1: Type: text/plain, Size: 399 bytes --]
On 6/12/23 20:44, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:
>
> Tom> The iterator constructor also needs a specification of encoding and
> Tom> width. I suppose for encoding we could use host_charset (), but I
> Tom> don't know how to get the base width of that char set.
>
> The base width is 1.
OK, then how about this?
Thanks,
- Tom
[-- Attachment #2: 0001-gdb-tui-Handle-unicode-chars-in-prompt.patch --]
[-- Type: text/x-patch, Size: 8339 bytes --]
From 02b786977e113302a133094ddd5b5e770679b569 Mon Sep 17 00:00:00 2001
From: Tom de Vries <tdevries@suse.de>
Date: Wed, 24 May 2023 19:54:34 +0200
Subject: [PATCH] [gdb/tui] Handle unicode chars in prompt
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Let's try to set the prompt using a unicode character, say '❯', aka U+276F
(heavy right-pointing angle quotation mark ornament).
This works fine on an xterm with CLI (with X marking the position of the
blinking cursor):
...
$ gdb -q -ex "set prompt GDB❯ "
GDB❯ X
...
but with TUI:
...
$ gdb -q -tui -ex "set prompt GDB❯ "
...
we get instead:
...
GDB GDB X
...
We can use the test-case gdb.tui/unicode-prompt.exp to get more details, using
tuiterm.
With Term::dump_screen we have:
...
16 (gdb) set prompt GDB❯
17 GDB❯ GDB❯ GDB❯ set prompt (gdb)
18 (gdb)
...
and with Term::dump_screen_with_attrs (summarizing using attribute sets <attrs1>
and <attrs2>):
...
16 (gdb) set prompt GDB❯
17 GDB<attrs1>❯<attrs2> GDB<attrs1>❯<attrs2> GDB<attrs1>❯<attrs2> set prompt (gdb)
18 (gdb)
...
where:
...
<attrs1> == <reverse:1><invisible:1><blinking:1><intensity:bold>
<attrs2> == <reverse:0><invisible:0><blinking:0><intensity:normal>
...
This explains why we didn't see the unicode char on xterm: it's hidden
because the invisible attribute is set.
So, there seem to be two problems:
- the attributes are incorrect, and
- the prompt is repeated a couple of times.
In TUI, the prompt is written out by tui_puts_internal, which outputs one byte
at a time using waddch, which apparently breaks multi-byte char support.
Fix this by detecting multi-byte chars in tui_puts_internal, and printing them using
waddnstr.
Tested on x86_64-linux.
Reported-By: wuzy01@qq.com
PR tui/28800
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28800
---
gdb/charset.c | 20 +++++
gdb/charset.h | 7 ++
gdb/testsuite/gdb.tui/unicode-prompt.exp | 52 ++++++++++++
gdb/tui/tui-io.c | 100 +++++++++++++++++++----
4 files changed, 164 insertions(+), 15 deletions(-)
create mode 100644 gdb/testsuite/gdb.tui/unicode-prompt.exp
diff --git a/gdb/charset.c b/gdb/charset.c
index bce6050c97f..765dce46fc3 100644
--- a/gdb/charset.c
+++ b/gdb/charset.c
@@ -690,6 +690,26 @@ wchar_iterator::iterate (enum wchar_iterate_result *out_result,
return -1;
}
+/* See charset.h. */
+
+void
+wchar_iterator::skip (size_t len)
+{
+ m_input += len;
+
+ gdb_assert (len <= m_bytes);
+ m_bytes -= len;
+}
+
+/* See charset.h. */
+
+void
+wchar_iterator::reset (const gdb_byte *input, size_t bytes)
+{
+ m_input = input;
+ m_bytes = bytes;
+}
+
struct charset_vector
{
~charset_vector ()
diff --git a/gdb/charset.h b/gdb/charset.h
index 52194547b0c..6bb0ce14af3 100644
--- a/gdb/charset.h
+++ b/gdb/charset.h
@@ -126,6 +126,13 @@ class wchar_iterator
int iterate (enum wchar_iterate_result *out_result, gdb_wchar_t **out_chars,
const gdb_byte **ptr, size_t *len);
+ /* Increase the input buffer pointer by LEN bytes. */
+ void skip (size_t len);
+
+ /* Reset the input buffer pointer to INPUT and the number of bytes in the
+ input buffer to BYTES. */
+ void reset (const gdb_byte *input, size_t bytes);
+
private:
/* The underlying iconv descriptor. */
diff --git a/gdb/testsuite/gdb.tui/unicode-prompt.exp b/gdb/testsuite/gdb.tui/unicode-prompt.exp
new file mode 100644
index 00000000000..1351235743d
--- /dev/null
+++ b/gdb/testsuite/gdb.tui/unicode-prompt.exp
@@ -0,0 +1,52 @@
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+require allow_tui_tests
+
+tuiterm_env
+
+save_vars { env(LC_ALL) } {
+ # Override "C" settings from default_gdb_init.
+ setenv LC_ALL "C.UTF-8"
+
+ Term::clean_restart 24 80
+
+ if {![Term::enter_tui]} {
+ unsupported "TUI not supported"
+ return
+ }
+
+ set unicode_char "\u276F"
+
+ set color_on "\\033\[31m"
+ set color_off "\\033\[0m"
+
+ set prompt "GDB$color_on$unicode_char$color_off "
+ set prompt_no_color "GDB$unicode_char "
+ set prompt_no_color_re [string_to_regexp $prompt_no_color]
+
+ # Set new prompt.
+ send_gdb "set prompt $prompt\n"
+ # Set old prompt back.
+ send_gdb "set prompt (gdb) \n"
+
+ gdb_assert { [Term::wait_for "^${prompt_no_color_re}set prompt $gdb_prompt "] } \
+ "prompt with unicode char"
+
+ set prompt_with_attrs_re "GDB<fg:red>$unicode_char<fg:default> "
+ set line [Term::get_line_with_attrs [expr $Term::_cur_row - 1]]
+ gdb_assert { [regexp "^$prompt_with_attrs_re.*$" $line] } \
+ "colored unicode char"
+}
diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
index 8cb68d12408..45eb9c5b755 100644
--- a/gdb/tui/tui-io.c
+++ b/gdb/tui/tui-io.c
@@ -47,6 +47,7 @@
#include <map>
#include "pager.h"
#include "gdbsupport/gdb-checked-static-cast.h"
+#include "charset.h"
/* This redefines CTRL if it is not already defined, so it must come
after terminal state releated include files like <term.h> and
@@ -520,30 +521,99 @@ tui_puts_internal (WINDOW *w, const char *string, int *height)
char c;
int prev_col = 0;
bool saw_nl = false;
+ size_t skip = 0;
+ wchar_iterator it ((gdb_byte *)string, strlen (string), host_charset (), 1);
- while ((c = *string++) != 0)
+ while (true)
{
- if (c == '\1' || c == '\2')
- {
- /* Ignore these, they are readline escape-marking
- sequences. */
- continue;
- }
+ bool handled = false;
+
+ /* Get iterator in sync with string. */
+ it.skip (skip);
+ skip = 0;
+
+ /* Detect and handle multibyte chars. */
+ {
+ enum wchar_iterate_result res2;
+ gdb_wchar_t *dummy1;
+ const gdb_byte *dummy2;
+ size_t len;
+ int res = it.iterate (&res2, &dummy1, &dummy2, &len);
+ if (res < 0)
+ {
+ /* End of string. */
+ gdb_assert (res2 == wchar_iterate_eof);
+ break;
+ }
+
+ if (res == 0)
+ {
+ if (res2 == wchar_iterate_invalid)
+ {
+ /* Let single-byte char code handle it. */
+ gdb_assert (len == 1);
+ }
+ else if (res2 == wchar_iterate_incomplete)
+ {
+ /* Iterator has been setup to return end-of-string on next
+ call to iterate. Make that an advance-by-one instead, and
+ let single-byte char code handle it. */
+ it.reset ((gdb_byte *)(string + 1), strlen (string + 1));
+ }
+ else
+ gdb_assert_not_reached ("");
+ }
+ else
+ {
+ /* res > 0. */
+ gdb_assert (res2 == wchar_iterate_ok);
+ if (len > 1)
+ {
+ /* Multi-byte char. Handle it. */
+ waddnstr (w, string, len);
+ string += len;
+ handled = true;
+ }
+ else
+ {
+ /* Single-byte char. Let single-byte char code handle it. */
+ gdb_assert (len == 1);
+ }
+ }
+ }
- if (c == '\033')
+ if (!handled)
{
- size_t bytes_read = apply_ansi_escape (w, string - 1);
- if (bytes_read > 0)
+ c = *string++;
+ if (c == '\0')
+ {
+ /* End of string. */
+ break;
+ }
+
+ if (c == '\1' || c == '\2')
{
- string = string + bytes_read - 1;
+ /* Ignore these, they are readline escape-marking
+ sequences. */
continue;
}
- }
- if (c == '\n')
- saw_nl = true;
+ if (c == '\033')
+ {
+ size_t bytes_read = apply_ansi_escape (w, string - 1);
+ if (bytes_read > 0)
+ {
+ skip = bytes_read - 1;
+ string += skip;
+ continue;
+ }
+ }
+
+ if (c == '\n')
+ saw_nl = true;
- do_tui_putc (w, c);
+ do_tui_putc (w, c);
+ }
if (height != nullptr)
{
base-commit: 2b462da34de977f953a778afa0cb55e3286ece3d
--
2.35.3
next prev parent reply other threads:[~2023-06-15 9:41 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-09 9:18 [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal Tom de Vries
2023-06-09 9:18 ` [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt Tom de Vries
2023-06-09 14:40 ` Tom Tromey
2023-06-09 15:39 ` Tom Tromey
2023-06-12 15:19 ` Tom de Vries
2023-06-12 18:44 ` Tom Tromey
2023-06-15 9:42 ` Tom de Vries [this message]
2023-06-09 14:35 ` [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal Tom Tromey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5add65be-93f8-8786-c987-447a1b31538d@suse.de \
--to=tdevries@suse.de \
--cc=gdb-patches@sourceware.org \
--cc=tom@tromey.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).