[PATCH v2 1/2] [gdb/tui] Simplify tui_puts

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal
@ 2023-06-09  9:18 Tom de Vries
  2023-06-09  9:18 ` [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt Tom de Vries
  2023-06-09 14:35 ` [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal Tom Tromey
  0 siblings, 2 replies; 8+ messages in thread
From: Tom de Vries @ 2023-06-09  9:18 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

Simplify tui_puts_internal by using continue, as per this [1] coding standard
rule, making the function more readable and easier to understand.

No functional changes.

Tested on x86_64-linux.

[1] https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code
---
 gdb/tui/tui-io.c | 39 ++++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
index 908cb834e4c..8cb68d12408 100644
--- a/gdb/tui/tui-io.c
+++ b/gdb/tui/tui-io.c
@@ -523,36 +523,37 @@ tui_puts_internal (WINDOW *w, const char *string, int *height)
 
   while ((c = *string++) != 0)
     {
-      if (c == '\n')
-	saw_nl = true;
-
       if (c == '\1' || c == '\2')
 	{
 	  /* Ignore these, they are readline escape-marking
 	     sequences.  */
+	  continue;
 	}
-      else
+
+      if (c == '\033')
 	{
-	  if (c == '\033')
+	  size_t bytes_read = apply_ansi_escape (w, string - 1);
+	  if (bytes_read > 0)
 	    {
-	      size_t bytes_read = apply_ansi_escape (w, string - 1);
-	      if (bytes_read > 0)
-		{
-		  string = string + bytes_read - 1;
-		  continue;
-		}
+	      string = string + bytes_read - 1;
+	      continue;
 	    }
-	  do_tui_putc (w, c);
+	}
 
-	  if (height != nullptr)
-	    {
-	      int col = getcurx (w);
-	      if (col <= prev_col)
-		++*height;
-	      prev_col = col;
-	    }
+      if (c == '\n')
+	saw_nl = true;
+
+      do_tui_putc (w, c);
+
+      if (height != nullptr)
+	{
+	  int col = getcurx (w);
+	  if (col <= prev_col)
+	    ++*height;
+	  prev_col = col;
 	}
     }
+
   if (TUI_CMD_WIN != nullptr && w == TUI_CMD_WIN->handle.get ())
     update_cmdwin_start_line ();
   if (saw_nl)

base-commit: 30711c89cc7dcd2bd4ea772b2f5dc639c5b1cfcc
-- 
2.35.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt
  2023-06-09  9:18 [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal Tom de Vries
@ 2023-06-09  9:18 ` Tom de Vries
  2023-06-09 14:40   ` Tom Tromey
  2023-06-09 15:39   ` Tom Tromey
  2023-06-09 14:35 ` [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal Tom Tromey
  1 sibling, 2 replies; 8+ messages in thread
From: Tom de Vries @ 2023-06-09  9:18 UTC (permalink / raw)
  To: gdb-patches; +Cc: Tom Tromey

Let's try to set the prompt using a unicode character, say '❯', aka U+276F
(heavy right-pointing angle quotation mark ornament).

This works fine on an xterm with CLI (with X marking the position of the
blinking cursor):
...
$ gdb -q -ex "set prompt GDB❯ "
GDB❯ X
...
but with TUI:
...
$ gdb -q -tui -ex "set prompt GDB❯ "
...
we get instead:
...
GDB  GDB  X
...

We can use the test-case gdb.tui/unicode-prompt.exp to get more details, using
tuiterm.

With Term::dump_screen we have:
...
   16 (gdb) set prompt GDB❯
   17 GDB❯ GDB❯ GDB❯ set prompt (gdb)
   18 (gdb)
...
and with Term::dump_screen_with_attrs (summarizing using attribute sets <attrs1>
and <attrs2>):
...
   16 (gdb) set prompt GDB❯
   17 GDB<attrs1>❯<attrs2> GDB<attrs1>❯<attrs2> GDB<attrs1>❯<attrs2> set prompt (gdb)
   18 (gdb)
...
where:
...
<attrs1> == <reverse:1><invisible:1><blinking:1><intensity:bold>
<attrs2> == <reverse:0><invisible:0><blinking:0><intensity:normal>
...

This explains why we didn't see the unicode char on xterm: it's hidden
because the invisible attribute is set.

So, there seem to be two problems:
- the attributes are incorrect, and
- the prompt is repeated a couple of times.

In TUI, the prompt is written out by tui_puts_internal, which outputs one byte
at a time using waddch, which apparantly breaks multi-byte char support.

Fix this by detecting multi-byte chars in tui_puts_internal, and printing them using
waddnstr.

Tested on x86_64-linux.

Reported-By: wuzy01@qq.com

PR tui/28800
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28800
---
 gdb/testsuite/gdb.tui/unicode-prompt.exp |  43 +++++++++
 gdb/tui/tui-io.c                         | 106 +++++++++++++++++++----
 2 files changed, 134 insertions(+), 15 deletions(-)
 create mode 100644 gdb/testsuite/gdb.tui/unicode-prompt.exp

diff --git a/gdb/testsuite/gdb.tui/unicode-prompt.exp b/gdb/testsuite/gdb.tui/unicode-prompt.exp
new file mode 100644
index 00000000000..84ac33d71bf
--- /dev/null
+++ b/gdb/testsuite/gdb.tui/unicode-prompt.exp
@@ -0,0 +1,43 @@
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+require allow_tui_tests
+
+tuiterm_env
+
+save_vars { env(LC_ALL) } {
+    # Override "C" settings from default_gdb_init.
+    setenv LC_ALL "C.UTF-8"
+
+    Term::clean_restart 24 80
+
+    if {![Term::enter_tui]} {
+	unsupported "TUI not supported"
+	return
+    }
+
+    set unicode_char "\u276F"
+
+    set prompt "GDB$unicode_char "
+    set prompt_re [string_to_regexp $prompt]
+
+    # Set new prompt.
+    send_gdb "set prompt $prompt\n"
+    # Set old prompt back.
+    send_gdb "set prompt (gdb) \n"
+
+    gdb_assert { [Term::wait_for "^${prompt_re}set prompt $gdb_prompt "] } \
+	"prompt with unicode char"
+}
diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
index 8cb68d12408..75ad20a74d1 100644
--- a/gdb/tui/tui-io.c
+++ b/gdb/tui/tui-io.c
@@ -514,6 +514,55 @@ tui_puts (const char *string, WINDOW *w)
     update_cmdwin_start_line ();
 }
 
+/* Use HAVE_BTOWC as sign that we have functioning wchar_t support.  See also
+   gdb_wchar.h.  */
+
+#ifdef HAVE_BTOWC
+/* Return true if STRING starts with a multi-byte char.  Return the length of
+   the multi-byte char in LEN, or 0 in case it's a multi-byte null char.
+   Implementation based on _rl_read_mbchar.  */
+
+static bool
+is_mb_char (const char *string, int &len)
+{
+  for (len = 1; len <= MB_CUR_MAX; len++)
+    {
+      size_t res;
+
+      {
+	mbstate_t ps;
+	memset (&ps, 0, sizeof (mbstate_t));
+	res = mbrtowc (nullptr, string, len, &ps);
+      }
+
+      if (res == (size_t)(-1))
+	{
+	  /* Not a multi-byte char.  */
+	  return false;
+	}
+
+      if (res == (size_t)(-2))
+	{
+	  /* Part of a multi-byte char.  */
+	  continue;
+	}
+
+      if (res == 0)
+	{
+	  /* Multi-byte null char.  */
+	  len = 0;
+	  return true;
+	}
+
+      /* Complete multi-byte char.  */
+      gdb_assert (res == len);
+      return true;
+    }
+
+  return false;
+}
+#endif
+
 static void
 tui_puts_internal (WINDOW *w, const char *string, int *height)
 {
@@ -521,29 +570,56 @@ tui_puts_internal (WINDOW *w, const char *string, int *height)
   int prev_col = 0;
   bool saw_nl = false;
 
-  while ((c = *string++) != 0)
+  while (true)
     {
-      if (c == '\1' || c == '\2')
-	{
-	  /* Ignore these, they are readline escape-marking
-	     sequences.  */
-	  continue;
-	}
+      bool handled = false;
 
-      if (c == '\033')
+#ifdef HAVE_BTOWC
+      {
+	int mb_len;
+	if (is_mb_char (string, mb_len) && mb_len != 1)
+	  {
+	    if (mb_len == 0)
+	      {
+		/* Multi-byte null char.  */
+		break;
+	      }
+
+	    waddnstr (w, string, mb_len);
+	    string += mb_len;
+	    handled = true;
+	  }
+      }
+#endif
+
+      if (!handled)
 	{
-	  size_t bytes_read = apply_ansi_escape (w, string - 1);
-	  if (bytes_read > 0)
+	  c = *string++;
+	  if (c == '\0')
+	    break;
+
+	  if (c == '\1' || c == '\2')
 	    {
-	      string = string + bytes_read - 1;
+	      /* Ignore these, they are readline escape-marking
+		 sequences.  */
 	      continue;
 	    }
-	}
 
-      if (c == '\n')
-	saw_nl = true;
+	  if (c == '\033')
+	    {
+	      size_t bytes_read = apply_ansi_escape (w, string - 1);
+	      if (bytes_read > 0)
+		{
+		  string = string + bytes_read - 1;
+		  continue;
+		}
+	    }
+
+	  if (c == '\n')
+	    saw_nl = true;
 
-      do_tui_putc (w, c);
+	  do_tui_putc (w, c);
+	}
 
       if (height != nullptr)
 	{
-- 
2.35.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal
  2023-06-09  9:18 [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal Tom de Vries
  2023-06-09  9:18 ` [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt Tom de Vries
@ 2023-06-09 14:35 ` Tom Tromey
  1 sibling, 0 replies; 8+ messages in thread
From: Tom Tromey @ 2023-06-09 14:35 UTC (permalink / raw)
  To: Tom de Vries via Gdb-patches; +Cc: Tom de Vries, Tom Tromey

>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:

Tom> Simplify tui_puts_internal by using continue, as per this [1] coding standard
Tom> rule, making the function more readable and easier to understand.

Tom> No functional changes.

Tom> Tested on x86_64-linux.

Thanks, looks good.

Reviewed-By: Tom Tromey <tom@tromey.com>

Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt
  2023-06-09  9:18 ` [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt Tom de Vries
@ 2023-06-09 14:40   ` Tom Tromey
  2023-06-09 15:39   ` Tom Tromey
  1 sibling, 0 replies; 8+ messages in thread
From: Tom Tromey @ 2023-06-09 14:40 UTC (permalink / raw)
  To: Tom de Vries via Gdb-patches; +Cc: Tom de Vries, Tom Tromey

>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:

Tom> Fix this by detecting multi-byte chars in tui_puts_internal, and printing them using
Tom> waddnstr.

Is the detection really needed?  What tui_puts_internal instead just
always collected the longest span of printable characters and used
waddnstr?

Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt
  2023-06-09  9:18 ` [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt Tom de Vries
  2023-06-09 14:40   ` Tom Tromey
@ 2023-06-09 15:39   ` Tom Tromey
  2023-06-12 15:19     ` Tom de Vries
  1 sibling, 1 reply; 8+ messages in thread
From: Tom Tromey @ 2023-06-09 15:39 UTC (permalink / raw)
  To: Tom de Vries via Gdb-patches; +Cc: Tom de Vries, Tom Tromey

>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:

Tom> +#ifdef HAVE_BTOWC
Tom> +      {
Tom> +	int mb_len;
Tom> +	if (is_mb_char (string, mb_len) && mb_len != 1)
Tom> +	  {
Tom> +	    if (mb_len == 0)
Tom> +	      {
Tom> +		/* Multi-byte null char.  */
Tom> +		break;
Tom> +	      }
Tom> +
Tom> +	    waddnstr (w, string, mb_len);
Tom> +	    string += mb_len;
Tom> +	    handled = true;
Tom> +	  }
Tom> +      }
Tom> +#endif

I wonder if this would be simplified by using wchar_iterator.

This iterator tries to convert just a single character, and has out
parameters that reflect which input bytes were converted.

The main benefit would be less #ifdef and no need for is_mb_char in
tui-io.c.

You may need to add a method to wchar_iterator to let the caller skip
some bytes (you wouldn't want to create a new one on each iteration, as
it calls iconv_open).  That way the escape handling could stay pretty
much the same.

Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt
  2023-06-09 15:39   ` Tom Tromey
@ 2023-06-12 15:19     ` Tom de Vries
  2023-06-12 18:44       ` Tom Tromey
  0 siblings, 1 reply; 8+ messages in thread
From: Tom de Vries @ 2023-06-12 15:19 UTC (permalink / raw)
  To: Tom Tromey, Tom de Vries via Gdb-patches

On 6/9/23 17:39, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> Tom> +#ifdef HAVE_BTOWC
> Tom> +      {
> Tom> +	int mb_len;
> Tom> +	if (is_mb_char (string, mb_len) && mb_len != 1)
> Tom> +	  {
> Tom> +	    if (mb_len == 0)
> Tom> +	      {
> Tom> +		/* Multi-byte null char.  */
> Tom> +		break;
> Tom> +	      }
> Tom> +
> Tom> +	    waddnstr (w, string, mb_len);
> Tom> +	    string += mb_len;
> Tom> +	    handled = true;
> Tom> +	  }
> Tom> +      }
> Tom> +#endif
> 
> I wonder if this would be simplified by using wchar_iterator.
> 
> This iterator tries to convert just a single character, and has out
> parameters that reflect which input bytes were converted.
> 
> The main benefit would be less #ifdef and no need for is_mb_char in
> tui-io.c.
> 

The iterator constructor also needs a specification of encoding and 
width.  I suppose for encoding we could use host_charset (), but I don't 
know how to get the base width of that char set.

ISTM that's a problem that the multibyte functions take care of for us.

Thanks,
- Tom

> You may need to add a method to wchar_iterator to let the caller skip
> some bytes (you wouldn't want to create a new one on each iteration, as
> it calls iconv_open).  That way the escape handling could stay pretty
> much the same.
> 
> Tom


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt
  2023-06-12 15:19     ` Tom de Vries
@ 2023-06-12 18:44       ` Tom Tromey
  2023-06-15  9:42         ` Tom de Vries
  0 siblings, 1 reply; 8+ messages in thread
From: Tom Tromey @ 2023-06-12 18:44 UTC (permalink / raw)
  To: Tom de Vries via Gdb-patches; +Cc: Tom Tromey, Tom de Vries

>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:

Tom> The iterator constructor also needs a specification of encoding and
Tom> width.  I suppose for encoding we could use host_charset (), but I
Tom> don't know how to get the base width of that char set.

The base width is 1.

Tom

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt
  2023-06-12 18:44       ` Tom Tromey
@ 2023-06-15  9:42         ` Tom de Vries
  0 siblings, 0 replies; 8+ messages in thread
From: Tom de Vries @ 2023-06-15  9:42 UTC (permalink / raw)
  To: Tom Tromey, Tom de Vries via Gdb-patches

[-- Attachment #1: Type: text/plain, Size: 399 bytes --]

On 6/12/23 20:44, Tom Tromey wrote:
>>>>>> "Tom" == Tom de Vries via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> Tom> The iterator constructor also needs a specification of encoding and
> Tom> width.  I suppose for encoding we could use host_charset (), but I
> Tom> don't know how to get the base width of that char set.
> 
> The base width is 1.

OK, then how about this?

Thanks,
- Tom

[-- Attachment #2: 0001-gdb-tui-Handle-unicode-chars-in-prompt.patch --]
[-- Type: text/x-patch, Size: 8339 bytes --]

From 02b786977e113302a133094ddd5b5e770679b569 Mon Sep 17 00:00:00 2001
From: Tom de Vries <tdevries@suse.de>
Date: Wed, 24 May 2023 19:54:34 +0200
Subject: [PATCH] [gdb/tui] Handle unicode chars in prompt
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Let's try to set the prompt using a unicode character, say '❯', aka U+276F
(heavy right-pointing angle quotation mark ornament).

This works fine on an xterm with CLI (with X marking the position of the
blinking cursor):
...
$ gdb -q -ex "set prompt GDB❯ "
GDB❯ X
...
but with TUI:
...
$ gdb -q -tui -ex "set prompt GDB❯ "
...
we get instead:
...
GDB  GDB  X
...

We can use the test-case gdb.tui/unicode-prompt.exp to get more details, using
tuiterm.

With Term::dump_screen we have:
...
   16 (gdb) set prompt GDB❯
   17 GDB❯ GDB❯ GDB❯ set prompt (gdb)
   18 (gdb)
...
and with Term::dump_screen_with_attrs (summarizing using attribute sets <attrs1>
and <attrs2>):
...
   16 (gdb) set prompt GDB❯
   17 GDB<attrs1>❯<attrs2> GDB<attrs1>❯<attrs2> GDB<attrs1>❯<attrs2> set prompt (gdb)
   18 (gdb)
...
where:
...
<attrs1> == <reverse:1><invisible:1><blinking:1><intensity:bold>
<attrs2> == <reverse:0><invisible:0><blinking:0><intensity:normal>
...

This explains why we didn't see the unicode char on xterm: it's hidden
because the invisible attribute is set.

So, there seem to be two problems:
- the attributes are incorrect, and
- the prompt is repeated a couple of times.

In TUI, the prompt is written out by tui_puts_internal, which outputs one byte
at a time using waddch, which apparently breaks multi-byte char support.

Fix this by detecting multi-byte chars in tui_puts_internal, and printing them using
waddnstr.

Tested on x86_64-linux.

Reported-By: wuzy01@qq.com

PR tui/28800
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28800
---
 gdb/charset.c                            |  20 +++++
 gdb/charset.h                            |   7 ++
 gdb/testsuite/gdb.tui/unicode-prompt.exp |  52 ++++++++++++
 gdb/tui/tui-io.c                         | 100 +++++++++++++++++++----
 4 files changed, 164 insertions(+), 15 deletions(-)
 create mode 100644 gdb/testsuite/gdb.tui/unicode-prompt.exp

diff --git a/gdb/charset.c b/gdb/charset.c
index bce6050c97f..765dce46fc3 100644
--- a/gdb/charset.c
+++ b/gdb/charset.c
@@ -690,6 +690,26 @@ wchar_iterator::iterate (enum wchar_iterate_result *out_result,
   return -1;
 }
 
+/* See charset.h.  */
+
+void
+wchar_iterator::skip (size_t len)
+{
+  m_input += len;
+
+  gdb_assert (len <= m_bytes);
+  m_bytes -= len;
+}
+
+/* See charset.h.  */
+
+void
+wchar_iterator::reset (const gdb_byte *input, size_t bytes)
+{
+  m_input = input;
+  m_bytes = bytes;
+}
+
 struct charset_vector
 {
   ~charset_vector ()
diff --git a/gdb/charset.h b/gdb/charset.h
index 52194547b0c..6bb0ce14af3 100644
--- a/gdb/charset.h
+++ b/gdb/charset.h
@@ -126,6 +126,13 @@ class wchar_iterator
   int iterate (enum wchar_iterate_result *out_result, gdb_wchar_t **out_chars,
 	       const gdb_byte **ptr, size_t *len);
 
+  /* Increase the input buffer pointer by LEN bytes.  */
+  void skip (size_t len);
+
+  /* Reset the input buffer pointer to INPUT and the number of bytes in the
+     input buffer to BYTES.  */
+  void reset (const gdb_byte *input, size_t bytes);
+
  private:
 
   /* The underlying iconv descriptor.  */
diff --git a/gdb/testsuite/gdb.tui/unicode-prompt.exp b/gdb/testsuite/gdb.tui/unicode-prompt.exp
new file mode 100644
index 00000000000..1351235743d
--- /dev/null
+++ b/gdb/testsuite/gdb.tui/unicode-prompt.exp
@@ -0,0 +1,52 @@
+# Copyright 2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+require allow_tui_tests
+
+tuiterm_env
+
+save_vars { env(LC_ALL) } {
+    # Override "C" settings from default_gdb_init.
+    setenv LC_ALL "C.UTF-8"
+
+    Term::clean_restart 24 80
+
+    if {![Term::enter_tui]} {
+	unsupported "TUI not supported"
+	return
+    }
+
+    set unicode_char "\u276F"
+
+    set color_on "\\033\[31m"
+    set color_off "\\033\[0m"
+
+    set prompt "GDB$color_on$unicode_char$color_off "
+    set prompt_no_color "GDB$unicode_char "
+    set prompt_no_color_re [string_to_regexp $prompt_no_color]
+
+    # Set new prompt.
+    send_gdb "set prompt $prompt\n"
+    # Set old prompt back.
+    send_gdb "set prompt (gdb) \n"
+
+    gdb_assert { [Term::wait_for "^${prompt_no_color_re}set prompt $gdb_prompt "] } \
+	"prompt with unicode char"
+
+    set prompt_with_attrs_re "GDB<fg:red>$unicode_char<fg:default> "
+    set line [Term::get_line_with_attrs [expr $Term::_cur_row - 1]]
+    gdb_assert { [regexp "^$prompt_with_attrs_re.*$" $line] } \
+	"colored unicode char"
+}
diff --git a/gdb/tui/tui-io.c b/gdb/tui/tui-io.c
index 8cb68d12408..45eb9c5b755 100644
--- a/gdb/tui/tui-io.c
+++ b/gdb/tui/tui-io.c
@@ -47,6 +47,7 @@
 #include <map>
 #include "pager.h"
 #include "gdbsupport/gdb-checked-static-cast.h"
+#include "charset.h"
 
 /* This redefines CTRL if it is not already defined, so it must come
    after terminal state releated include files like <term.h> and
@@ -520,30 +521,99 @@ tui_puts_internal (WINDOW *w, const char *string, int *height)
   char c;
   int prev_col = 0;
   bool saw_nl = false;
+  size_t skip = 0;
+  wchar_iterator it ((gdb_byte *)string, strlen (string), host_charset (), 1);
 
-  while ((c = *string++) != 0)
+  while (true)
     {
-      if (c == '\1' || c == '\2')
-	{
-	  /* Ignore these, they are readline escape-marking
-	     sequences.  */
-	  continue;
-	}
+      bool handled = false;
+
+      /* Get iterator in sync with string.  */
+      it.skip (skip);
+      skip = 0;
+
+      /* Detect and handle multibyte chars.  */
+      {
+	enum wchar_iterate_result res2;
+	gdb_wchar_t *dummy1;
+	const gdb_byte *dummy2;
+	size_t len;
+	int res = it.iterate (&res2, &dummy1, &dummy2, &len);
+	if (res < 0)
+	  {
+	    /* End of string.  */
+	    gdb_assert (res2 == wchar_iterate_eof);
+	    break;
+	  }
+
+	if (res == 0)
+	  {
+	    if (res2 == wchar_iterate_invalid)
+	      {
+		/* Let single-byte char code handle it. */
+		gdb_assert (len == 1);
+	      }
+	    else if (res2 == wchar_iterate_incomplete)
+	      {
+		/* Iterator has been setup to return end-of-string on next
+		   call to iterate.  Make that an advance-by-one instead, and
+		   let single-byte char code handle it. */
+		it.reset ((gdb_byte *)(string + 1), strlen (string + 1));
+	      }
+	    else
+	      gdb_assert_not_reached ("");
+	  }
+	else
+	  {
+	    /* res > 0.  */
+	    gdb_assert (res2 == wchar_iterate_ok);
+	    if (len > 1)
+	      {
+		/* Multi-byte char.  Handle it.  */
+		waddnstr (w, string, len);
+		string += len;
+		handled = true;
+	      }
+	    else
+	      {
+		/* Single-byte char.  Let single-byte char code handle it.  */
+		gdb_assert (len == 1);
+	      }
+	  }
+      }
 
-      if (c == '\033')
+      if (!handled)
 	{
-	  size_t bytes_read = apply_ansi_escape (w, string - 1);
-	  if (bytes_read > 0)
+	  c = *string++;
+	  if (c == '\0')
+	    {
+	      /* End of string.  */
+	      break;
+	    }
+
+	  if (c == '\1' || c == '\2')
 	    {
-	      string = string + bytes_read - 1;
+	      /* Ignore these, they are readline escape-marking
+		 sequences.  */
 	      continue;
 	    }
-	}
 
-      if (c == '\n')
-	saw_nl = true;
+	  if (c == '\033')
+	    {
+	      size_t bytes_read = apply_ansi_escape (w, string - 1);
+	      if (bytes_read > 0)
+		{
+		  skip = bytes_read - 1;
+		  string += skip;
+		  continue;
+		}
+	    }
+
+	  if (c == '\n')
+	    saw_nl = true;
 
-      do_tui_putc (w, c);
+	  do_tui_putc (w, c);
+	}
 
       if (height != nullptr)
 	{

base-commit: 2b462da34de977f953a778afa0cb55e3286ece3d
-- 
2.35.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-06-15  9:41 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-09  9:18 [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal Tom de Vries
2023-06-09  9:18 ` [PATCH v2 2/2] [gdb/tui] Handle unicode chars in prompt Tom de Vries
2023-06-09 14:40   ` Tom Tromey
2023-06-09 15:39   ` Tom Tromey
2023-06-12 15:19     ` Tom de Vries
2023-06-12 18:44       ` Tom Tromey
2023-06-15  9:42         ` Tom de Vries
2023-06-09 14:35 ` [PATCH v2 1/2] [gdb/tui] Simplify tui_puts_internal Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).