public inbox for cygwin-patches@cygwin.com
 help / color / mirror / Atom feed
* Re: [Patch] Fix incorrect code page when setting console title on Win10
  2020-08-26  9:06 ` Corinna Vinschen
@ 2020-08-26  7:30   ` Johannes Schindelin
  2020-08-26 17:33     ` Corinna Vinschen
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Schindelin @ 2020-08-26  7:30 UTC (permalink / raw)
  To: cygwin-patches; +Cc: Dahan Gong, Corinna Vinschen

Hi Corinna,

On Wed, 26 Aug 2020, Corinna Vinschen wrote:

> On Aug 26 16:43, 宫大汉 via Cygwin-patches wrote:
> > When Cygwin sets console titles on Win10 (has_con_24bit_colors && !con_is_legacy),
> > `WriteConsoleA` is used and causes an error if:
> > 1. the environment variable of `LANG` is `***.UTF-8`
> > 2. and the code page of console.exe is not UTF-8
> >   1. e.g. on my Computer, it's GB2312, for Chinese text
> >
> >
> > I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738,
> > and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25.

Just in case you want to have a look at it, you can download the patch via
https://github.com/git-for-windows/msys2-runtime/commit/334f52a53a2e6b7f560b0e8810b9f672ebb3ad24.patch

FWIW my original reviewer comment was: "why not fix wpbuf.send() in the
first place?" but after having a good look around, that method seemed to
be called from so many places that I "got cold feet" of that approach.

For one, I saw at least one caller that wants to send Escape sequences,
and I have no idea whether it is a good idea to do that in the `*W()`
version of the `WriteConsole()` function.

So the real question from my side is: how to address properly the many
uses of `WriteConsoleA()` (which breaks all kinds of encodings in many
situations because Windows' idea of the current code page and Cygwin's
idea of the current locale are pretty often at odds).

The patch discussed here circumvents one of those call sites.

However, even though there have not been any callers of `WriteConsoleA()`
in Cygwin v3.0.7 (but four callers of `WriteConsoleW()` which I suspect
were converted to `*A()` calls in v3.1.0 by the Pseudo Console patches),
there are now a whopping 15 callers of that `*A()` function in Cygwin
v3.1.7. See here:

	$ git grep WriteConsoleA
	winsup/cygwin/fhandler_console.cc:    WriteConsoleA (handle, buf, ixput, wn, 0);
	winsup/cygwin/fhandler_console.cc:          WriteConsoleA (get_output_handle (), "\033[?1h", 5, NULL, 0);
	winsup/cygwin/fhandler_console.cc:            WriteConsoleA (get_output_handle (), &last_char, 1, 0, 0);
	winsup/cygwin/fhandler_console.cc:                WriteConsoleA (get_output_handle (),
	winsup/cygwin/fhandler_console.cc:            WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0);
	winsup/cygwin/fhandler_console.cc:            WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0);
	winsup/cygwin/fhandler_console.cc:            WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0);
	winsup/cygwin/fhandler_console.cc:            WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0);
	winsup/cygwin/fhandler_console.cc:            WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0);
	winsup/cygwin/fhandler_console.cc:            WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0);
	winsup/cygwin/fhandler_console.cc:                    WriteConsoleA (get_output_handle (),
	winsup/cygwin/fhandler_tty.cc:DEF_HOOK (WriteConsoleA);
	winsup/cygwin/fhandler_tty.cc:WriteConsoleA_Hooked
	winsup/cygwin/fhandler_tty.cc:  return WriteConsoleA_Orig (h, p, l, n, o);
	winsup/cygwin/fhandler_tty.cc:      DO_HOOK (NULL, WriteConsoleA);

That cannot be intentional, can it? We should always thrive to use the
`*W()` functions so that we can be sure that the expected encoding is
used, right?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Patch] Fix incorrect code page when setting console title on Win10
@ 2020-08-26  8:43 =?gb18030?B?uay087q6?=
  2020-08-26  9:06 ` Corinna Vinschen
  0 siblings, 1 reply; 5+ messages in thread
From: =?gb18030?B?uay087q6?= @ 2020-08-26  8:43 UTC (permalink / raw)
  To: =?gb18030?B?Y3lnd2luLXBhdGNoZXM=?=

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb18030", Size: 2757 bytes --]

When Cygwin sets console titles on Win10 (has_con_24bit_colors && !con_is_legacy),
`WriteConsoleA` is used and causes an error if:
1. the environment variable of `LANG` is `***.UTF-8`
2. and the code page of console.exe is not UTF-8
  1. e.g. on my Computer, it's GB2312, for Chinese text


I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738,
and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25.
Then what should I do, in order to fix this on Cygwin?


The below is the commit's .patch file:


From 334f52a53a2e6b7f560b0e8810b9f672ebb3ad24 Mon Sep 17 00:00:00 2001
From: Dahan Gong <gdh1995@qq.com&gt;
Date: Fri, 31 Jul 2020 22:06:54 +0800
Subject: [PATCH] Fix incorrect code page in console title


`WriteConsoleA` always follows the current code page of a console window, so it's not suitable to pass a multi-byte string in `get_ttyp ()-&gt;term_code_page` to it directly.


This PR turns to `WriteConsoleW` so that most characters will be translated "as is", and I've tested it on Win 10 v2004 (CMD: ver 10.0.19041.388).


Signed-off-by: gdh1995 <gdh1995@qq.com&gt;
---
&nbsp;winsup/cygwin/fhandler_console.cc | 18 +++++++++++++++++-
&nbsp;1 file changed, 17 insertions(+), 1 deletion(-)


diff --git a/winsup/cygwin/fhandler_console.cc b/winsup/cygwin/fhandler_console.cc
index dd979fb8e2..7870eeda81 100644
--- a/winsup/cygwin/fhandler_console.cc
+++ b/winsup/cygwin/fhandler_console.cc
@@ -80,6 +80,15 @@ static class write_pending_buffer
&nbsp; &nbsp;{
&nbsp; &nbsp; &nbsp;WriteConsoleA (handle, buf, ixput, wn, 0);
&nbsp; &nbsp;}
+&nbsp; inline char *c_str (size_t *psize = NULL)
+&nbsp; {
+&nbsp; &nbsp; size_t size = ixput < WPBUF_LEN ? ixput : WPBUF_LEN - 1;
+&nbsp; &nbsp; buf[size] = '\0';
+&nbsp; &nbsp; if (psize != NULL) {
+&nbsp; &nbsp; &nbsp; *psize = size;
+&nbsp; &nbsp; }
+&nbsp; &nbsp; return (char *) buf;
+&nbsp; }
&nbsp;} wpbuf;
&nbsp;
&nbsp;static void
@@ -3203,7 +3212,14 @@ fhandler_console::write (const void *vsrc, size_t len)
&nbsp;	&nbsp; &nbsp; if (*src < ' ')
&nbsp;	&nbsp; &nbsp; &nbsp; {
&nbsp;		if (wincap.has_con_24bit_colors () &amp;&amp; !con_is_legacy)
-		&nbsp; wpbuf.send (get_output_handle ());
+		&nbsp; {
+		&nbsp; &nbsp; size_t nms;
+		&nbsp; &nbsp; char *ms = wpbuf.c_str(&amp;nms);
+		&nbsp; &nbsp; wchar_t write_buf[TITLESIZE + 1];
+		&nbsp; &nbsp; DWORD done;
+		&nbsp; &nbsp; DWORD buf_len = sys_mbstowcs (write_buf, TITLESIZE, ms);
+		&nbsp; &nbsp; write_console (write_buf, buf_len, done);
+		&nbsp; }
&nbsp;		else if (*src == '\007' &amp;&amp; con.state == gettitle)
&nbsp;		&nbsp; set_console_title (con.my_title_buf);
&nbsp;		con.state = normal;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Patch] Fix incorrect code page when setting console title on Win10
  2020-08-26  8:43 [Patch] Fix incorrect code page when setting console title on Win10 =?gb18030?B?uay087q6?=
@ 2020-08-26  9:06 ` Corinna Vinschen
  2020-08-26  7:30   ` Johannes Schindelin
  0 siblings, 1 reply; 5+ messages in thread
From: Corinna Vinschen @ 2020-08-26  9:06 UTC (permalink / raw)
  To: Dahan Gong; +Cc: cygwin-patches

Hi,

On Aug 26 16:43, 宫大汉 via Cygwin-patches wrote:
> When Cygwin sets console titles on Win10 (has_con_24bit_colors &amp;&amp; !con_is_legacy),
> `WriteConsoleA` is used and causes an error if:
> 1. the environment variable of `LANG` is `***.UTF-8`
> 2. and the code page of console.exe is not UTF-8
> &nbsp; 1. e.g. on my Computer, it's GB2312, for Chinese text
> 
> 
> I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738,
> and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25.
> Then what should I do, in order to fix this on Cygwin?

Sending patches to the cygwin-patches mailing list is the right way to
go, so that's fine.  However, there are two small problems with your
patch:

- For non-trivial patches we need a 2-clause BSD waiver from you, as per
  https://cygwin.com/contrib.html, chapter "Before you get started".

- Your MUA apparently broke the patch.  No way to apply it in this form.
  Would you mind to resend it together with your BSD waiver, but as
  attachment?


Thanks,
Corinna



> 
> 
> The below is the commit's .patch file:
> 
> 
> From 334f52a53a2e6b7f560b0e8810b9f672ebb3ad24 Mon Sep 17 00:00:00 2001
> From: Dahan Gong <gdh1995@qq.com&gt;
> Date: Fri, 31 Jul 2020 22:06:54 +0800
> Subject: [PATCH] Fix incorrect code page in console title
> 
> 
> `WriteConsoleA` always follows the current code page of a console window, so it's not suitable to pass a multi-byte string in `get_ttyp ()-&gt;term_code_page` to it directly.
> 
> 
> This PR turns to `WriteConsoleW` so that most characters will be translated "as is", and I've tested it on Win 10 v2004 (CMD: ver 10.0.19041.388).
> 
> 
> Signed-off-by: gdh1995 <gdh1995@qq.com&gt;



> ---
> &nbsp;winsup/cygwin/fhandler_console.cc | 18 +++++++++++++++++-
> &nbsp;1 file changed, 17 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/winsup/cygwin/fhandler_console.cc b/winsup/cygwin/fhandler_console.cc
> index dd979fb8e2..7870eeda81 100644
> --- a/winsup/cygwin/fhandler_console.cc
> +++ b/winsup/cygwin/fhandler_console.cc
> @@ -80,6 +80,15 @@ static class write_pending_buffer
> &nbsp; &nbsp;{
> &nbsp; &nbsp; &nbsp;WriteConsoleA (handle, buf, ixput, wn, 0);
> &nbsp; &nbsp;}
> +&nbsp; inline char *c_str (size_t *psize = NULL)
> +&nbsp; {
> +&nbsp; &nbsp; size_t size = ixput < WPBUF_LEN ? ixput : WPBUF_LEN - 1;
> +&nbsp; &nbsp; buf[size] = '\0';
> +&nbsp; &nbsp; if (psize != NULL) {
> +&nbsp; &nbsp; &nbsp; *psize = size;
> +&nbsp; &nbsp; }
> +&nbsp; &nbsp; return (char *) buf;
> +&nbsp; }
> &nbsp;} wpbuf;
> &nbsp;
> &nbsp;static void
> @@ -3203,7 +3212,14 @@ fhandler_console::write (const void *vsrc, size_t len)
> &nbsp;	&nbsp; &nbsp; if (*src < ' ')
> &nbsp;	&nbsp; &nbsp; &nbsp; {
> &nbsp;		if (wincap.has_con_24bit_colors () &amp;&amp; !con_is_legacy)
> -		&nbsp; wpbuf.send (get_output_handle ());
> +		&nbsp; {
> +		&nbsp; &nbsp; size_t nms;
> +		&nbsp; &nbsp; char *ms = wpbuf.c_str(&amp;nms);
> +		&nbsp; &nbsp; wchar_t write_buf[TITLESIZE + 1];
> +		&nbsp; &nbsp; DWORD done;
> +		&nbsp; &nbsp; DWORD buf_len = sys_mbstowcs (write_buf, TITLESIZE, ms);
> +		&nbsp; &nbsp; write_console (write_buf, buf_len, done);
> +		&nbsp; }
> &nbsp;		else if (*src == '\007' &amp;&amp; con.state == gettitle)
> &nbsp;		&nbsp; set_console_title (con.my_title_buf);
> &nbsp;		con.state = normal;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Patch] Fix incorrect code page when setting console title on Win10
  2020-08-26  7:30   ` Johannes Schindelin
@ 2020-08-26 17:33     ` Corinna Vinschen
  2020-08-27  3:34       ` Takashi Yano
  0 siblings, 1 reply; 5+ messages in thread
From: Corinna Vinschen @ 2020-08-26 17:33 UTC (permalink / raw)
  To: cygwin-patches

On Aug 26 09:30, Johannes Schindelin wrote:
> Hi Corinna,
> 
> On Wed, 26 Aug 2020, Corinna Vinschen wrote:
> 
> > On Aug 26 16:43, 宫大汉 via Cygwin-patches wrote:
> > > When Cygwin sets console titles on Win10 (has_con_24bit_colors &amp;&amp; !con_is_legacy),
> > > `WriteConsoleA` is used and causes an error if:
> > > 1. the environment variable of `LANG` is `***.UTF-8`
> > > 2. and the code page of console.exe is not UTF-8
> > > &nbsp; 1. e.g. on my Computer, it's GB2312, for Chinese text
> > >
> > >
> > > I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738,
> > > and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25.
> 
> Just in case you want to have a look at it, you can download the patch via
> https://github.com/git-for-windows/msys2-runtime/commit/334f52a53a2e6b7f560b0e8810b9f672ebb3ad24.patch
> 
> FWIW my original reviewer comment was: "why not fix wpbuf.send() in the
> first place?" but after having a good look around, that method seemed to
> be called from so many places that I "got cold feet" of that approach.
> 
> For one, I saw at least one caller that wants to send Escape sequences,
> and I have no idea whether it is a good idea to do that in the `*W()`
> version of the `WriteConsole()` function.

Yes, it is.  There's no good reason to use the A functions, actually.
They are just wrappers calling the W functions and WriteConsoleW
evaluates ESC sequences just fine (just given as UTF-16 chars).

> So the real question from my side is: how to address properly the many
> uses of `WriteConsoleA()` (which breaks all kinds of encodings in many
> situations because Windows' idea of the current code page and Cygwin's
> idea of the current locale are pretty often at odds).
> 
> The patch discussed here circumvents one of those call sites.
> 
> However, even though there have not been any callers of `WriteConsoleA()`
> in Cygwin v3.0.7 (but four callers of `WriteConsoleW()` which I suspect
> were converted to `*A()` calls in v3.1.0 by the Pseudo Console patches),
> there are now a whopping 15 callers of that `*A()` function in Cygwin
> v3.1.7. See here:
> [...]
> That cannot be intentional, can it? We should always thrive to use the
> `*W()` functions so that we can be sure that the expected encoding is
> used, right?

Takashi?  Any reason to use WriteConsoleA rather than WriteConsoleW?  If
at all, WriteConsoleA should only be used if it's 100% safe to assume
that the buffer only contains ASCII chars < 127.


Thanks,
Corinna

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Patch] Fix incorrect code page when setting console title on Win10
  2020-08-26 17:33     ` Corinna Vinschen
@ 2020-08-27  3:34       ` Takashi Yano
  0 siblings, 0 replies; 5+ messages in thread
From: Takashi Yano @ 2020-08-27  3:34 UTC (permalink / raw)
  To: cygwin-patches

On Wed, 26 Aug 2020 19:33:45 +0200
Corinna Vinschen wrote:
> On Aug 26 09:30, Johannes Schindelin wrote:
> > Hi Corinna,
> > 
> > On Wed, 26 Aug 2020, Corinna Vinschen wrote:
> > 
> > > On Aug 26 16:43, 宫大汉 via Cygwin-patches wrote:
> > > > When Cygwin sets console titles on Win10 (has_con_24bit_colors &amp;&amp; !con_is_legacy),
> > > > `WriteConsoleA` is used and causes an error if:
> > > > 1. the environment variable of `LANG` is `***.UTF-8`
> > > > 2. and the code page of console.exe is not UTF-8
> > > > &nbsp; 1. e.g. on my Computer, it's GB2312, for Chinese text
> > > >
> > > >
> > > > I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738,
> > > > and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25.
> > 
> > Just in case you want to have a look at it, you can download the patch via
> > https://github.com/git-for-windows/msys2-runtime/commit/334f52a53a2e6b7f560b0e8810b9f672ebb3ad24.patch
> > 
> > FWIW my original reviewer comment was: "why not fix wpbuf.send() in the
> > first place?" but after having a good look around, that method seemed to
> > be called from so many places that I "got cold feet" of that approach.
> > 
> > For one, I saw at least one caller that wants to send Escape sequences,
> > and I have no idea whether it is a good idea to do that in the `*W()`
> > version of the `WriteConsole()` function.
> 
> Yes, it is.  There's no good reason to use the A functions, actually.
> They are just wrappers calling the W functions and WriteConsoleW
> evaluates ESC sequences just fine (just given as UTF-16 chars).
> 
> > So the real question from my side is: how to address properly the many
> > uses of `WriteConsoleA()` (which breaks all kinds of encodings in many
> > situations because Windows' idea of the current code page and Cygwin's
> > idea of the current locale are pretty often at odds).
> > 
> > The patch discussed here circumvents one of those call sites.
> > 
> > However, even though there have not been any callers of `WriteConsoleA()`
> > in Cygwin v3.0.7 (but four callers of `WriteConsoleW()` which I suspect
> > were converted to `*A()` calls in v3.1.0 by the Pseudo Console patches),
> > there are now a whopping 15 callers of that `*A()` function in Cygwin
> > v3.1.7. See here:
> > [...]
> > That cannot be intentional, can it? We should always thrive to use the
> > `*W()` functions so that we can be sure that the expected encoding is
> > used, right?
> 
> Takashi?  Any reason to use WriteConsoleA rather than WriteConsoleW?  If
> at all, WriteConsoleA should only be used if it's 100% safe to assume
> that the buffer only contains ASCII chars < 127.

No. I just did not realize that the escapce sequence cound contain
non-ASCII chars. I am sorry.

I will submit a patch for that issue.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-27  3:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-26  8:43 [Patch] Fix incorrect code page when setting console title on Win10 =?gb18030?B?uay087q6?=
2020-08-26  9:06 ` Corinna Vinschen
2020-08-26  7:30   ` Johannes Schindelin
2020-08-26 17:33     ` Corinna Vinschen
2020-08-27  3:34       ` Takashi Yano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).