* [Patch] Fix incorrect code page when setting console title on Win10 @ 2020-08-26 8:43 =?gb18030?B?uay087q6?= 2020-08-26 9:06 ` Corinna Vinschen 0 siblings, 1 reply; 5+ messages in thread From: =?gb18030?B?uay087q6?= @ 2020-08-26 8:43 UTC (permalink / raw) To: =?gb18030?B?Y3lnd2luLXBhdGNoZXM=?= [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="gb18030", Size: 2757 bytes --] When Cygwin sets console titles on Win10 (has_con_24bit_colors && !con_is_legacy), `WriteConsoleA` is used and causes an error if: 1. the environment variable of `LANG` is `***.UTF-8` 2. and the code page of console.exe is not UTF-8 1. e.g. on my Computer, it's GB2312, for Chinese text I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738, and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25. Then what should I do, in order to fix this on Cygwin? The below is the commit's .patch file: From 334f52a53a2e6b7f560b0e8810b9f672ebb3ad24 Mon Sep 17 00:00:00 2001 From: Dahan Gong <gdh1995@qq.com> Date: Fri, 31 Jul 2020 22:06:54 +0800 Subject: [PATCH] Fix incorrect code page in console title `WriteConsoleA` always follows the current code page of a console window, so it's not suitable to pass a multi-byte string in `get_ttyp ()->term_code_page` to it directly. This PR turns to `WriteConsoleW` so that most characters will be translated "as is", and I've tested it on Win 10 v2004 (CMD: ver 10.0.19041.388). Signed-off-by: gdh1995 <gdh1995@qq.com> --- winsup/cygwin/fhandler_console.cc | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/winsup/cygwin/fhandler_console.cc b/winsup/cygwin/fhandler_console.cc index dd979fb8e2..7870eeda81 100644 --- a/winsup/cygwin/fhandler_console.cc +++ b/winsup/cygwin/fhandler_console.cc @@ -80,6 +80,15 @@ static class write_pending_buffer { WriteConsoleA (handle, buf, ixput, wn, 0); } + inline char *c_str (size_t *psize = NULL) + { + size_t size = ixput < WPBUF_LEN ? ixput : WPBUF_LEN - 1; + buf[size] = '\0'; + if (psize != NULL) { + *psize = size; + } + return (char *) buf; + } } wpbuf; static void @@ -3203,7 +3212,14 @@ fhandler_console::write (const void *vsrc, size_t len) if (*src < ' ') { if (wincap.has_con_24bit_colors () && !con_is_legacy) - wpbuf.send (get_output_handle ()); + { + size_t nms; + char *ms = wpbuf.c_str(&nms); + wchar_t write_buf[TITLESIZE + 1]; + DWORD done; + DWORD buf_len = sys_mbstowcs (write_buf, TITLESIZE, ms); + write_console (write_buf, buf_len, done); + } else if (*src == '\007' && con.state == gettitle) set_console_title (con.my_title_buf); con.state = normal; ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch] Fix incorrect code page when setting console title on Win10 2020-08-26 8:43 [Patch] Fix incorrect code page when setting console title on Win10 =?gb18030?B?uay087q6?= @ 2020-08-26 9:06 ` Corinna Vinschen 2020-08-26 7:30 ` Johannes Schindelin 0 siblings, 1 reply; 5+ messages in thread From: Corinna Vinschen @ 2020-08-26 9:06 UTC (permalink / raw) To: Dahan Gong; +Cc: cygwin-patches Hi, On Aug 26 16:43, 宫大汉 via Cygwin-patches wrote: > When Cygwin sets console titles on Win10 (has_con_24bit_colors && !con_is_legacy), > `WriteConsoleA` is used and causes an error if: > 1. the environment variable of `LANG` is `***.UTF-8` > 2. and the code page of console.exe is not UTF-8 > 1. e.g. on my Computer, it's GB2312, for Chinese text > > > I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738, > and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25. > Then what should I do, in order to fix this on Cygwin? Sending patches to the cygwin-patches mailing list is the right way to go, so that's fine. However, there are two small problems with your patch: - For non-trivial patches we need a 2-clause BSD waiver from you, as per https://cygwin.com/contrib.html, chapter "Before you get started". - Your MUA apparently broke the patch. No way to apply it in this form. Would you mind to resend it together with your BSD waiver, but as attachment? Thanks, Corinna > > > The below is the commit's .patch file: > > > From 334f52a53a2e6b7f560b0e8810b9f672ebb3ad24 Mon Sep 17 00:00:00 2001 > From: Dahan Gong <gdh1995@qq.com> > Date: Fri, 31 Jul 2020 22:06:54 +0800 > Subject: [PATCH] Fix incorrect code page in console title > > > `WriteConsoleA` always follows the current code page of a console window, so it's not suitable to pass a multi-byte string in `get_ttyp ()->term_code_page` to it directly. > > > This PR turns to `WriteConsoleW` so that most characters will be translated "as is", and I've tested it on Win 10 v2004 (CMD: ver 10.0.19041.388). > > > Signed-off-by: gdh1995 <gdh1995@qq.com> > --- > winsup/cygwin/fhandler_console.cc | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > > diff --git a/winsup/cygwin/fhandler_console.cc b/winsup/cygwin/fhandler_console.cc > index dd979fb8e2..7870eeda81 100644 > --- a/winsup/cygwin/fhandler_console.cc > +++ b/winsup/cygwin/fhandler_console.cc > @@ -80,6 +80,15 @@ static class write_pending_buffer > { > WriteConsoleA (handle, buf, ixput, wn, 0); > } > + inline char *c_str (size_t *psize = NULL) > + { > + size_t size = ixput < WPBUF_LEN ? ixput : WPBUF_LEN - 1; > + buf[size] = '\0'; > + if (psize != NULL) { > + *psize = size; > + } > + return (char *) buf; > + } > } wpbuf; > > static void > @@ -3203,7 +3212,14 @@ fhandler_console::write (const void *vsrc, size_t len) > if (*src < ' ') > { > if (wincap.has_con_24bit_colors () && !con_is_legacy) > - wpbuf.send (get_output_handle ()); > + { > + size_t nms; > + char *ms = wpbuf.c_str(&nms); > + wchar_t write_buf[TITLESIZE + 1]; > + DWORD done; > + DWORD buf_len = sys_mbstowcs (write_buf, TITLESIZE, ms); > + write_console (write_buf, buf_len, done); > + } > else if (*src == '\007' && con.state == gettitle) > set_console_title (con.my_title_buf); > con.state = normal; ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch] Fix incorrect code page when setting console title on Win10 2020-08-26 9:06 ` Corinna Vinschen @ 2020-08-26 7:30 ` Johannes Schindelin 2020-08-26 17:33 ` Corinna Vinschen 0 siblings, 1 reply; 5+ messages in thread From: Johannes Schindelin @ 2020-08-26 7:30 UTC (permalink / raw) To: cygwin-patches; +Cc: Dahan Gong, Corinna Vinschen Hi Corinna, On Wed, 26 Aug 2020, Corinna Vinschen wrote: > On Aug 26 16:43, 宫大汉 via Cygwin-patches wrote: > > When Cygwin sets console titles on Win10 (has_con_24bit_colors && !con_is_legacy), > > `WriteConsoleA` is used and causes an error if: > > 1. the environment variable of `LANG` is `***.UTF-8` > > 2. and the code page of console.exe is not UTF-8 > > 1. e.g. on my Computer, it's GB2312, for Chinese text > > > > > > I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738, > > and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25. Just in case you want to have a look at it, you can download the patch via https://github.com/git-for-windows/msys2-runtime/commit/334f52a53a2e6b7f560b0e8810b9f672ebb3ad24.patch FWIW my original reviewer comment was: "why not fix wpbuf.send() in the first place?" but after having a good look around, that method seemed to be called from so many places that I "got cold feet" of that approach. For one, I saw at least one caller that wants to send Escape sequences, and I have no idea whether it is a good idea to do that in the `*W()` version of the `WriteConsole()` function. So the real question from my side is: how to address properly the many uses of `WriteConsoleA()` (which breaks all kinds of encodings in many situations because Windows' idea of the current code page and Cygwin's idea of the current locale are pretty often at odds). The patch discussed here circumvents one of those call sites. However, even though there have not been any callers of `WriteConsoleA()` in Cygwin v3.0.7 (but four callers of `WriteConsoleW()` which I suspect were converted to `*A()` calls in v3.1.0 by the Pseudo Console patches), there are now a whopping 15 callers of that `*A()` function in Cygwin v3.1.7. See here: $ git grep WriteConsoleA winsup/cygwin/fhandler_console.cc: WriteConsoleA (handle, buf, ixput, wn, 0); winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), "\033[?1h", 5, NULL, 0); winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), &last_char, 1, 0, 0); winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0); winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0); winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0); winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0); winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0); winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), buf, strlen (buf), 0, 0); winsup/cygwin/fhandler_console.cc: WriteConsoleA (get_output_handle (), winsup/cygwin/fhandler_tty.cc:DEF_HOOK (WriteConsoleA); winsup/cygwin/fhandler_tty.cc:WriteConsoleA_Hooked winsup/cygwin/fhandler_tty.cc: return WriteConsoleA_Orig (h, p, l, n, o); winsup/cygwin/fhandler_tty.cc: DO_HOOK (NULL, WriteConsoleA); That cannot be intentional, can it? We should always thrive to use the `*W()` functions so that we can be sure that the expected encoding is used, right? Ciao, Dscho ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch] Fix incorrect code page when setting console title on Win10 2020-08-26 7:30 ` Johannes Schindelin @ 2020-08-26 17:33 ` Corinna Vinschen 2020-08-27 3:34 ` Takashi Yano 0 siblings, 1 reply; 5+ messages in thread From: Corinna Vinschen @ 2020-08-26 17:33 UTC (permalink / raw) To: cygwin-patches On Aug 26 09:30, Johannes Schindelin wrote: > Hi Corinna, > > On Wed, 26 Aug 2020, Corinna Vinschen wrote: > > > On Aug 26 16:43, 宫大汉 via Cygwin-patches wrote: > > > When Cygwin sets console titles on Win10 (has_con_24bit_colors && !con_is_legacy), > > > `WriteConsoleA` is used and causes an error if: > > > 1. the environment variable of `LANG` is `***.UTF-8` > > > 2. and the code page of console.exe is not UTF-8 > > > 1. e.g. on my Computer, it's GB2312, for Chinese text > > > > > > > > > I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738, > > > and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25. > > Just in case you want to have a look at it, you can download the patch via > https://github.com/git-for-windows/msys2-runtime/commit/334f52a53a2e6b7f560b0e8810b9f672ebb3ad24.patch > > FWIW my original reviewer comment was: "why not fix wpbuf.send() in the > first place?" but after having a good look around, that method seemed to > be called from so many places that I "got cold feet" of that approach. > > For one, I saw at least one caller that wants to send Escape sequences, > and I have no idea whether it is a good idea to do that in the `*W()` > version of the `WriteConsole()` function. Yes, it is. There's no good reason to use the A functions, actually. They are just wrappers calling the W functions and WriteConsoleW evaluates ESC sequences just fine (just given as UTF-16 chars). > So the real question from my side is: how to address properly the many > uses of `WriteConsoleA()` (which breaks all kinds of encodings in many > situations because Windows' idea of the current code page and Cygwin's > idea of the current locale are pretty often at odds). > > The patch discussed here circumvents one of those call sites. > > However, even though there have not been any callers of `WriteConsoleA()` > in Cygwin v3.0.7 (but four callers of `WriteConsoleW()` which I suspect > were converted to `*A()` calls in v3.1.0 by the Pseudo Console patches), > there are now a whopping 15 callers of that `*A()` function in Cygwin > v3.1.7. See here: > [...] > That cannot be intentional, can it? We should always thrive to use the > `*W()` functions so that we can be sure that the expected encoding is > used, right? Takashi? Any reason to use WriteConsoleA rather than WriteConsoleW? If at all, WriteConsoleA should only be used if it's 100% safe to assume that the buffer only contains ASCII chars < 127. Thanks, Corinna ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Patch] Fix incorrect code page when setting console title on Win10 2020-08-26 17:33 ` Corinna Vinschen @ 2020-08-27 3:34 ` Takashi Yano 0 siblings, 0 replies; 5+ messages in thread From: Takashi Yano @ 2020-08-27 3:34 UTC (permalink / raw) To: cygwin-patches On Wed, 26 Aug 2020 19:33:45 +0200 Corinna Vinschen wrote: > On Aug 26 09:30, Johannes Schindelin wrote: > > Hi Corinna, > > > > On Wed, 26 Aug 2020, Corinna Vinschen wrote: > > > > > On Aug 26 16:43, 宫大汉 via Cygwin-patches wrote: > > > > When Cygwin sets console titles on Win10 (has_con_24bit_colors && !con_is_legacy), > > > > `WriteConsoleA` is used and causes an error if: > > > > 1. the environment variable of `LANG` is `***.UTF-8` > > > > 2. and the code page of console.exe is not UTF-8 > > > > 1. e.g. on my Computer, it's GB2312, for Chinese text > > > > > > > > > > > > I've done some tests on msys2 and details are on https://github.com/git-for-windows/git/issues/2738, > > > > and I filed a PR of https://github.com/git-for-windows/msys2-runtime/pull/25. > > > > Just in case you want to have a look at it, you can download the patch via > > https://github.com/git-for-windows/msys2-runtime/commit/334f52a53a2e6b7f560b0e8810b9f672ebb3ad24.patch > > > > FWIW my original reviewer comment was: "why not fix wpbuf.send() in the > > first place?" but after having a good look around, that method seemed to > > be called from so many places that I "got cold feet" of that approach. > > > > For one, I saw at least one caller that wants to send Escape sequences, > > and I have no idea whether it is a good idea to do that in the `*W()` > > version of the `WriteConsole()` function. > > Yes, it is. There's no good reason to use the A functions, actually. > They are just wrappers calling the W functions and WriteConsoleW > evaluates ESC sequences just fine (just given as UTF-16 chars). > > > So the real question from my side is: how to address properly the many > > uses of `WriteConsoleA()` (which breaks all kinds of encodings in many > > situations because Windows' idea of the current code page and Cygwin's > > idea of the current locale are pretty often at odds). > > > > The patch discussed here circumvents one of those call sites. > > > > However, even though there have not been any callers of `WriteConsoleA()` > > in Cygwin v3.0.7 (but four callers of `WriteConsoleW()` which I suspect > > were converted to `*A()` calls in v3.1.0 by the Pseudo Console patches), > > there are now a whopping 15 callers of that `*A()` function in Cygwin > > v3.1.7. See here: > > [...] > > That cannot be intentional, can it? We should always thrive to use the > > `*W()` functions so that we can be sure that the expected encoding is > > used, right? > > Takashi? Any reason to use WriteConsoleA rather than WriteConsoleW? If > at all, WriteConsoleA should only be used if it's 100% safe to assume > that the buffer only contains ASCII chars < 127. No. I just did not realize that the escapce sequence cound contain non-ASCII chars. I am sorry. I will submit a patch for that issue. -- Takashi Yano <takashi.yano@nifty.ne.jp> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-08-27 3:34 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-26 8:43 [Patch] Fix incorrect code page when setting console title on Win10 =?gb18030?B?uay087q6?= 2020-08-26 9:06 ` Corinna Vinschen 2020-08-26 7:30 ` Johannes Schindelin 2020-08-26 17:33 ` Corinna Vinschen 2020-08-27 3:34 ` Takashi Yano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).