From: Takashi Yano <takashi.yano@nifty.ne.jp>
To: cygwin-patches@cygwin.com
Subject: Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"
Date: Sat, 5 Sep 2020 20:15:06 +0900 [thread overview]
Message-ID: <20200905201506.8bbca09f51a2b2b06135affa@nifty.ne.jp> (raw)
In-Reply-To: <20200905174301.adbb3c147122fbe0636a0d56@nifty.ne.jp>
On Sat, 5 Sep 2020 17:43:01 +0900
Takashi Yano via Cygwin-patches <cygwin-patches@cygwin.com> wrote:
> Hi Corinna,
>
> On Fri, 4 Sep 2020 21:22:35 +0200
> Corinna Vinschen wrote:
> > Hi Takashi,
> >
> > On Sep 4 23:50, Takashi Yano via Cygwin-patches wrote:
> > > Hi Corinna,
> > >
> > > On Fri, 4 Sep 2020 14:44:00 +0200
> > > Corinna Vinschen wrote:
> > > > On Sep 4 18:21, Takashi Yano via Cygwin-patches wrote:
> > > > > I think I have found the answer to your request.
> > > > > Patch attached. What do you think of this patch?
> > > > >
> > > > > Calling initial_setlocale() is necessary because
> > > > > nl_langinfo() always returns "ANSI_X3.4-1968"
> > > > > regardless locale setting if this is not called.
> > > > [...]
> > > > However, the initial_setlocale() call in dll_crt0_1 calls
> > > > internal_setlocale(), and *that* function sets the conversion functions
> > > > for the internal conversions. What it *doesn't* do yet at the moment is
> > > > to store the charset name itself or, better, the equivalent codepage.
> > > >
> > > > If we change that, setup_locale can simply go away. Below is a patch
> > > > doing just that. Can you please check if that works in your test
> > > > scenarios?
> > >
> > > I tried your patch, but unfortunately it does not work.
> > > cygheap->locale.term_code_page is 0 in pty master.
> > >
> > > If the following lines are moved in internal_setlocale(),
> > >
> > > const char *charset = __locale_charset (__get_global_locale ());
> > > debug_printf ("Global charset set to %s", charset);
> > > /* Store codepage to be utilized by pseudo console code. */
> > > cygheap->locale.term_code_page =
> > > __eval_codepage_from_internal_charset (charset);
> > >
> > > in internal_setlocale() before
> > >
> > > /* Don't do anything if the charset hasn't actually changed. */
> > > if (cygheap->locale.mbtowc == __get_global_locale ()->mbtowc)
> > > return;
> >
> > Uh, that makes sense.
> >
> > > cygheap->locale.term_code_page is always 65001 even if mintty is
> > > startted by
> > > mintty -o locale=ja_JP -o charset=CP932
> > > or
> > > mintty -o locale=ja_JP -o charset=EUCJP
> > >
> > > Perhaps, this is because LANG is not set properly yet when mintty
> > > is started.
> >
> > Yeah, that's the reason. The above settings of locale and charset on
> > the CLI should only take over when mintty calls setlocale() with a
> > matching string. The fact that it sets the matching value in the
> > environment, too, should only affect child processes, not mintty itself.
> >
> > But it's incorrect to call initial_setlocale() from setup_locale()
> > without resetting it to its original value.
> >
> > Unfortunately that doesn't solve any problem with the pseudo console
> > codepage. Drat. It sounds like you need the terminal's charset,
> > rather than the one set in the environment.
> >
> > So this boils down to the fact that term_code_page must be set
> > after the application is already running and as soo as it creates
> > the pty, me thinks. What if __eval_codepage_from_internal_charset()
> > is called at pty creation? Or even on reading from /writing to
> > the pty the first time? That should always be late enough to fetch
> > the correct codepage.
> >
> > Patch attached. Does that work as expected?
>
> Thank you very much for the patch.
>
> Your new additional patch works well except the test case such as:
>
> int pm = getpt();
> if (fork()) {
> [do the master operations]
> } else {
> int ps = open(ptsname(pm), O_RDWR|O_NOCTTY);
> close(pm);
> setsid();
> ioctl(ps, TIOCSCTTY, 1);
> dup2(ps, 0);
> dup2(ps, 1);
> dup2(ps, 2);
> close(ps);
> [exec non-cygwin process]
> }
>
> If this test case is run in cygwin console (command prompt),
> it causes garbled output due to term_code_page == 0.
>
> The second additional patch attached fixes the isseu.
No. This does not fix enough.
In the test case above, if it does not call setlocale(),
__eval_codepage_from_internal_charset() always returns "ASCII"
regardless of locale setting. Therefore, output is garbled if
the terminal charset is not UTF-8.
--
Takashi Yano <takashi.yano@nifty.ne.jp>
next prev parent reply other threads:[~2020-09-05 11:15 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-01 16:19 Johannes Schindelin
2020-09-02 6:06 ` Johannes Schindelin
2020-09-02 8:30 ` Corinna Vinschen
2020-09-02 8:38 ` Corinna Vinschen
2020-09-02 10:54 ` Takashi Yano
2020-09-02 15:24 ` Corinna Vinschen
2020-09-02 16:09 ` Corinna Vinschen
2020-09-02 16:25 ` Takashi Yano
2020-09-02 16:38 ` Corinna Vinschen
2020-09-03 17:59 ` Corinna Vinschen
2020-09-04 9:21 ` Takashi Yano
2020-09-04 12:44 ` Corinna Vinschen
2020-09-04 14:05 ` Brian Inglis
2020-09-04 14:50 ` Takashi Yano
2020-09-04 19:22 ` Corinna Vinschen
2020-09-05 8:43 ` Takashi Yano
2020-09-05 11:15 ` Takashi Yano [this message]
2020-09-05 14:15 ` Takashi Yano
2020-09-06 8:57 ` Takashi Yano
2020-09-06 10:15 ` Takashi Yano
2020-09-06 16:04 ` Takashi Yano
2020-09-07 4:45 ` Takashi Yano
2020-09-07 9:08 ` Corinna Vinschen
2020-09-07 9:54 ` Takashi Yano
2020-09-07 9:59 ` Takashi Yano
2020-09-08 8:40 ` Corinna Vinschen
2020-09-08 9:45 ` Takashi Yano
2020-09-08 19:16 ` Corinna Vinschen
2020-09-10 13:08 ` Takashi Yano
2020-09-07 8:27 ` Corinna Vinschen
2020-09-07 8:38 ` Takashi Yano
2020-09-07 9:09 ` Corinna Vinschen
2020-09-07 8:26 ` Corinna Vinschen
2020-09-07 9:36 ` Takashi Yano
2020-09-07 18:24 ` Takashi Yano
2020-09-07 21:08 ` Johannes Schindelin
2020-09-08 4:52 ` Brian Inglis
2020-09-07 10:27 ` Takashi Yano
2020-09-07 13:40 ` Takashi Yano
2020-09-08 7:55 ` Corinna Vinschen
2020-09-06 10:28 ` Takashi Yano
2020-09-07 8:33 ` Corinna Vinschen
2020-09-02 9:41 ` Takashi Yano
2020-09-02 6:26 ` Johannes Schindelin
2020-09-02 13:06 ` Takashi Yano
2020-09-02 9:12 ` Johannes Schindelin
2020-09-02 14:52 ` Takashi Yano
2020-09-04 10:03 ` Takashi Yano
2020-09-04 6:23 ` Johannes Schindelin
2020-09-04 15:03 ` Takashi Yano
2020-09-07 21:17 ` Johannes Schindelin
2020-09-08 8:16 ` Takashi Yano
2020-09-09 7:21 ` Corinna Vinschen
2020-09-10 0:15 ` Takashi Yano
2020-09-10 12:34 ` Takashi Yano
2020-09-11 9:05 ` Corinna Vinschen
2020-09-11 9:23 ` Corinna Vinschen
2020-09-10 14:04 ` Corinna Vinschen
2020-09-10 14:16 ` Takashi Yano
2020-09-10 14:18 ` Takashi Yano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200905201506.8bbca09f51a2b2b06135affa@nifty.ne.jp \
--to=takashi.yano@nifty.ne.jp \
--cc=cygwin-patches@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).