public inbox for cygwin-patches@cygwin.com
 help / color / mirror / Atom feed
From: Takashi Yano <takashi.yano@nifty.ne.jp>
To: cygwin-patches@cygwin.com
Subject: Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"
Date: Thu, 3 Sep 2020 01:25:00 +0900	[thread overview]
Message-ID: <20200903012500.640e36573c67328fc3e1bc70@nifty.ne.jp> (raw)
In-Reply-To: <20200902152450.GJ4127@calimero.vinschen.de>

Hi Corinna,

On Wed, 2 Sep 2020 17:24:50 +0200
Corinna Vinschen  wrote:
> On Sep  2 19:54, Takashi Yano via Cygwin-patches wrote:
> > Hi Corinna,
> > 
> > On Wed, 2 Sep 2020 10:38:18 +0200
> > Corinna Vinschen wrote:
> > > On Sep  2 10:30, Corinna Vinschen wrote:
> > > > Ok guys, I'm not opposed to this change in terms of its result,
> > > > but I'm starting to wonder why all this locale code in fhandler_tty
> > > > is necessary at all.
> > > > 
> > > > I see that get_langinfo() calls __loadlocale and performs a lot of stuff
> > > > on the charsets which looks like duplicates of the initial_setlocale()
> > > > call performed at DLL startup.
> > > > 
> > > > If there's anything missing in the initial_setlocale() call which would
> > > > be required by the pseudo tty code?  What exactly is it?  The codepage?
> > > > And why can't we just add the info to cygheap->locale at initial_setlocale()
> > > > time so it's available at exec time without going through all this hassle
> > > > every time?
> > > > 
> > > > Apart from that, all this locale/charset/lcid stuff should be concentrated
> > > > in nlsfunc.cc ideally.
> > > 
> > > get_locale_from_env() and get_langinfo() should go away.  If we just
> > > need a codepage for get_ttyp ()->term_code_page, we should really find a
> > > way to do this from within internal_setlocale().
> > 
> > I looked into internal_setlocale() code, but I could not found
> > the code which handles thecode page. I found the code handling
> > the code page in __set_charset_from_locale() function in nlsfuncs.cc,
> > but it does not return code page itself. Could you please explain
> > more detail of your idea?
> 
> I had none yet :)  I was just musing, without actually thinking about a
> solution.  But I think this isn't very complicated.  Given this is
> inside Cygwin, nothing keeps the function to have a well-defined
> side-effect, as in setting a (not yet existing) member "term_code_page"
> of cygheap->locale.
> 
> Kind of like this:
> 
> diff --git a/winsup/cygwin/cygheap.h b/winsup/cygwin/cygheap.h
> index 8877cc358c39..2b84f4252071 100644
> --- a/winsup/cygwin/cygheap.h
> +++ b/winsup/cygwin/cygheap.h
> @@ -341,6 +341,7 @@ struct cygheap_debug
>  struct cygheap_locale
>  {
>    mbtowc_p mbtowc;
> +  UINT term_code_page;
>  };
>  
>  struct user_heap_info
> diff --git a/winsup/cygwin/nlsfuncs.cc b/winsup/cygwin/nlsfuncs.cc
> index 668d7eb9e778..752f4239d911 100644
> --- a/winsup/cygwin/nlsfuncs.cc
> +++ b/winsup/cygwin/nlsfuncs.cc
> @@ -1298,6 +1298,9 @@ __set_charset_from_locale (const char *locale, char *charset)
>  			    LOCALE_IDEFAULTANSICODEPAGE | LOCALE_RETURN_NUMBER,
>  			    (PWCHAR) &cp, sizeof cp))
>      cp = 0;
> +  /* Store codepage in cygheap->locale so fhandler_tty can switch the
> +     pseudo console to the correct codepage. */
> +  cygheap->locale.term_code_page = cp ?: CP_UTF8;
>    /* Translate codepage and lcid to a charset closely aligned with the default
>       charsets defined in Glibc. */
>    const char *cs;
> 
> Make sense?

I have tried your code, however, it does not work as expected.
It seems that __set_charset_from_locale() is not called.
cygheap->locale.term_code_page is always 0.

I have added following lines into setup_locale() to make sure
to call __set_charset_from_locale() for a test,

  setlocale (LC_ALL, "");
  __set_charset_from_locale (__get_global_locale()->categories[LC_CTYPE], charset);
  get_ttyp ()->term_code_page = cygheap->locale.term_code_page;

however, term_code_page is set to 932 if locale is ja_JP.UTF-8.
In this case term_code_page should be CP_UTF8 (65001).

The code page retrieved in __set_charset_from_locale() is not
based on "UTF-8" but "ja_JP".

Let me consider a while.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>

  parent reply	other threads:[~2020-09-02 16:25 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-01 16:19 Johannes Schindelin
2020-09-02  6:06 ` Johannes Schindelin
2020-09-02  8:30 ` Corinna Vinschen
2020-09-02  8:38   ` Corinna Vinschen
2020-09-02 10:54     ` Takashi Yano
2020-09-02 15:24       ` Corinna Vinschen
2020-09-02 16:09         ` Corinna Vinschen
2020-09-02 16:25         ` Takashi Yano [this message]
2020-09-02 16:38           ` Corinna Vinschen
2020-09-03 17:59             ` Corinna Vinschen
2020-09-04  9:21               ` Takashi Yano
2020-09-04 12:44                 ` Corinna Vinschen
2020-09-04 14:05                   ` Brian Inglis
2020-09-04 14:50                   ` Takashi Yano
2020-09-04 19:22                     ` Corinna Vinschen
2020-09-05  8:43                       ` Takashi Yano
2020-09-05 11:15                         ` Takashi Yano
2020-09-05 14:15                           ` Takashi Yano
2020-09-06  8:57                             ` Takashi Yano
2020-09-06 10:15                               ` Takashi Yano
2020-09-06 16:04                                 ` Takashi Yano
2020-09-07  4:45                                   ` Takashi Yano
2020-09-07  9:08                                     ` Corinna Vinschen
2020-09-07  9:54                                       ` Takashi Yano
2020-09-07  9:59                                         ` Takashi Yano
2020-09-08  8:40                                     ` Corinna Vinschen
2020-09-08  9:45                                       ` Takashi Yano
2020-09-08 19:16                                         ` Corinna Vinschen
2020-09-10 13:08                                         ` Takashi Yano
2020-09-07  8:27                           ` Corinna Vinschen
2020-09-07  8:38                             ` Takashi Yano
2020-09-07  9:09                               ` Corinna Vinschen
2020-09-07  8:26                         ` Corinna Vinschen
2020-09-07  9:36                           ` Takashi Yano
2020-09-07 18:24                             ` Takashi Yano
2020-09-07 21:08                             ` Johannes Schindelin
2020-09-08  4:52                               ` Brian Inglis
2020-09-07 10:27                           ` Takashi Yano
2020-09-07 13:40                             ` Takashi Yano
2020-09-08  7:55                               ` Corinna Vinschen
2020-09-06 10:28                   ` Takashi Yano
2020-09-07  8:33                     ` Corinna Vinschen
2020-09-02  9:41   ` Takashi Yano
2020-09-02  6:26     ` Johannes Schindelin
2020-09-02 13:06       ` Takashi Yano
2020-09-02  9:12         ` Johannes Schindelin
2020-09-02 14:52           ` Takashi Yano
2020-09-04 10:03 ` Takashi Yano
2020-09-04  6:23   ` Johannes Schindelin
2020-09-04 15:03     ` Takashi Yano
2020-09-07 21:17       ` Johannes Schindelin
2020-09-08  8:16         ` Takashi Yano
2020-09-09  7:21           ` Corinna Vinschen
2020-09-10  0:15             ` Takashi Yano
2020-09-10 12:34               ` Takashi Yano
2020-09-11  9:05                 ` Corinna Vinschen
2020-09-11  9:23                   ` Corinna Vinschen
2020-09-10 14:04               ` Corinna Vinschen
2020-09-10 14:16                 ` Takashi Yano
2020-09-10 14:18                   ` Takashi Yano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200903012500.640e36573c67328fc3e1bc70@nifty.ne.jp \
    --to=takashi.yano@nifty.ne.jp \
    --cc=cygwin-patches@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).