From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.133]) by sourceware.org (Postfix) with ESMTPS id CB2C9385783C for ; Mon, 31 Aug 2020 19:37:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org CB2C9385783C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=cygwin.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=corinna-cygwin@cygwin.com Received: from calimero.vinschen.de ([217.91.18.234]) by mrelayeu.kundenserver.de (mreue009 [212.227.15.167]) with ESMTPSA (Nemesis) id 1MIxBa-1jtWcV1vlC-00KU7X for ; Mon, 31 Aug 2020 21:37:37 +0200 Received: by calimero.vinschen.de (Postfix, from userid 500) id 0041BA8098A; Mon, 31 Aug 2020 21:37:36 +0200 (CEST) Date: Mon, 31 Aug 2020 21:37:36 +0200 From: Corinna Vinschen To: cygwin-developers@cygwin.com Subject: Re: New implementation of pseudo console support (experimental) Message-ID: <20200831193736.GG3272@calimero.vinschen.de> Reply-To: cygwin-developers@cygwin.com Mail-Followup-To: cygwin-developers@cygwin.com References: <20200819134156.GP3272@calimero.vinschen.de> <20200820170210.e066c8ad933ca31061130ba9@nifty.ne.jp> <20200831231253.332c66fdddb33ceed5f61db6@nifty.ne.jp> <20200831235325.c26c1a75e4cec737e793c91c@nifty.ne.jp> <9f0e8248-cc3b-b5a8-0af5-43dbdf079478@towo.net> <1104c24d-49ea-96b9-30cb-acd4460108ab@towo.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Provags-ID: V03:K1:POxXejepKml407aoSB4y6Y4tTsWeqWVbYxGa769qGomy/6dL54j /yxsl+oiu1QOSdlkd3l++PrQLOZJezFaHNcLzNMHGaWqCAEQPgVwtgCpOr26dOGPCLyQfbZ T4JskNTYwyRIV6BH643vJfrXkgZ/7GUJ8NOPomTf9LL1Ow4PjmHCL/Po538wHY4mjCyRQYU tj7o0Ca2vbDa6lTy0Hicw== X-UI-Out-Filterresults: notjunk:1;V03:K0:EGq563HY5dU=:4UdfMOZdjAZNAUcI6IDR5S K+tKYEP/3NdeQAODTeBccVxKhB6NBcguezIyWWKKMLemzA057bRpTKwvtqVvwIY/Wd0mSJddM 72Krl9f6KMnoztAtld6dBWpnMzocLxlE+dcbAZXleE8YZxgrTJg5Oxp7U77//oBQgRP0w8rrb QwCB83MShEzk7boxresdER+iGsYfAf8wBoFlpMdQe0vIXsdatE7gj5AU02stF8ouyLqo3P6eZ ic8M6vPTO2n4jO/a6zwwF1gMh3J40962y2DfpKnEBJdGgHhkevpPY7yjG8b2LkpLrCJgF0iFk BM7RnzGf66JvuVeNu+tTwc28+SFliMeBWFntcR/avYXiFIofYREuH8pza5LuqmwfqHLAGCTli rYKa10vmdUB9xeoSL5bQlkisW1ACbenSPDIXLliZfPgrM/d389xkyiMe1/ZiNukmW37cQM5vp KTI3zDrTPFSQspKx6rYlmFW9ShK5nNy1MNIapUSOJlAc9XdMs4HlB1l6oxDyjJpruXWfCfvS3 CQy7TxG+0tfa25fNvihlg5PRxMo0NTEkwHvv3Za0HdthNaqrH1SMzC0sBBg1LebhSaHu8BvN/ wDDlNPvTuj40r+PrnCIuiLx8vYzMaK5VWMxc91lqvN3jRQUJFN+Nb2e6OMcF4Oiwa9X8yDrIS v+jCAlhtNABUm91MNG2BoSdu6KHexCqX6lu/ZQbH+uRlyfrbJEHgEfp/0n2M6qTz6uQdsMNGl jHt9tiBbPzzE1MINGajOtxj8HPRp15tIHg6rA2UAeo0+C1bsfAMl8CaBHVqI/+yGvMP2QR0ZV Xn4Y8HhTH4jaT6WkZ7zpKfjRr97HGJ3uCdrgoXU2740NWnN9yKcSya2POo73P4YzoK5c019t3 nMM2OGpmw672cfXE6pXg== X-Spam-Status: No, score=-100.5 required=5.0 tests=BAYES_00, GOOD_FROM_CORINNA_CYGWIN, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NEUTRAL, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin-developers@cygwin.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Cygwin core component developers mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Aug 2020 19:37:40 -0000 On Aug 31 21:17, Johannes Schindelin wrote: > [...] > So I had a look at the code, and it seems that > `fhandler_pty_slave::setup_locale()` forces the output encoding to > C.ASCII if Pseudo Console support is enabled: > > char locale[ENCODING_LEN + 1] = "C"; > char charset[ENCODING_LEN + 1] = "ASCII"; > LCID lcid = get_langinfo (locale, charset); > > /* Set console code page from locale */ > if (get_pseudo_console ()) > { > UINT code_page; > if (lcid == 0 || lcid == (LCID) -1) > code_page = 20127; /* ASCII */ This looks wrong, actually. The default behaviour of Cygwin since Cygwin 1.7 was to assume UTF-8, even if the application doesn't call setlocale. This means the locale is "C", so ASCII is expected. However, even in this case, the internal conversions use UTF-8. See function internal_setlocale() in nlsfuncs.cc, lines 1553/1554. We never switched the console codepage, though, because the codepage doesn't make much sense when using wide character functions only, i. e. WriteConsoleW. Only the alternate charset is 437/ASCII. So, if the pseudo console actually *requires* to set the charset... > else if (!GetLocaleInfo (lcid, > LOCALE_IDEFAULTCODEPAGE | LOCALE_RETURN_NUMBER, > (char *) &code_page, sizeof (code_page))) > code_page = 20127; /* ASCII */ > SetConsoleCP (code_page); > SetConsoleOutputCP (code_page); can we please default to UTF-8 here even if the code page is ASCII? Corinna