From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca>
To: cygwin@cygwin.com
Subject: Re: Cygwin 2.6.0: unreadable UTF-8 in Windows console
Date: Sat, 01 Oct 2016 05:17:00 -0000 [thread overview]
Message-ID: <d1da6d0e-380b-ec15-7fac-89747f04dc30@SystematicSw.ab.ca> (raw)
In-Reply-To: <f4712f19-ef37-2040-1cda-3e352f09c8cd@SystematicSw.ab.ca>
On 2016-09-30 22:34, Brian Inglis wrote:
> On 2016-09-30 20:13, Ivan Vanyushkin wrote:
>> Something has changed in version 2.6.0, and now UTF-8 text can't be displayed in Windows console (cmd).
>> 1. Create a file "test.txt" with non-ASCII text in UTF-8 encoding.
>> 2. Run "cmd".
>> 3. Run:
>> C:\Cygwin\bin\cat test.txt
>> ââââââââââââââââ ââââââââââââââ ââââ ââââââ 8000 ââ. ââââ ââââââââââââââââââââââ ââââââââââ.
>> Non-ASCII text is not readable. Older Cygwin 2.5.2 has no such issue.
>> C:\Cygwin\bin\uname -a
>> CYGWIN_NT-10.0 PCName 2.6.0(0.304/5/3) 2016-08-31 14:32 x86_64 Cygwin
>> C:\Cygwin\bin\locale
>> LANG=
>> LC_CTYPE="C.UTF-8"
>> LC_NUMERIC="C.UTF-8"
>> LC_TIME="C.UTF-8"
>> LC_COLLATE="C.UTF-8"
>> LC_MONETARY="C.UTF-8"
>> LC_MESSAGES="C.UTF-8"
>> LC_ALL=
>> Same issue with any other commands like "grep", or with utilities built and run under Cygwin 2.6.0.
>> Same issue in other Windows consoles, like ConEmu or FAR Manager.
>> If I change Windows console encoding to UTF-8 (run: "chcp 65001"), file can be correctly displayed natively
>> (run: "type test.txt"), but Cygwin "cat" still has the same issue.
>> How should I display UTF-8 now?
>
> No problems here - same setup.
> Don't have files containing UTF-8 specials handy, but do have with Latin1 (ISO-8859-1) specials,
> convertable to UTF-8.
> Stripped common ASCII-only lines from output below.
> Default email encoding is Unicode (hopefully UTF-8) not Western (presumably Latin1), so should render accurately.
>
> $ uname -srvmo
> CYGWIN_NT-10.0 2.6.0(0.304/5/3) 2016-08-31 14:32 x86_64 Cygwin
> $ locale
> LANG=C.UTF-8
> LC_CTYPE="C.UTF-8"
> LC_NUMERIC="C.UTF-8"
> LC_TIME="C.UTF-8"
> LC_COLLATE="C.UTF-8"
> LC_MONETARY="C.UTF-8"
> LC_MESSAGES="C.UTF-8"
> LC_ALL=C.UTF-8
> $ egrep -a 'Deg|LF' latin1.txt # -a needed to override binary assumption - garbled characters
> DegN='âN'
> DegW='âW'
> Y2LF='%sâ%s %s %s'
> Y2LLF='|â%.0s|'
> LF='|â'.YFP.'|'
> $ iconv -f iso-8859-1 -t utf-8 latin1.txt | egrep 'Deg|LF' # good utf-8 characters
> DegN='°N'
> DegW='°W'
> Y2LF='%s±%s %s %s'
> Y2LLF='|±%.0s|'
> LF='|±'.YFP.'|'
Sorry - this was mintty - you used cmd!
Saw similar problems you had until I set LC_ALL=C.UTF-8 (and LANG for consistency, but doesn't really matter) and chcp 65001.
Then type and Cygwin commands produce the same output.
Without CP65001 (and a Unicode console font mapping most characters - I use DejaVu Sans Mono everywhere I can) there may be no valid encoding for UTF-8 special characters in your default console CP (437 for US, 850 for non-US, others for localized versions).
Unfortunately then less displays spaces as squares, so you may have to set PAGER=more for readability.
--
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
WARNING: multiple messages have this Message-ID
From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca>
To: cygwin@cygwin.com
Subject: Re: Cygwin 2.6.0: unreadable UTF-8 in Windows console
Date: Sat, 01 Oct 2016 09:05:00 -0000 [thread overview]
Message-ID: <d1da6d0e-380b-ec15-7fac-89747f04dc30@SystematicSw.ab.ca> (raw)
Message-ID: <20161001090500.wREa22wJliFGlSxtCX1ZpCkA1ChU5hLgNQMqjeBr9kg@z> (raw)
In-Reply-To: <f4712f19-ef37-2040-1cda-3e352f09c8cd@SystematicSw.ab.ca>
On 2016-09-30 22:34, Brian Inglis wrote:
> On 2016-09-30 20:13, Ivan Vanyushkin wrote:
>> Something has changed in version 2.6.0, and now UTF-8 text can't be displayed in Windows console (cmd).
>> 1. Create a file "test.txt" with non-ASCII text in UTF-8 encoding.
>> 2. Run "cmd".
>> 3. Run:
>> C:\Cygwin\bin\cat test.txt
>> ââââââââââââââââ ââââââââââââââ ââââ ââââââ 8000 ââ. ââââ ââââââââââââââââââââââ ââââââââââ.
>> Non-ASCII text is not readable. Older Cygwin 2.5.2 has no such issue.
>> C:\Cygwin\bin\uname -a
>> CYGWIN_NT-10.0 PCName 2.6.0(0.304/5/3) 2016-08-31 14:32 x86_64 Cygwin
>> C:\Cygwin\bin\locale
>> LANG=
>> LC_CTYPE="C.UTF-8"
>> LC_NUMERIC="C.UTF-8"
>> LC_TIME="C.UTF-8"
>> LC_COLLATE="C.UTF-8"
>> LC_MONETARY="C.UTF-8"
>> LC_MESSAGES="C.UTF-8"
>> LC_ALL=
>> Same issue with any other commands like "grep", or with utilities built and run under Cygwin 2.6.0.
>> Same issue in other Windows consoles, like ConEmu or FAR Manager.
>> If I change Windows console encoding to UTF-8 (run: "chcp 65001"), file can be correctly displayed natively
>> (run: "type test.txt"), but Cygwin "cat" still has the same issue.
>> How should I display UTF-8 now?
>
> No problems here - same setup.
> Don't have files containing UTF-8 specials handy, but do have with Latin1 (ISO-8859-1) specials,
> convertable to UTF-8.
> Stripped common ASCII-only lines from output below.
> Default email encoding is Unicode (hopefully UTF-8) not Western (presumably Latin1), so should render accurately.
>
> $ uname -srvmo
> CYGWIN_NT-10.0 2.6.0(0.304/5/3) 2016-08-31 14:32 x86_64 Cygwin
> $ locale
> LANG=C.UTF-8
> LC_CTYPE="C.UTF-8"
> LC_NUMERIC="C.UTF-8"
> LC_TIME="C.UTF-8"
> LC_COLLATE="C.UTF-8"
> LC_MONETARY="C.UTF-8"
> LC_MESSAGES="C.UTF-8"
> LC_ALL=C.UTF-8
> $ egrep -a 'Deg|LF' latin1.txt # -a needed to override binary assumption - garbled characters
> DegN='âN'
> DegW='âW'
> Y2LF='%sâ%s %s %s'
> Y2LLF='|â%.0s|'
> LF='|â'.YFP.'|'
> $ iconv -f iso-8859-1 -t utf-8 latin1.txt | egrep 'Deg|LF' # good utf-8 characters
> DegN='°N'
> DegW='°W'
> Y2LF='%s±%s %s %s'
> Y2LLF='|±%.0s|'
> LF='|±'.YFP.'|'
Sorry - this was mintty - you used cmd!
Saw similar problems you had until I set LC_ALL=C.UTF-8 (and LANG for consistency, but doesn't really matter) and chcp 65001.
Then type and Cygwin commands produce the same output.
Without CP65001 (and a Unicode console font mapping most characters - I use DejaVu Sans Mono everywhere I can) there may be no valid encoding for UTF-8 special characters in your default console CP (437 for US, 850 for non-US, others for localized versions).
Unfortunately then less displays spaces as squares, so you may have to set PAGER=more for readability.
--
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
next prev parent reply other threads:[~2016-10-01 5:17 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-01 2:14 Ivan Vanyushkin
[not found] ` <f4712f19-ef37-2040-1cda-3e352f09c8cd@SystematicSw.ab.ca>
2016-10-01 5:17 ` Brian Inglis [this message]
2016-10-01 9:05 ` Brian Inglis
2016-10-02 6:29 ` Ivan Vanyushkin
2016-10-02 6:39 ` Bengt Larsson
2016-10-02 9:34 ` Ivan Vanyushkin
2016-10-19 12:01 ` Corinna Vinschen
2016-10-21 14:23 Ivan Vanyushkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d1da6d0e-380b-ec15-7fac-89747f04dc30@SystematicSw.ab.ca \
--to=brian.inglis@systematicsw.ab.ca \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).