From: Joe Wigglesworth <wiggles@ca.ibm.com>
To: cygwin@cygwin.com
Subject: Re: Anyone using bash shell in Japanese, Chinese, or Korean?
Date: Wed, 19 May 2004 20:54:00 -0000 [thread overview]
Message-ID: <OF3F9F621A.766AE7D4-ON85256E99.006B5D42-85256E99.006DBEBB@ca.ibm.com> (raw)
In-Reply-To: <Pine.GSO.4.58.0405181753480.2073@slinky.cs.nyu.edu>
[-- Attachment #1: Type: text/plain, Size: 2391 bytes --]
Your observation, Igor, is correct. The 0x5C character is the backslash
and that is at the root of the problem, especially in a command like ls.
We haven't run across the problem with 0x2F, but that may just be a matter
of time.
If Cygwin applications are handling the multibyte characters as a sequence
of bytes that would certainly be the root of the problem. Moving to
multibyte-friendly Windows API calls would be a big step forward. Given
the large number of Cygwin users in Japan, I'm surprised that hasn't
already happened. It must be possible because I've been told that the
bash shell that comes with the MKS Toolkit can handle Japanese characters
correctly.
-Joe
FWIW, I think it might be more than a coincidence that 0x5c is the ASCII
code for '\'. I suspect the same problem will occur for characters with
0x2f ('/') in the second byte (if there are any).
The crux is that a lot of Cygwin applications don't have any handling of
multibyte characters -- they simply process each string as a sequence of
bytes. The problem appears when the multi-byte representation contains
(accidentally) a character that's being treated specially (e.g., '/' or
'\'). How much of this is due to the program looking at the string
itself, and how much is due to using the wrong type of Windows API calls
(that aren't multibyte-friendly), remains to be seen. It would be
interesting to strace the "ls ." invocation to see whether it breaks
somewhere inside "ls" or inside a Windows API call.
Igor
P.S. We all saw the (identical?) post from two days ago
(<http://cygwin.com/ml/cygwin/2004-05/msg00567.html>), so there was no
need to re-post this.
--
http://cs.nyu.edu/~pechtcha/
|\ _,,,---,,_ pechtcha@cs.nyu.edu
ZZZzz /,`.-'`' -. ;-;;,_ igor@watson.ibm.com
|,4- ) )-,_. ,\ ( `'-' Igor
Pechtchanski, Ph.D.
'---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-.
Meow!
"I have since come to realize that being between your mentor and his route
to the bathroom is a major career booster." -- Patrick Naughton
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 5203 bytes --]
prev parent reply other threads:[~2004-05-19 19:58 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-05-18 22:05 Joe Wigglesworth
2004-05-18 22:11 ` Igor Pechtchanski
2004-05-19 20:54 ` Joe Wigglesworth [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=OF3F9F621A.766AE7D4-ON85256E99.006B5D42-85256E99.006DBEBB@ca.ibm.com \
--to=wiggles@ca.ibm.com \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).