public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Joe Wigglesworth <wiggles@ca.ibm.com>
To: cygwin@cygwin.com
Subject: Re: Anyone using bash shell in Japanese, Chinese, or Korean?
Date: Wed, 19 May 2004 20:54:00 -0000	[thread overview]
Message-ID: <OF3F9F621A.766AE7D4-ON85256E99.006B5D42-85256E99.006DBEBB@ca.ibm.com> (raw)
In-Reply-To: <Pine.GSO.4.58.0405181753480.2073@slinky.cs.nyu.edu>

[-- Attachment #1: Type: text/plain, Size: 2391 bytes --]

Your observation, Igor, is correct.  The 0x5C character is the backslash 
and that is at the root of the problem, especially in a command like ls. 
We haven't run across the problem with 0x2F, but that may just be a matter 
of time.

If Cygwin applications are handling the multibyte characters as a sequence 
of bytes that would certainly be the root of the problem. Moving to 
multibyte-friendly Windows API calls would be a big step forward.  Given 
the large number of Cygwin users in Japan, I'm surprised that hasn't 
already happened.  It must be possible because I've been told that the 
bash shell that comes with the MKS Toolkit can handle Japanese characters 
correctly.

-Joe


FWIW, I think it might be more than a coincidence that 0x5c is the ASCII
code for '\'.  I suspect the same problem will occur for characters with
0x2f ('/') in the second byte (if there are any).

The crux is that a lot of Cygwin applications don't have any handling of
multibyte characters -- they simply process each string as a sequence of
bytes.  The problem appears when the multi-byte representation contains
(accidentally) a character that's being treated specially (e.g., '/' or
'\').  How much of this is due to the program looking at the string
itself, and how much is due to using the wrong type of Windows API calls
(that aren't multibyte-friendly), remains to be seen.  It would be
interesting to strace the "ls ." invocation to see whether it breaks
somewhere inside "ls" or inside a Windows API call.
                 Igor
P.S. We all saw the (identical?) post from two days ago
(<http://cygwin.com/ml/cygwin/2004-05/msg00567.html>), so there was no
need to re-post this.
-- 
                                                                 
http://cs.nyu.edu/~pechtcha/
      |\      _,,,---,,_ pechtcha@cs.nyu.edu
ZZZzz /,`.-'`'    -.  ;-;;,_ igor@watson.ibm.com
     |,4-  ) )-,_. ,\ (  `'-'                            Igor 
Pechtchanski, Ph.D.
    '---''(_/--'  `-'\_) fL              a.k.a JaguaR-R-R-r-r-r-.-.-. 
Meow!

"I have since come to realize that being between your mentor and his route
to the bathroom is a major career booster."  -- Patrick Naughton

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 5203 bytes --]

      reply	other threads:[~2004-05-19 19:58 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-05-18 22:05 Joe Wigglesworth
2004-05-18 22:11 ` Igor Pechtchanski
2004-05-19 20:54   ` Joe Wigglesworth [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OF3F9F621A.766AE7D4-ON85256E99.006B5D42-85256E99.006DBEBB@ca.ibm.com \
    --to=wiggles@ca.ibm.com \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).