public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* Anyone using bash shell in Japanese, Chinese, or Korean?
@ 2004-05-18 22:05 Joe Wigglesworth
  2004-05-18 22:11 ` Igor Pechtchanski
  0 siblings, 1 reply; 3+ messages in thread
From: Joe Wigglesworth @ 2004-05-18 22:05 UTC (permalink / raw)
  To: cygwin


[-- Attachment #1.1: Type: text/plain, Size: 1704 bytes --]

I'm having difficulty getting the bash shell to handle Japanese double 
byte characters correctly. The handling of double byte Japanese characters 
is improved by adding the definitions listed below, but some commands such 
as ls, find, and cygpath still have problems. Is there anything else I can 
do to improve the handling of Japanese double byte characters in the bash 
shell?  I believe the same problems would occur with Chinese and Korean 
(or any other double byte language for that matter), but would be happy to 
be corrected by someone who knows otherwise.

The details of the problem I'm encountering are given below.

>> Added to .inputrc >>
set kanji-code sjis
set convert-meta off
set meta-flag on
set output-meta on

>> Added to .profile >>
export LANG=ja_JP.SJIS
export TZ=JST-9
export JLESSCHARSET=japanese-sjis
alias ls='ls --show-control-chars --color -F'

>> Added to .vimrc (for vi editor) >>>
set enc=sjis
set fileencoding=sjis

Sample steps to reproduce the problem:
1. Set above variables
2. Open bash
3. Create a directory with Japanese characters
   mkdir '@@@@@'   (@ means Japanese)
4. Change the directory
   cd '@@@@@'

>>> Under Japanese directory, bash can't find files. It seems that the 
Japanese directory can't be handled properly.
$ ls -la
ls: .: No such file or directory

>>> find command can't search under the Japanese directory
$ find . -name test.txt
find: ./@@@@@: No such file or directory

This problem does not occur with all Japanese characters. Problematic 
Japanese characters are Kanji characters which has 0x5c code as the second 
byte in Shift-JIS.

Attached below is the result of executing the command "cygcheck -s -v -r > 
cygversion.txt"



-Joe

[-- Attachment #1.2: cygversion.txt --]
[-- Type: text/plain, Size: 15131 bytes --]


Cygwin Win95/NT Configuration Diagnostics
Current System Time: Mon May 10 01:23:42 2004

Windows 2000 Server Ver 5.0 Build 2195 Service Pack 4

Path:	C:\cygwin\usr\local\bin
	C:\cygwin\bin
	C:\cygwin\bin
	C:\cygwin\usr\X11R6\bin
	C:\cygwin\home\thinkcontrol\bin
	c:\IBM\LDAP52\bin
	c:\IBM\WebSphereMQ\Java\lib
	c:\WINNT\system32
	c:\WINNT
	c:\WINNT\System32\Wbem
	"C
	C:\cygwin\Program Files\Maruo"
	c:\IBM\WebSphereMQ\bin
	c:\IBM\WebSphereMQ\WEMPS\bin
	c:\IBM\SQLLIB\BIN
	c:\IBM\SQLLIB\FUNCTION
	c:\usr\ov\bin
	c:\usr\ov\jre\bin\classic

Output from C:\cygwin\bin\id.exe (nontsec)
UID: 1002(tioadmin) GID: 513(‚È‚µ)
513(‚È‚µ)

Output from C:\cygwin\bin\id.exe (ntsec)
UID: 1002(tioadmin) GID: 513(‚È‚µ)
0(root)              513(‚È‚µ)            
544(Administrators)  545(Users)

SysDir: C:\WINNT\system32
WinDir: C:\WINNT

HOME = `C:\cygwin\home\thinkcontrol'
MAKE_MODE = `unix'
PWD = `/'
TCL_LIBRARY = `C:\usr\ov\bin'
USER = `tioadmin'

ALLUSERSPROFILE = `C:\Documents and Settings\All Users'
APPDATA = `C:\Documents and Settings\tioadmin\Application Data'
APPMON_TIMER_SEC = `60'
CLASSPATH = `C:\IBM\WebSphereMQ\Java\lib\providerutil.jar;C:\IBM\WebSphereMQ\Java\lib\com.ibm.mqjms.jar;C:\IBM\WebSphereMQ\Java\lib\ldap.jar;C:\IBM\WebSphereMQ\Java\lib\jta.jar;C:\IBM\WebSphereMQ\Java\lib\jndi.jar;C:\IBM\WebSphereMQ\Java\lib\jms.jar;C:\IBM\WebSphereMQ\Java\lib\connector.jar;C:\IBM\WebSphereMQ\Java\lib\fscontext.jar;C:\IBM\WebSphereMQ\Java\lib\com.ibm.mq.jar;.;C:\IBM\SQLLIB\java\db2java.zip;C:\IBM\SQLLIB\java\db2jcc.jar;C:\IBM\SQLLIB\bin;C:\IBM\SQLLIB\java\common.jar'
COMMONPROGRAMFILES = `C:\Program Files\Common Files'
COMPUTERNAME = `TIV51114'
COMSPEC = `C:\WINNT\system32\cmd.exe'
CVS_RSH = `/bin/ssh'
CYG_ROOT = `C:\cygwin'
DB2INSTANCE = `DB2'
DB2TEMPDIR = `C:\IBM\SQLLIB\'
DB2_HOME = `C:\IBM\SQLLIB\'
DB2_NAME = `TC114'
HOMEDRIVE = `C:'
HOMEPATH = `\Cygwin\home\thinkcontrol'
HOSTNAME = `tiv51114'
INCLUDE = `C:\IBM\SQLLIB\INCLUDE;C:\IBM\SQLLIB\LIB'
INFOPATH = `/usr/local/info:/usr/info:/usr/share/info:/usr/autotool/devel/info:/usr/autotool/stable/info:'
JAVA_HOME = `C:\IBM\WebSphere\AppServer\Java'
LANG = `jajp932'
LDAPHOME = `C:\IBM\LDAP52'
LIB = `C:\IBM\SQLLIB\LIB'
LIBPATH = `C:\IBM\LDAP52\JAVA'
LOCPATH = `C:\IBM\LDAP52\bin\locale'
LOGONSERVER = `\\TIV51114'
MANPATH = `/usr/local/man:/usr/man:/usr/share/man:/usr/autotool/devel/man::/usr/ssl/man'
MIBDB = `C:\usr\ov\conf\snmpmib'
MIBFILES = `C:\usr\ov\snmp_mibs'
MQ_JAVA_DATA_PATH = `C:\IBM\WebSphereMQ'
MQ_JAVA_INSTALL_PATH = `C:\IBM\WebSphereMQ\Java'
NLSPATH = `C:\IBM\LDAP52\NLS\MSG\%L\%N'
NUMBER_OF_PROCESSORS = `1'
NVSNMP_CONF = `C:\usr\ov\conf\ovsnmp.conf'
NV_DB2PATH = `C:\IBM\SQLLIB'
NV_DRIVE = `C:'
NV_TCL = `C:\usr\ov\tcl'
OLDPWD = `/home/thinkcontrol'
OS2LIBPATH = `C:\WINNT\system32\os2\dll;'
OS = `Windows_NT'
OVWHELPDIR = `C:\usr\ov\help'
PATHEXT = `.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH'
PHOENIX_HOME = `C:\Cygwin\home\thinkcontrol\jakarta-avalon-phoenix'
PROCESSOR_ARCHITECTURE = `x86'
PROCESSOR_IDENTIFIER = `x86 Family 15 Model 2 Stepping 9, GenuineIntel'
PROCESSOR_LEVEL = `15'
PROCESSOR_REVISION = `0209'
PROGRAMFILES = `C:\Program Files'
PROMPT = `$P$G'
PS1 = `\[\033]0;\w\007
\033[32m\]\u@\h \[\033[33m\w\033[0m\]
$ '
SHLVL = `1'
SNMPCOLLECTDIR = `C:\usr\ov\databases\snmpcollect'
SNMPCOL_CONF = `C:\usr\ov\conf\snmpcol.conf'
SYSTEMDRIVE = `C:'
SYSTEMROOT = `C:\WINNT'
TC_ROOT = `C:\cygwin'
TEMP = `c:\DOCUME~1\tioadmin\LOCALS~1\Temp'
TERM = `cygwin'
TIO_HOME = `C:\Cygwin\home\thinkcontrol'
TIO_LOGS = `C:\Program Files\ibm\tivoli\common\COP\logs'
TISDIR = `C:\IBM\LDAP52'
TMP = `c:\DOCUME~1\tioadmin\LOCALS~1\Temp'
TRAPD_CONF = `C:\usr\ov\conf\trapd.conf'
USERDOMAIN = `TIV51114'
USERNAME = `tioadmin'
USERPROFILE = `C:\Documents and Settings\tioadmin'
WAS_HOME = `C:\IBM\WebSphere\AppServer'
WINDIR = `C:\WINNT'
ZCE_CLASSPATH = `C:\usr\ov\jars\nv_zce.jar;C:\usr\ov\jars\log.jar;C:\usr\ov\jars\xerces-3.2.1.jar;C:\usr\ov\jars\zce.jar;C:\usr\ov\jars\evd.jar;C:\usr\ov\jars\concurrent.jar;C:\usr\ov\jars\log4j-1.2.7.jar;C:\usr\ov\jars\ibmjsse.jar;C:\usr\ov\jars\netview.jar'
_ = `/usr/bin/cygcheck'

HKEY_CURRENT_USER\Software\Cygnus Solutions
HKEY_CURRENT_USER\Software\Cygnus Solutions\Cygwin
HKEY_CURRENT_USER\Software\Cygnus Solutions\Cygwin\mounts v2
HKEY_CURRENT_USER\Software\Cygnus Solutions\Cygwin\Program Options
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions\Cygwin
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions\Cygwin\mounts v2
  (default) = `/cygdrive'
  cygdrive flags = 0x00000022
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions\Cygwin\mounts v2\/
  (default) = `C:\cygwin'
  flags = 0x0000000a
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions\Cygwin\mounts v2\/usr/bin
  (default) = `C:\cygwin/bin'
  flags = 0x0000000a
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions\Cygwin\mounts v2\/usr/lib
  (default) = `C:\cygwin/lib'
  flags = 0x0000000a
HKEY_LOCAL_MACHINE\SOFTWARE\Cygnus Solutions\Cygwin\Program Options

a:  fd           N/A    N/A                    
c:  hd  NTFS   39001Mb  14% CP CS UN PA FC     
d:  hd  FAT32  34631Mb  33% CP    UN           IMAGE
e:  cd           N/A    N/A                    
f:  net NTFS   17351Mb  97% CP CS UN PA FC     TBSM31_Work
g:  net NTFS   17351Mb  77% CP CS UN PA FC     Tools

C:\cygwin      /          system  binmode
C:\cygwin/bin  /usr/bin   system  binmode
C:\cygwin/lib  /usr/lib   system  binmode
.              /cygdrive  system  binmode,cygdrive

Found: C:\cygwin\bin\awk.exe
Found: C:\cygwin\bin\bash.exe
Found: C:\cygwin\bin\cat.exe
Found: C:\cygwin\bin\cp.exe
Not Found: cpp (good!)
Found: C:\cygwin\bin\find.exe
Not Found: gcc
Not Found: gdb
Found: C:\cygwin\bin\grep.exe
Not Found: ld
Found: C:\cygwin\bin\ls.exe
Not Found: make
Found: C:\cygwin\bin\mv.exe
Found: C:\cygwin\bin\rm.exe
Found: C:\cygwin\bin\sed.exe
Found: C:\cygwin\bin\sh.exe
Found: C:\cygwin\bin\tar.exe

    7k 2003/10/19 C:\cygwin\bin\cygcrypt-0.dll - os=4.0 img=1.0 sys=4.0
                  "cygcrypt-0.dll" v0.0 ts=2003/10/19 16:57
  842k 2003/09/30 C:\cygwin\bin\cygcrypto-0.9.7.dll - os=4.0 img=1.0 sys=4.0
                  "cygcrypto-0.9.7.dll" v0.0 ts=2003/10/1 1:49
   45k 2001/04/25 C:\cygwin\bin\cygform5.dll - os=4.0 img=1.0 sys=4.0
                  "cygform5.dll" v0.0 ts=2001/4/25 14:28
   35k 2002/01/09 C:\cygwin\bin\cygform6.dll - os=4.0 img=1.0 sys=4.0
                  "cygform6.dll" v0.0 ts=2002/1/9 15:03
   48k 2003/08/09 C:\cygwin\bin\cygform7.dll - os=4.0 img=1.0 sys=4.0
                  "cygform7.dll" v0.0 ts=2003/8/9 18:25
   28k 2003/07/20 C:\cygwin\bin\cyggdbm-3.dll - os=4.0 img=1.0 sys=4.0
                  "cyggdbm-3.dll" v0.0 ts=2003/7/20 16:58
   30k 2003/08/11 C:\cygwin\bin\cyggdbm-4.dll - os=4.0 img=1.0 sys=4.0
                  "cyggdbm-4.dll" v0.0 ts=2003/8/11 11:12
   19k 2003/03/22 C:\cygwin\bin\cyggdbm.dll - os=4.0 img=1.0 sys=4.0
                  "cyggdbm.dll" v0.0 ts=2002/2/20 12:05
   15k 2003/07/20 C:\cygwin\bin\cyggdbm_compat-3.dll - os=4.0 img=1.0 sys=4.0
                  "cyggdbm_compat-3.dll" v0.0 ts=2003/7/20 17:00
   15k 2003/08/11 C:\cygwin\bin\cyggdbm_compat-4.dll - os=4.0 img=1.0 sys=4.0
                  "cyggdbm_compat-4.dll" v0.0 ts=2003/8/11 11:13
   69k 2003/08/10 C:\cygwin\bin\cyggettextlib-0-12-1.dll - os=4.0 img=1.0 sys=4.0
                  "cyggettextlib-0-12-1.dll" v0.0 ts=2003/8/11 7:10
   12k 2003/08/10 C:\cygwin\bin\cyggettextpo-0.dll - os=4.0 img=1.0 sys=4.0
                  "cyggettextpo-0.dll" v0.0 ts=2003/8/11 7:11
  134k 2003/08/10 C:\cygwin\bin\cyggettextsrc-0-12-1.dll - os=4.0 img=1.0 sys=4.0
                  "cyggettextsrc-0-12-1.dll" v0.0 ts=2003/8/11 7:10
   17k 2001/06/28 C:\cygwin\bin\cyghistory4.dll - os=4.0 img=1.0 sys=4.0
                  "cyghistory4.dll" v0.0 ts=2001/1/7 13:34
   29k 2003/08/10 C:\cygwin\bin\cyghistory5.dll - os=4.0 img=1.0 sys=4.0
                  "cyghistory5.dll" v0.0 ts=2003/8/11 8:16
  958k 2003/08/10 C:\cygwin\bin\cygiconv-2.dll - os=4.0 img=1.0 sys=4.0
                  "cygiconv-2.dll" v0.0 ts=2003/8/11 5:57
   22k 2001/12/13 C:\cygwin\bin\cygintl-1.dll - os=4.0 img=1.0 sys=4.0
                  "cygintl-1.dll" v0.0 ts=2001/12/13 18:28
   37k 2003/08/10 C:\cygwin\bin\cygintl-2.dll - os=4.0 img=1.0 sys=4.0
                  "cygintl-2.dll" v0.0 ts=2003/8/11 6:50
   26k 2001/04/25 C:\cygwin\bin\cygmenu5.dll - os=4.0 img=1.0 sys=4.0
                  "cygmenu5.dll" v0.0 ts=2001/4/25 14:27
   20k 2002/01/09 C:\cygwin\bin\cygmenu6.dll - os=4.0 img=1.0 sys=4.0
                  "cygmenu6.dll" v0.0 ts=2002/1/9 15:03
   29k 2003/08/09 C:\cygwin\bin\cygmenu7.dll - os=4.0 img=1.0 sys=4.0
                  "cygmenu7.dll" v0.0 ts=2003/8/9 18:25
  156k 2001/04/25 C:\cygwin\bin\cygncurses++5.dll - os=4.0 img=1.0 sys=4.0
                  "cygncurses++5.dll" v0.0 ts=2001/4/25 14:29
  175k 2002/01/09 C:\cygwin\bin\cygncurses++6.dll - os=4.0 img=1.0 sys=4.0
                  "cygncurses++6.dll" v0.0 ts=2002/1/9 15:03
  226k 2001/04/25 C:\cygwin\bin\cygncurses5.dll - os=4.0 img=1.0 sys=4.0
                  "cygncurses5.dll" v0.0 ts=2001/4/25 14:17
  202k 2002/01/09 C:\cygwin\bin\cygncurses6.dll - os=4.0 img=1.0 sys=4.0
                  "cygncurses6.dll" v0.0 ts=2002/1/9 15:03
  224k 2003/08/09 C:\cygwin\bin\cygncurses7.dll - os=4.0 img=1.0 sys=4.0
                  "cygncurses7.dll" v0.0 ts=2003/8/9 18:24
   15k 2001/04/25 C:\cygwin\bin\cygpanel5.dll - os=4.0 img=1.0 sys=4.0
                  "cygpanel5.dll" v0.0 ts=2001/4/25 14:27
   12k 2002/01/09 C:\cygwin\bin\cygpanel6.dll - os=4.0 img=1.0 sys=4.0
                  "cygpanel6.dll" v0.0 ts=2002/1/9 15:03
   19k 2003/08/09 C:\cygwin\bin\cygpanel7.dll - os=4.0 img=1.0 sys=4.0
                  "cygpanel7.dll" v0.0 ts=2003/8/9 18:24
   62k 2003/12/11 C:\cygwin\bin\cygpcre-0.dll - os=4.0 img=1.0 sys=4.0
                  "cygpcre-0.dll" v0.0 ts=2003/12/12 2:01
   63k 2003/04/11 C:\cygwin\bin\cygpcre.dll - os=4.0 img=1.0 sys=4.0
                  "cygpcre.dll" v0.0 ts=2003/4/11 17:31
    9k 2003/12/11 C:\cygwin\bin\cygpcreposix-0.dll - os=4.0 img=1.0 sys=4.0
                  "cygpcreposix-0.dll" v0.0 ts=2003/12/12 2:01
   61k 2003/04/11 C:\cygwin\bin\cygpcreposix.dll - os=4.0 img=1.0 sys=4.0
                  "cygpcreposix.dll" v0.0 ts=2003/4/11 17:31
   22k 2002/06/09 C:\cygwin\bin\cygpopt-0.dll - os=4.0 img=1.0 sys=4.0
                  "cygpopt-0.dll" v0.0 ts=2002/6/9 14:45
  108k 2001/06/28 C:\cygwin\bin\cygreadline4.dll - os=4.0 img=1.0 sys=4.0
                  "cygreadline4.dll" v0.0 ts=2001/1/7 13:34
  148k 2003/08/10 C:\cygwin\bin\cygreadline5.dll - os=4.0 img=1.0 sys=4.0
                  "cygreadline5.dll" v0.0 ts=2003/8/11 8:16
  171k 2003/09/30 C:\cygwin\bin\cygssl-0.9.7.dll - os=4.0 img=1.0 sys=4.0
                  "cygssl-0.9.7.dll" v0.0 ts=2003/10/1 1:49
   61k 2003/12/04 C:\cygwin\bin\cygz.dll - os=4.0 img=1.0 sys=4.0
                  "cygz.dll" v0.0 ts=2003/12/4 12:03
 1083k 2004/01/31 C:\cygwin\bin\cygwin1.dll - os=4.0 img=1.0 sys=4.0
                  "cygwin1.dll" v0.0 ts=2004/1/31 9:32
    Cygwin DLL version info:
        DLL version: 1.5.7
        DLL epoch: 19
        DLL bad signal mask: 19005
        DLL old termios: 5
        DLL malloc env: 28
        API major: 0
        API minor: 109
        Shared data: 3
        DLL identifier: cygwin1
        Mount registry: 2
        Cygnus registry name: Cygnus Solutions
        Cygwin registry name: Cygwin
        Program options name: Program Options
        Cygwin mount registry name: mounts v2
        Cygdrive flags: cygdrive flags
        Cygdrive prefix: cygdrive prefix
        Cygdrive default prefix: 
        Build date: Fri Jan 30 19:32:04 EST 2004
        CVS tag: cr-0x9e
        Shared id: cygwin1S3


Cygwin Package Information
Last downloaded files to: F:\Bach_21\Cygwin_0208
Last downloaded files from: http://cygwin.get-software.com

Package              Version            
_update-info-dir     00226-1            
ash                  20040127-1         
base-files           2.6-1              
base-passwd          1.1-1              
bash                 2.05b-16           
bzip2                1.0.2-5            
clear                1.0-1              
cron                 3.0.1-11           
crypt                1.1-1              
cvs                  1.11.6-3           
cygrunsrv            0.98-1             
cygutils             1.2.2-1            
cygwin               1.5.7-1            
cygwin-doc           1.3-6              
diffutils            2.8.4-1            
ed                   0.2-1              
editrights           1.01-1             
expect               20030128-1         
file                 4.06-1             
fileutils            4.1-2              
findutils            4.1.7-4            
gawk                 3.1.3-4            
gdbm                 1.8.3-7            
grep                 2.5-1              
groff                1.18.1-2           
gzip                 1.3.5-1            
inetutils            1.3.2-25           
less                 381-1              
libgdbm              1.8.0-5            
libgdbm-devel        1.8.3-7            
libgdbm3             1.8.3-3            
libgdbm4             1.8.3-7            
libgettextpo0        0.12.1-3           
libiconv2            1.9.1-3            
libintl1             0.10.40-1          
libintl2             0.12.1-3           
libncurses5          5.2-1              
libncurses6          5.2-8              
libncurses7          5.3-4              
libpcre              4.1-1              
libpcre0             4.5-1              
libpopt0             1.6.4-4            
libreadline4         4.1-2              
libreadline5         4.3-5              
login                1.9-7              
man                  1.5k-2             
mktemp               1.5-3              
more                 2.11o-1            
ncurses              5.3-4              
openssh              3.7.1p2-2          
openssl              0.9.7c-1           
readline             4.3-5              
sed                  4.0.8-1            
sh-utils             2.0.15-4           
sharutils            4.2.1-3            
shutdown             1.4-1              
tar                  1.13.25-5          
tcltk                20030901-1         
termcap              20021106-2         
terminfo             5.3_20030726-1     
texinfo              4.2-4              
textutils            2.0.21-1           
time                 1.7-1              
unzip                5.50-5             
vim                  6.2.098-1          
which                1.5-2              
whois                4.6.7-1            
zip                  2.3-5              
zlib                 1.2.1-1            
Use -h to see help about each section--=_mixed 00776B6F85256E98_=--

[-- Attachment #1.3: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 5203 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Anyone using bash shell in Japanese, Chinese, or Korean?
  2004-05-18 22:05 Anyone using bash shell in Japanese, Chinese, or Korean? Joe Wigglesworth
@ 2004-05-18 22:11 ` Igor Pechtchanski
  2004-05-19 20:54   ` Joe Wigglesworth
  0 siblings, 1 reply; 3+ messages in thread
From: Igor Pechtchanski @ 2004-05-18 22:11 UTC (permalink / raw)
  To: Joe Wigglesworth; +Cc: cygwin

On Tue, 18 May 2004, Joe Wigglesworth wrote:

> I'm having difficulty getting the bash shell to handle Japanese double
> byte characters correctly. The handling of double byte Japanese characters
> is improved by adding the definitions listed below, but some commands such
> as ls, find, and cygpath still have problems. Is there anything else I can
> do to improve the handling of Japanese double byte characters in the bash
> shell?  I believe the same problems would occur with Chinese and Korean
> (or any other double byte language for that matter), but would be happy to
> be corrected by someone who knows otherwise.
> [snip]
>
> This problem does not occur with all Japanese characters. Problematic
> Japanese characters are Kanji characters which has 0x5c code as the second
> byte in Shift-JIS.

FWIW, I think it might be more than a coincidence that 0x5c is the ASCII
code for '\'.  I suspect the same problem will occur for characters with
0x2f ('/') in the second byte (if there are any).

The crux is that a lot of Cygwin applications don't have any handling of
multibyte characters -- they simply process each string as a sequence of
bytes.  The problem appears when the multi-byte representation contains
(accidentally) a character that's being treated specially (e.g., '/' or
'\').  How much of this is due to the program looking at the string
itself, and how much is due to using the wrong type of Windows API calls
(that aren't multibyte-friendly), remains to be seen.  It would be
interesting to strace the "ls ." invocation to see whether it breaks
somewhere inside "ls" or inside a Windows API call.
	Igor
P.S. We all saw the (identical?) post from two days ago
(<http://cygwin.com/ml/cygwin/2004-05/msg00567.html>), so there was no
need to re-post this.
-- 
				http://cs.nyu.edu/~pechtcha/
      |\      _,,,---,,_		pechtcha@cs.nyu.edu
ZZZzz /,`.-'`'    -.  ;-;;,_		igor@watson.ibm.com
     |,4-  ) )-,_. ,\ (  `'-'		Igor Pechtchanski, Ph.D.
    '---''(_/--'  `-'\_) fL	a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

"I have since come to realize that being between your mentor and his route
to the bathroom is a major career booster."  -- Patrick Naughton

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Anyone using bash shell in Japanese, Chinese, or Korean?
  2004-05-18 22:11 ` Igor Pechtchanski
@ 2004-05-19 20:54   ` Joe Wigglesworth
  0 siblings, 0 replies; 3+ messages in thread
From: Joe Wigglesworth @ 2004-05-19 20:54 UTC (permalink / raw)
  To: cygwin

[-- Attachment #1: Type: text/plain, Size: 2391 bytes --]

Your observation, Igor, is correct.  The 0x5C character is the backslash 
and that is at the root of the problem, especially in a command like ls. 
We haven't run across the problem with 0x2F, but that may just be a matter 
of time.

If Cygwin applications are handling the multibyte characters as a sequence 
of bytes that would certainly be the root of the problem. Moving to 
multibyte-friendly Windows API calls would be a big step forward.  Given 
the large number of Cygwin users in Japan, I'm surprised that hasn't 
already happened.  It must be possible because I've been told that the 
bash shell that comes with the MKS Toolkit can handle Japanese characters 
correctly.

-Joe


FWIW, I think it might be more than a coincidence that 0x5c is the ASCII
code for '\'.  I suspect the same problem will occur for characters with
0x2f ('/') in the second byte (if there are any).

The crux is that a lot of Cygwin applications don't have any handling of
multibyte characters -- they simply process each string as a sequence of
bytes.  The problem appears when the multi-byte representation contains
(accidentally) a character that's being treated specially (e.g., '/' or
'\').  How much of this is due to the program looking at the string
itself, and how much is due to using the wrong type of Windows API calls
(that aren't multibyte-friendly), remains to be seen.  It would be
interesting to strace the "ls ." invocation to see whether it breaks
somewhere inside "ls" or inside a Windows API call.
                 Igor
P.S. We all saw the (identical?) post from two days ago
(<http://cygwin.com/ml/cygwin/2004-05/msg00567.html>), so there was no
need to re-post this.
-- 
                                                                 
http://cs.nyu.edu/~pechtcha/
      |\      _,,,---,,_ pechtcha@cs.nyu.edu
ZZZzz /,`.-'`'    -.  ;-;;,_ igor@watson.ibm.com
     |,4-  ) )-,_. ,\ (  `'-'                            Igor 
Pechtchanski, Ph.D.
    '---''(_/--'  `-'\_) fL              a.k.a JaguaR-R-R-r-r-r-.-.-. 
Meow!

"I have since come to realize that being between your mentor and his route
to the bathroom is a major career booster."  -- Patrick Naughton

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 5203 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-05-19 19:58 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-18 22:05 Anyone using bash shell in Japanese, Chinese, or Korean? Joe Wigglesworth
2004-05-18 22:11 ` Igor Pechtchanski
2004-05-19 20:54   ` Joe Wigglesworth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).