public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Thomas Wolff <towo@towo.net>
To: cygwin@cygwin.com
Subject: cygwin_conv_ functions and character encoding
Date: Fri, 05 Feb 2016 15:47:00 -0000	[thread overview]
Message-ID: <56B4C40A.4060607@towo.net> (raw)

The cygwin path conversion functions ignore the current locale;
rather they seem to always use the locale environment set when the 
program was started, see test program convloc.c:

#include <locale.h>
#include <stdio.h>
#include <sys/cygwin.h>
#include <stdlib.h>
int main() {
   setlocale(LC_ALL, "C.UTF-8");
   char * utfstring = "böh";
   printf("ustring <%s>\n", utfstring);
   wchar_t * wstring = cygwin_create_path(CCP_POSIX_TO_WIN_W, utfstring);
   printf("wstring <%ls>\n", wstring);
}

Run in a UTF-8 terminal:
 > LC_CTYPE=de_DE ./convloc
ustring (C.UTF-8) <böh>
wstring (C.UTF-8) <D:\TEMP\böh>

In sys_wcstombs in strfuncs.cc I see:
   const char *charset = cygheap->locale.charset;
which is set in internal_setlocale ()...

In fact, the situation can be fixed by adding after setlocale():
   cygwin_internal(CW_INT_SETLOCALE);  // -> internal_setlocale();
(cf. https://sourceware.org/ml/cygwin-developers/2010-02/msg00054.html)
but I think those functions should use the proper locale implicitly; 
according to the generic description in 
http://linux.die.net/man/3/setlocale,
LC_CTYPE affects ... conversion ... functions, in my opinion this would 
include cygwin-specific conversion functions as well as implicitly 
called conversion (see open() below).
The same problem applies to the open() function (involving path conversion).
The wide string function mbstowcs behaves as expected.


The whole issue occurred to me while trying to work around a missing 
conversion functionality, just converting the pathname syntax between 
Unicode strings. The desired options would be like:
   CCP_POSIX_W_TO_WIN_W,   /* from is wchar_t *posix, to is wchar_t 
*win32  */
   CCP_WIN_W_TO_POSIX_W,   /* from is wchar_t *win32, to is wchar_t 
*posix  */

------
Thomas

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

             reply	other threads:[~2016-02-05 15:47 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-05 15:47 Thomas Wolff [this message]
2016-02-08 13:15 ` Corinna Vinschen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56B4C40A.4060607@towo.net \
    --to=towo@towo.net \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).