public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin
@ 2004-03-18 16:42 erwin at klomp dot org
2004-03-18 16:44 ` [Bug java/14636] " erwin at klomp dot org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: erwin at klomp dot org @ 2004-03-18 16:42 UTC (permalink / raw)
To: gcc-bugs
I'm not sure whether to report this with cygwin or gcc, but my hunch is that the
problem is more generic than just cygwin.
I have a test class that I'll attach that shows the problem. When I try to
convert an UTF-8 byte-array to a java String, the byte order in the java chars
is wrong. (This is on an Intel platform w. MS Windows XP)
However the field iconv_byte_swap in gnu.gcj.convert.IOConverter is true, as the
test program shows.
An additional complication is that on most platforms, iconv isn't used to UTF-8,
but on cygwin with statically linked binaries, the Input_UTF8 converter class
isn't used because the linker throws it away, so IOConverter falls back on iconv.
I wonder if the native method gnu::gcj::convert::Input_iconv::read in
natIconv.cc does the byte swapping correctly. It reads characters from a local
variable of type jchar*, swaps the bytes, and then writes it back through a
variable of type char*
Isn't a char 8-bits wide and a jchar 16 bits wide?
Also, this piece of code hasn't changed between release 3.3.1 and the HEAD.
There is a workaround: include a reference to the class that implements the UTF8
converter in Java, to force the linker to include it in the executable.
- Erwin
Full gcj -v information:
Configured with: /GCC/gcc-3.3.1-3/configure --with-gcc --with-gnu-ld --with-gnu-
as --prefix=/usr --exec-prefix=/usr --sysconfdir=/etc --libdir=/usr/lib --libexe
cdir=/usr/sbin --mandir=/usr/share/man --infodir=/usr/share/info --enable-langua
ges=c,ada,c++,f77,pascal,java,objc --enable-libgcj --enable-threads=posix --with
-system-zlib --enable-nls --without-included-gettext --enable-interpreter --enab
le-sjlj-exceptions --disable-version-specific-runtime-libs --enable-shared --dis
able-win32-registry --enable-java-gc=boehm --disable-hash-synchronization --verb
ose --target=i686-pc-cygwin --host=i686-pc-cygwin --build=i686-pc-cygwin
Thread model: posix
gcc version 3.3.1 (cygming special)
--
Summary: Problem with UTF-8 in IOConverter/iconv on cygwin
Product: gcc
Version: 3.3.1
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: java
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: erwin at klomp dot org
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: 3.3.1
GCC host triplet: i686-pc-cygwin
GCC target triplet: i686-pc-cygwin
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug java/14636] Problem with UTF-8 in IOConverter/iconv on cygwin
2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
@ 2004-03-18 16:44 ` erwin at klomp dot org
2004-03-18 16:56 ` pinskia at gcc dot gnu dot org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: erwin at klomp dot org @ 2004-03-18 16:44 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From erwin at klomp dot org 2004-03-18 16:44 -------
Created an attachment (id=5941)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=5941&action=view)
This piece of code will trigger the bug with gcj-3.3.1 and cygwin on windows.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug java/14636] Problem with UTF-8 in IOConverter/iconv on cygwin
2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
2004-03-18 16:44 ` [Bug java/14636] " erwin at klomp dot org
@ 2004-03-18 16:56 ` pinskia at gcc dot gnu dot org
2004-03-18 17:09 ` erwin at klomp dot org
2004-03-18 17:23 ` [Bug libgcj/14636] " pinskia at gcc dot gnu dot org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-18 16:56 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-18 16:56 -------
This is a dup of bug 12908.
*** This bug has been marked as a duplicate of 12908 ***
--
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |DUPLICATE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug java/14636] Problem with UTF-8 in IOConverter/iconv on cygwin
2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
2004-03-18 16:44 ` [Bug java/14636] " erwin at klomp dot org
2004-03-18 16:56 ` pinskia at gcc dot gnu dot org
@ 2004-03-18 17:09 ` erwin at klomp dot org
2004-03-18 17:23 ` [Bug libgcj/14636] " pinskia at gcc dot gnu dot org
3 siblings, 0 replies; 5+ messages in thread
From: erwin at klomp dot org @ 2004-03-18 17:09 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From erwin at klomp dot org 2004-03-18 17:09 -------
Please, read more carefully.
The bug is NOT that Input_UTF8 is missing. Yes, Input_UTF8 is a good workaround
for this bug in the case of UTF8.
But _this_ bug report is about a bug in the IOConverter class and the iconv
interface, and bug 12908 is _not_ about that.
Also, there probably are more converters that are supported by iconv than by
Java implementations, and they probably all exhibit the same problem.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |UNCONFIRMED
Resolution|DUPLICATE |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libgcj/14636] Problem with UTF-8 in IOConverter/iconv on cygwin
2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
` (2 preceding siblings ...)
2004-03-18 17:09 ` erwin at klomp dot org
@ 2004-03-18 17:23 ` pinskia at gcc dot gnu dot org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-18 17:23 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-18 17:23 -------
Either it is a dup of bug 9715 or a bug 12908. The problem (9715) might be that iconv on cygwin is
not that good and does not support them. Also PR 13708 is about making sure that the UTF8 converter
stays in, no matter what so this is a dup of bug 13708 then.
*** This bug has been marked as a duplicate of 13708 ***
--
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Component|java |libgcj
Resolution| |DUPLICATE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-03-18 17:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
2004-03-18 16:44 ` [Bug java/14636] " erwin at klomp dot org
2004-03-18 16:56 ` pinskia at gcc dot gnu dot org
2004-03-18 17:09 ` erwin at klomp dot org
2004-03-18 17:23 ` [Bug libgcj/14636] " pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).