public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin
@ 2004-03-18 16:42 erwin at klomp dot org
  2004-03-18 16:44 ` [Bug java/14636] " erwin at klomp dot org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: erwin at klomp dot org @ 2004-03-18 16:42 UTC (permalink / raw)
  To: gcc-bugs

I'm not sure whether to report this with cygwin or gcc, but my hunch is that the
problem is more generic than just cygwin.

I have a test class that I'll attach that shows the problem. When I try to
convert an UTF-8 byte-array to a java String, the byte order in the java chars
is wrong. (This is on an Intel platform w. MS Windows XP)

However the field iconv_byte_swap in gnu.gcj.convert.IOConverter is true, as the
test program shows.

An additional complication is that on most platforms, iconv isn't used to UTF-8,
but on cygwin with statically linked binaries, the Input_UTF8 converter class
isn't used because the linker throws it away, so IOConverter falls back on iconv. 

I wonder if the native method gnu::gcj::convert::Input_iconv::read in
natIconv.cc does the byte swapping correctly. It reads characters from a local
variable of type jchar*, swaps the bytes, and then writes it back through a
variable of type char*
Isn't a char 8-bits wide and a jchar 16 bits wide?

Also, this piece of code hasn't changed between release 3.3.1 and the HEAD.


There is a workaround: include a reference to the class that implements the UTF8
converter in Java, to force the linker to include it in the executable.

- Erwin



Full gcj -v information:


Configured with: /GCC/gcc-3.3.1-3/configure --with-gcc --with-gnu-ld --with-gnu-
as --prefix=/usr --exec-prefix=/usr --sysconfdir=/etc --libdir=/usr/lib --libexe
cdir=/usr/sbin --mandir=/usr/share/man --infodir=/usr/share/info --enable-langua
ges=c,ada,c++,f77,pascal,java,objc --enable-libgcj --enable-threads=posix --with
-system-zlib --enable-nls --without-included-gettext --enable-interpreter --enab
le-sjlj-exceptions --disable-version-specific-runtime-libs --enable-shared --dis
able-win32-registry --enable-java-gc=boehm --disable-hash-synchronization --verb
ose --target=i686-pc-cygwin --host=i686-pc-cygwin --build=i686-pc-cygwin
Thread model: posix
gcc version 3.3.1 (cygming special)

-- 
           Summary: Problem with UTF-8 in IOConverter/iconv on cygwin
           Product: gcc
           Version: 3.3.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: java
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: erwin at klomp dot org
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: 3.3.1
  GCC host triplet: i686-pc-cygwin
GCC target triplet: i686-pc-cygwin


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug java/14636] Problem with UTF-8 in IOConverter/iconv on cygwin
  2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
@ 2004-03-18 16:44 ` erwin at klomp dot org
  2004-03-18 16:56 ` pinskia at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: erwin at klomp dot org @ 2004-03-18 16:44 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From erwin at klomp dot org  2004-03-18 16:44 -------
Created an attachment (id=5941)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=5941&action=view)
This piece of code will trigger the bug with gcj-3.3.1 and cygwin on windows.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug java/14636] Problem with UTF-8 in IOConverter/iconv on cygwin
  2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
  2004-03-18 16:44 ` [Bug java/14636] " erwin at klomp dot org
@ 2004-03-18 16:56 ` pinskia at gcc dot gnu dot org
  2004-03-18 17:09 ` erwin at klomp dot org
  2004-03-18 17:23 ` [Bug libgcj/14636] " pinskia at gcc dot gnu dot org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-18 16:56 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-03-18 16:56 -------
This is a dup of bug 12908.

*** This bug has been marked as a duplicate of 12908 ***

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug java/14636] Problem with UTF-8 in IOConverter/iconv on cygwin
  2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
  2004-03-18 16:44 ` [Bug java/14636] " erwin at klomp dot org
  2004-03-18 16:56 ` pinskia at gcc dot gnu dot org
@ 2004-03-18 17:09 ` erwin at klomp dot org
  2004-03-18 17:23 ` [Bug libgcj/14636] " pinskia at gcc dot gnu dot org
  3 siblings, 0 replies; 5+ messages in thread
From: erwin at klomp dot org @ 2004-03-18 17:09 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From erwin at klomp dot org  2004-03-18 17:09 -------
Please, read more carefully.

The bug is NOT that Input_UTF8 is missing. Yes, Input_UTF8 is a good workaround
for this bug in the case of UTF8.

But _this_ bug report is about a bug in the IOConverter class and the iconv
interface, and bug 12908 is _not_ about that.

Also, there probably are more converters that are supported by iconv than by
Java implementations, and they probably all exhibit the same problem.

 



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|DUPLICATE                   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libgcj/14636] Problem with UTF-8 in IOConverter/iconv on cygwin
  2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
                   ` (2 preceding siblings ...)
  2004-03-18 17:09 ` erwin at klomp dot org
@ 2004-03-18 17:23 ` pinskia at gcc dot gnu dot org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-18 17:23 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2004-03-18 17:23 -------
Either it is a dup of bug 9715 or a bug 12908.  The problem (9715) might be that iconv on cygwin is 
not that good and does not support them.  Also PR 13708 is about making sure that the UTF8 converter 
stays in, no matter what so this is a dup of bug 13708 then.

*** This bug has been marked as a duplicate of 13708 ***

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
          Component|java                        |libgcj
         Resolution|                            |DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14636


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-03-18 17:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-18 16:42 [Bug java/14636] New: Problem with UTF-8 in IOConverter/iconv on cygwin erwin at klomp dot org
2004-03-18 16:44 ` [Bug java/14636] " erwin at klomp dot org
2004-03-18 16:56 ` pinskia at gcc dot gnu dot org
2004-03-18 17:09 ` erwin at klomp dot org
2004-03-18 17:23 ` [Bug libgcj/14636] " pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).