public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/37298] New: no output when use wcout and cout, wrong utf16 -> utf8 conversion
@ 2008-08-31 16:20 jaworski at autograf dot pl
2008-08-31 18:16 ` [Bug libstdc++/37298] " paolo dot carlini at oracle dot com
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: jaworski at autograf dot pl @ 2008-08-31 16:20 UTC (permalink / raw)
To: gcc-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2905 bytes --]
Hi!
I wrote my first std::wstring program and have found bug in libstdc++!
* the exact version of GCC
gcc version 4.2.3 (4.2.3-6mnb1)
* the system type;
Linux Mandriva 2008.1 PowerPack
* the options given when GCC was configured/built;
./configure --prefix=/usr --libexecdir=/usr/lib --with-slibdir=/lib
--mandir=/usr/share/man --infodir=/usr/share/info --enable-checking=release
--enable-languages=c,c++,ada,fortran,objc,obj-c++,java
--host=i586-manbo-linux-gnu --with-cpu=generic --with-system-zlib
--enable-threads=posix --enable-shared --enable-long-long --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-clocale=gnu --enable-java-awt=gtk
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --enable-gtk-cairo
--disable-libjava-multilib --enable-ssp --disable-libssp
* the complete command line that triggers the bug;
g++ couttest.cpp -o couttest; ./couttest
* the compiler output (error messages, warnings, etc.)
There is no error or warrning messages.
Important think:
$ locale
LANG=pl_PL.UTF-8
...
I wrote this program (couttest.cpp):
#include <string>
#include <iostream>
#include <locale>
int main()
{
std::wstring wstr(L"letters1:ąśłółżź");
std::string str("letters2:ąśłółżź");
std::wcout.imbue(std::locale(""));
std::cout.imbue(std::locale(""));
std::wcout << wstr << std::endl;
std::cout << str << std::endl;
}
Output is:
letters1:???????
Second line doesn't appear.
Another problem: why I can't see polish letters??? I expect that conversion
from UTF-16 to UTF-8 is quite trivial. Besides I think, that conversion is
often case.
But let change above program:
#include <string>
#include <iostream>
#include <locale>
int main()
{
std::wstring wstr(L"letters1:ąśłółżź");
std::string str("letters2:ąśłółżź");
std::wcout.imbue(std::locale(""));
std::cout.imbue(std::locale(""));
std::cout << str << std::endl;
std::wcout << wstr << std::endl;
}
Output is:
letters2:ąśłółżź
letters1:[B�B|z
Both lines appear. But in this case wcout output is different than in first
program - why??? Correct cout output is not suprise, as sources are in utf-8 -
in this case there is no conversion.
I think that there are two bugs: 1) lack of whole line in first program's
output 2) wrong output (wrong conversion from utf-16 to utf-8).
Jacek Jaworski
--
Summary: no output when use wcout and cout, wrong utf16 -> utf8
conversion
Product: gcc
Version: 4.2.3
Status: UNCONFIRMED
Severity: trivial
Priority: P3
Component: libstdc++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jaworski at autograf dot pl
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37298
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/37298] no output when use wcout and cout, wrong utf16 -> utf8 conversion
2008-08-31 16:20 [Bug libstdc++/37298] New: no output when use wcout and cout, wrong utf16 -> utf8 conversion jaworski at autograf dot pl
@ 2008-08-31 18:16 ` paolo dot carlini at oracle dot com
2008-08-31 18:25 ` paolo dot carlini at oracle dot com
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: paolo dot carlini at oracle dot com @ 2008-08-31 18:16 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from paolo dot carlini at oracle dot com 2008-08-31 18:15 -------
*** This bug has been marked as a duplicate of 35353 ***
--
paolo dot carlini at oracle dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |DUPLICATE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37298
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/37298] no output when use wcout and cout, wrong utf16 -> utf8 conversion
2008-08-31 16:20 [Bug libstdc++/37298] New: no output when use wcout and cout, wrong utf16 -> utf8 conversion jaworski at autograf dot pl
2008-08-31 18:16 ` [Bug libstdc++/37298] " paolo dot carlini at oracle dot com
@ 2008-08-31 18:25 ` paolo dot carlini at oracle dot com
2008-08-31 20:15 ` jaworski at autograf dot pl
2008-08-31 20:58 ` paolo dot carlini at oracle dot com
3 siblings, 0 replies; 5+ messages in thread
From: paolo dot carlini at oracle dot com @ 2008-08-31 18:25 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from paolo dot carlini at oracle dot com 2008-08-31 18:24 -------
By the way, your testcases are strictly speaking invalid, because you cannot
mix operations on cout and wcout, per 27.3 (there is some discussion in
libstdc++/11705). Fixed that, the issue is the same as 35353, currently, by
default, cout, wcout, etc, are non-converting in our implementation.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37298
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/37298] no output when use wcout and cout, wrong utf16 -> utf8 conversion
2008-08-31 16:20 [Bug libstdc++/37298] New: no output when use wcout and cout, wrong utf16 -> utf8 conversion jaworski at autograf dot pl
2008-08-31 18:16 ` [Bug libstdc++/37298] " paolo dot carlini at oracle dot com
2008-08-31 18:25 ` paolo dot carlini at oracle dot com
@ 2008-08-31 20:15 ` jaworski at autograf dot pl
2008-08-31 20:58 ` paolo dot carlini at oracle dot com
3 siblings, 0 replies; 5+ messages in thread
From: jaworski at autograf dot pl @ 2008-08-31 20:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from jaworski at autograf dot pl 2008-08-31 20:14 -------
Subject: Re: no output when use wcout and cout, wrong utf16 -> utf8 conversion
> currently, by
> default, cout, wcout, etc, are non-converting in our implementation.
Very strange.... uft16->utf8 conversion seems be very easy, and required for
proper output on contemporary Linux. I don't know what special info is
required if utf16 and utf8 is the same stadard. But some strange conversion
is performed because the output is not utf16 nor utf8 - so what is the
output's format? It is some your own format? Is there any reason for this?
I want to use uft16 because I don't want care of some codings. I expect that
it works straightforward. So, what I should do? Create utf8 coding? In my
example I do this because utf8 is default settings in my system, but it
doesn't work. Should I choose polish locale? But when I create polish
locale, and some day I will add russian translation, what will happen then?
In my opinion it isn't Unicode way! I think, that Unicode mind no care about
character codding.
Jacek Jaworski
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37298
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libstdc++/37298] no output when use wcout and cout, wrong utf16 -> utf8 conversion
2008-08-31 16:20 [Bug libstdc++/37298] New: no output when use wcout and cout, wrong utf16 -> utf8 conversion jaworski at autograf dot pl
` (2 preceding siblings ...)
2008-08-31 20:15 ` jaworski at autograf dot pl
@ 2008-08-31 20:58 ` paolo dot carlini at oracle dot com
3 siblings, 0 replies; 5+ messages in thread
From: paolo dot carlini at oracle dot com @ 2008-08-31 20:58 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from paolo dot carlini at oracle dot com 2008-08-31 20:57 -------
(In reply to comment #3)
> Very strange.... uft16->utf8 conversion seems be very easy, and required for
> proper output on contemporary Linux.
By the way, contemporary Linux, or any Linux for that matter, is normally part
of a GNU / Linux system and in that case wchar_t is always 32 bits wide, UCS-4
encoding. UTF-16 doesn't play an important role. See the glibc docs for further
information. Our implementation, if sync_with_stdio(false) is called, is
perfectly able to convert back and forth from an internal wchar_t (UCS-4)
encoding to an external char (UTF-8) encoding, via the delivered codecvt facet.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37298
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-08-31 20:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-08-31 16:20 [Bug libstdc++/37298] New: no output when use wcout and cout, wrong utf16 -> utf8 conversion jaworski at autograf dot pl
2008-08-31 18:16 ` [Bug libstdc++/37298] " paolo dot carlini at oracle dot com
2008-08-31 18:25 ` paolo dot carlini at oracle dot com
2008-08-31 20:15 ` jaworski at autograf dot pl
2008-08-31 20:58 ` paolo dot carlini at oracle dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).