public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
[parent not found: <bug-16006-4@http.gcc.gnu.org/bugzilla/>]
* [Bug libstdc++/16006] New: Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
@ 2004-06-15 16:48 olau at hardworking dot dk
  2004-06-16  9:13 ` [Bug libstdc++/16006] " pcarlini at suse dot de
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: olau at hardworking dot dk @ 2004-06-15 16:48 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1416 bytes --]

It seems that Finnish numbers use a non-breaking space as the thousands
separator. This character (0xA0 I believe) is converted incorrectly to UTF-8
when using the UTF-8 locale and outputting numbers. This program demonstrates
the problem:

#include <iostream>

int main()
{
  std::cout.imbue(std::locale("fi_FI.UTF-8"));
  std::cout << 1224 << std::endl;
  
  setlocale(LC_ALL, "fi_FI.UTF-8");
  printf("%'d\n", 1224);
}

Compile it with "g++ tmp.cpp -o tmp -Wall". Then run "./tmp". It will output

ole:~/tmp$ ./tmp
1Â224
1Â 224

Note that libstdc++ converts the non-breaking space (you can see this character
by using .ISO-8859-1 instead of .UTF-8 in the program) into one character, which
is obviously not UTF-8, whereas libc converts the space into two.

I'm not from Finland myself, but a user of one of my programs had mysterious
crashes due to this problem - obviously they only occurs when the numbers become
greater than 1,000.

-- 
           Summary: Conversions of numbers in fi_FI.UTF-8 produces incorrect
                    UTF-8
           Product: gcc
           Version: 3.3.4
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: libstdc++
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: olau at hardworking dot dk
                CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-05-16 19:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-16006-6946@http.gcc.gnu.org/bugzilla/>
2006-10-07 19:48 ` [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8 pcarlini at suse dot de
2006-10-08 11:04 ` pcarlini at suse dot de
2006-11-06 11:24 ` bkoz at gcc dot gnu dot org
2010-01-08 18:48 ` paolo dot carlini at oracle dot com
     [not found] <bug-16006-4@http.gcc.gnu.org/bugzilla/>
2023-05-16 19:41 ` pinskia at gcc dot gnu.org
2004-06-15 16:48 [Bug libstdc++/16006] New: " olau at hardworking dot dk
2004-06-16  9:13 ` [Bug libstdc++/16006] " pcarlini at suse dot de
2004-06-16 14:18 ` olau at hardworking dot dk
2004-06-16 15:57 ` pcarlini at suse dot de
2004-06-16 16:08 ` pcarlini at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).