public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
       [not found] <bug-16006-4@http.gcc.gnu.org/bugzilla/>
@ 2023-05-16 19:41 ` pinskia at gcc dot gnu.org
  0 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-16 19:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW
           Assignee|bkoz at gcc dot gnu.org            |unassigned at gcc dot gnu.org

--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Unassigning since Benjamin since not been active in GCC development for over 8
years now.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
       [not found] <bug-16006-6946@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2006-11-06 11:24 ` bkoz at gcc dot gnu dot org
@ 2010-01-08 18:48 ` paolo dot carlini at oracle dot com
  3 siblings, 0 replies; 9+ messages in thread
From: paolo dot carlini at oracle dot com @ 2010-01-08 18:48 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from paolo dot carlini at oracle dot com  2010-01-08 18:47 -------
*** Bug 39243 has been marked as a duplicate of this bug. ***


-- 

paolo dot carlini at oracle dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Sergey dot Belyashov at
                   |                            |gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
       [not found] <bug-16006-6946@http.gcc.gnu.org/bugzilla/>
  2006-10-07 19:48 ` pcarlini at suse dot de
  2006-10-08 11:04 ` pcarlini at suse dot de
@ 2006-11-06 11:24 ` bkoz at gcc dot gnu dot org
  2010-01-08 18:48 ` paolo dot carlini at oracle dot com
  3 siblings, 0 replies; 9+ messages in thread
From: bkoz at gcc dot gnu dot org @ 2006-11-06 11:24 UTC (permalink / raw)
  To: gcc-bugs



-- 

bkoz at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |bkoz at gcc dot gnu dot org
                   |dot org                     |
             Status|REOPENED                    |ASSIGNED
   Last reconfirmed|0000-00-00 00:00:00         |2006-11-06 11:24:16
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
       [not found] <bug-16006-6946@http.gcc.gnu.org/bugzilla/>
  2006-10-07 19:48 ` pcarlini at suse dot de
@ 2006-10-08 11:04 ` pcarlini at suse dot de
  2006-11-06 11:24 ` bkoz at gcc dot gnu dot org
  2010-01-08 18:48 ` paolo dot carlini at oracle dot com
  3 siblings, 0 replies; 9+ messages in thread
From: pcarlini at suse dot de @ 2006-10-08 11:04 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from pcarlini at suse dot de  2006-10-08 11:04 -------
Let's reopen this report as an enhancement request. In fact, we should
implement this:

  http://gcc.gnu.org/ml/libstdc++/2004-06/msg00256.html

probably by using iconv to implement the relevant char <-> char codecvt_byname,
like here:

  http://gcc.gnu.org/ml/libstdc++/2004-06/msg00252.html

Note that this is not going to happen very soon, it will be quite a bit of
work: for now people requiring UTF-8 output should rely as a rule on wchar
streams.


-- 

pcarlini at suse dot de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
       [not found] <bug-16006-6946@http.gcc.gnu.org/bugzilla/>
@ 2006-10-07 19:48 ` pcarlini at suse dot de
  2006-10-08 11:04 ` pcarlini at suse dot de
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: pcarlini at suse dot de @ 2006-10-07 19:48 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from pcarlini at suse dot de  2006-10-07 19:48 -------
*** Bug 29379 has been marked as a duplicate of this bug. ***


-- 

pcarlini at suse dot de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |debian-gcc at lists dot
                   |                            |debian dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
  2004-06-15 16:48 [Bug libstdc++/16006] New: " olau at hardworking dot dk
                   ` (2 preceding siblings ...)
  2004-06-16 15:57 ` pcarlini at suse dot de
@ 2004-06-16 16:08 ` pcarlini at suse dot de
  3 siblings, 0 replies; 9+ messages in thread
From: pcarlini at suse dot de @ 2004-06-16 16:08 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pcarlini at suse dot de  2004-06-16 16:07 -------
... forgot to add: complete support for UTF-8 is available only in 3.4.0,
therefore, not even try with 3.3.x ;-)

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
  2004-06-15 16:48 [Bug libstdc++/16006] New: " olau at hardworking dot dk
  2004-06-16  9:13 ` [Bug libstdc++/16006] " pcarlini at suse dot de
  2004-06-16 14:18 ` olau at hardworking dot dk
@ 2004-06-16 15:57 ` pcarlini at suse dot de
  2004-06-16 16:08 ` pcarlini at suse dot de
  3 siblings, 0 replies; 9+ messages in thread
From: pcarlini at suse dot de @ 2004-06-16 15:57 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pcarlini at suse dot de  2004-06-16 15:57 -------
> This flag is a GNU extension.

So, we are in the realm of -extensions-, not of Standard C. Ok, if you want to
use that, but, beware, no consistency with the C++ Standard is guaranteed.

> You are wrong -
> the output is _not_ OK. It is not UTF-8. Run the program with .ISO-8859-1
> instead of .UTF-8, and you get the non-breaking space in .ISO-8859-1. Then
> put that character through iconv from ISO-8859-1 to UTF-8 and you get _two_
> characters, not one (in fact it could not possible be just one character when
> it's UTF-8).

In the ISO Standard the thousands separator is a -single- char_type of the
-internal- encoding. Therefore, in general, in order to accomplish what you
want, you have to use an internal encoding sufficiently wide (cout -> wcout)
and also you have to call std::ios::sync_with_stdio(false) before any other
I/O operation, otherwise no encoding to UTF-8 (from the internal
representation) will take place (despite the imbue).

Thanks, Paolo.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|                            |INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
  2004-06-15 16:48 [Bug libstdc++/16006] New: " olau at hardworking dot dk
  2004-06-16  9:13 ` [Bug libstdc++/16006] " pcarlini at suse dot de
@ 2004-06-16 14:18 ` olau at hardworking dot dk
  2004-06-16 15:57 ` pcarlini at suse dot de
  2004-06-16 16:08 ` pcarlini at suse dot de
  3 siblings, 0 replies; 9+ messages in thread
From: olau at hardworking dot dk @ 2004-06-16 14:18 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From olau at hardworking dot dk  2004-06-16 14:17 -------
The %'d is to make it output the thousands separator. Look in the glibc manual:

`''
     Separate the digits into groups as specified by the locale
     specified for the `LC_NUMERIC' category; *note General Numeric::.
     This flag is a GNU extension.

I'm not sure how you do it otherwise in C. But about the bug. You are wrong -
the output is _not_ OK. It is not UTF-8. Run the program with .ISO-8859-1
instead of .UTF-8, and you get the non-breaking space in .ISO-8859-1. Then put
that character through iconv from ISO-8859-1 to UTF-8 and you get _two_
characters, not one (in fact it could not possible be just one character when
it's UTF-8).

So glibc is right (produces correct UTF-8 non-breaking space) and libstdc++ is
wrong (produces incorrect UTF-8 non-breaking space). The invalid UTF-8 from
libstdc++ makes my GTK+ program die horrible.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8
  2004-06-15 16:48 [Bug libstdc++/16006] New: " olau at hardworking dot dk
@ 2004-06-16  9:13 ` pcarlini at suse dot de
  2004-06-16 14:18 ` olau at hardworking dot dk
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: pcarlini at suse dot de @ 2004-06-16  9:13 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pcarlini at suse dot de  2004-06-16 09:13 -------
Hi. First, a comment about your C lines: what do you mean by "%'d"? This format
string looks definitely incorrect to me and and if I change it to just "%d" the 
expected
1224
is produced.
About the C++ lines: the particular capital A, is just the thousands separator
in the locale at issue. The output seems also ok (taking into account the 
character set of yours (and mine;) shell)

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16006


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-05-16 19:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-16006-4@http.gcc.gnu.org/bugzilla/>
2023-05-16 19:41 ` [Bug libstdc++/16006] Conversions of numbers in fi_FI.UTF-8 produces incorrect UTF-8 pinskia at gcc dot gnu.org
     [not found] <bug-16006-6946@http.gcc.gnu.org/bugzilla/>
2006-10-07 19:48 ` pcarlini at suse dot de
2006-10-08 11:04 ` pcarlini at suse dot de
2006-11-06 11:24 ` bkoz at gcc dot gnu dot org
2010-01-08 18:48 ` paolo dot carlini at oracle dot com
2004-06-15 16:48 [Bug libstdc++/16006] New: " olau at hardworking dot dk
2004-06-16  9:13 ` [Bug libstdc++/16006] " pcarlini at suse dot de
2004-06-16 14:18 ` olau at hardworking dot dk
2004-06-16 15:57 ` pcarlini at suse dot de
2004-06-16 16:08 ` pcarlini at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).