public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
* [Bug localedata/20864] New: iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK
@ 2016-11-25  4:49 arthur200126 at gmail dot com
  2016-11-25 21:25 ` [Bug localedata/20864] " vapier at gentoo dot org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: arthur200126 at gmail dot com @ 2016-11-25  4:49 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=20864

            Bug ID: 20864
           Summary: iconv: cp936 missing single-byte euro sign (0x80,
                    U+20AC), not same as GBK
           Product: glibc
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: arthur200126 at gmail dot com
                CC: libc-locales at sourceware dot org
  Target Milestone: ---

The addition of a single-byte euro sign at 0x80 in CP936 is possibly the most
well-known difference between the Windows Code Page and the GBK specification.
However, current versions of glibc seems to alias CP936 to GBK and not display
this behavior.

The following session comes from GNU bash running in a UTF-8 console. $''
denotes bash's ANSI C-style quoting, where \xhh generates a raw hex byte and
\uhhhh generates the representation of U+hhhh under current locale.

# iconv (Ubuntu GLIBC 2.23-0ubuntu4) 2.23
$ iconv -f cp936 -t utf-8 <<< $'\x80'
iconv: illegal input sequence at position 0
$ iconv -t cp936 -f utf-8 <<< $'\u20ac' | hexdump -C
iconv: illegal input sequence at position 0 

Expected behavior (from libiconv) is shown below.

# iconv (GNU libiconv 1.14)
$ iconv -f cp936 -t utf-8 <<< $'\x80'
€
$ iconv -t cp936 -f utf-8 <<< $'\u20ac' | hexdump -C
00000000  80 0a                                             |..|
00000002

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug localedata/20864] iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK
  2016-11-25  4:49 [Bug localedata/20864] New: iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK arthur200126 at gmail dot com
@ 2016-11-25 21:25 ` vapier at gentoo dot org
  2016-11-26 23:41 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: vapier at gentoo dot org @ 2016-11-25 21:25 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=20864

Mike Frysinger <vapier at gentoo dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2016-11-25
                 CC|                            |vapier at gentoo dot org
           Assignee|unassigned at sourceware dot org   |vapier at gentoo dot org
     Ever confirmed|0                           |1

--- Comment #1 from Mike Frysinger <vapier at gentoo dot org> ---
i've posted a fix here:
https://sourceware.org/ml/libc-alpha/2016-11/msg00930.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug localedata/20864] iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK
  2016-11-25  4:49 [Bug localedata/20864] New: iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK arthur200126 at gmail dot com
  2016-11-25 21:25 ` [Bug localedata/20864] " vapier at gentoo dot org
  2016-11-26 23:41 ` cvs-commit at gcc dot gnu.org
@ 2016-11-26 23:41 ` vapier at gentoo dot org
  2016-11-29  8:15 ` fweimer at redhat dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: vapier at gentoo dot org @ 2016-11-26 23:41 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=20864

Mike Frysinger <vapier at gentoo dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |2.25

--- Comment #3 from Mike Frysinger <vapier at gentoo dot org> ---
thanks, fix will be in the next release.  considering it's always been broken,
doesn't seem like it's worth it to back port to branches.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug localedata/20864] iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK
  2016-11-25  4:49 [Bug localedata/20864] New: iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK arthur200126 at gmail dot com
  2016-11-25 21:25 ` [Bug localedata/20864] " vapier at gentoo dot org
@ 2016-11-26 23:41 ` cvs-commit at gcc dot gnu.org
  2016-11-26 23:41 ` vapier at gentoo dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2016-11-26 23:41 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=20864

--- Comment #2 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  aa4d00ca39e604ac4e9fead401ccd4483e11a281 (commit)
      from  bf469f0ce98df9875daef625a85abd1160c44335 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=aa4d00ca39e604ac4e9fead401ccd4483e11a281

commit aa4d00ca39e604ac4e9fead401ccd4483e11a281
Author: Mike Frysinger <vapier@gentoo.org>
Date:   Fri Nov 25 11:12:10 2016 -0500

    localedata: GBK: add mapping for 0x80->Euro sign [BZ #20864]

    Microsoft long ago added a mapping for 0x80 to the Euro sign to their
    CP936.  While GBK 1.0 doesn't include this mapping, it is compatible,
    and Microsoft and glibc alias the two codepages.  We could split them
    apart so GBK wouldn't include the mapping, but that seems like a lot
    of work for little gain.

-----------------------------------------------------------------------

Summary of changes:
 localedata/ChangeLog    |    5 +++++
 localedata/charmaps/GBK |    7 +++++++
 2 files changed, 12 insertions(+), 0 deletions(-)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug localedata/20864] iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK
  2016-11-25  4:49 [Bug localedata/20864] New: iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK arthur200126 at gmail dot com
                   ` (2 preceding siblings ...)
  2016-11-26 23:41 ` vapier at gentoo dot org
@ 2016-11-29  8:15 ` fweimer at redhat dot com
  2016-11-29 19:26 ` fweimer at redhat dot com
  2016-12-16 19:25 ` vapier at gentoo dot org
  5 siblings, 0 replies; 7+ messages in thread
From: fweimer at redhat dot com @ 2016-11-29  8:15 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=20864

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fweimer at redhat dot com
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug localedata/20864] iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK
  2016-11-25  4:49 [Bug localedata/20864] New: iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK arthur200126 at gmail dot com
                   ` (3 preceding siblings ...)
  2016-11-29  8:15 ` fweimer at redhat dot com
@ 2016-11-29 19:26 ` fweimer at redhat dot com
  2016-12-16 19:25 ` vapier at gentoo dot org
  5 siblings, 0 replies; 7+ messages in thread
From: fweimer at redhat dot com @ 2016-11-29 19:26 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=20864

--- Comment #5 from Florian Weimer <fweimer at redhat dot com> ---
I checked in a follow-on fix for the test case failure in commit
0415d32187731ac03ef6c72c6cfb25314d4b0133.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug localedata/20864] iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK
  2016-11-25  4:49 [Bug localedata/20864] New: iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK arthur200126 at gmail dot com
                   ` (4 preceding siblings ...)
  2016-11-29 19:26 ` fweimer at redhat dot com
@ 2016-12-16 19:25 ` vapier at gentoo dot org
  5 siblings, 0 replies; 7+ messages in thread
From: vapier at gentoo dot org @ 2016-12-16 19:25 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=20864

--- Comment #6 from Mike Frysinger <vapier at gentoo dot org> ---
(In reply to Florian Weimer from comment #5)

thanks!  sorry for that.  i have some local tests i run for locale definitions,
but forgot that charmaps have more extensive tests.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-12-16 19:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-25  4:49 [Bug localedata/20864] New: iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK arthur200126 at gmail dot com
2016-11-25 21:25 ` [Bug localedata/20864] " vapier at gentoo dot org
2016-11-26 23:41 ` cvs-commit at gcc dot gnu.org
2016-11-26 23:41 ` vapier at gentoo dot org
2016-11-29  8:15 ` fweimer at redhat dot com
2016-11-29 19:26 ` fweimer at redhat dot com
2016-12-16 19:25 ` vapier at gentoo dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).