public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
* [Bug localedata/23502] New: gconv(UTF-8 to GB18030)
@ 2018-08-10 10:26 286000435 at qq dot com
  2018-08-10 10:29 ` [Bug localedata/23502] " 286000435 at qq dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: 286000435 at qq dot com @ 2018-08-10 10:26 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=23502

            Bug ID: 23502
           Summary: gconv(UTF-8 to GB18030)
           Product: glibc
           Version: 2.17
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: 286000435 at qq dot com
                CC: libc-locales at sourceware dot org
  Target Milestone: ---

A bug of gconv(UTF-8 to GB18030). 
The version of glibc is greater than 2.17(>=2.17).
Following code can't get the right result.

#include <iconv.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void print_buf(char* buf, size_t len)
{
        size_t i;
        for (i = 0; i < len; ++i) {
                printf("%02X ", (unsigned char)buf[i]);
        }
        printf("\n");
}

int main(int argc, char** argv)
{
        char s[] = "早見純";
        iconv_t cd = iconv_open("GB18030", "UTF-8");
        if (cd <= 0 ) {
                printf("iconv_open fail\n");
                return 0;
        }
        size_t inbytesleft = sizeof(s) - 1;
        char dst[6 * sizeof(s)] = { 0 };
        size_t outbytesleft = sizeof(dst) - 1;
        char* inbuf = s;
        char* outbuf = dst;

        if (iconv(cd, &inbuf, &inbytesleft, &outbuf, &outbytesleft) ==
(size_t)-1) {
                printf("iconv fail: %d\n", errno);
        }
        iconv_close(cd);
        size_t n = strlen(dst);
        printf("inbytesleft: %u, outbytesleft: %u, dst len: %u\n", inbytesleft,
outbytesleft, n);
        print_buf(dst, n);
        return 0;
}


get error output:
iconv fail: 84
inbytesleft: 4, outbytesleft: 72, dst len: 6
D4 E7 D2 8A BC 83


But I can get the right result with gblic 2.12:
inbytesleft: 0, outbytesleft: 69, dst len: 8
D4 E7 D2 8A BC 83 FE 52

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/23502] gconv(UTF-8 to GB18030)
  2018-08-10 10:26 [Bug localedata/23502] New: gconv(UTF-8 to GB18030) 286000435 at qq dot com
@ 2018-08-10 10:29 ` 286000435 at qq dot com
  2018-08-10 10:38 ` [Bug localedata/23502] gconv(GB18030 to UTF-8) 286000435 at qq dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: 286000435 at qq dot com @ 2018-08-10 10:29 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=23502

lance <286000435 at qq dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |critical

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/23502] gconv(GB18030 to UTF-8)
  2018-08-10 10:26 [Bug localedata/23502] New: gconv(UTF-8 to GB18030) 286000435 at qq dot com
  2018-08-10 10:29 ` [Bug localedata/23502] " 286000435 at qq dot com
@ 2018-08-10 10:38 ` 286000435 at qq dot com
  2018-08-10 10:39 ` [Bug localedata/23502] gconv(UTF-8 to GB18030) 286000435 at qq dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: 286000435 at qq dot com @ 2018-08-10 10:38 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=23502

lance <286000435 at qq dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|gconv(UTF-8 to GB18030)     |gconv(GB18030 to UTF-8)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/23502] gconv(GB18030 to UTF-8)
  2018-08-10 10:26 [Bug localedata/23502] New: gconv(UTF-8 to GB18030) 286000435 at qq dot com
                   ` (2 preceding siblings ...)
  2018-08-10 10:39 ` [Bug localedata/23502] gconv(UTF-8 to GB18030) 286000435 at qq dot com
@ 2018-08-10 10:39 ` 286000435 at qq dot com
  2018-08-10 10:42 ` 286000435 at qq dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: 286000435 at qq dot com @ 2018-08-10 10:39 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=23502

lance <286000435 at qq dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|gconv(UTF-8 to GB18030)     |gconv(GB18030 to UTF-8)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/23502] gconv(UTF-8 to GB18030)
  2018-08-10 10:26 [Bug localedata/23502] New: gconv(UTF-8 to GB18030) 286000435 at qq dot com
  2018-08-10 10:29 ` [Bug localedata/23502] " 286000435 at qq dot com
  2018-08-10 10:38 ` [Bug localedata/23502] gconv(GB18030 to UTF-8) 286000435 at qq dot com
@ 2018-08-10 10:39 ` 286000435 at qq dot com
  2018-08-10 10:39 ` [Bug localedata/23502] gconv(GB18030 to UTF-8) 286000435 at qq dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: 286000435 at qq dot com @ 2018-08-10 10:39 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=23502

lance <286000435 at qq dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|gconv(GB18030 to UTF-8)     |gconv(UTF-8 to GB18030)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/23502] gconv(GB18030 to UTF-8)
  2018-08-10 10:26 [Bug localedata/23502] New: gconv(UTF-8 to GB18030) 286000435 at qq dot com
                   ` (3 preceding siblings ...)
  2018-08-10 10:39 ` [Bug localedata/23502] gconv(GB18030 to UTF-8) 286000435 at qq dot com
@ 2018-08-10 10:42 ` 286000435 at qq dot com
  2018-08-10 12:04 ` schwab@linux-m68k.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: 286000435 at qq dot com @ 2018-08-10 10:42 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=23502

--- Comment #1 from lance <286000435 at qq dot com> ---
> A bug of gconv(GB18030 to UTF-8).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/23502] gconv(GB18030 to UTF-8)
  2018-08-10 10:26 [Bug localedata/23502] New: gconv(UTF-8 to GB18030) 286000435 at qq dot com
                   ` (4 preceding siblings ...)
  2018-08-10 10:42 ` 286000435 at qq dot com
@ 2018-08-10 12:04 ` schwab@linux-m68k.org
  2018-09-05 14:04 ` schwab@linux-m68k.org
  2018-09-12  8:55 ` 286000435 at qq dot com
  7 siblings, 0 replies; 9+ messages in thread
From: schwab@linux-m68k.org @ 2018-08-10 12:04 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=23502

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|critical                    |normal

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/23502] gconv(GB18030 to UTF-8)
  2018-08-10 10:26 [Bug localedata/23502] New: gconv(UTF-8 to GB18030) 286000435 at qq dot com
                   ` (5 preceding siblings ...)
  2018-08-10 12:04 ` schwab@linux-m68k.org
@ 2018-09-05 14:04 ` schwab@linux-m68k.org
  2018-09-12  8:55 ` 286000435 at qq dot com
  7 siblings, 0 replies; 9+ messages in thread
From: schwab@linux-m68k.org @ 2018-09-05 14:04 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=23502

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2018-09-05
     Ever confirmed|0                           |1

--- Comment #2 from Andreas Schwab <schwab@linux-m68k.org> ---
What is the exact contents of the array s?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/23502] gconv(GB18030 to UTF-8)
  2018-08-10 10:26 [Bug localedata/23502] New: gconv(UTF-8 to GB18030) 286000435 at qq dot com
                   ` (6 preceding siblings ...)
  2018-09-05 14:04 ` schwab@linux-m68k.org
@ 2018-09-12  8:55 ` 286000435 at qq dot com
  7 siblings, 0 replies; 9+ messages in thread
From: 286000435 at qq dot com @ 2018-09-12  8:55 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=23502

--- Comment #3 from lance <286000435 at qq dot com> ---
(In reply to Andreas Schwab from comment #2)
> What is the exact contents of the array s?

This is new code:

#include <iconv.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void print_buf(char* buf, size_t len)
{
        size_t i;
        for (i = 0; i < len; ++i) {
                printf("%02X ", (unsigned char)buf[i]);
        }
        printf("\n");
}

int main(int argc, char** argv)
{
        char s[] = "早見純";
        print_buf(s, sizeof(s));
        iconv_t cd = iconv_open("GB18030", "UTF-8");
        if (cd <= 0 ) {
                printf("iconv_open fail\n");
                return 0;
        }
        size_t inbytesleft = sizeof(s) - 1;
        char dst[6 * sizeof(s)] = { 0 };
        size_t outbytesleft = sizeof(dst) - 1;
        char* inbuf = s;
        char* outbuf = dst;

        if (iconv(cd, &inbuf, &inbytesleft, &outbuf, &outbytesleft) ==
(size_t)-1) {
                printf("iconv fail: %d\n", errno);
        }
        iconv_close(cd);
        size_t n = strlen(dst);
        printf("inbytesleft: %u, outbytesleft: %u, dst len: %u\n", inbytesleft,
outbytesleft, n);
        print_buf(dst, n);
        return 0;
}

array s contents is :
E6 97 A9 E8 A6 8B E7 B4 94 EE A0 97 00

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-09-12  8:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-10 10:26 [Bug localedata/23502] New: gconv(UTF-8 to GB18030) 286000435 at qq dot com
2018-08-10 10:29 ` [Bug localedata/23502] " 286000435 at qq dot com
2018-08-10 10:38 ` [Bug localedata/23502] gconv(GB18030 to UTF-8) 286000435 at qq dot com
2018-08-10 10:39 ` [Bug localedata/23502] gconv(UTF-8 to GB18030) 286000435 at qq dot com
2018-08-10 10:39 ` [Bug localedata/23502] gconv(GB18030 to UTF-8) 286000435 at qq dot com
2018-08-10 10:42 ` 286000435 at qq dot com
2018-08-10 12:04 ` schwab@linux-m68k.org
2018-09-05 14:04 ` schwab@linux-m68k.org
2018-09-12  8:55 ` 286000435 at qq dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).