public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/13541] New: iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ
@ 2011-12-23  1:59 ezyang at mit dot edu
  2012-12-19 10:43 ` [Bug libc/13541] " schwab@linux-m68k.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: ezyang at mit dot edu @ 2011-12-23  1:59 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13541

             Bug #: 13541
           Summary: iconv //IGNORE charsets are inconsistent about INBUF*
                    state after EILSEQ
           Product: glibc
           Version: 2.14
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: drepper.fsp@gmail.com
        ReportedBy: ezyang@mit.edu
    Classification: Unclassified


The iconv infopage says the following:

    `EILSEQ'
          The conversion stopped because of an invalid byte sequence in
          the input.  After the call, `*INBUF' points at the first byte
          of the invalid byte sequence.

However, this is clearly not the case when an //IGNORE target charset is
specified:

    #include <iconv.h>
    #include <string.h>
    #include <stdio.h>
    #include <errno.h>
    int main() {
        iconv_t i = iconv_open("ascii//IGNORE", "utf-8");
        char inbuf[10000];
        char outbuf[10000];
        char *in = inbuf;
        char *out = outbuf;
        int inleft = 10000;
        int outleft = 10000;
        int s;
        memset(inbuf, 0x77, 10000);
        inbuf[0] = 0xC2;
        inbuf[1] = 0xA2;
        s = iconv(i, &in, &inleft, &out, &outleft);
        printf("s = %d, errno = %d, in[0] = %x, inleft = %d\n", s, errno,
(unsigned char)*in, inleft);
    }

Outputs the following:

    s = -1, errno = 84, in[0] = 77, inleft = 1839

'iconv' appears to have gobbled up another ~8000 bytes after the invalid byte
sequence, before returning EILSEQ (84).

The documentation here cannot possibly correct, if we want 'IGNORE' to actually
do anything. So we have two options:

1. Claim that the semantics of EILSEQ change when the magic //IGNORE flag is
specified, and require user code to work around it properly. This is what the
'-c' flag in iconv_prog.c does, by magically "converting" these errors into
E2BIG errors, and re-running iconv appropriately.

2. Claim that the this API is wrong, and modify the API such that an iconv
operating on an //IGNORE character set *never* returns EILSEQ (what one might
expect, since IGNORE is supposed to allow us to ignore sequences that are
illegal in the target). This would make glibc's iconv implementation consistent
with libiconv's.

I favor (2), since it makes client code considerably simpler and easier to
implement correctly.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13541] iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ
  2011-12-23  1:59 [Bug libc/13541] New: iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ ezyang at mit dot edu
@ 2012-12-19 10:43 ` schwab@linux-m68k.org
  2014-02-16 18:25 ` jackie.rosen at hushmail dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: schwab@linux-m68k.org @ 2012-12-19 10:43 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=13541

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|drepper.fsp at gmail dot    |unassigned at sourceware
                   |com                         |dot org

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13541] iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ
  2011-12-23  1:59 [Bug libc/13541] New: iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ ezyang at mit dot edu
  2012-12-19 10:43 ` [Bug libc/13541] " schwab@linux-m68k.org
@ 2014-02-16 18:25 ` jackie.rosen at hushmail dot com
  2014-05-28 19:41 ` schwab at sourceware dot org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jackie.rosen at hushmail dot com @ 2014-02-16 18:25 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13541

Jackie Rosen <jackie.rosen at hushmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jackie.rosen at hushmail dot com

--- Comment #1 from Jackie Rosen <jackie.rosen at hushmail dot com> ---
*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen from the domain http://volichat.com
Page where seen: http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13541] iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ
  2011-12-23  1:59 [Bug libc/13541] New: iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ ezyang at mit dot edu
  2012-12-19 10:43 ` [Bug libc/13541] " schwab@linux-m68k.org
  2014-02-16 18:25 ` jackie.rosen at hushmail dot com
@ 2014-05-28 19:41 ` schwab at sourceware dot org
  2014-06-27 11:21 ` fweimer at redhat dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: schwab at sourceware dot org @ 2014-05-28 19:41 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13541

Andreas Schwab <schwab at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|jackie.rosen at hushmail dot com   |

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/13541] iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ
  2011-12-23  1:59 [Bug libc/13541] New: iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ ezyang at mit dot edu
                   ` (2 preceding siblings ...)
  2014-05-28 19:41 ` schwab at sourceware dot org
@ 2014-06-27 11:21 ` fweimer at redhat dot com
  2015-08-27 22:06 ` [Bug locale/13541] " jsm28 at gcc dot gnu.org
  2015-09-17  0:14 ` ibaldo at adinet dot com.uy
  5 siblings, 0 replies; 7+ messages in thread
From: fweimer at redhat dot com @ 2014-06-27 11:21 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13541

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug locale/13541] iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ
  2011-12-23  1:59 [Bug libc/13541] New: iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ ezyang at mit dot edu
                   ` (3 preceding siblings ...)
  2014-06-27 11:21 ` fweimer at redhat dot com
@ 2015-08-27 22:06 ` jsm28 at gcc dot gnu.org
  2015-09-17  0:14 ` ibaldo at adinet dot com.uy
  5 siblings, 0 replies; 7+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2015-08-27 22:06 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13541

Joseph Myers <jsm28 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|libc                        |locale

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug locale/13541] iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ
  2011-12-23  1:59 [Bug libc/13541] New: iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ ezyang at mit dot edu
                   ` (4 preceding siblings ...)
  2015-08-27 22:06 ` [Bug locale/13541] " jsm28 at gcc dot gnu.org
@ 2015-09-17  0:14 ` ibaldo at adinet dot com.uy
  5 siblings, 0 replies; 7+ messages in thread
From: ibaldo at adinet dot com.uy @ 2015-09-17  0:14 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=13541

Ivan Baldo <ibaldo at adinet dot com.uy> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ibaldo at adinet dot com.uy

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-09-17  0:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-23  1:59 [Bug libc/13541] New: iconv //IGNORE charsets are inconsistent about INBUF* state after EILSEQ ezyang at mit dot edu
2012-12-19 10:43 ` [Bug libc/13541] " schwab@linux-m68k.org
2014-02-16 18:25 ` jackie.rosen at hushmail dot com
2014-05-28 19:41 ` schwab at sourceware dot org
2014-06-27 11:21 ` fweimer at redhat dot com
2015-08-27 22:06 ` [Bug locale/13541] " jsm28 at gcc dot gnu.org
2015-09-17  0:14 ` ibaldo at adinet dot com.uy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).