public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/6050] iconv(1) buffers all of stdin in memory
       [not found] <bug-6050-131@http.sourceware.org/bugzilla/>
@ 2012-12-19 10:44 ` schwab@linux-m68k.org
  2013-08-10 16:01 ` bugdal at aerifal dot cx
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: schwab@linux-m68k.org @ 2012-12-19 10:44 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=6050

Andreas Schwab <schwab@linux-m68k.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|drepper.fsp at gmail dot    |unassigned at sourceware
                   |com                         |dot org

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libc/6050] iconv(1) buffers all of stdin in memory
       [not found] <bug-6050-131@http.sourceware.org/bugzilla/>
  2012-12-19 10:44 ` [Bug libc/6050] iconv(1) buffers all of stdin in memory schwab@linux-m68k.org
@ 2013-08-10 16:01 ` bugdal at aerifal dot cx
  2013-10-07 13:58   ` Ondřej Bílka
  2013-10-07 13:58 ` neleai at seznam dot cz
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 8+ messages in thread
From: bugdal at aerifal dot cx @ 2013-08-10 16:01 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=6050

Rich Felker <bugdal at aerifal dot cx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugdal at aerifal dot cx

--- Comment #1 from Rich Felker <bugdal at aerifal dot cx> ---
Ping. This bug still exists. I traced it to a comment in the source:

/* we have a problem with reading from a desriptor since we must not
   provide the iconv() function an incomplete character or shift
   sequence at the end of the buffer.  Since we have to deal with
   arbitrary encodings we must read the whole text in a buffer and
   process it in one step.  */

See
http://sourceware.org/git/?p=glibc.git;a=blob;f=iconv/iconv_prog.c;h=1a1d0d0cf45c0d747a8090bc234addd9e49f1ba7;hb=HEAD#l561

The claims made in the comment are simply erroneous. Per POSIX, the iconv
function returns (size_t)-1 with errno set to EINVAL to indicate "Input
conversion stopped due to an incomplete character or shift sequence at the end
of the input buffer." This is a different condition from EILSEQ, and thus the
caller can detect and recover from it simply by moving the remaining bytes of
the input buffer to the beginning, re-filling the buffer, and calling iconv
again.

If glibc's iconv function does not support this behavior correctly, that's a
library-level bug which should be filed separately and fixed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Bug libc/6050] iconv(1) buffers all of stdin in memory
  2013-08-10 16:01 ` bugdal at aerifal dot cx
@ 2013-10-07 13:58   ` Ondřej Bílka
  0 siblings, 0 replies; 8+ messages in thread
From: Ondřej Bílka @ 2013-10-07 13:58 UTC (permalink / raw)
  To: bugdal at aerifal dot cx; +Cc: glibc-bugs

On Sat, Aug 10, 2013 at 04:01:22PM +0000, bugdal at aerifal dot cx wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=6050
> 
> Rich Felker <bugdal at aerifal dot cx> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |bugdal at aerifal dot cx
> 
> --- Comment #1 from Rich Felker <bugdal at aerifal dot cx> ---
> Ping. This bug still exists. I traced it to a comment in the source:
> 
> /* we have a problem with reading from a desriptor since we must not
>    provide the iconv() function an incomplete character or shift
>    sequence at the end of the buffer.  Since we have to deal with
>    arbitrary encodings we must read the whole text in a buffer and
>    process it in one step.  */
> 
> See
> http://sourceware.org/git/?p=glibc.git;a=blob;f=iconv/iconv_prog.c;h=1a1d0d0cf45c0d747a8090bc234addd9e49f1ba7;hb=HEAD#l561
> 
> The claims made in the comment are simply erroneous. Per POSIX, the iconv
> function returns (size_t)-1 with errno set to EINVAL to indicate "Input
> conversion stopped due to an incomplete character or shift sequence at the end
> of the input buffer." This is a different condition from EILSEQ, and thus the
> caller can detect and recover from it simply by moving the remaining bytes of
> the input buffer to the beginning, re-filling the buffer, and calling iconv
> again.
>
This would work only for stateless encodings. You cannot do this with
ISO-2022-JP as you would need additional argument to save state.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libc/6050] iconv(1) buffers all of stdin in memory
       [not found] <bug-6050-131@http.sourceware.org/bugzilla/>
  2012-12-19 10:44 ` [Bug libc/6050] iconv(1) buffers all of stdin in memory schwab@linux-m68k.org
  2013-08-10 16:01 ` bugdal at aerifal dot cx
@ 2013-10-07 13:58 ` neleai at seznam dot cz
  2013-10-07 14:06 ` bugdal at aerifal dot cx
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: neleai at seznam dot cz @ 2013-10-07 13:58 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=6050

--- Comment #2 from Ondrej Bilka <neleai at seznam dot cz> ---
On Sat, Aug 10, 2013 at 04:01:22PM +0000, bugdal at aerifal dot cx wrote:
> http://sourceware.org/bugzilla/show_bug.cgi?id=6050
> 
> Rich Felker <bugdal at aerifal dot cx> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |bugdal at aerifal dot cx
> 
> --- Comment #1 from Rich Felker <bugdal at aerifal dot cx> ---
> Ping. This bug still exists. I traced it to a comment in the source:
> 
> /* we have a problem with reading from a desriptor since we must not
>    provide the iconv() function an incomplete character or shift
>    sequence at the end of the buffer.  Since we have to deal with
>    arbitrary encodings we must read the whole text in a buffer and
>    process it in one step.  */
> 
> See
> http://sourceware.org/git/?p=glibc.git;a=blob;f=iconv/iconv_prog.c;h=1a1d0d0cf45c0d747a8090bc234addd9e49f1ba7;hb=HEAD#l561
> 
> The claims made in the comment are simply erroneous. Per POSIX, the iconv
> function returns (size_t)-1 with errno set to EINVAL to indicate "Input
> conversion stopped due to an incomplete character or shift sequence at the end
> of the input buffer." This is a different condition from EILSEQ, and thus the
> caller can detect and recover from it simply by moving the remaining bytes of
> the input buffer to the beginning, re-filling the buffer, and calling iconv
> again.
>
This would work only for stateless encodings. You cannot do this with
ISO-2022-JP as you would need additional argument to save state.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libc/6050] iconv(1) buffers all of stdin in memory
       [not found] <bug-6050-131@http.sourceware.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2013-10-07 13:58 ` neleai at seznam dot cz
@ 2013-10-07 14:06 ` bugdal at aerifal dot cx
  2013-10-09  7:50 ` neleai at seznam dot cz
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: bugdal at aerifal dot cx @ 2013-10-07 14:06 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=6050

--- Comment #3 from Rich Felker <bugdal at aerifal dot cx> ---
On Mon, Oct 07, 2013 at 01:58:09PM +0000, neleai at seznam dot cz wrote:
> This would work only for stateless encodings. You cannot do this with
> ISO-2022-JP as you would need additional argument to save state.

No, the state is saved in the conversion descriptor, and iconv(3)
reports to the caller the exact point at which it stopped in the
input, so you simply resume from that point.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug libc/6050] iconv(1) buffers all of stdin in memory
       [not found] <bug-6050-131@http.sourceware.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2013-10-07 14:06 ` bugdal at aerifal dot cx
@ 2013-10-09  7:50 ` neleai at seznam dot cz
  2015-08-27 22:03 ` [Bug locale/6050] " jsm28 at gcc dot gnu.org
  2024-09-20 11:54 ` fweimer at redhat dot com
  6 siblings, 0 replies; 8+ messages in thread
From: neleai at seznam dot cz @ 2013-10-09  7:50 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=6050

Ondrej Bilka <neleai at seznam dot cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |neleai at seznam dot cz

--- Comment #4 from Ondrej Bilka <neleai at seznam dot cz> ---
ah, then it is ok, could you write a patch?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug locale/6050] iconv(1) buffers all of stdin in memory
       [not found] <bug-6050-131@http.sourceware.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2013-10-09  7:50 ` neleai at seznam dot cz
@ 2015-08-27 22:03 ` jsm28 at gcc dot gnu.org
  2024-09-20 11:54 ` fweimer at redhat dot com
  6 siblings, 0 replies; 8+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2015-08-27 22:03 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=6050

Joseph Myers <jsm28 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|libc                        |locale

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug locale/6050] iconv(1) buffers all of stdin in memory
       [not found] <bug-6050-131@http.sourceware.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2015-08-27 22:03 ` [Bug locale/6050] " jsm28 at gcc dot gnu.org
@ 2024-09-20 11:54 ` fweimer at redhat dot com
  6 siblings, 0 replies; 8+ messages in thread
From: fweimer at redhat dot com @ 2024-09-20 11:54 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=6050

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
                 CC|                            |fweimer at redhat dot com
   Target Milestone|---                         |2.41
           Assignee|unassigned at sourceware dot org   |fweimer at redhat dot com
             Status|NEW                         |RESOLVED

--- Comment #6 from Florian Weimer <fweimer at redhat dot com> ---
Fixed for 2.41 via:

commit fa1b0d5e9f6e0353e16339430770a7a8824c0468
Author: Florian Weimer <fweimer@redhat.com>
Date:   Fri Sep 20 13:10:54 2024 +0200

    iconv: Input buffering for the iconv program (bug 6050)

    Do not read the entire input file into memory.

    Reviewed-by: DJ Delorie <dj@redhat.com>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-09-20 11:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-6050-131@http.sourceware.org/bugzilla/>
2012-12-19 10:44 ` [Bug libc/6050] iconv(1) buffers all of stdin in memory schwab@linux-m68k.org
2013-08-10 16:01 ` bugdal at aerifal dot cx
2013-10-07 13:58   ` Ondřej Bílka
2013-10-07 13:58 ` neleai at seznam dot cz
2013-10-07 14:06 ` bugdal at aerifal dot cx
2013-10-09  7:50 ` neleai at seznam dot cz
2015-08-27 22:03 ` [Bug locale/6050] " jsm28 at gcc dot gnu.org
2024-09-20 11:54 ` fweimer at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).