public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libc/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16
       [not found] <bug-10460-131@http.sourceware.org/bugzilla/>
@ 2013-07-06 15:41 ` krichter722 at aol dot de
  2013-07-06 15:42 ` krichter722 at aol dot de
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: krichter722 at aol dot de @ 2013-07-06 15:41 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=10460

Kalle Richter <krichter722 at aol dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
                 CC|                            |krichter722 at aol dot de
            Version|2.9                         |2.17
         Resolution|WORKSFORME                  |---

--- Comment #3 from Kalle Richter <krichter722 at aol dot de> ---
This behavior persists in iconv (Ubuntu EGLIBC 2.17-0ubuntu5) 2.17 if you
choose input file and output file to be the same (maybe only on an x86_64). Did
you one try to reproduce the error with a set of identical input and output
files? If I process the files in the following python script (using temporary
files) everything works fine:
<code>
#!/usr/bin/python

import os
import tempfile
import shutil

# The intention of this script is to avoid error "bus error" when processing
file iconv with identical input and output file

for (dirpath, dirnames, filenames) in
os.walk("/home/richter/sources/Aristoteles", topdown=True, onerror=None,
followlinks=False):
    #print(dirpath, dirnames, filenames)
    for filename in filenames:
        _file = os.path.join(dirpath, filename)
        #print(_file)
        file_ext_pair = _file.split(".")
        if len(file_ext_pair) > 1 and file_ext_pair[len(file_ext_pair)-1] ==
"java":
            _tempfile = tempfile.mkstemp()[1]
            os.system("iconv -f ISO-8859-15 -t utf-8 \""+_file+"\" -o
\""+_tempfile+"\"")
            os.remove(_file)
            shutil.move(_tempfile, _file)
</code>

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16
       [not found] <bug-10460-131@http.sourceware.org/bugzilla/>
  2013-07-06 15:41 ` [Bug libc/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16 krichter722 at aol dot de
@ 2013-07-06 15:42 ` krichter722 at aol dot de
  2013-07-07  3:25 ` bugdal at aerifal dot cx
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: krichter722 at aol dot de @ 2013-07-06 15:42 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=10460

--- Comment #4 from Kalle Richter <krichter722 at aol dot de> ---
The idea behind is to add a warning if input and output file are identical that
the execution is discouraged.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug libc/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16
       [not found] <bug-10460-131@http.sourceware.org/bugzilla/>
  2013-07-06 15:41 ` [Bug libc/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16 krichter722 at aol dot de
  2013-07-06 15:42 ` krichter722 at aol dot de
@ 2013-07-07  3:25 ` bugdal at aerifal dot cx
  2013-07-07  9:23 ` [Bug manual/10460] " krichter722 at aol dot de
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: bugdal at aerifal dot cx @ 2013-07-07  3:25 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=10460

Rich Felker <bugdal at aerifal dot cx> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugdal at aerifal dot cx

--- Comment #5 from Rich Felker <bugdal at aerifal dot cx> ---
None of the standard utilities work, or can be expected to work, when the input
and output files are the same. As far as I know this is well documented in
POSIX. If it's not documented in the manual for the GNU iconv utility, this
should probably be reclassified as a manual bug, or a new bug filed against the
manual.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug manual/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16
       [not found] <bug-10460-131@http.sourceware.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2013-07-07  3:25 ` bugdal at aerifal dot cx
@ 2013-07-07  9:23 ` krichter722 at aol dot de
  2013-10-20 21:22 ` neleai at seznam dot cz
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 7+ messages in thread
From: krichter722 at aol dot de @ 2013-07-07  9:23 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=10460

Kalle Richter <krichter722 at aol dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mtk.manpages at gmail dot com,
                   |                            |roland at gnu dot org
          Component|libc                        |manual

--- Comment #6 from Kalle Richter <krichter722 at aol dot de> ---
One of a the information that input and output have to be different or a link
to the general POSIX information should be added. 
(As I have to choose one of duplicate, fixed, invalid, wontfix and worksforme
and moved, which all don't match with reopened, I choose duplicate without any
purpose)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug manual/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16
       [not found] <bug-10460-131@http.sourceware.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2013-07-07  9:23 ` [Bug manual/10460] " krichter722 at aol dot de
@ 2013-10-20 21:22 ` neleai at seznam dot cz
  2013-10-21  1:38 ` bugdal at aerifal dot cx
  2014-07-01  7:29 ` fweimer at redhat dot com
  6 siblings, 0 replies; 7+ messages in thread
From: neleai at seznam dot cz @ 2013-10-20 21:22 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=10460

Ondrej Bilka <neleai at seznam dot cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |neleai at seznam dot cz
           Severity|normal                      |enhancement

--- Comment #7 from Ondrej Bilka <neleai at seznam dot cz> ---
This asks for new functionality, easiest way to resolve this is write a patch
yourself. 

Relevant part is  iconv/iconv_prog.c around line 300. One solution would be
that if input is same as output first load entire input file to memory and
proceed like in mmap case.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug manual/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16
       [not found] <bug-10460-131@http.sourceware.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2013-10-20 21:22 ` neleai at seznam dot cz
@ 2013-10-21  1:38 ` bugdal at aerifal dot cx
  2014-07-01  7:29 ` fweimer at redhat dot com
  6 siblings, 0 replies; 7+ messages in thread
From: bugdal at aerifal dot cx @ 2013-10-21  1:38 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=10460

--- Comment #8 from Rich Felker <bugdal at aerifal dot cx> ---
There's another open bug, #6050, which is to be fixed by removing the useless
and harmful attempt to read the whole input file into memory, so I don't think
adding more cases to do this would be a good idea. Using the same file for
input and output is just ALWAYS wrong for all the standard utilities; iconv is
no exception.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug manual/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16
       [not found] <bug-10460-131@http.sourceware.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2013-10-21  1:38 ` bugdal at aerifal dot cx
@ 2014-07-01  7:29 ` fweimer at redhat dot com
  6 siblings, 0 replies; 7+ messages in thread
From: fweimer at redhat dot com @ 2014-07-01  7:29 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=10460

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fweimer at redhat dot com
              Flags|                            |security-

--- Comment #9 from Florian Weimer <fweimer at redhat dot com> ---
(In reply to Rich Felker from comment #8)
> There's another open bug, #6050, which is to be fixed by removing the
> useless and harmful attempt to read the whole input file into memory, so I
> don't think adding more cases to do this would be a good idea. Using the
> same file for input and output is just ALWAYS wrong for all the standard
> utilities; iconv is no exception.

There are exceptions, sort being one of them.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-07-01  7:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-10460-131@http.sourceware.org/bugzilla/>
2013-07-06 15:41 ` [Bug libc/10460] "iconv" corrupts all files over 17 KB from UTF8 to UTF16 krichter722 at aol dot de
2013-07-06 15:42 ` krichter722 at aol dot de
2013-07-07  3:25 ` bugdal at aerifal dot cx
2013-07-07  9:23 ` [Bug manual/10460] " krichter722 at aol dot de
2013-10-20 21:22 ` neleai at seznam dot cz
2013-10-21  1:38 ` bugdal at aerifal dot cx
2014-07-01  7:29 ` fweimer at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).