From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 90C5A3858422; Thu, 30 Sep 2021 17:45:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 90C5A3858422 From: "soko246 at gmail dot com" To: glibc-bugs@sourceware.org Subject: [Bug locale/24973] iconv encounters segmentation fault when converting 0x00 0xfe in EUC-KR to UTF-8 (CVE-2019-25013) Date: Thu, 30 Sep 2021 17:45:15 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: locale X-Bugzilla-Version: 2.30 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: soko246 at gmail dot com X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at sourceware dot org X-Bugzilla-Target-Milestone: 2.33 X-Bugzilla-Flags: security+ X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: glibc-bugs@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Glibc-bugs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2021 17:45:15 -0000 https://sourceware.org/bugzilla/show_bug.cgi?id=3D24973 soko246 changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |soko246 at gmail dot com --- Comment #2 from soko246 --- Using iconv results in corrupted output, when "-c" flag is used for input w= here characters that *can* and *cannot* be converted appear together. The issue only manifests for rather large inputs (presumably > 32K). Run in bash: >export LANG=3DC >perl -E 'say "\x58\xe2\x58\xc3\x92\x58\xe2\x58\x58\xe2\x58\xc3\x92\x58\xe2= \x58\n" x 15000' | iconv -c -f ISO-8859-3 -t UTF-8 | sort | uniq -c Expected output: >15000 X=C3=A2X=EF=BF=BDX=C3=A2XX=C3=A2X=EF=BF=BDX=C3=A2X Actual output: > 1 > 2 XX=C3=A2X=EF=BF=BDX=C3=A2X > 2 X=C3=A2X=EF=BF=BDXX=C3=A2X > 2 X=C3=A2X=EF=BF=BDX=C3=A2X > 1 X=C3=A2X=EF=BF=BDX=C3=A2XX > 2 X=C3=A2X=EF=BF=BDX=C3=A2XX=C3=A2X=EF=BF=BDX=EF=BF=BDX=C3=A2XX=C3=A2X=EF= =BF=BDX=C3=A2X > 14917 X=C3=A2X=EF=BF=BDX=C3=A2XX=C3=A2X=EF=BF=BDX=C3=A2X As can be seen, many lines just disappear (14917+2+1+2+2+2+1 don't sum up to 15000).=20 Actual specific input does not matter, as long as it has a mix of convertab= le and non-convertable characters. Reducing number of input lines to smaller number (ex. 1000) and all works as expected: >1000 X=C3=A2X=EF=BF=BDX=C3=A2XX=C3=A2X=EF=BF=BDX=C3=A2X I tried this for ISO-8859-3 and ISO-8859-8 (same input) with similar (wrong) results. Using piconv (Perl variant of iconv) instead of iconv produces correct resu= lts. --=20 You are receiving this mail because: You are on the CC list for the bug.=