From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27144 invoked by alias); 1 Jun 2011 07:42:00 -0000 Received: (qmail 27136 invoked by uid 22791); 1 Jun 2011 07:41:59 -0000 X-SWARE-Spam-Status: No, hits=-2.8 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from localhost (HELO sourceware.org) (127.0.0.1) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 01 Jun 2011 07:41:46 +0000 From: "glibcbugz at ghalkes dot nl" To: glibc-bugs@sources.redhat.com Subject: [Bug libc/12830] New: ISO-2022-JP-2 maps C1 control characters incorrectly X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: libc X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: glibcbugz at ghalkes dot nl X-Bugzilla-Status: NEW X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: drepper.fsp at gmail dot com X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Date: Wed, 01 Jun 2011 07:42:00 -0000 Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org X-SW-Source: 2011-06/txt/msg00000.txt.bz2 http://sourceware.org/bugzilla/show_bug.cgi?id=12830 Summary: ISO-2022-JP-2 maps C1 control characters incorrectly Product: glibc Version: 2.13 Status: NEW Severity: normal Priority: P2 Component: libc AssignedTo: drepper.fsp@gmail.com ReportedBy: glibcbugz@ghalkes.nl In the ISO-2022-JP-2 converter, the C1 control codes (U0080-U009F) are encoded as 1B 2E 41 1B 4E [00 - 1F] (i.e., load ISO-8859-1 in the G2 graphics set, use single shift to set G2 and encode the byte [00 - 1F]). However, if I understand the standard correctly, switching to the G2 set _only_ changes the mapping of the 96 characters in the range 20-7F (or the 94 charaacters in the range 21-7E if a smaller set is used). The control characters are unaffected. To access the C1 control set, one should use 1B [40 - 5F]. This is actually done for the encoding of the "single shift 2" control (U+008E) in the sequence above, which is encoded as 1B 4E. -- Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.