From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26982 invoked by alias); 14 May 2009 17:26:51 -0000 Received: (qmail 26966 invoked by uid 22791); 14 May 2009 17:26:50 -0000 X-Spam-Check-By: sourceware.org Received: from aquarius.hirmke.de (HELO calimero.vinschen.de) (217.91.18.234) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 14 May 2009 17:26:43 +0000 Received: by calimero.vinschen.de (Postfix, from userid 500) id 1A4716D4272; Thu, 14 May 2009 19:26:25 +0200 (CEST) Date: Thu, 14 May 2009 17:26:00 -0000 From: Corinna Vinschen To: newlib@sourceware.org, cygwin@cygwin.com Subject: Re: [Fwd: [1.7] wcwidth failing configure tests] Message-ID: <20090514172625.GA20688@calimero.vinschen.de> Reply-To: newlib@sourceware.org Mail-Followup-To: newlib@sourceware.org, cygwin@cygwin.com References: <20090512165404.GW21324@calimero.vinschen.de> <416096c60905120956n5521929bm69586f5e6325a994@mail.gmail.com> <20090512173153.GY21324@calimero.vinschen.de> <3f0ad08d0905140858j17c7b374paa649f18ef18178d@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3f0ad08d0905140858j17c7b374paa649f18ef18178d@mail.gmail.com> User-Agent: Mutt/1.5.19 (2009-02-20) Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com X-SW-Source: 2009-05/txt/msg00447.txt.bz2 On May 15 00:58, IWAMURO Motonori wrote: > 2009/5/13 Corinna Vinschen : > >> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c > > > > This looks nice. > > Do you import Markus Kuhn's wcwidth implementation? > > >> Trouble is, there's the thorny issue of the "CJK Ambiguous Width" > >> category of characters, which consists of things like Greek and > >> Cyrillic letters as well as line drawing symbols. Those have a width > >> of 1 in Western use, yet with CJK fonts they have a width of 2. That's > >> why Markus Kuhn's code includes the mk_wcswidth_cjk() variant. > > > > We should use the standard variation alone, imho. > > I don't think so. > > 1) It is very very inconvenient for me :-) > (Now, I apply the local patch of CJK width support to cygwin1.dll in > my environment.) > > 2) Unicode Standard Annex #11 > http://www.unicode.org/unicode/reports/tr11/ recommends: > > 5 Recommendations > (snip) > > When processing or displaying data > (snip) > > Ambiguous characters behave like wide or narrow characters depending > > on the context (language tag, script identification, associated > > font, source of data, or explicit markup; all can provide the > > context). If the context cannot be established reliably, they should > > be treated as narrow characters by default. > > The recommendation is independent of legacy encoding. > > I think that a new locale category that specifies the "context" is necessary. > Because the "context" influences only the display or text layout. > > However, there is no such standard now. > > Therefore, I propose to use *_cjk() when the language part of LC_CTYPE > is 'ja', 'ko', 'vi' or 'zh'. That would be fine with me, but tests for the actual language are not used anywhere in newlib, so that's something very new. Can we check in my patch for the time being and extend it with the CJK variation later? I will not be available for the next two weeks, but I'd be glad if at least the default variation can go in so I can create another Cygwin test release before I'm offline. Corinna -- Corinna Vinschen Cygwin Project Co-Leader Red Hat -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/