From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16394 invoked by alias); 15 Dec 2002 03:26:01 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 16380 invoked by uid 71); 15 Dec 2002 03:26:01 -0000 Date: Sat, 14 Dec 2002 19:26:00 -0000 Message-ID: <20021215032601.16379.qmail@sources.redhat.com> To: nobody@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org, From: starner@okstate.edu Subject: Re: ada/6726: -gnaty miscounts characters in UTF-8 source text Reply-To: starner@okstate.edu X-SW-Source: 2002-12/txt/msg00809.txt.bz2 List-Id: The following reply was made to PR ada/6726; it has been noted by GNATS. From: starner@okstate.edu To: bosch@gcc.gnu.org, gcc-bugs@gcc.gnu.org, gcc-prs@gcc.gnu.org, nobody@gcc.gnu.org, starner@okstate.edu, gcc-gnats@gcc.gnu.org Cc: Subject: Re: ada/6726: -gnaty miscounts characters in UTF-8 source text Date: Sat, 14 Dec 2002 21:19:12 -0600 (CST) >State-Changed-From-To: open->closed [...] > The line length limitation of -gnaty switch is intended to make sure that all lines fit on a regular terminal screen, so that the source can be viewed without problems on all screens. Another issue is that not all wide characters necessarily are the same width: many Asian fixed-spacing terminal fonts use double-width characters for certain glyphs. You've stated the problems involved in getting this to work completely right. What about actually fixing the bug in some way? You could * Drop in a wcwidth implementation. Markus Kuhn has a wcwidth implementation in a page or two of code. * Just calling all non-ASCII Unicode characters single width or double width. It's a better approximation then the triple width, which most UTF-8 characters are counted as and which is never actually true. * Disabling character counts for lines on which non-ASCII characters appear, possibly with a warning. It's ugly, and the warning is probably overkill, but it works. * Documenting it would be a good start. The current documentation says If the ^letter m^word LINE_LENGTH^ appears in the string after @option{-gnaty} then the length of source lines must not exceed 79 characters, including any trailing blanks. The value of 79 allows convenient display on an 80 character wide device or window, allowing for possible special treatment of 80 character lines. If you chose to interpret this as a documentation error instead of a code error, then so be it. But it's clearly one or the other or both.