public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
* [Fwd: [1.7] wcwidth failing configure tests]
@ 2009-05-12 16:54 Corinna Vinschen
  2009-05-12 16:56 ` Andy Koppe
  0 siblings, 1 reply; 36+ messages in thread
From: Corinna Vinschen @ 2009-05-12 16:54 UTC (permalink / raw)
  To: newlib; +Cc: cygwin

Forwarded to newlib.

----- Forwarded message from Eric Blake -----
> Date: Tue, 12 May 2009 16:02:04 +0000 (UTC)
> From: Eric Blake
> Subject:  [1.7] wcwidth failing configure tests
> To: cygwin AT cygwin DOT com
> 
> I noticed this failure in various configure scripts (findutils, coreutils, ...):
> 
> checking whether wcwidth works reasonably in UTF-8 locales... no
> 
> I've reduced it to a STC:
> 
> #include <locale.h>
> #include <wchar.h>
> int main ()
> {
>   int i = 0;
>   if (setlocale (LC_ALL, "fr_FR.UTF-8") != NULL)
>     {
>       if (wcwidth (0x0301) > 0)
>         i |= 1;
>       if (wcwidth (0x200B) > 0)
>         i |= 2;
>     }
>   return i;
> }
> 
> The return value should be 0 but is coming back as 3; 0x0301 is a combining 
> mark which should occupy no space on its own, and 0x200b is a 0-width space, 
> according to Unicode 5.1 (and earlier, to some extent).  And that probably 
> means that other places within wcwidth() are broken.
----- End forwarded message -----

wcwidth returns 1 if iswprint returns true.  I had a quick debug attempt
and it turns out that the entire range 0x0300..0x034f is marked as
printable in the u3 array in libc/ctype/utf8print.h.  The entire range
0x0300..0x034f are combining characters which are printable, but have
zero width.

200b..200d are all three zero-width characters but all three are also
printable.

Scanning the Unicode 5.1 standard, I see a couple of these characters,
which are printable but have zero width:

0300..036f
0483..0489
200b..200f
20d0..20ea
3099..309a
fe20..fe23 (not sure about them.  Each of them is the half of a full combined
	    char which doesn't make sense alone, afaics)
feff
and a couple of musical symbols in the 0x1d1xx range

How can we fix this problem?  Should we hardcode a check for the above
character values in wcwidth?

And here's another question.  The utf8*.h files claim they have been
generated from the unicode.txt file of the Unicode 3.2 standard.  Do we
have the script which generated the utf8*.h files?  Can we regenerate
the files to match the current Unicode 5.1 standard?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2009-06-28  5:40 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-12 16:54 [Fwd: [1.7] wcwidth failing configure tests] Corinna Vinschen
2009-05-12 16:56 ` Andy Koppe
2009-05-12 17:32   ` Corinna Vinschen
2009-05-13 19:04     ` Andy Koppe
2009-05-13 19:40       ` Corinna Vinschen
2009-05-13 19:55         ` Andy Koppe
2009-05-14 15:58     ` IWAMURO Motonori
2009-05-14 17:26       ` Corinna Vinschen
2009-05-14 21:51         ` Jeff Johnston
2009-05-15 11:43           ` Corinna Vinschen
2009-05-20 16:52       ` Thomas Wolff
2009-05-20 19:41         ` IWAMURO Motonori
2009-06-05 16:25         ` Thomas Wolff
2009-06-06  7:24           ` Andy Koppe
2009-06-06 12:53             ` IWAMURO Motonori
2009-06-06  9:31           ` Corinna Vinschen
2009-06-06  9:56             ` Andy Koppe
2009-06-06 13:06             ` IWAMURO Motonori
2009-06-06 12:22           ` IWAMURO Motonori
     [not found]           ` <3f0ad08d0906060242t275a78e7tb9913bf78d1c5e83@mail.gmail.com>
2009-06-06  9:46             ` IWAMURO Motonori
2009-06-12 18:56             ` Thomas Wolff
2009-06-12 19:12               ` Corinna Vinschen
2009-06-15  0:30               ` IWAMURO Motonori
2009-06-15  4:34                 ` IWAMURO Motonori
2009-06-15 11:43                   ` [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests]) Corinna Vinschen
2009-06-15 15:58                     ` IWAMURO Motonori
2009-06-15 17:08                       ` Corinna Vinschen
2009-06-15 17:14                         ` IWAMURO Motonori
2009-06-18 15:57                         ` Thomas.Wolff
2009-06-18 16:49                           ` Corinna Vinschen
2009-06-19  0:08                           ` Andy Koppe
2009-06-19 14:45                           ` Thomas Wolff
2009-06-19 14:49                             ` Corinna Vinschen
2009-06-27 22:03                     ` Andy Koppe
2009-06-28  8:18                       ` IWAMURO Motonori
2009-05-26 16:46       ` [Fwd: [1.7] wcwidth failing configure tests] IWAMURO Motonori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).