[Fwd: [1.7] wcwidth failing configure tests]

public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed

* [Fwd: [1.7] wcwidth failing configure tests]
@ 2009-05-12 16:54 Corinna Vinschen
  2009-05-12 16:56 ` Andy Koppe
  0 siblings, 1 reply; 36+ messages in thread
From: Corinna Vinschen @ 2009-05-12 16:54 UTC (permalink / raw)
  To: newlib; +Cc: cygwin

Forwarded to newlib.

----- Forwarded message from Eric Blake -----
> Date: Tue, 12 May 2009 16:02:04 +0000 (UTC)
> From: Eric Blake
> Subject:  [1.7] wcwidth failing configure tests
> To: cygwin AT cygwin DOT com
> 
> I noticed this failure in various configure scripts (findutils, coreutils, ...):
> 
> checking whether wcwidth works reasonably in UTF-8 locales... no
> 
> I've reduced it to a STC:
> 
> #include <locale.h>
> #include <wchar.h>
> int main ()
> {
>   int i = 0;
>   if (setlocale (LC_ALL, "fr_FR.UTF-8") != NULL)
>     {
>       if (wcwidth (0x0301) > 0)
>         i |= 1;
>       if (wcwidth (0x200B) > 0)
>         i |= 2;
>     }
>   return i;
> }
> 
> The return value should be 0 but is coming back as 3; 0x0301 is a combining 
> mark which should occupy no space on its own, and 0x200b is a 0-width space, 
> according to Unicode 5.1 (and earlier, to some extent).  And that probably 
> means that other places within wcwidth() are broken.
----- End forwarded message -----

wcwidth returns 1 if iswprint returns true.  I had a quick debug attempt
and it turns out that the entire range 0x0300..0x034f is marked as
printable in the u3 array in libc/ctype/utf8print.h.  The entire range
0x0300..0x034f are combining characters which are printable, but have
zero width.

200b..200d are all three zero-width characters but all three are also
printable.

Scanning the Unicode 5.1 standard, I see a couple of these characters,
which are printable but have zero width:

0300..036f
0483..0489
200b..200f
20d0..20ea
3099..309a
fe20..fe23 (not sure about them.  Each of them is the half of a full combined
	    char which doesn't make sense alone, afaics)
feff
and a couple of musical symbols in the 0x1d1xx range

How can we fix this problem?  Should we hardcode a check for the above
character values in wcwidth?

And here's another question.  The utf8*.h files claim they have been
generated from the unicode.txt file of the Unicode 3.2 standard.  Do we
have the script which generated the utf8*.h files?  Can we regenerate
the files to match the current Unicode 5.1 standard?

Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-12 16:54 [Fwd: [1.7] wcwidth failing configure tests] Corinna Vinschen
@ 2009-05-12 16:56 ` Andy Koppe
  2009-05-12 17:32   ` Corinna Vinschen
  0 siblings, 1 reply; 36+ messages in thread
From: Andy Koppe @ 2009-05-12 16:56 UTC (permalink / raw)
  To: newlib, cygwin

> And here's another question.  The utf8*.h files claim they have been
> generated from the unicode.txt file of the Unicode 3.2 standard.  Do we
> have the script which generated the utf8*.h files?  Can we regenerate
> the files to match the current Unicode 5.1 standard?

There's Markus Kuhn's wcwidth implementation, which says it's based on
Unicode 5.0:

http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
category of characters, which consists of things like Greek and
Cyrillic letters as well as line drawing symbols. Those have a width
of 1 in Western use, yet with CJK fonts they have a width of 2. That's
why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.

Andy

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-12 16:56 ` Andy Koppe
@ 2009-05-12 17:32   ` Corinna Vinschen
  2009-05-13 19:04     ` Andy Koppe
  2009-05-14 15:58     ` IWAMURO Motonori
  0 siblings, 2 replies; 36+ messages in thread
From: Corinna Vinschen @ 2009-05-12 17:32 UTC (permalink / raw)
  To: newlib, cygwin

On May 12 17:56, Andy Koppe wrote:
> > And here's another question. Â The utf8*.h files claim they have been
> > generated from the unicode.txt file of the Unicode 3.2 standard. Â Do we
> > have the script which generated the utf8*.h files? Â Can we regenerate
> > the files to match the current Unicode 5.1 standard?
> 
> There's Markus Kuhn's wcwidth implementation, which says it's based on
> Unicode 5.0:
> 
> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

This looks nice.

> Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
> category of characters, which consists of things like Greek and
> Cyrillic letters as well as line drawing symbols. Those have a width
> of 1 in Western use, yet with CJK fonts they have a width of 2. That's
> why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.

We should use the standard variation alone, imho.

And we need some workaround for UTF-16 systems like Cygwin.
Unfortunately, surrogate pairs only work well as part of a string, not
as standalone chars.  So wcwidth would return -1 for each single char,
but wcswidth could be tweaked to handle them gracefully.


Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-12 17:32   ` Corinna Vinschen
@ 2009-05-13 19:04     ` Andy Koppe
  2009-05-13 19:40       ` Corinna Vinschen
  2009-05-14 15:58     ` IWAMURO Motonori
  1 sibling, 1 reply; 36+ messages in thread
From: Andy Koppe @ 2009-05-13 19:04 UTC (permalink / raw)
  To: newlib, cygwin

2009/5/12 Corinna Vinschen:
>> Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
>> category of characters, which consists of things like Greek and
>> Cyrillic letters as well as line drawing symbols. Those have a width
>> of 1 in Western use, yet with CJK fonts they have a width of 2. That's
>> why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.
>
> We should use the standard variation alone, imho.

I'm not sure that CJK users would be happy with that. See MinTTY issue
88 for my misguided attempts to dismiss this as a legacy issue:
http://code.google.com/p/mintty/issues/detail?id=88

In comment 8 on that, "deenheart" mentioned that he was working on a
fix for wcwidth(). I don't know what he had in mind, but I'd suspect
something based on an environment variable setting.

> And we need some workaround for UTF-16 systems like Cygwin.
> Unfortunately, surrogate pairs only work well as part of a string, not
> as standalone chars.  So wcwidth would return -1 for each single char,
> but wcswidth could be tweaked to handle them gracefully.

Looking at the ranges in wcwidth.c, it might be possible to decide the
width of a surrogate pair based on the high surrogate only, and then
treat the low surrogate as a combining character with length 0.

Andy

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-13 19:04     ` Andy Koppe
@ 2009-05-13 19:40       ` Corinna Vinschen
  2009-05-13 19:55         ` Andy Koppe
  0 siblings, 1 reply; 36+ messages in thread
From: Corinna Vinschen @ 2009-05-13 19:40 UTC (permalink / raw)
  To: newlib, cygwin

On May 13 20:04, Andy Koppe wrote:
> 2009/5/12 Corinna Vinschen:
> >> Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
> >> category of characters, which consists of things like Greek and
> >> Cyrillic letters as well as line drawing symbols. Those have a width
> >> of 1 in Western use, yet with CJK fonts they have a width of 2. That's
> >> why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.
> >
> > We should use the standard variation alone, imho.
> 
> I'm not sure that CJK users would be happy with that. See MinTTY issue
> 88 for my misguided attempts to dismiss this as a legacy issue:
> http://code.google.com/p/mintty/issues/detail?id=88
> 
> In comment 8 on that, "deenheart" mentioned that he was working on a
> fix for wcwidth(). I don't know what he had in mind, but I'd suspect
> something based on an environment variable setting.
> 
> > And we need some workaround for UTF-16 systems like Cygwin.
> > Unfortunately, surrogate pairs only work well as part of a string, not
> > as standalone chars. Â So wcwidth would return -1 for each single char,
> > but wcswidth could be tweaked to handle them gracefully.
> 
> Looking at the ranges in wcwidth.c, it might be possible to decide the
> width of a surrogate pair based on the high surrogate only, and then
> treat the low surrogate as a combining character with length 0.

How should that work?  The first half of the surrogate pair has not
enough information to decide that.  For instance, take the ranges
0x10A01, 0x10A03 }, { 0x10A05, 0x10A06 }.  The information about the low
10 bits of the Unicode value is in the second half of the pair.  From
the first half you don't know if the char is perhaps the 0x10A04 value
or one of the other.  So you need both halves to make a decision.

A surrogate pair half alone is also always invalid.  That's something
you can't handle in wcwidth.


Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-13 19:40       ` Corinna Vinschen
@ 2009-05-13 19:55         ` Andy Koppe
  0 siblings, 0 replies; 36+ messages in thread
From: Andy Koppe @ 2009-05-13 19:55 UTC (permalink / raw)
  To: newlib, cygwin

> How should that work?  The first half of the surrogate pair has not
> enough information to decide that.  For instance, take the ranges
> 0x10A01, 0x10A03 }, { 0x10A05, 0x10A06 }.  The information about the low
> 10 bits of the Unicode value is in the second half of the pair.  From
> the first half you don't know if the char is perhaps the 0x10A04 value
> or one of the other.  So you need both halves to make a decision.

You're right. I'd somehow overlooked the end of the combining[] array.

Andy

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-12 17:32   ` Corinna Vinschen
  2009-05-13 19:04     ` Andy Koppe
@ 2009-05-14 15:58     ` IWAMURO Motonori
  2009-05-14 17:26       ` Corinna Vinschen
                         ` (2 more replies)
  1 sibling, 3 replies; 36+ messages in thread
From: IWAMURO Motonori @ 2009-05-14 15:58 UTC (permalink / raw)
  To: newlib, cygwin

2009/5/13 Corinna Vinschen <vinschen@redhat.com>:
>> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
>
> This looks nice.

Do you import Markus Kuhn's wcwidth implementation?

>> Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
>> category of characters, which consists of things like Greek and
>> Cyrillic letters as well as line drawing symbols. Those have a width
>> of 1 in Western use, yet with CJK fonts they have a width of 2. That's
>> why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.
>
> We should use the standard variation alone, imho.

I don't think so.

1) It is very very inconvenient for me :-)
(Now, I apply the local patch of CJK width support to cygwin1.dll in
my environment.)

2) Unicode Standard Annex #11
http://www.unicode.org/unicode/reports/tr11/ recommends:
> 5 Recommendations
(snip)
> When processing or displaying data
(snip)
> Ambiguous characters behave like wide or narrow characters depending
> on the context (language tag, script identification, associated
> font, source of data, or explicit markup; all can provide the
> context). If the context cannot be established reliably, they should
> be treated as narrow characters by default.

The recommendation is independent of legacy encoding.

I think that a new locale category that specifies the "context" is necessary.
Because the "context" influences only the display or text layout.

However, there is no such standard now.

Therefore, I propose to use *_cjk() when the language part of LC_CTYPE
is 'ja', 'ko', 'vi' or 'zh'.
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-14 15:58     ` IWAMURO Motonori
@ 2009-05-14 17:26       ` Corinna Vinschen
  2009-05-14 21:51         ` Jeff Johnston
  2009-05-20 16:52       ` Thomas Wolff
  2009-05-26 16:46       ` IWAMURO Motonori
  2 siblings, 1 reply; 36+ messages in thread
From: Corinna Vinschen @ 2009-05-14 17:26 UTC (permalink / raw)
  To: newlib, cygwin

On May 15 00:58, IWAMURO Motonori wrote:
> 2009/5/13 Corinna Vinschen <vinschen@redhat.com>:
> >> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
> >
> > This looks nice.
> 
> Do you import Markus Kuhn's wcwidth implementation?
> 
> >> Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
> >> category of characters, which consists of things like Greek and
> >> Cyrillic letters as well as line drawing symbols. Those have a width
> >> of 1 in Western use, yet with CJK fonts they have a width of 2. That's
> >> why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.
> >
> > We should use the standard variation alone, imho.
> 
> I don't think so.
> 
> 1) It is very very inconvenient for me :-)
> (Now, I apply the local patch of CJK width support to cygwin1.dll in
> my environment.)
> 
> 2) Unicode Standard Annex #11
> http://www.unicode.org/unicode/reports/tr11/ recommends:
> > 5 Recommendations
> (snip)
> > When processing or displaying data
> (snip)
> > Ambiguous characters behave like wide or narrow characters depending
> > on the context (language tag, script identification, associated
> > font, source of data, or explicit markup; all can provide the
> > context). If the context cannot be established reliably, they should
> > be treated as narrow characters by default.
> 
> The recommendation is independent of legacy encoding.
> 
> I think that a new locale category that specifies the "context" is necessary.
> Because the "context" influences only the display or text layout.
> 
> However, there is no such standard now.
> 
> Therefore, I propose to use *_cjk() when the language part of LC_CTYPE
> is 'ja', 'ko', 'vi' or 'zh'.

That would be fine with me, but tests for the actual language are not
used anywhere in newlib, so that's something very new.  Can we check in my patch for the time being and
extend it with the CJK variation later?  I will not be available for the
next two weeks, but I'd be glad if at least the default variation can go
in so I can create another Cygwin test release before I'm offline.


Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-14 17:26       ` Corinna Vinschen
@ 2009-05-14 21:51         ` Jeff Johnston
  2009-05-15 11:43           ` Corinna Vinschen
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff Johnston @ 2009-05-14 21:51 UTC (permalink / raw)
  To: newlib, cygwin

Corinna Vinschen wrote:
> On May 15 00:58, IWAMURO Motonori wrote:
>   
>> 2009/5/13 Corinna Vinschen <vinschen@redhat.com>:
>>     
>>>> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
>>>>         
>>> This looks nice.
>>>       
>> Do you import Markus Kuhn's wcwidth implementation?
>>
>>     
>>>> Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
>>>> category of characters, which consists of things like Greek and
>>>> Cyrillic letters as well as line drawing symbols. Those have a width
>>>> of 1 in Western use, yet with CJK fonts they have a width of 2. That's
>>>> why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.
>>>>         
>>> We should use the standard variation alone, imho.
>>>       
>> I don't think so.
>>
>> 1) It is very very inconvenient for me :-)
>> (Now, I apply the local patch of CJK width support to cygwin1.dll in
>> my environment.)
>>
>> 2) Unicode Standard Annex #11
>> http://www.unicode.org/unicode/reports/tr11/ recommends:
>>     
>>> 5 Recommendations
>>>       
>> (snip)
>>     
>>> When processing or displaying data
>>>       
>> (snip)
>>     
>>> Ambiguous characters behave like wide or narrow characters depending
>>> on the context (language tag, script identification, associated
>>> font, source of data, or explicit markup; all can provide the
>>> context). If the context cannot be established reliably, they should
>>> be treated as narrow characters by default.
>>>       
>> The recommendation is independent of legacy encoding.
>>
>> I think that a new locale category that specifies the "context" is necessary.
>> Because the "context" influences only the display or text layout.
>>
>> However, there is no such standard now.
>>
>> Therefore, I propose to use *_cjk() when the language part of LC_CTYPE
>> is 'ja', 'ko', 'vi' or 'zh'.
>>     
>
> That would be fine with me, but tests for the actual language are not
> used anywhere in newlib, so that's something very new.  Can we check in my patch for the time being and
> extend it with the CJK variation later?  I will not be available for the
> next two weeks, but I'd be glad if at least the default variation can go
> in so I can create another Cygwin test release before I'm offline.
>
>
>   
Corinna, I have no problem with checking the new patch in and extending 
this later, assuming you have thoroughly tested this implementation.

-- Jeff J.
> Corinna
>
>   


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-14 21:51         ` Jeff Johnston
@ 2009-05-15 11:43           ` Corinna Vinschen
  0 siblings, 0 replies; 36+ messages in thread
From: Corinna Vinschen @ 2009-05-15 11:43 UTC (permalink / raw)
  To: newlib, cygwin

On May 14 17:51, Jeff Johnston wrote:
> Corinna, I have no problem with checking the new patch in and extending  
> this later, assuming you have thoroughly tested this implementation.

I tested it with _MB_CAPABLE defined and with _MB_CAPABLE undefined.
Both variations worked as expected, the latter using the old newlib
implementation using iswprint/iswcntrl.

Patch applied.  I have adding the CJK variation on my todo list for
when I'm back from vacation.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-14 15:58     ` IWAMURO Motonori
  2009-05-14 17:26       ` Corinna Vinschen
@ 2009-05-20 16:52       ` Thomas Wolff
  2009-05-20 19:41         ` IWAMURO Motonori
  2009-06-05 16:25         ` Thomas Wolff
  2009-05-26 16:46       ` IWAMURO Motonori
  2 siblings, 2 replies; 36+ messages in thread
From: Thomas Wolff @ 2009-05-20 16:52 UTC (permalink / raw)
  To: newlib, cygwin

Corinna Vinschen wrote:

> On May 12 17:56, Andy Koppe wrote:
> > > And here's another question. ?The utf8*.h files claim they have been
> > > generated from the unicode.txt file of the Unicode 3.2 standard. ?Do we
> > > have the script which generated the utf8*.h files? ?Can we regenerate
> > > the files to match the current Unicode 5.1 standard?
I've updated my editor mined to Unicode 5.1 data already. I can provide 
an according wcwidth function if that's desired. I also have scripts 
for semi-automatic generation of this information, however "semi" as I said, 
to be improved.

> > There's Markus Kuhn's wcwidth implementation, which says it's based on
> > Unicode 5.0:
> > 
> > http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
> 
> This looks nice.
I'm sure Markus will update to 5.1 one day too...


> > Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
> > category of characters, which consists of things like Greek and
> > Cyrillic letters as well as line drawing symbols. Those have a width
> > of 1 in Western use, yet with CJK fonts they have a width of 2. That's
> > why Markus Kuhn's code includes the mk_wcswidth_cjk() variant.
> 
> We should use the standard variation alone, imho.
> 
> And we need some workaround for UTF-16 systems like Cygwin.
> Unfortunately, surrogate pairs only work well as part of a string, not
> as standalone chars.  So wcwidth would return -1 for each single char,
> but wcswidth could be tweaked to handle them gracefully.
This gets me to the related question how to output non-BMP characters;
currently, the cygwin console display them all as two square boxes, 
using two screen columns. This indicates that probably just the single 
surrogate characters are being output.
Could proper non-BMP character display be achieved by simply combining 
the surrogates and outputting them to Windows as a true Unicode character?
(The Windows function would need to be 32 bit which I don't know, 
the string elements could stay as they are.)
Just an idea which might lead to a simple solution.


> On May 15 00:58, IWAMURO Motonori wrote:
> > 2009/5/13 Corinna Vinschen <vinschen@redhat.com>:
> > >> Trouble is, there's the thorny issue of the "CJK Ambiguous Width"
> > >> ... (see above)
> > > We should use the standard variation alone, imho.
> > I don't think so.
> > 
> > 1) It is very very inconvenient for me :-)
> > 
> > 2) Unicode Standard Annex #11
> > http://www.unicode.org/unicode/reports/tr11/ recommends:
> > > 5 Recommendations
> > (snip)
> > > When processing or displaying data
> > (snip)
> > > Ambiguous characters behave like wide or narrow characters depending
> > > on the context (language tag, script identification, associated
> > > font, source of data, or explicit markup; all can provide the
> > > context). If the context cannot be established reliably, they should
> > > be treated as narrow characters by default.
> > 
> > The recommendation is independent of legacy encoding.
> > 
> > I think that a new locale category that specifies the "context" is necessary.
> > Because the "context" influences only the display or text layout.
> > 
> > However, there is no such standard now.
> > 
> > Therefore, I propose to use *_cjk() when the language part of LC_CTYPE
> > is 'ja', 'ko', 'vi' or 'zh'.
The problem with this is
1. As you say, there is no standard.
2. If you wish to handle character widths compliant with the terminal 
   your application is running in, there is no guarantee that your 
   assumption of CJK width (or the actual locale setting if that model 
   would be implemented) does indeed reflect the terminal's width properties.
3. In mintty, you can dynamically change width properties by selecting 
   different fonts; mintty changes CJK width behaviour according to certain 
   font properties. "static" configuration in your shell using a locale 
   variable would not reflect this change
   I see two ways to handle this:
   a) Ask Andy (author of mintty) to not do this switching; however, 
      I don't know what display consequences that might have. On the 
      other hand, other terminals don't switch either. Or maybe mintty 
      could at leasts issue a warning on CJK width switching, or 
      maintain two separate font lists, or...
   b) Determine the actual CJK width behaviour dynamically. That's what 
      mined does (in addition to other width property detection in general).
      That's why it can handle the alternative quite seamlessly.

> That would be fine with me, but tests for the actual language are not
> used anywhere in newlib, so that's something very new.
So I would suggest not to introduce it before the concept is sufficiently discussed.
And I'm not happy with the idea of a cygwin-specific solution (or workaround).


Kind regards,
Thomas

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-20 16:52       ` Thomas Wolff
@ 2009-05-20 19:41         ` IWAMURO Motonori
  2009-06-05 16:25         ` Thomas Wolff
  1 sibling, 0 replies; 36+ messages in thread
From: IWAMURO Motonori @ 2009-05-20 19:41 UTC (permalink / raw)
  To: newlib, cygwin

2009/5/21 Thomas Wolff <towo@towo.net>:
>> > Therefore, I propose to use *_cjk() when the language part of LC_CTYPE
>> > is 'ja', 'ko', 'vi' or 'zh'.
> The problem with this is
> 1. As you say, there is no standard.

But,
- I think that my proposal doesn't violate any specification.
- I heard that there is an existing implementation that behave like my
proposal. (Sorry, I didn't hear the system name.)

> 2. If you wish to handle character widths compliant with the terminal
>   your application is running in, there is no guarantee that your
>   assumption of CJK width (or the actual locale setting if that model
>   would be implemented) does indeed reflect the terminal's width properties.

Yes, I understand it, too. My proposal is completely workaround.
But it is the best solution because we have no specification/standard
for my wish.

> 3. In mintty, you can dynamically change width properties by selecting
>   different fonts; mintty changes CJK width behaviour according to certain
>   font properties. "static" configuration in your shell using a locale
>   variable would not reflect this change

It is no problem because we -- most Japanese language users -- need
not change the settings of mintty and locale after first setup.
We set LANG=ja_JP.UTF-8 and select a Japanese font for mintty.

>   I see two ways to handle this:
>   a) Ask Andy (author of mintty) to not do this switching;

It is not necessary bacause the mechanism is based on my another
poroposal. ("deenheart" is my handle on google code.)

> other terminals don't switch either.

If we use other terminals, we need switch CJK width option manually.
(xterm, mlterm, putty, ...)

>   b) Determine the actual CJK width behaviour dynamically. That's what
>      mined does (in addition to other width property detection in general).

It is the best solution. I think that we need specify the following:
- the escape sequence about language context for terminal emulater.
-- setting language context
-- getting language context
-- getting capability of language context
   (context is fixed, static or dynamic / acceptable languages)
- new multilingualized string/terminal API for terminal based applications.

And, we need rewrite too many applications by new API.

> I'm not happy with the idea of a cygwin-specific solution (or workaround).

I think that it is not cygwin-specific solution.
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-20 16:52       ` Thomas Wolff
  2009-05-20 19:41         ` IWAMURO Motonori
@ 2009-06-05 16:25         ` Thomas Wolff
  2009-06-06  7:24           ` Andy Koppe
                             ` (3 more replies)
  1 sibling, 4 replies; 36+ messages in thread
From: Thomas Wolff @ 2009-06-05 16:25 UTC (permalink / raw)
  To: newlib, cygwin

IWAMURO Motonori wrote:
> 2009/5/21 Thomas Wolff <towo@towo.net>:
> >> > Therefore, I propose to use *_cjk() when the language part of LC_CTYPE
> >> > is 'ja', 'ko', 'vi' or 'zh'.
> > The problem with this is
> > 1. As you say, there is no standard.

> But,
> - I think that my proposal doesn't violate any specification.
I think it does. Part of the locale information is the "charmap" 
(called "codepage" on DOS/Windows). It may be implicit like 
with LC_CTYPE=zh_CN which defines "GB2312" as its charmap, but it 
is typically explicit like in en_US.UTF-8 - the intention is 
that the "codepage" information should be the same for all locales 
having thbe "UTF-8" (or any other) charmap. So you cannot freely 
change width information among locales with the same charmap.
Also, if ja_JP.UTF-8 would mean "CJK width", how would you specify 
a working locale setting for a terminal that does not run a CJK width 
font but should yet use other Japanese settings? E.g. with rxvt which 
does not support CJK width.

However, there is one resort within the locale mechanism that can be used;
the locale syntax allows for an optional "modifier" which can be used to 
specify deviations, e.g.
	de_DE           has charmap ISO-8859-1
	de_DE@euro      has charmap ISO-8859-15
	uz_UZ           has charmap ISO-8859-1
	uz_UZ@cyrillic  has charmap UTF-8
	aa_ER and aa_ER@saaho both have charmap UTF-8 (with some other difference).
Thus you could define e.g.
	ja_JP.UTF-8@cjk
or
	ja_JP.UTF-8@cjkwidth
to indicate CJK width properties. I guess this is the most compliant way to go.


> - I heard that there is an existing implementation that behave like my
> proposal. (Sorry, I didn't hear the system name.)
Even if so, I think the way I described is more compatible with the locale 
mechanism as used elsewhere.


> > 2. If you wish to handle character widths compliant with the terminal
> > ? your application is running in, there is no guarantee that your
> > ? assumption of CJK width (or the actual locale setting if that model
> > ? would be implemented) does indeed reflect the terminal's width properties.

> Yes, I understand it, too. My proposal is completely workaround.
> But it is the best solution because we have no specification/standard
> for my wish.
A well-chosen option like above, that stays within the described standard 
options, would be best accepted by other communities, I think, and could 
be established for this purpose.


> > 3. In mintty, you can dynamically change width properties by selecting
> > ? different fonts; mintty changes CJK width behaviour according to certain
> > ? font properties. "static" configuration in your shell using a locale
> > ? variable would not reflect this change

> It is no problem because we -- most Japanese language users -- need
> not change the settings of mintty and locale after first setup.
> We set LANG=ja_JP.UTF-8 and select a Japanese font for mintty.
In any case, mined running in mintty will detect CJK width itself, 
regardless of locale setting, with coming versions of both programs 
even when it gets changed on-the-fly :)


> > ? b) Determine the actual CJK width behaviour dynamically. That's what
> > ? ? ?mined does (in addition to other width property detection in general).

> It is the best solution. I think that we need specify the following:
> - the escape sequence about language context for terminal emulater.
> -- setting language context
> -- getting language context
> -- getting capability of language context
>    (context is fixed, static or dynamic / acceptable languages)
> - new multilingualized string/terminal API for terminal based applications.
This sounds complicated.
With my proposal, an application that wishes to auto-adjust on width 
properties (maybe even when changing) and which (unlike mined) uses 
the system wcwidth functions could proceed as follows:
* Detect CJK width by using a simple test string width detection.
* (Optional) When receiving a SIGWINCH signal (future version of MinTTY), 
  repeat this detection.
* If e.g. LC_CTYPE starts with "ja_JP.UTF-8", call setlocale with 
  either "ja_JP.UTF-8@cjkwidth" or "ja_JP.UTF-8".
  The application would need to stay with the same locale prefix 
  "ja_JP..." because there is no reasonable way to choose a completely 
  different locale, which is another reason to just use the modifier 
  suffix, rather than reserving the complete "ja_JP..." setting for 
  CJK width.

Advantage of this approach: The system does not have to care about 
this issue and can just follow the locale setting.


> And, we need rewrite too many applications by new API.
Well, alternatively, the system could follow the approach outlined 
above, but maybe that's not the proper level to do it (?)


> > I'm not happy with the idea of a cygwin-specific solution (or workaround).
> I think that it is not cygwin-specific solution.
As I tried to suggest above, using "UTF-8" for different width data on one 
system would be quite specific, using the "@" modifier syntax would not.


Kind regards,
Thomas

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-06-05 16:25         ` Thomas Wolff
@ 2009-06-06  7:24           ` Andy Koppe
  2009-06-06 12:53             ` IWAMURO Motonori
  2009-06-06  9:31           ` Corinna Vinschen
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 36+ messages in thread
From: Andy Koppe @ 2009-06-06  7:24 UTC (permalink / raw)
  To: cygwin; +Cc: newlib

2009/6/5 Thomas Wolff:
> the locale syntax allows for an optional "modifier" which can be used to
> specify deviations, e.g.
>        de_DE           has charmap ISO-8859-1
>        de_DE@euro      has charmap ISO-8859-15
>        uz_UZ           has charmap ISO-8859-1
>        uz_UZ@cyrillic  has charmap UTF-8
>        aa_ER and aa_ER@saaho both have charmap UTF-8 (with some other difference).
> Thus you could define e.g.
>        ja_JP.UTF-8@cjk
> or
>        ja_JP.UTF-8@cjkwidth
> to indicate CJK width properties. I guess this is the most compliant way to go.

This looks the right approach to me.

However, to make the locale setting more convenient for CJK users,
there could be modifiers for both widths. Without modifier, the CJK
locales would default to "Ambiguous Wide", while everything else would
default to "Ambiguous Narrow".

In the time-honoured tradition of keeping Unix identifiers brief and
obscure, I propose the modifiers should be "@aw" and "@an". Otherwise,
how about "@ambigwide" and "@ambignarrow"?

Calling it something like "cjkwide" has the problem that it gives the
impression that the actual CJK ideographs are affected by this,
whereas this really concerns things like line drawing characters and
non-latin non-CJK letters. That confused me to start with anyway.

Puzzled that this hasn't been solved in glibc years ago ...

Andy

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-06-06  7:24           ` Andy Koppe
@ 2009-06-06 12:53             ` IWAMURO Motonori
  0 siblings, 0 replies; 36+ messages in thread
From: IWAMURO Motonori @ 2009-06-06 12:53 UTC (permalink / raw)
  To: cygwin; +Cc: newlib

2009/6/6 Andy Koppe <andy.koppe@gmail.com>:
> However, to make the locale setting more convenient for CJK users,
> there could be modifiers for both widths. Without modifier, the CJK
> locales would default to "Ambiguous Wide", while everything else would
> default to "Ambiguous Narrow".

It is acceptable for me.

> Puzzled that this hasn't been solved in glibc years ago ...

I also examined it.
But, I was not able to discover the reason.

One Debian user is trying to fix it, but it doesn't progress...

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=471021
http://sourceware.org/bugzilla/show_bug.cgi?id=4335
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-06-05 16:25         ` Thomas Wolff
  2009-06-06  7:24           ` Andy Koppe
@ 2009-06-06  9:31           ` Corinna Vinschen
  2009-06-06  9:56             ` Andy Koppe
  2009-06-06 13:06             ` IWAMURO Motonori
       [not found]           ` <3f0ad08d0906060242t275a78e7tb9913bf78d1c5e83@mail.gmail.com>
  2009-06-06 12:22           ` [Fwd: [1.7] wcwidth failing configure tests] IWAMURO Motonori
  3 siblings, 2 replies; 36+ messages in thread
From: Corinna Vinschen @ 2009-06-06  9:31 UTC (permalink / raw)
  To: cygwin, newlib

On Jun  5 18:25, Thomas Wolff wrote:
> IWAMURO Motonori wrote:
> > 2009/5/21 Thomas Wolff <towo@towo.net>:
> > >> > Therefore, I propose to use *_cjk() when the language part of LC_CTYPE
> > >> > is 'ja', 'ko', 'vi' or 'zh'.
> > > The problem with this is
> > > 1. As you say, there is no standard.
> 
> > But,
> > - I think that my proposal doesn't violate any specification.
> I think it does. Part of the locale information is the "charmap" 
> (called "codepage" on DOS/Windows). It may be implicit like 
> with LC_CTYPE=zh_CN which defines "GB2312" as its charmap, but it 
> is typically explicit like in en_US.UTF-8 - the intention is 
> that the "codepage" information should be the same for all locales 
> having thbe "UTF-8" (or any other) charmap. So you cannot freely 
> change width information among locales with the same charmap.
> Also, if ja_JP.UTF-8 would mean "CJK width", how would you specify 
> a working locale setting for a terminal that does not run a CJK width 
> font but should yet use other Japanese settings? E.g. with rxvt which 
> does not support CJK width.
> 
> However, there is one resort within the locale mechanism that can be used;
> the locale syntax allows for an optional "modifier" which can be used to 
> specify deviations, e.g.
> 	de_DE           has charmap ISO-8859-1
> 	de_DE@euro      has charmap ISO-8859-15
> 	uz_UZ           has charmap ISO-8859-1
> 	uz_UZ@cyrillic  has charmap UTF-8
> 	aa_ER and aa_ER@saaho both have charmap UTF-8 (with some other difference).
> Thus you could define e.g.
> 	ja_JP.UTF-8@cjk
> or
> 	ja_JP.UTF-8@cjkwidth
> to indicate CJK width properties. I guess this is the most compliant way to go.

I like this approach.  It's also more flexible than using the language
specifier.

<nit-picking>
Thomas, couldn't you have discussed this in the two weeks I was on
vacation?  Why did you wait until I implemented the language-based
approach?
</nit-picking>

Now, we just have to agree on the modifier and somebody has to implement
this in newlib/libc/locale/locale.c.  So far the modifier is ignored
entirely (de_DE@euro will still use ISO-8859-1).

I vote for @cjkwide, regardless of Andy's objection.  People using CJK
will know the meaning and it has the additional advantage to be a rather
simple to memorize identifier.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-06-06  9:31           ` Corinna Vinschen
@ 2009-06-06  9:56             ` Andy Koppe
  2009-06-06 13:06             ` IWAMURO Motonori
  1 sibling, 0 replies; 36+ messages in thread
From: Andy Koppe @ 2009-06-06  9:56 UTC (permalink / raw)
  To: cygwin, newlib

> <nit-picking>
> Thomas, couldn't you have discussed this in the two weeks I was on
> vacation?  Why did you wait until I implemented the language-based
> approach?
> </nit-picking>

Sorry, that's largely my fault. Among a bunch of other MinTTY issues
we were privately discussing various more or less mad schemes to
communicate the ambiguous width between terminal and application and
so it took a while for us to realise that a locale-based scheme really
is the best approach.

Andy

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-06-06  9:31           ` Corinna Vinschen
  2009-06-06  9:56             ` Andy Koppe
@ 2009-06-06 13:06             ` IWAMURO Motonori
  1 sibling, 0 replies; 36+ messages in thread
From: IWAMURO Motonori @ 2009-06-06 13:06 UTC (permalink / raw)
  To: cygwin, newlib

2009/6/6 Corinna Vinschen <corinna-cygwin@cygwin.com>:
> I vote for @cjkwide, regardless of Andy's objection.  People using CJK
> will know the meaning and it has the additional advantage to be a rather
> simple to memorize identifier.

I oppose @cjkwide approach because I don't think that I need make
special cases give priority more than general cases.

I think that Andy's approach is better.
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

[parent not found: <3f0ad08d0906060242t275a78e7tb9913bf78d1c5e83@mail.gmail.com>]

* Re: [Fwd: [1.7] wcwidth failing configure tests]
       [not found]           ` <3f0ad08d0906060242t275a78e7tb9913bf78d1c5e83@mail.gmail.com>
@ 2009-06-06  9:46             ` IWAMURO Motonori
  2009-06-12 18:56             ` Thomas Wolff
  1 sibling, 0 replies; 36+ messages in thread
From: IWAMURO Motonori @ 2009-06-06  9:46 UTC (permalink / raw)
  To: cygwin, newlib

I oppose your proposal because I think that it is useless for us.

2009/6/6 Thomas Wolff <towo@towo.net>:
> the intention is that the "codepage" information should be the same
> for all locales having thbe "UTF-8" (or any other) charmap.  So you
> cannot freely change width information among locales with the same
> charmap.

I don't think that there is such a restriction.
The standard of the character doesn't provide for the width of the
character as a standard.

> Also, if ja_JP.UTF-8 would mean "CJK width", how would you specify a
> working locale setting for a terminal that does not run a CJK width
> font but should yet use other Japanese settings? E.g. with rxvt
> which does not support CJK width.

Oh, we ALWAYS have a hard time in this problem VERY VERY VERY much.

case1: We use only the application that treats the width of the
character without locale.
case2: We make the patch that solves the character width problem, and
throw it out up-stream.
case3: We make the patch, and apply it locally.
case4: We tearfully give up the correct display of the screen.
case5: We tearfully give up using the application.

I selected case5 for rxvt.

> Thus you could define e.g.
>        ja_JP.UTF-8@cjk
> or
>        ja_JP.UTF-8@cjkwidth
> to indicate CJK width properties. I guess this is the most compliant way to go.

I don't think that it is the good idea because:

- It is "a cygwin-specific solution (or workaround)".
- In NetBSD, the change to which wcwidth of East Asian Ambiguous
Characters returns 2 by CJK locale is planned.

# to be continued.
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
       [not found]           ` <3f0ad08d0906060242t275a78e7tb9913bf78d1c5e83@mail.gmail.com>
  2009-06-06  9:46             ` IWAMURO Motonori
@ 2009-06-12 18:56             ` Thomas Wolff
  2009-06-12 19:12               ` Corinna Vinschen
  2009-06-15  0:30               ` IWAMURO Motonori
  1 sibling, 2 replies; 36+ messages in thread
From: Thomas Wolff @ 2009-06-12 18:56 UTC (permalink / raw)
  To: newlib, cygwin, IWAMURO Motonori

IWAMURO Motonori wrote to me by private mail:
> I oppose your proposal because I think that it is useless for us.
> 
> 2009/6/6 Thomas Wolff <towo@towo.net>:
>> the intention is that the "codepage" information should be the same
>> for all locales having thbe "UTF-8" (or any other) charmap.  So you
>> cannot freely change width information among locales with the same
>> charmap.
> 
> I don't think that there is such a restriction.
> The standard of the character doesn't provide for the width of the
> character as a standard.
I'm not sure which "standard" you are referring to.
I have checked source data files in /usr/share/i18n/charmaps on my Linux system, e.g. "UTF-8.gz".
These files are used when creating a new locale with the "localedef" command.
They contain not only the mapping but also (by the end of the file) a 
list of combining and double-width characters. So obviously, even 
stronger than I had argued, this would imply a scheme of predefined 
character widths defined by each such "charmap", thus assuming that 
character widths are the same for all locales with the same "charmap".

>> Also, if ja_JP.UTF-8 would mean "CJK width", how would you specify a
>> working locale setting for a terminal that does not run a CJK width
>> font but should yet use other Japanese settings? E.g. with rxvt
>> which does not support CJK width.
> 
> Oh, we ALWAYS have a hard time in this problem VERY VERY VERY much.
> 
> case1: We use only the application that treats the width of the
> character without locale.
No problem.
> case2: We make the patch that solves the character width problem, and
> throw it out up-stream.
Yes, you should go ahead "up-stream", whatever that means in the case of locales.
> case3: We make the patch, and apply it locally.
No, bad idea. All locale-dogmatic people (I'm not one, just warning) 
will bash you for this. What is the situation after remote login? The 
remote system will assume its own locale setting (e.g. "ja_JP.UTF-8") 
to indicate the actual behaviour of its environment properly, which is 
not the case after local implementation of a solution.
> case4: We tearfully give up the correct display of the screen.
> case5: We tearfully give up using the application.
> I selected case5 for rxvt.
> 
No reason to give up.
The approach I've taken in mined is quite successful. The other 
approach, via locale names, will also have limited success provided it 
is taken "up-stream".

>> Thus you could define e.g.
>>        ja_JP.UTF-8@cjk
>> or
>>        ja_JP.UTF-8@cjkwidth
>> to indicate CJK width properties. I guess this is the most compliant way to go.
> 
> I don't think that it is the good idea because:
> 
> - It is "a cygwin-specific solution (or workaround)".
Apparently we agree that a solution should be found that is not cygwin-specific, 
but should be established "up-stream". The question is thus which of the 
discussed mechanisms has a better chance to get accepted up-stream:
- ja_JP.UTF-8 meaning different width data than en_US.UTF-8
or
- ja_JP.UTF-8@cjkwidth meaning different width data than ja_JP.UTF-8

My assumption is that the second proposal (that I made) has a better 
chance, given the existing paradigms of the locale community. But that's 
speculative. If you think you can get your proposal passed "up-stream", 
go ahead and try it, please! If you succeed, everything is fine.

> - In NetBSD, the change to which wcwidth of East Asian Ambiguous Characters returns 2 by CJK locale is planned.
So the same issue (of compliance and portability, especially in the 
remote case) should be discussed in the NetBSD community.
(Is there a suitable forum or mailing list to check?)

> - and, I don't think that I need make special cases give priority more
> than general cases.

> >> - I heard that there is an existing implementation that behave like my
> >> proposal. (Sorry, I didn't hear the system name.)
> > Even if so, I think the way I described is more compatible with the locale
> > mechanism as used elsewhere.

> I think that ALL locale implementations should treat East Asian
> Ambiguous Character Width as 2 for CJK locale.
Again, I agree that IF you manage to get ALL implementations to follow 
this approach, the solution is fine. Please go ahead.


> >> It is no problem because we -- most Japanese language users -- need
> >> not change the settings of mintty and locale after first setup.
> >> We set LANG=ja_JP.UTF-8 and select a Japanese font for mintty.
> > In any case, mined running in mintty will detect CJK width itself,
> > regardless of locale setting, with coming versions of both programs
> > even when it gets changed on-the-fly :)
> Sorry, I can't understand above because I am not good at English.
Well, even if your proposal would finally be implemented, MinTTY will 
still be able to choose different fonts and depending on which font is 
selected, run in locale-width-compliant or width-breaking mode.
* My solution could be tweaked to handle this.
* Auto-detection (of mined) can handle it already.
* Your solution could probably not handle it.



> I don't think so. I think that we should consider the following issues
> if a new mechanism is introduced.

> The existing locale / terminal API don't support:
> - Unicode BiDi.
> - Unicode control characters.
> - Unicode combining characters.
> - Multilingualization. (*)
> - Detect font/fontset information selected with terminal emulator.
> (including, need to consider the case of no-tty)
Not sure what you intend to say with these remarks. Locale and 
terminal APIs are actually two different things. And locale API can 
e.g. handle combining characters (by wcwidth returning 0).



> * Now, we can't use Japanese, Chinese, and Korean at the same time
> even if we use Unicode.
>   Because many font glyphs are quite different even if the code point
> is the same in each language.
This is a completely different issue and it should be easy to solve it 
by simply choosing an appropriate font.



> > With my proposal, an application that wishes to auto-adjust on width
> > properties (maybe even when changing) and which (unlike mined) uses
> > the system wcwidth functions could proceed as follows:
> > * Detect CJK width by using a simple test string width detection.
> > * (Optional) When receiving a SIGWINCH signal (future version of MinTTY),
> >  repeat this detection.
> > * If e.g. LC_CTYPE starts with "ja_JP.UTF-8", call setlocale with
> >  either "ja_JP.UTF-8@cjkwidth" or "ja_JP.UTF-8".
> How to detect it? The application using wcwidth is not necessarily
> executed with terminal emulator. (e.g. text formatter)
OK, my arguments refer to an interactive application that wants to 
control the precise representation of text on the screen.
If for example a text formatter formats for paper printing, it would 
need to apply completely different assumptions anyway. The dreadful 
single/double width issue of cell-based terminals isn't relevant at 
all in that case.



> >> > I'm not happy with the idea of a cygwin-specific solution (or workaround).
> >> I think that it is not cygwin-specific solution.
> > As I tried to suggest above, using "UTF-8" for different width data on one
> > system would be quite specific, using the "@" modifier syntax would not.
> "UTF-8" is only an encoding scheme. It does not specify the character width.
OK, we had this argument above, and we were both not quite right before.
The essence is that whatever you get established up-stream may turn out 
to be a working solution, so I would appreciate if you go ahead and persuade 
some "up-stream" people...


Best regards,
Thomas

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-06-12 18:56             ` Thomas Wolff
@ 2009-06-12 19:12               ` Corinna Vinschen
  2009-06-15  0:30               ` IWAMURO Motonori
  1 sibling, 0 replies; 36+ messages in thread
From: Corinna Vinschen @ 2009-06-12 19:12 UTC (permalink / raw)
  To: newlib, cygwin

On Jun 12 17:38, Thomas Wolff wrote:
> IWAMURO Motonori wrote to me by private mail:
> > I oppose your proposal because I think that it is useless for us.
> > 
> > 2009/6/6 Thomas Wolff <towo@towo.net>:
> >> the intention is that the "codepage" information should be the same
> >> for all locales having thbe "UTF-8" (or any other) charmap.  So you
> >> cannot freely change width information among locales with the same
> >> charmap.
> > 
> > I don't think that there is such a restriction.
> > The standard of the character doesn't provide for the width of the
> > character as a standard.
> I'm not sure which "standard" you are referring to.

The problem appears to be that there is no standard for the handling
of ambiguous characters.

> I have checked source data files in /usr/share/i18n/charmaps on my Linux system, e.g. "UTF-8.gz".
> These files are used when creating a new locale with the "localedef" command.
> They contain not only the mapping but also (by the end of the file) a 
> list of combining and double-width characters. So obviously, even 
> stronger than I had argued, this would imply a scheme of predefined 
> character widths defined by each such "charmap", thus assuming that 
> character widths are the same for all locales with the same "charmap".

I'm not sure the Linux solution is overly flexible.  AFAICS, when using
the UTF-8 charset, the ambiguous characters always have width 1.  Only
when switching to GB18030, the width of these chars is two.  That seems
to be a bit unsatisfying for CJK users.

> >> Also, if ja_JP.UTF-8 would mean "CJK width", how would you specify a
> >> working locale setting for a terminal that does not run a CJK width
> >> font but should yet use other Japanese settings? E.g. with rxvt
> >> which does not support CJK width.

Wouldn't that be covered by using your own proposal just backwards?
Define the default for ja, ko, and zh to use width = 2, with a
@cjknarrow (or whatever) modifier to use width = 1.

> The approach I've taken in mined is quite successful. The other 
> approach, via locale names, will also have limited success provided it 
> is taken "up-stream".

Whatever "upstream" means.


Corinna

-- 
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-06-12 18:56             ` Thomas Wolff
  2009-06-12 19:12               ` Corinna Vinschen
@ 2009-06-15  0:30               ` IWAMURO Motonori
  2009-06-15  4:34                 ` IWAMURO Motonori
  1 sibling, 1 reply; 36+ messages in thread
From: IWAMURO Motonori @ 2009-06-15  0:30 UTC (permalink / raw)
  To: newlib, cygwin

2009/6/13 Thomas Wolff <towo@towo.net>:
> I have checked source data files in /usr/share/i18n/charmaps on my Linux system, e.g. "UTF-8.gz".
<snip>
> character widths are the same for all locales with the same "charmap".

It was reported as a bug, but it isn't fixed now...X-(

http://sourceware.org/bugzilla/show_bug.cgi?id=4335
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=471021

> If you think you can get your proposal passed "up-stream",
> go ahead and try it, please! If you succeed, everything is fine.

 Hmmm, I think that you have misunderstood something because my
explanation is bad.
 I called "up-stream" as the maintainance team of each OS, library, or
application.
 I don't think that there is something single "up-stream".

Japanese language users have tried to fix of the problem for many
years, but it doesn't progress so much now.

>> - In NetBSD, the change to which wcwidth of East Asian Ambiguous Characters returns 2 by CJK locale is planned.
> So the same issue (of compliance and portability, especially in the
> remote case) should be discussed in the NetBSD community.
> (Is there a suitable forum or mailing list to check?)

Sorry, I don't know it because I was personally advised by one of the
NetBSD maintainer ( http://www.hi-matic.org/ (written in Japanese) ).

>> I think that ALL locale implementations should treat East Asian
>> Ambiguous Character Width as 2 for CJK locale.
> Again, I agree that IF you manage to get ALL implementations to follow
> this approach, the solution is fine. Please go ahead.

I will do so, but I want to solve the problem on Cygwin first of all.

>> How to detect it? The application using wcwidth is not necessarily
>> executed with terminal emulator. (e.g. text formatter)
> OK, my arguments refer to an interactive application that wants to
> control the precise representation of text on the screen.
> If for example a text formatter formats for paper printing, it would
> need to apply completely different assumptions anyway. The dreadful
> single/double width issue of cell-based terminals isn't relevant at
> all in that case.

I am assuming the application that depends on the fixed-pitch font as
text-formatter. (like 'indent' command)

I hope the following two results become the same.
- the auto-format filter program using 'wcwidth'.
- run auto-format command on editor. (e.g. "fill-paragraph",
"indent-region", etc on Emacs)
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-06-15  0:30               ` IWAMURO Motonori
@ 2009-06-15  4:34                 ` IWAMURO Motonori
  2009-06-15 11:43                   ` [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests]) Corinna Vinschen
  0 siblings, 1 reply; 36+ messages in thread
From: IWAMURO Motonori @ 2009-06-15  4:34 UTC (permalink / raw)
  To: newlib, cygwin

2009/6/13 Corinna Vinschen <vinschen@redhat.com>:
>> I'm not sure which "standard" you are referring to.
>
> The problem appears to be that there is no standard for the handling
> of ambiguous characters.

Yes, but the guideline exists.
http://cygwin.com/ml/cygwin/2009-05/msg00444.html
> 2) Unicode Standard Annex #11
> http://www.unicode.org/unicode/reports/tr11/ recommends:
> > 5 Recommendations
> (snip)
> > When processing or displaying data
> (snip)
> > Ambiguous characters behave like wide or narrow characters depending
> > on the context (language tag, script identification, associated
> > font, source of data, or explicit markup; all can provide the
> > context). If the context cannot be established reliably, they should
> > be treated as narrow characters by default.

> Define the default for ja, ko, and zh to use width = 2, with a
> @cjknarrow (or whatever) modifier to use width = 1.

I think it is good idea.
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth  failing configure tests])
  2009-06-15  4:34                 ` IWAMURO Motonori
@ 2009-06-15 11:43                   ` Corinna Vinschen
  2009-06-15 15:58                     ` IWAMURO Motonori
  2009-06-27 22:03                     ` Andy Koppe
  0 siblings, 2 replies; 36+ messages in thread
From: Corinna Vinschen @ 2009-06-15 11:43 UTC (permalink / raw)
  To: cygwin, newlib

On Jun 14 22:18, IWAMURO Motonori wrote:
> 2009/6/13 Corinna Vinschen 
> > The problem appears to be that there is no standard for the handling
> > of ambiguous characters.
> 
> Yes, but the guideline exists.
> http://cygwin.com/ml/cygwin/2009-05/msg00444.html

A single mail in a single mailing list of a single project.  That's rather
a suggestion than a guideline...

> > > Ambiguous characters behave like wide or narrow characters depending
> > > on the context (language tag, script identification, associated
> > > font, source of data, or explicit markup; all can provide the
> > > context). If the context cannot be established reliably, they should
> > > be treated as narrow characters by default.
> 
> > Define the default for ja, ko, and zh to use width = 2, with a
> > @cjknarrow (or whatever) modifier to use width = 1.
> 
> I think it is good idea.

If everybody agrees to this suggestion, here's the patch.  Tested
with various combinations like

  LANG=ja_JP.UTF-8@cjknarrow
  LANG=ja_JP@cjknarrow
  LANG=ja.UTF-8@cjknarrow
  LANG=ja@cjknarrow


Corinna


	* libc/locale/locale.c (loadlocale): Add handling of "@cjknarrow"
	modifier on _MB_CAPABLE targets.  Add comment to explain.


Index: libc/locale/locale.c
===================================================================
RCS file: /cvs/src/src/newlib/libc/locale/locale.c,v
retrieving revision 1.20
diff -u -p -r1.20 locale.c
--- libc/locale/locale.c	3 Jun 2009 19:28:22 -0000	1.20
+++ libc/locale/locale.c	15 Jun 2009 08:40:46 -0000
@@ -397,6 +397,9 @@ loadlocale(struct _reent *p, int categor
   int (*l_wctomb) (struct _reent *, char *, wchar_t, const char *, mbstate_t *);
   int (*l_mbtowc) (struct _reent *, wchar_t *, const char *, size_t,
 		   const char *, mbstate_t *);
+#ifdef _MB_CAPABLE
+  int cjknarrow = 0;
+#endif
   
   /* "POSIX" is translated to "C", as on Linux. */
   if (!strcmp (locale, "POSIX"))
@@ -427,10 +430,14 @@ loadlocale(struct _reent *p, int categor
       if (c[0] == '.')
 	{
 	  /* Charset */
-	  strcpy (charset, c + 1);
-	  if ((c = strchr (charset, '@')))
+	  char *chp;
+
+	  ++c;
+	  strcpy (charset, c);
+	  if ((chp = strchr (charset, '@')))
 	    /* Strip off modifier */
-	    *c = '\0';
+	    *chp = '\0';
+	  c += strlen (charset);
 	}
       else if (c[0] == '\0' || c[0] == '@')
 	/* End of string or just a modifier */
@@ -442,6 +449,17 @@ loadlocale(struct _reent *p, int categor
       else
 	/* Invalid string */
       	return NULL;
+#ifdef _MB_CAPABLE
+      if (c[0] == '@')
+	{
+	  /* Modifier */
+	  /* Only one modifier is recognized right now.  "cjknarrow" is used
+	     to modify the behaviour of wcwidth() for East Asian languages.
+	     For details see the comment at the end of this function. */
+	  if (!strcmp (c + 1, "cjknarrow"))
+	    cjknarrow = 1;
+	}
+#endif
     }
   /* We only support this subset of charsets. */
   switch (charset[0])
@@ -604,13 +622,15 @@ loadlocale(struct _reent *p, int categor
       __mbtowc = l_mbtowc;
       __set_ctype (charset);
       /* Check for the language part of the locale specifier.  In case
-         of "ja", "ko", or "zh", assume the use of CJK fonts.  This is
-	 stored in lc_ctype_cjk_lang and tested in wcwidth() to figure
-	 out the width to return (1 or 2) for the "CJK Ambiguous Width"
-	 category of characters. */
-      lc_ctype_cjk_lang = (strncmp (locale, "ja", 2) == 0
-			   || strncmp (locale, "ko", 2) == 0
-			   || strncmp (locale, "zh", 2) == 0);
+         of "ja", "ko", or "zh", assume the use of CJK fonts, unless the
+	 "@cjknarrow" modifier has been specifed.
+	 The result is stored in lc_ctype_cjk_lang and tested in wcwidth()
+	 to figure out the width to return (1 or 2) for the "CJK Ambiguous
+	 Width" category of characters. */
+      lc_ctype_cjk_lang = !cjknarrow
+			  && ((strncmp (locale, "ja", 2) == 0
+			      || strncmp (locale, "ko", 2) == 0
+			      || strncmp (locale, "zh", 2) == 0));
 #endif
     }
   else if (category == LC_MESSAGES)


-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth   failing configure tests])
  2009-06-15 11:43                   ` [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests]) Corinna Vinschen
@ 2009-06-15 15:58                     ` IWAMURO Motonori
  2009-06-15 17:08                       ` Corinna Vinschen
  2009-06-27 22:03                     ` Andy Koppe
  1 sibling, 1 reply; 36+ messages in thread
From: IWAMURO Motonori @ 2009-06-15 15:58 UTC (permalink / raw)
  To: cygwin, newlib

2009/6/15 Corinna Vinschen <corinna-cygwin@cygwin.com>:
>> Yes, but the guideline exists.
>> http://cygwin.com/ml/cygwin/2009-05/msg00444.html
>
> A single mail in a single mailing list of a single project.  That's rather
> a suggestion than a guideline...

Sorry, my writing was bad. My quotation is a part of Unicode Standard
Annex #11 EAST ASIAN WIDTH.
Please see "When processing or displaying data" of "5 Recommendations"
at http://www.unicode.org/unicode/reports/tr11/ .

> If everybody agrees to this suggestion, here's the patch.

Is the name of modifier prefix "cjk-" good? It influences not CJK
characters but a part of symbols and European characters.
Please refer to Andy's opinion:
http://cygwin.com/ml/cygwin/2009-06/msg00240.html

It personally proposes "ambinarrow" because the switch of Vim is "ambiwidth".

And, I don't think that it is symmetrical. How about the following
patch? (I have not changed the name of modifier prefix)

--- libc/locale/locale.c.ORIG	2009-06-15 23:05:40.812500000 +0900
+++ libc/locale/locale.c	2009-06-15 22:56:35.546875000 +0900
@@ -398,7 +398,8 @@
   int (*l_mbtowc) (struct _reent *, wchar_t *, const char *, size_t,
 		   const char *, mbstate_t *);
 #ifdef _MB_CAPABLE
-  int cjknarrow = 0;
+#define CJK_DEFAULT -1
+  int cjk_lang = CJK_DEFAULT;
 #endif

   /* "POSIX" is translated to "C", as on Linux. */
@@ -453,11 +454,14 @@
       if (c[0] == '@')
 	{
 	  /* Modifier */
-	  /* Only one modifier is recognized right now.	 "cjknarrow" is used
-	     to modify the behaviour of wcwidth() for East Asian languages.
-	     For details see the comment at the end of this function. */
+	  /* Only one modifier is recognized right now.	 "cjknarrow" and
+	     "cjkwide" are used to modify the behaviour of wcwidth() for
+	     East Asian languages. For details see the comment at the
+	     end of this function. */
 	  if (!strcmp (c + 1, "cjknarrow"))
-	    cjknarrow = 1;
+	    cjk_lang = 0;
+	  else if (!strcmp (c + 1, "cjkwide"))
+	    cjk_lang = 1;
 	}
 #endif
     }
@@ -627,10 +631,11 @@
 	The result is stored in lc_ctype_cjk_lang and tested in wcwidth()
 	to figure out the width to return (1 or 2) for the "CJK Ambiguous
 	Width" category of characters. */
-      lc_ctype_cjk_lang = !cjknarrow
-			 && ((strncmp (locale, "ja", 2) == 0
-			     || strncmp (locale, "ko", 2) == 0
-			     || strncmp (locale, "zh", 2) == 0));
+      lc_ctype_cjk_lang = cjk_lang != CJK_DEFAULT
+			? cjk_lang
+			: ((strncmp (locale, "ja", 2) == 0
+			   || strncmp (locale, "ko", 2) == 0
+			   || strncmp (locale, "zh", 2) == 0));
 #endif
     }
   else if (category == LC_MESSAGES)
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth  failing configure tests])
  2009-06-15 15:58                     ` IWAMURO Motonori
@ 2009-06-15 17:08                       ` Corinna Vinschen
  2009-06-15 17:14                         ` IWAMURO Motonori
  2009-06-18 15:57                         ` Thomas.Wolff
  0 siblings, 2 replies; 36+ messages in thread
From: Corinna Vinschen @ 2009-06-15 17:08 UTC (permalink / raw)
  To: cygwin, newlib

On Jun 15 23:35, IWAMURO Motonori wrote:
> 2009/6/15 Corinna Vinschen:
> > If everybody agrees to this suggestion, here's the patch.
> 
> Is the name of modifier prefix "cjk-" good? It influences not CJK
> characters but a part of symbols and European characters.
> Please refer to Andy's opinion:
> http://cygwin.com/ml/cygwin/2009-06/msg00240.html
> 
> It personally proposes "ambinarrow" because the switch of Vim is "ambiwidth".

I think "cjk" in the name is the right choice.  There are no ambiguous
characters in western languages (well, probably there are, but the
ambiguity is not on the level of character widths).  This is a problem
which only has a meaning in these so called CJK languages.  It makes
sense to me to use this in the modifier name.

> And, I don't think that it is symmetrical. How about the following
> patch? (I have not changed the name of modifier prefix)

I'm not convinced that we need symmetry.  It looks like a nice idea for
Cygwin or newlib, given that the setlocale language string is checked
and picked to pieces hardcoded in the loadlocale function.

However, besides of being unnecessary, other systems like Linux or BSD
use the language string as directory name relative to the
/usr/share/locale directory.  If this gets ever used on non-Cygwin
systems, the symmetry (which has no precedent in the locale arena) would
require these systems to create yet another subdirectory or symlink for
the same purpose.  Even worse, if you propose that @cjkwide is a valid
modifier for *any* language, you would make the whole mechanism on
non-newlib based systems more complicated for no apparent reason.

Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth   failing configure tests])
  2009-06-15 17:08                       ` Corinna Vinschen
@ 2009-06-15 17:14                         ` IWAMURO Motonori
  2009-06-18 15:57                         ` Thomas.Wolff
  1 sibling, 0 replies; 36+ messages in thread
From: IWAMURO Motonori @ 2009-06-15 17:14 UTC (permalink / raw)
  To: cygwin, newlib

OK. I withdraw my proposal.

2009/6/16 Corinna Vinschen <corinna-cygwin@cygwin.com>:
> On Jun 15 23:35, IWAMURO Motonori wrote:
>> 2009/6/15 Corinna Vinschen:
>> > If everybody agrees to this suggestion, here's the patch.
>>
>> Is the name of modifier prefix "cjk-" good? It influences not CJK
>> characters but a part of symbols and European characters.
>> Please refer to Andy's opinion:
>> http://cygwin.com/ml/cygwin/2009-06/msg00240.html
>>
>> It personally proposes "ambinarrow" because the switch of Vim is "ambiwidth".
>
> I think "cjk" in the name is the right choice.  There are no ambiguous
> characters in western languages (well, probably there are, but the
> ambiguity is not on the level of character widths).  This is a problem
> which only has a meaning in these so called CJK languages.  It makes
> sense to me to use this in the modifier name.
>
>> And, I don't think that it is symmetrical. How about the following
>> patch? (I have not changed the name of modifier prefix)
>
> I'm not convinced that we need symmetry.  It looks like a nice idea for
> Cygwin or newlib, given that the setlocale language string is checked
> and picked to pieces hardcoded in the loadlocale function.
>
> However, besides of being unnecessary, other systems like Linux or BSD
> use the language string as directory name relative to the
> /usr/share/locale directory.  If this gets ever used on non-Cygwin
> systems, the symmetry (which has no precedent in the locale arena) would
> require these systems to create yet another subdirectory or symlink for
> the same purpose.  Even worse, if you propose that @cjkwide is a valid
> modifier for *any* language, you would make the whole mechanism on
> non-newlib based systems more complicated for no apparent reason.
>
>
> Corinna
>
> --
> Corinna Vinschen                  Please, send mails regarding Cygwin to
> Cygwin Project Co-Leader          cygwin AT cygwin DOT com
> Red Hat
>



-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests])
  2009-06-15 17:08                       ` Corinna Vinschen
  2009-06-15 17:14                         ` IWAMURO Motonori
@ 2009-06-18 15:57                         ` Thomas.Wolff
  2009-06-18 16:49                           ` Corinna Vinschen
                                             ` (2 more replies)
  1 sibling, 3 replies; 36+ messages in thread
From: Thomas.Wolff @ 2009-06-18 15:57 UTC (permalink / raw)
  To: cygwin, newlib; +Cc: IWAMURO Motonori, Andy Koppe

2009/6/16 Corinna Vinschen <corinna-cygwin@cygwin.com>:
> On Jun 15 23:35, IWAMURO Motonori wrote:
>> 2009/6/15 Corinna Vinschen:
>> > If everybody agrees to this suggestion, here's the patch.
>>
>> Is the name of modifier prefix "cjk-" good? It influences not CJK
>> characters but a part of symbols and European characters.
>> Please refer to Andy's opinion:
>> http://cygwin.com/ml/cygwin/2009-06/msg00240.html
>>
>> It personally proposes "ambinarrow" because the switch of Vim is "ambiwidth".
>
> I think "cjk" in the name is the right choice. ?There are no ambiguous
> characters in western languages (well, probably there are, but the
> ambiguity is not on the level of character widths). ?This is a problem
> which only has a meaning in these so called CJK languages. ?It makes
> sense to me to use this in the modifier name.
I agree with keeping "cjk" in the modifier name (also because the xterm 
option is called -cjk_width) but for the historic understanding, it's 
actually quite the other way round:

In traditional CJK character encodings, fonts, and terminal 
applications, basically ALL characters were wide, including a subset 
of Latin characters as it happened to be included in those character 
sets, and sometimes even including the ASCII range.
These are the ones considered "ambiguous" since they used to be wide, 
while in all non-CJK environments they are not (excluding ASCII which 
is thus mirrored in the range "Halfwidth and Fullwidth Forms", 
U+FF00 ... U+FF5E).
This also explains the chaotic mix of wide and narrow characters in ranges 
like Latin-1 Supplement, Latin Extended, Greek and Cyrillic which is in no 
way useful for any user; it's just a legacy compatibility issue.
I think the major usage for CJK users nowadays is about ranges like 
Arrows, Enclosed Alphanumerics (with circled digits), Box Drawing etc.


>> And, I don't think that it is symmetrical. How about the following
>> patch? (I have not changed the name of modifier prefix)
>
> I'm not convinced that we need symmetry. ?It looks like a nice idea for
> Cygwin or newlib, given that the setlocale language string is checked
> and picked to pieces hardcoded in the loadlocale function.
Despite IWAMURO Motonori's withdrawal, I think symmetry would be the 
right approach to take. The major aspect is how to reflect the actual 
behaviour of existing terminal environments. And as a matter of fact, 
you can run both xterm and MinTTY with a non-CJK locale and ambiguous 
characters being wide. This is achieved by invoking xterm -cjk_width or 
by selecting an according font in MinTTY, e.g. Ming, SimSun, MS Mincho, 
or even just the popular Lucida Typewriter.
(Although it occurs to me that in the case of Lucida Typewriter this 
might be a bug since the wideness of ambiguous characters is just 
simulated in this configuration rather than using wide font characters -
Andy, can you please check this?)


> However, besides of being unnecessary, other systems like Linux or BSD
> use the language string as directory name relative to the
> /usr/share/locale directory. ?If this gets ever used on non-Cygwin
> systems, the symmetry (which has no precedent in the locale arena) would
> require these systems to create yet another subdirectory or symlink for
> the same purpose. ?Even worse, if you propose that @cjkwide is a valid
> modifier for *any* language, you would make the whole mechanism on
> non-newlib based systems more complicated for no apparent reason.
The silly unmodular way that some systems implement the locale mechanism 
(the worst of them being SunOS) 
should not be an argument to not propagate a reasonable solution.
[Who was in favour of these double negations?]

The "locale interface" (syntax and semantics of LC_* strings) is defined 
in a modular way and so the implementations should be - let them fix it.


Thomas

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth  failing configure tests])
  2009-06-18 15:57                         ` Thomas.Wolff
@ 2009-06-18 16:49                           ` Corinna Vinschen
  2009-06-19  0:08                           ` Andy Koppe
  2009-06-19 14:45                           ` Thomas Wolff
  2 siblings, 0 replies; 36+ messages in thread
From: Corinna Vinschen @ 2009-06-18 16:49 UTC (permalink / raw)
  To: cygwin, newlib

On Jun 18 14:09, Thomas.Wolff@nsn.com wrote:
> 2009/6/16 Corinna Vinschen 
> > However, besides of being unnecessary, other systems like Linux or BSD
> > use the language string as directory name relative to the
> > /usr/share/locale directory. ?If this gets ever used on non-Cygwin
> > systems, the symmetry (which has no precedent in the locale arena) would
> > require these systems to create yet another subdirectory or symlink for
> > the same purpose. ?Even worse, if you propose that @cjkwide is a valid
> > modifier for *any* language, you would make the whole mechanism on
> > non-newlib based systems more complicated for no apparent reason.
> The silly unmodular way that some systems implement the locale mechanism 
> (the worst of them being SunOS) 
> should not be an argument to not propagate a reasonable solution.
> [Who was in favour of these double negations?]
> 
> The "locale interface" (syntax and semantics of LC_* strings) is defined 
> in a modular way and so the implementations should be - let them fix it.

What do you think, how big will be the acceptance of this approach
outside of newlib/Cygwin?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth   failing configure tests])
  2009-06-18 15:57                         ` Thomas.Wolff
  2009-06-18 16:49                           ` Corinna Vinschen
@ 2009-06-19  0:08                           ` Andy Koppe
  2009-06-19 14:45                           ` Thomas Wolff
  2 siblings, 0 replies; 36+ messages in thread
From: Andy Koppe @ 2009-06-19  0:08 UTC (permalink / raw)
  To: cygwin

2009/6/18 Thomas.Wolff:
> And as a matter of fact,
> you can run both xterm and MinTTY with a non-CJK locale and ambiguous
> characters being wide. This is achieved by invoking xterm -cjk_width or
> by selecting an according font in MinTTY, e.g. Ming, SimSun, MS Mincho,
> or even just the popular Lucida Typewriter.
> (Although it occurs to me that in the case of Lucida Typewriter this
> might be a bug since the wideness of ambiguous characters is just
> simulated in this configuration rather than using wide font characters -
> Andy, can you please check this?)

Yep, there's a problem here, thanks. I haven't got Lucida Typewriter,
but found my Vista install has Lucida Sans Typewriter. That font
doesn't actually have Greek or box drawing characters, so all I'm
getting is the square replacement character, but it does indeed take
up two cells for those. Turns out that's because Latin characters are
reported as having a width of 0.5 (of whatever unit) whereas the
replacement character is reported as being 0.625 wide. I'll adjust the
ambiguous-width detection.

Andy

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests])
  2009-06-18 15:57                         ` Thomas.Wolff
  2009-06-18 16:49                           ` Corinna Vinschen
  2009-06-19  0:08                           ` Andy Koppe
@ 2009-06-19 14:45                           ` Thomas Wolff
  2009-06-19 14:49                             ` Corinna Vinschen
  2 siblings, 1 reply; 36+ messages in thread
From: Thomas Wolff @ 2009-06-19 14:45 UTC (permalink / raw)
  To: cygwin, newlib

I wrote:
> Despite IWAMURO Motonori's withdrawal, I think symmetry would be the 
> right approach to take. The major aspect is how to reflect the actual 
> behaviour of existing terminal environments. ...

> ...
> The "locale interface" (syntax and semantics of LC_* strings) is defined 
> in a modular way and so the implementations should be - let them fix it.

Corinna Vinschen wrote:
> What do you think, how big will be the acceptance of this approach
> outside of newlib/Cygwin?

I have no idea about the acceptance of the whole concept, especially 
(as I had warned) about changing the width of the CJK locales 
WITHOUT modifier as IWAMURO Motonori insisted.
But I guess a general solution of the width issue will be more 
appreciated than one that handles only the CJK locales.

Thomas

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth  failing configure tests])
  2009-06-19 14:45                           ` Thomas Wolff
@ 2009-06-19 14:49                             ` Corinna Vinschen
  0 siblings, 0 replies; 36+ messages in thread
From: Corinna Vinschen @ 2009-06-19 14:49 UTC (permalink / raw)
  To: cygwin, newlib

On Jun 19 13:02, Thomas Wolff wrote:
> I wrote:
> > Despite IWAMURO Motonori's withdrawal, I think symmetry would be the 
> > right approach to take. The major aspect is how to reflect the actual 
> > behaviour of existing terminal environments. ...
> 
> > ...
> > The "locale interface" (syntax and semantics of LC_* strings) is defined 
> > in a modular way and so the implementations should be - let them fix it.
> 
> Corinna Vinschen wrote:
> > What do you think, how big will be the acceptance of this approach
> > outside of newlib/Cygwin?
> 
> I have no idea about the acceptance of the whole concept, especially 
> (as I had warned) about changing the width of the CJK locales 
> WITHOUT modifier as IWAMURO Motonori insisted.
> But I guess a general solution of the width issue will be more 
> appreciated than one that handles only the CJK locales.

Well, if your proposal will be accepted by other projects, we can easily
extend our own implementation without changing existing functionality.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth   failing configure tests])
  2009-06-15 11:43                   ` [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests]) Corinna Vinschen
  2009-06-15 15:58                     ` IWAMURO Motonori
@ 2009-06-27 22:03                     ` Andy Koppe
  2009-06-28  8:18                       ` IWAMURO Motonori
  1 sibling, 1 reply; 36+ messages in thread
From: Andy Koppe @ 2009-06-27 22:03 UTC (permalink / raw)
  To: cygwin, newlib

2009/6/15 Corinna Vinschen:
>> > Define the default for ja, ko, and zh to use width = 2, with a
>> > @cjknarrow (or whatever) modifier to use width = 1.
>>
>> I think it is good idea.
>
> If everybody agrees to this suggestion, here's the patch.  Tested
> with various combinations like
>
>  LANG=ja_JP.UTF-8@cjknarrow
>  LANG=ja_JP@cjknarrow
>  LANG=ja.UTF-8@cjknarrow
>  LANG=ja@cjknarrow

Apologies for harping on about this, especially as it was me who
suggested the @narrow scheme in the first place, but I do think this
is the wrong way to go.

MinTTY currenly ignores POSIX locales completely, so I've been
pondering how to deal with locales and codepages more properly. One
thing I'd like to do is to automatically set LANG depending on the
Windows locale and the codepage and font settings in MinTTY (if LANG
isn't set already, that is).

Trouble is, what do I do if a cjkwide font is selected, yet the
Windows locale is not East Asian? I can't just randomly stick the user
into one of the three CJK countries, because people don't always take
kindly to being put into the wrong country.

That could be addressed by adding the @cjkwide modifier for non-CJK
languages, as discussed previously, but then MinTTY would still need
to parse the language setting to decide which modifier (if any) needs
to be used. Having the @cjkwide modifier only, independent of the
selected language, would keep things much easier to use and explain.

And then there's the Linux compatibility angle, where ja_JP.UTF-8
means ambiguous width 1 not 2.

To try to help with changing this, here's some text for the user guide.

Replace this:
"Right now the language and territory, as well as the modifier, are
not important to Cygwin, except to fix a single problem. There's a
class of characters in the Unicode character set, called the "CJK
Ambiguous Width Character set". For these characters the width
returned by the wcwidth/wcswidth function is usually 1. This is often
a problem in East-Asian languages, which historically use character
sets in which these characters have a width of 2. Kind of explains why
they are called "ambiguous"...

The problem has been fixed for now like this. wcwidth/wcswidth usually
return 1 as the width of these characters. However, if the language is
specifed as "ja" (Japanese), "ko" (Korean), or "zh" (Chinese), wcwidth
returns 2 for these characters. Unfortunately this isn't correct in
all circumstances, so the user can specify the modifier "@cjknarrow",
which modifies the behaviour of wcwidth/wcswidth to return 1 for the
ambiguous width characters to return 1 even in those languages."

With this:
"Right now the language and territory are not important to Cygwin, but
the modifier is used to deal with the issue of "CJK Ambiguous Width"
characters. For these characters the width returned by the wcwidth
function is usually 1. This is often a problem in East Asian
languages, which historically use character sets in which these
characters have a width of 2. Kind of explains why they are called
"ambiguous"... . (See http://unicode.org/reports/tr11/ for a full
explanation.)

Therefore, if the modifier "@cjkwide" is specified, wcwidth returns 2
for these characters. For example, with jp_JP.UTF-8 their width is 1,
whereas with jp_JP.UTF-8@cjkwide it is 2."

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth   failing configure tests])
  2009-06-27 22:03                     ` Andy Koppe
@ 2009-06-28  8:18                       ` IWAMURO Motonori
  0 siblings, 0 replies; 36+ messages in thread
From: IWAMURO Motonori @ 2009-06-28  8:18 UTC (permalink / raw)
  To: cygwin

Hi.

2009/6/27 Andy Koppe <andy.koppe@gmail.com>:
> And then there's the Linux compatibility angle, where ja_JP.UTF-8
> means ambiguous width 1 not 2.

I want you not to judge it based on the behavior of current Linux.
Because:
- I don't think the behavior is correct.
- Now, I am creating the patch for the problem.
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-06-05 16:25         ` Thomas Wolff
                             ` (2 preceding siblings ...)
       [not found]           ` <3f0ad08d0906060242t275a78e7tb9913bf78d1c5e83@mail.gmail.com>
@ 2009-06-06 12:22           ` IWAMURO Motonori
  3 siblings, 0 replies; 36+ messages in thread
From: IWAMURO Motonori @ 2009-06-06 12:22 UTC (permalink / raw)
  To: newlib, cygwin

# Continuation of discussion.
#
# I hope that all the applications work correctly only by setting
"LANG=ja_JP.UTF-8".
# I don't hope that I give up the use of the binary packages and that
I keep applying many local patches.


> I don't think that it is the good idea because:
>
> - It is "a cygwin-specific solution (or workaround)".
> - In NetBSD, the change to which wcwidth of East Asian Ambiguous Characters returns 2 by CJK locale is planned.

- and, I don't think that I need make special cases give priority more
than general cases.

>> - I heard that there is an existing implementation that behave like my
>> proposal. (Sorry, I didn't hear the system name.)
> Even if so, I think the way I described is more compatible with the locale
> mechanism as used elsewhere.

I think that ALL locale implementations should treat East Asian
Ambiguous Character Width as 2 for CJK locale.

>> It is no problem because we -- most Japanese language users -- need
>> not change the settings of mintty and locale after first setup.
>> We set LANG=ja_JP.UTF-8 and select a Japanese font for mintty.
> In any case, mined running in mintty will detect CJK width itself,
> regardless of locale setting, with coming versions of both programs
> even when it gets changed on-the-fly :)

Sorry, I can't understand above because I am not good at English.

> This sounds complicated.

I don't think so. I think that we should consider the following issues
if a new mechanism is introduced.

The existing locale / terminal API don't support:
- Unicode BiDi.
- Unicode control characters.
- Unicode combining characters.
- Multilingualization. (*)
- Detect font/fontset information selected with terminal emulator.
(including, need to consider the case of no-tty)

* Now, we can't use Japanese, Chinese, and Korean at the same time
even if we use Unicode.
  Because many font glyphs are quite different even if the code point
is the same in each language.

> With my proposal, an application that wishes to auto-adjust on width
> properties (maybe even when changing) and which (unlike mined) uses
> the system wcwidth functions could proceed as follows:
> * Detect CJK width by using a simple test string width detection.
> * (Optional) When receiving a SIGWINCH signal (future version of MinTTY),
>  repeat this detection.
> * If e.g. LC_CTYPE starts with "ja_JP.UTF-8", call setlocale with
>  either "ja_JP.UTF-8@cjkwidth" or "ja_JP.UTF-8".

How to detect it? The application using wcwidth is not necessarily
executed with terminal emulator. (e.g. text formatter)

>> > I'm not happy with the idea of a cygwin-specific solution (or workaround).
>> I think that it is not cygwin-specific solution.
> As I tried to suggest above, using "UTF-8" for different width data on one
> system would be quite specific, using the "@" modifier syntax would not.

"UTF-8" is only an encoding scheme. It does not specify the character width.
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Fwd: [1.7] wcwidth failing configure tests]
  2009-05-14 15:58     ` IWAMURO Motonori
  2009-05-14 17:26       ` Corinna Vinschen
  2009-05-20 16:52       ` Thomas Wolff
@ 2009-05-26 16:46       ` IWAMURO Motonori
  2 siblings, 0 replies; 36+ messages in thread
From: IWAMURO Motonori @ 2009-05-26 16:46 UTC (permalink / raw)
  To: newlib, cygwin

I correct my proposal.

2009/5/15 IWAMURO Motonori <deenheart@gmail.com>:
> I propose to use *_cjk() when the language part of LC_CTYPE
> is 'ja', 'ko', 'vi' or 'zh'.

LC_CTYPE is 'ja', 'ko', or 'zh'. I remove 'vi'. (advice from a NetBSD
locale part maintainer)
-- 
IWAMURO Motnori <http://vmi.jp/>

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2009-06-28  5:40 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-12 16:54 [Fwd: [1.7] wcwidth failing configure tests] Corinna Vinschen
2009-05-12 16:56 ` Andy Koppe
2009-05-12 17:32   ` Corinna Vinschen
2009-05-13 19:04     ` Andy Koppe
2009-05-13 19:40       ` Corinna Vinschen
2009-05-13 19:55         ` Andy Koppe
2009-05-14 15:58     ` IWAMURO Motonori
2009-05-14 17:26       ` Corinna Vinschen
2009-05-14 21:51         ` Jeff Johnston
2009-05-15 11:43           ` Corinna Vinschen
2009-05-20 16:52       ` Thomas Wolff
2009-05-20 19:41         ` IWAMURO Motonori
2009-06-05 16:25         ` Thomas Wolff
2009-06-06  7:24           ` Andy Koppe
2009-06-06 12:53             ` IWAMURO Motonori
2009-06-06  9:31           ` Corinna Vinschen
2009-06-06  9:56             ` Andy Koppe
2009-06-06 13:06             ` IWAMURO Motonori
     [not found]           ` <3f0ad08d0906060242t275a78e7tb9913bf78d1c5e83@mail.gmail.com>
2009-06-06  9:46             ` IWAMURO Motonori
2009-06-12 18:56             ` Thomas Wolff
2009-06-12 19:12               ` Corinna Vinschen
2009-06-15  0:30               ` IWAMURO Motonori
2009-06-15  4:34                 ` IWAMURO Motonori
2009-06-15 11:43                   ` [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests]) Corinna Vinschen
2009-06-15 15:58                     ` IWAMURO Motonori
2009-06-15 17:08                       ` Corinna Vinschen
2009-06-15 17:14                         ` IWAMURO Motonori
2009-06-18 15:57                         ` Thomas.Wolff
2009-06-18 16:49                           ` Corinna Vinschen
2009-06-19  0:08                           ` Andy Koppe
2009-06-19 14:45                           ` Thomas Wolff
2009-06-19 14:49                             ` Corinna Vinschen
2009-06-27 22:03                     ` Andy Koppe
2009-06-28  8:18                       ` IWAMURO Motonori
2009-06-06 12:22           ` [Fwd: [1.7] wcwidth failing configure tests] IWAMURO Motonori
2009-05-26 16:46       ` IWAMURO Motonori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).