On Oct 29 08:50, Corinna Vinschen wrote: > On Oct 28 21:58, Eric Blake wrote: > > On 10/28/2015 04:14 PM, Ken Brown wrote: > > > It's my understanding that collation is supposed to take whitespace and > > > punctuation into account in the POSIX locale but not in other locales. > > > > Not quite right. It is up to the locale definition whether whitespace > > affects collation. But you are correct that in the POSIX locale, > > whitespace must not be ignored in collation. > > > > > This doesn't seem to be the case on Cygwin. Here's a test case using > > > wcscoll, but the same problem occurs with strcoll. > > > > That's because the locale definitions are different in cygwin than they > > are in glibc. But it is not a bug in Cygwin; POSIX allows for different > > systems to have different locale definitions while still using the same > > locale name like en_US.UTF-8. > > Btw, strcoll and wcscoll in Cygwin are implemented using the Windows > function CompareStringW with the LCID set to the locale matching the > POSIX locale setting. I'm rather glad I didn't have to implement this > by myself... :} OTOH, CompareString has a couple of flags to control its behaviour, see https://msdn.microsoft.com/en-us/library/windows/desktop/dd317761%28v=vs.85%29.aspx Right now Cygwin calls CompareStringW with dwCmpFlags set to 0, but there are flags like NORM_IGNORENONSPACE, NORM_IGNORESYMBOLS. I'm open to a discussion how to change the settings to more closely resemble the rules on Linux. E.g. wcscoll simply calls wcscmp rather than CompareStringW for the C/POSIX locale anyway. So, would it makes sense to set the flags to NORM_IGNORESYMBOLS in other locales? Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat