public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Ken Brown <kbrown@cornell.edu>
To: cygwin@cygwin.com
Subject: Re: Bug in collation functions?
Date: Thu, 29 Oct 2015 15:51:00 -0000	[thread overview]
Message-ID: <56321815.7000203@cornell.edu> (raw)
In-Reply-To: <20151029083057.GH5319@calimero.vinschen.de>

On 10/29/2015 4:30 AM, Corinna Vinschen wrote:
> On Oct 29 08:50, Corinna Vinschen wrote:
>> On Oct 28 21:58, Eric Blake wrote:
>>> On 10/28/2015 04:14 PM, Ken Brown wrote:
>>>> It's my understanding that collation is supposed to take whitespace and
>>>> punctuation into account in the POSIX locale but not in other locales.
>>>
>>> Not quite right. It is up to the locale definition whether whitespace
>>> affects collation.  But you are correct that in the POSIX locale,
>>> whitespace must not be ignored in collation.
>>>
>>>> This doesn't seem to be the case on Cygwin.  Here's a test case using
>>>> wcscoll, but the same problem occurs with strcoll.
>>>
>>> That's because the locale definitions are different in cygwin than they
>>> are in glibc.  But it is not a bug in Cygwin; POSIX allows for different
>>> systems to have different locale definitions while still using the same
>>> locale name like en_US.UTF-8.
>>
>> Btw, strcoll and wcscoll in Cygwin are implemented using the Windows
>> function CompareStringW with the LCID set to the locale matching the
>> POSIX locale setting.  I'm rather glad I didn't have to implement this
>> by myself... :}
>
> OTOH, CompareString has a couple of flags to control its behaviour, see
> https://msdn.microsoft.com/en-us/library/windows/desktop/dd317761%28v=vs.85%29.aspx
>
> Right now Cygwin calls CompareStringW with dwCmpFlags set to 0, but there
> are flags like NORM_IGNORENONSPACE, NORM_IGNORESYMBOLS.  I'm open to a
> discussion how to change the settings to more closely resemble the rules
> on Linux.
>
> E.g.  wcscoll simply calls wcscmp rather than CompareStringW for the
> C/POSIX locale anyway.  So, would it makes sense to set the flags to
> NORM_IGNORESYMBOLS in other locales?

I think so.  That's what the native Windows build of emacs does in this 
situation.  (I came across the issue because one of the tests in the 
emacs test suite was failing on Cygwin.)

Ken


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

  reply	other threads:[~2015-10-29 12:58 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-29  7:41 Ken Brown
2015-10-29  7:50 ` Eric Blake
2015-10-29 12:58   ` Corinna Vinschen
2015-10-29 15:35     ` Corinna Vinschen
2015-10-29 15:51       ` Ken Brown [this message]
2015-10-29 16:14         ` Corinna Vinschen
2015-10-29 16:14           ` Ken Brown
2015-10-29 16:51             ` Ken Brown
2015-10-29 18:09               ` Eric Blake
2015-10-29 21:58                 ` Ken Brown
2015-10-30  8:05                   ` Ken Brown
2015-10-30 14:07                     ` Ken Brown
2015-10-30 19:11                       ` Corinna Vinschen
2015-10-30 19:14                         ` Ken Brown
2015-10-30 21:13                           ` Corinna Vinschen
     [not found]                           ` <5634F6BA.7070301@cornell.edu>
2015-11-02 11:14                             ` Corinna Vinschen
2015-10-29 16:17           ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56321815.7000203@cornell.edu \
    --to=kbrown@cornell.edu \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).