From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 64284 invoked by alias); 29 Oct 2015 08:31:01 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 64267 invoked by uid 89); 29 Oct 2015 08:31:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.4 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 X-HELO: calimero.vinschen.de Received: from aquarius.hirmke.de (HELO calimero.vinschen.de) (217.91.18.234) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 29 Oct 2015 08:31:00 +0000 Received: by calimero.vinschen.de (Postfix, from userid 500) id 0322EA805F3; Thu, 29 Oct 2015 09:30:58 +0100 (CET) Date: Thu, 29 Oct 2015 15:35:00 -0000 From: Corinna Vinschen To: cygwin@cygwin.com Subject: Re: Bug in collation functions? Message-ID: <20151029083057.GH5319@calimero.vinschen.de> Reply-To: cygwin@cygwin.com Mail-Followup-To: cygwin@cygwin.com References: <563148AF.1000502@cornell.edu> <5631996D.7040908@redhat.com> <20151029075050.GE5319@calimero.vinschen.de> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="BoBwh7s2kSeeheTs" Content-Disposition: inline In-Reply-To: <20151029075050.GE5319@calimero.vinschen.de> User-Agent: Mutt/1.5.23 (2014-03-12) X-SW-Source: 2015-10/txt/msg00532.txt.bz2 --BoBwh7s2kSeeheTs Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-length: 1851 On Oct 29 08:50, Corinna Vinschen wrote: > On Oct 28 21:58, Eric Blake wrote: > > On 10/28/2015 04:14 PM, Ken Brown wrote: > > > It's my understanding that collation is supposed to take whitespace a= nd > > > punctuation into account in the POSIX locale but not in other locales. > >=20 > > Not quite right. It is up to the locale definition whether whitespace > > affects collation. But you are correct that in the POSIX locale, > > whitespace must not be ignored in collation. > >=20 > > > This doesn't seem to be the case on Cygwin. Here's a test case using > > > wcscoll, but the same problem occurs with strcoll. > >=20 > > That's because the locale definitions are different in cygwin than they > > are in glibc. But it is not a bug in Cygwin; POSIX allows for different > > systems to have different locale definitions while still using the same > > locale name like en_US.UTF-8. >=20 > Btw, strcoll and wcscoll in Cygwin are implemented using the Windows > function CompareStringW with the LCID set to the locale matching the > POSIX locale setting. I'm rather glad I didn't have to implement this > by myself... :} OTOH, CompareString has a couple of flags to control its behaviour, see https://msdn.microsoft.com/en-us/library/windows/desktop/dd317761%28v=3Dvs.= 85%29.aspx Right now Cygwin calls CompareStringW with dwCmpFlags set to 0, but there are flags like NORM_IGNORENONSPACE, NORM_IGNORESYMBOLS. I'm open to a discussion how to change the settings to more closely resemble the rules on Linux. E.g. wcscoll simply calls wcscmp rather than CompareStringW for the C/POSIX locale anyway. So, would it makes sense to set the flags to NORM_IGNORESYMBOLS in other locales? Corinna --=20 Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat --BoBwh7s2kSeeheTs Content-Type: application/pgp-signature Content-length: 819 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJWMdlBAAoJEPU2Bp2uRE+gHbIQAJCfLjX0JnlhuZ4tIz1bjyUR RpL+l6421qIWgHX1KSjz0TxS9yxptxd476+K3LpYvk6fLdOA4xtQ2hkY3IuPLUHh 3LdGSQbX1726v8WFsQUlGxy4SPu5gtAzPXxpEyw6iuIlZmjYxw2Xg3CVjw6d4OxK MfwQSgHqJmX5slYcFMkqwpoYg2CjLNsLO3FRsb8Vq/Azrrxx//yOFqxSW8oqpR0s Txc/zF3spi+kNQonOJX5h6H/HduUviojjsrqiWrEU38seHPSyWP9x308Nlzuym7R 2iQNd2QmN38c8yJNK4xtuJidpvEU2aji17G9P0bBZZC53MPCK68uOK/4M7CktlIz AeWKbVCv1wMS3t4TmqDhHagx8vT59iWGp/FprtK9PXOYSI7IMFCLtcxB++nAK/WM cq1IK2Nh5joJRNS25F1w8ujjUA2Vb52+K/hELvSF3kmE6jWFbqdHqzIHXhQZBEQJ bPAHKo1EGE0x6i5QhQz/nmnZ+TNxnBpcW5rY3QGPCP9mutewg5BEzyGW/DI/fl0u WMpPBQ3HBwsOcdwrK2frzBnwCglpA959kyXC/8Y9KgzeJg0Pmct+jKGSIM4Ijmig ZwBgVSD9uQooT/Lk67TkoLtj8J8agYcz/4woNxboOoDEQV7FQjb47lzOYijyX/KS a4WCAvfwg7U4Z1Ng59FF =Ge3y -----END PGP SIGNATURE----- --BoBwh7s2kSeeheTs--