From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16529 invoked by alias); 29 Oct 2015 15:35:20 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 16518 invoked by uid 89); 29 Oct 2015 15:35:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.4 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 X-HELO: calimero.vinschen.de Received: from aquarius.hirmke.de (HELO calimero.vinschen.de) (217.91.18.234) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 29 Oct 2015 15:35:19 +0000 Received: by calimero.vinschen.de (Postfix, from userid 500) id 96F92A805F6; Thu, 29 Oct 2015 16:35:16 +0100 (CET) Date: Thu, 29 Oct 2015 16:14:00 -0000 From: Corinna Vinschen To: cygwin@cygwin.com Subject: Re: Bug in collation functions? Message-ID: <20151029153516.GJ5319@calimero.vinschen.de> Reply-To: cygwin@cygwin.com Mail-Followup-To: cygwin@cygwin.com References: <563148AF.1000502@cornell.edu> <5631996D.7040908@redhat.com> <20151029075050.GE5319@calimero.vinschen.de> <20151029083057.GH5319@calimero.vinschen.de> <56321815.7000203@cornell.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="6ovzACdnYbOKIO0z" Content-Disposition: inline In-Reply-To: <56321815.7000203@cornell.edu> User-Agent: Mutt/1.5.23 (2014-03-12) X-SW-Source: 2015-10/txt/msg00537.txt.bz2 --6ovzACdnYbOKIO0z Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-length: 2361 On Oct 29 08:59, Ken Brown wrote: > On 10/29/2015 4:30 AM, Corinna Vinschen wrote: > >On Oct 29 08:50, Corinna Vinschen wrote: > >>On Oct 28 21:58, Eric Blake wrote: > >>>On 10/28/2015 04:14 PM, Ken Brown wrote: > >>>>It's my understanding that collation is supposed to take whitespace a= nd > >>>>punctuation into account in the POSIX locale but not in other locales. > >>> > >>>Not quite right. It is up to the locale definition whether whitespace > >>>affects collation. But you are correct that in the POSIX locale, > >>>whitespace must not be ignored in collation. > >>> > >>>>This doesn't seem to be the case on Cygwin. Here's a test case using > >>>>wcscoll, but the same problem occurs with strcoll. > >>> > >>>That's because the locale definitions are different in cygwin than they > >>>are in glibc. But it is not a bug in Cygwin; POSIX allows for differe= nt > >>>systems to have different locale definitions while still using the same > >>>locale name like en_US.UTF-8. > >> > >>Btw, strcoll and wcscoll in Cygwin are implemented using the Windows > >>function CompareStringW with the LCID set to the locale matching the > >>POSIX locale setting. I'm rather glad I didn't have to implement this > >>by myself... :} > > > >OTOH, CompareString has a couple of flags to control its behaviour, see > >https://msdn.microsoft.com/en-us/library/windows/desktop/dd317761%28v=3D= vs.85%29.aspx > > > >Right now Cygwin calls CompareStringW with dwCmpFlags set to 0, but there > >are flags like NORM_IGNORENONSPACE, NORM_IGNORESYMBOLS. I'm open to a > >discussion how to change the settings to more closely resemble the rules > >on Linux. > > > >E.g. wcscoll simply calls wcscmp rather than CompareStringW for the > >C/POSIX locale anyway. So, would it makes sense to set the flags to > >NORM_IGNORESYMBOLS in other locales? >=20 > I think so. That's what the native Windows build of emacs does in this > situation. Is that all it's doing? I'm asking because using NORM_IGNORESYMBOLS does not exaclty resemble the behaviour on Linux on my W10 box: "11" > "1.1" in POSIX locale !!! "11" > "1.1" in en_US.UTF-8 locale "11" > "1 2" in POSIX locale "11" < "1 2" in en_US.UTF-8 locale Corinna --=20 Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat --6ovzACdnYbOKIO0z Content-Type: application/pgp-signature Content-length: 819 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJWMjy0AAoJEPU2Bp2uRE+gU5MP/0tvublBPkmzo+AHPy7XAyXy 0RhJg7klIf2ruW0yRF+QahWnTtkl6Ml37Uu1/pHssio6YyAJ7o96k+N+qTblI0XD D0MfWZMwsNbvaeE6OOZPaP0uXj5Ou66p8qund2C8ujXuU9egpq73GgUF8tx/AHWT JXrhhBSC3fGY698sPrGdDiv4PYNukn+QjSC1a7R0Xs3arKeO7/q9dg++kImiAZ2z 1j278dhJd8vL037Uj1ehxQL7W48oPzkmoV2Ch9vfswf7pLh2T4t0J8PJOIRDHRXw cUAIR1F40my9cVj52EwP/0WwL0ws7qTvCz+Ox422qAcZnSWegNdrXlIm8SD6A0yq xaujHtifD8Cw8Z5PPTDPc0hAjGn0HYkjFeD5vzCe1c6227GHWSMzlzsH/0/wxE60 QJMZPpQWLS9noa11SZo61FdQBsmOIHluZh+Ui9EDpzFdRWJ8SKML5go+yGgIS1DF LnuSCSNpbref519G4eUT1ErzIb37qW+VJjNE/bp0aPndRksNHyx12SsAeK+QTkP6 d1Ewdbz9auAJzKtkZ/MzlXIOcC/m+CbS+KNiHxmAvo3g7fXr4WsOSjt8boPdUCvo EEqqb8dQaiTReTXR58exDktGw8bXQABVL5zq2tP7MO4N731onPsTotiK+UhsX8Bn aZtyhmWM89VgrLMDZIqB =+XWf -----END PGP SIGNATURE----- --6ovzACdnYbOKIO0z--