From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23990 invoked by alias); 28 Oct 2015 22:14:06 -0000 Mailing-List: contact cygwin-help@cygwin.com; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner@cygwin.com Mail-Followup-To: cygwin@cygwin.com Received: (qmail 23979 invoked by uid 89); 28 Oct 2015 22:14:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 X-HELO: limerock02.mail.cornell.edu Received: from limerock02.mail.cornell.edu (HELO limerock02.mail.cornell.edu) (128.84.13.242) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 28 Oct 2015 22:14:04 +0000 X-CornellRouted: This message has been Routed already. Received: from authusersmtp.mail.cornell.edu (granite3.serverfarm.cornell.edu [10.16.197.8]) by limerock02.mail.cornell.edu (8.14.4/8.14.4_cu) with ESMTP id t9SME1a3022098 for ; Wed, 28 Oct 2015 18:14:02 -0400 Received: from [10.13.22.3] (50-192-21-217-static.hfc.comcastbusiness.net [50.192.21.217]) (authenticated bits=0) by authusersmtp.mail.cornell.edu (8.14.4/8.12.10) with ESMTP id t9SME0Pp007488 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) for ; Wed, 28 Oct 2015 18:14:01 -0400 To: cygwin From: Ken Brown Subject: Bug in collation functions? Message-ID: <563148AF.1000502@cornell.edu> Date: Thu, 29 Oct 2015 07:41:00 -0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2015-10/txt/msg00521.txt.bz2 It's my understanding that collation is supposed to take whitespace and punctuation into account in the POSIX locale but not in other locales. This doesn't seem to be the case on Cygwin. Here's a test case using wcscoll, but the same problem occurs with strcoll. $ cat wcscoll_test.c #include #include #include void compare (const wchar_t *a, const wchar_t *b, const char *loc) { setlocale (LC_COLLATE, loc); char res = wcscoll (a, b) < 0 ? '<' : '>'; printf ("\"%ls\" %c \"%ls\" in %s locale\n", a, res, b, loc); } int main () { compare (L"11", L"1.1", "POSIX"); compare (L"11", L"1.1", "en_US.UTF-8"); compare (L"11", L"1 2", "POSIX"); compare (L"11", L"1 2", "en_US.UTF-8"); } $ gcc wcscoll_test.c -o wcscoll_test $ ./wcscoll_test "11" > "1.1" in POSIX locale "11" > "1.1" in en_US.UTF-8 locale "11" > "1 2" in POSIX locale "11" > "1 2" in en_US.UTF-8 locale On Linux, the output from the same program is "11" > "1.1" in POSIX locale "11" < "1.1" in en_US.UTF-8 locale "11" > "1 2" in POSIX locale "11" < "1 2" in en_US.UTF-8 locale Ken -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple