From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5401 invoked by alias); 9 May 2006 16:01:08 -0000 Received: (qmail 5369 invoked by uid 48); 9 May 2006 16:00:51 -0000 Date: Tue, 09 May 2006 16:01:00 -0000 Message-ID: <20060509160051.5368.qmail@sourceware.org> From: "mfabian at suse dot de" To: glibc-bugs@sources.redhat.com In-Reply-To: <20060509154921.2648.mfabian@suse.de> References: <20060509154921.2648.mfabian@suse.de> Reply-To: sourceware-bugzilla@sourceware.org Subject: [Bug libc/2648] localedata/locales/es_ES has incorrect LC_COLLATE handling X-Bugzilla-Reason: CC Mailing-List: contact glibc-bugs-help@sourceware.org; run by ezmlm Precedence: bulk List-Subscribe: List-Post: List-Help: , Sender: glibc-bugs-owner@sourceware.org X-SW-Source: 2006-05/txt/msg00069.txt.bz2 List-Id: ------- Additional Comments From mfabian at suse dot de 2006-05-09 16:00 ------- Original comment in the Novell bugzilla: When LC_COLLATE=es_ES, the sort command ignores spaces in its sorting algorithm, so it sorts MAS PUJADAS, FRANCESC after MASOLIVER GARCIA, JAIME instead of before, even though the comments in /usr/share/i18n/locales/es_ES indicate that the sorting algorithm for this locales should take spaces into account (and sort them before punctuation characters, numbers and letters). This spanish customer is not using LC_COLLATE="POSIX" because the sort command gives incorrect results when dealing with characters with spanish accents so he has to use LC_COLLATE="es_ES.UTF-8" which is ignoring spaces. Even /usr/share/i18n/locales/es_ES states: LC_COLLATE % Base collation scheme: 1994-03-22 % Ordering algorithm: % 1. Spaces and hyphen (but not soft % hyphen) before punctuation % characters, punctuation characters % before numbers, % numbers before letters. I also tested it with every other language setting and the results are always the same: mortlach:~ # export LC_COLLATE="POSIX" mortlach:~ # sort demo AB CDESY ABC DETZ ABCD ETX mortlach:~ # export LC_COLLATE="en_GB.UTF-8" mortlach:~ # sort demo AB CDESY ABCD ETX ABC DETZ mortlach:~ # export LC_COLLATE="de_DE.UTF-8" mortlach:~ # sort demo AB CDESY ABCD ETX ABC DETZ So the question is why LC_COLLATE="POSIX" behaves differently to any other language setting, if this is a feature where is it documented and why is it so? It doesn't make sence that LC_COLLATE="POSIX" behaves different to the English settings (UK & US) which on the other hand behave exactly the same way as any other language setting so there must be a reason why this is so -- http://sourceware.org/bugzilla/show_bug.cgi?id=2648 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.