From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-locales-return-540-listarch-libc-locales=sources.redhat.com@sourceware.org>
Received: (qmail 4195 invoked by alias); 19 Oct 2006 23:54:03 -0000
Received: (qmail 4183 invoked by uid 22791); 19 Oct 2006 23:54:02 -0000
X-Spam-Status: No, hits=-0.0 required=5.0
	tests=AWL,BAYES_00,DNS_FROM_RFC_ABUSE,SPF_NEUTRAL
X-Spam-Check-By: sourceware.org
Message-ID: <45381092.2070401@gmail.com>
Date: Thu, 19 Oct 2006 23:54:00 -0000
From: =?UTF-8?B?IlJlc2hhdCBTYWJpcSAoUmXFn2F0KSI=?=
 <tatar.iqtelif.i18n@gmail.com>
User-Agent: Mozilla Thunderbird 1.5.0.7 (Windows/20060909)
MIME-Version: 1.0
To: =?UTF-8?B?THVkb3ZpYyBDb3VydMOocw==?= <ludovic.courtes@laas.fr>
CC: libc-locales@sources.redhat.com
Subject: Re: Weird case-insensitive collation
References: <87k62w1r7f.fsf@laas.fr>
In-Reply-To: <87k62w1r7f.fsf@laas.fr>
X-Enigmail-Version: 0.94.1.0
OpenPGP: id=262839AF;
	url=http://keyserver.veridis.com:11371
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Virus-Scanned: Symantec AntiVirus Scan Engine
Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:libc-locales-subscribe@sourceware.org>
List-Post: <mailto:libc-locales@sourceware.org>
List-Help: <mailto:libc-locales-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: libc-locales-owner@sourceware.org
X-SW-Source: 2006-q4/txt/msg00028.txt.bz2

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ludovic CourtÃ¨s yazmÄ±Å:
> Hi,
> 
> `strcasecmp ()' behaves wrongly under the `fr_FR' locale.  Consider the
> following example program:
> 
>   #include <stdlib.h>
>   #include <stdio.h>
>   #include <locale.h>
>   #include <strings.h>
> 
>   int
>   main (int argc, char *argv[])
>   {
>     int result;
> 
>     if (!setlocale (LC_ALL, "fr_FR.ISO-8859-1"))
>       abort ();
> 
>     result = strcasecmp ("Ã©tÃ©", "Hiver");
>     printf ("result=%i\n", result);
> 
>     return (result < 0) ? 0 : 1;
>   }
> 
> Under French collation conventions, letter `Ã©' (`e' with acute) comes
> before `h'.  Thus, the word "Ã©tÃ©" should be "lower than" the word
> "hiver".  `strcoll ()' returns the right answer (a negative number) but
> `strcasecmp ()' wrongfully returns a positive number, regardless of
> whether "hiver" is spelt with a capital `H' or not.
> 
> Is this a bug or am I missing something?
> 
> Thanks,
> Ludovic.
> 
I think this function is not locale-aware, so it compares characters'
integral value, which naturally produces a positive.
http://opengroup.org/onlinepubs/007908799/xsh/strcasecmp.html
In the POSIX locale, strcasecmp() and strncasecmp() do upper to lower
conversions, then a byte comparison. The results are unspecified in
other locales.

Since is 0xe9, and being an Extended ASCII character greater than both h
(0x68) and H (0x48), this doesn't seem to be a bug to me.

HTH,
Reshat.

- --
My public GPG key (ID 0x262839AF) is at: http://keyserver.veridis.com:11371
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)

iD8DBQFFOBCSO75ytyYoOa8RAtvJAJ4hh3k83W6rXdW5OQk1AzbZmybKDwCfWM98
y+onNxS2erMCG+Rc3S+sMmk=
=PuTh
-----END PGP SIGNATURE-----