public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug localedata/11559] New: Wrong sorting of the space in cs_CZ locale
@ 2010-04-30  5:40 martin dot edlman at gmail dot com
  0 siblings, 0 replies; only message in thread
From: martin dot edlman at gmail dot com @ 2010-04-30  5:40 UTC (permalink / raw)
  To: glibc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1849 bytes --]

There is a problem with sorting of the space in the cs_CZ locale (as in many
other locales). 

According to Czech Standard 97 6030 Alphabetical ordering (Czech Institute of
Standards, Prague 1993) [Czech: ČSN 97 6030 Abecední řazení (Český normalizační
institut, Praha 1993)]:
The space between two contextual characters should be considered as a single
character. The space is sorted before the first letter of the alphabet. For
example: 

Novak Zdenek
Novakova Jana

So I propose to change the behavior of sorting the space from
<U0020> IGNORE;IGNORE;IGNORE;<U0020>
to
<U0020> <U0020>;IGNORE;<U0020>;<U0020> 

The problem is that this change takes spaces on the beginning of a line into
account, which is not correct as it sorts "Novák" and "(space)Zounar" as

(space)Zounar
Novák

instead of correct

Novák
(space)Zounar

The same applies to multiple spaces, they should be considered as one space, so
"Novák(space)Jan" and "Novák(space)(space)Zdenek" are incorrectly sorted as

Novák(space)(space)Zdenek
Novák(space)Jan

insted of correct

Novák(space)Jan
Novák(space)(space)Zdenek

Is it possible to fix this behavior in locale definition? It should definitely
solve the problem and fulfill the standard.

-- 
           Summary: Wrong sorting of the space in cs_CZ locale
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
        AssignedTo: libc-locales at sources dot redhat dot com
        ReportedBy: martin dot edlman at gmail dot com
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=11559

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2010-04-30  5:40 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-30  5:40 [Bug localedata/11559] New: Wrong sorting of the space in cs_CZ locale martin dot edlman at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).