public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
@ 2014-02-19 20:03 ` aldocassola at gmail dot com
  2014-02-19 20:04 ` carlos at redhat dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: aldocassola at gmail dot com @ 2014-02-19 20:03 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

Aldo <aldocassola at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|2.18                        |unspecified

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll'
@ 2014-02-19 20:03 aldocassola at gmail dot com
  2014-02-19 20:03 ` [Bug localedata/16608] " aldocassola at gmail dot com
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: aldocassola at gmail dot com @ 2014-02-19 20:03 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

            Bug ID: 16608
           Summary: es_US locale has invalid collation rules for 'ch' and
                    'll'
           Product: glibc
           Version: 2.18
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: aldocassola at gmail dot com
                CC: libc-locales at sourceware dot org

The es_EC locale file (which depends on es_US) defines 'ch' and 'll' as
standalone letters for collation. This is an incorrect collation procedure
according to the rules of the Spanish Royal Academy since 1997 (see
http://www.rae.es/consultas/exclusion-de-ch-y-ll-del-abecedario)

According to the above rules, words with ch and ll are to be sorted as simply
having a 'c' and 'h' and double 'l' 

E.g.:
incorrect (current): file_ce, file_cf, file_cg, file_cz, file_ch
correct (expected): file_ce, file_cf, file_cg, file_ch, file_cz

The es_ES file specifies the correct behavior and the rest of es_* files depend
on it.

Please either make es_EC depend on es_ES or fix es_US

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
  2014-02-19 20:03 ` [Bug localedata/16608] " aldocassola at gmail dot com
@ 2014-02-19 20:04 ` carlos at redhat dot com
  2014-02-19 20:05 ` carlos at redhat dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: carlos at redhat dot com @ 2014-02-19 20:04 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |WAITING
                 CC|                            |carlos at redhat dot com

--- Comment #1 from Carlos O'Donell <carlos at redhat dot com> ---
Do we know what CLDR does here?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
                   ` (2 preceding siblings ...)
  2014-02-19 20:05 ` carlos at redhat dot com
@ 2014-02-19 20:05 ` aldocassola at gmail dot com
  2014-02-19 20:05 ` aldocassola at gmail dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: aldocassola at gmail dot com @ 2014-02-19 20:05 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

--- Comment #2 from Aldo <aldocassola at gmail dot com> ---
Not entirely sure if this link is the right one, but it seems they agree with
the rules:

http://st.unicode.org/cldr-apps/v#/es_EC/Alphabetic_Information/

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
  2014-02-19 20:03 ` [Bug localedata/16608] " aldocassola at gmail dot com
  2014-02-19 20:04 ` carlos at redhat dot com
@ 2014-02-19 20:05 ` carlos at redhat dot com
  2014-02-19 20:05 ` aldocassola at gmail dot com
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: carlos at redhat dot com @ 2014-02-19 20:05 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

--- Comment #3 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Aldo from comment #2)
> Not entirely sure if this link is the right one, but it seems they agree
> with the rules:
> 
> http://st.unicode.org/cldr-apps/v#/es_EC/Alphabetic_Information/

That doesn't provide enough information. For example if instead you use libicu
(http://site.icu-project.org/) to do the sorting and it comes out as expected
then that argues CLDR has the same interpretation. In the light of our desire
to harmonize better with CLDR we would make the change locally.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
                   ` (3 preceding siblings ...)
  2014-02-19 20:05 ` aldocassola at gmail dot com
@ 2014-02-19 20:05 ` aldocassola at gmail dot com
  2014-02-19 21:06 ` aldocassola at gmail dot com
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: aldocassola at gmail dot com @ 2014-02-19 20:05 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

--- Comment #4 from Aldo <aldocassola at gmail dot com> ---
That makes sense. However, es_EC is the only locale of a latinamerican country
inheriting collation from es_US (we do use the US dollar, but text is collated
as specified by the authority, which is the case for the other countries),
which looks more like a bug to me.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
                   ` (4 preceding siblings ...)
  2014-02-19 20:05 ` aldocassola at gmail dot com
@ 2014-02-19 21:06 ` aldocassola at gmail dot com
  2014-02-19 21:31 ` aldocassola at gmail dot com
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: aldocassola at gmail dot com @ 2014-02-19 21:06 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

--- Comment #5 from Aldo <aldocassola at gmail dot com> ---
Created attachment 7429
  --> https://sourceware.org/bugzilla/attachment.cgi?id=7429&action=edit
icu sort sample program

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
                   ` (5 preceding siblings ...)
  2014-02-19 21:06 ` aldocassola at gmail dot com
@ 2014-02-19 21:31 ` aldocassola at gmail dot com
  2014-06-13  8:14 ` fweimer at redhat dot com
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: aldocassola at gmail dot com @ 2014-02-19 21:31 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

--- Comment #6 from Aldo <aldocassola at gmail dot com> ---
Comment on attachment 7429
  --> https://sourceware.org/bugzilla/attachment.cgi?id=7429
icu sort sample program

I have written a small sort program using libicu to sort the strings "ca",
"ch", "cz", and "c&ntilde;". Compile it with 
gcc sort.c -licui18n -licuuc -licuio

It takes the locale as the first command-line argument.

A sample run:

$ ./a.out es_EC
Unsorted array (using: es_EC)
ch cñ cz ca
Sorted array (using: es_EC)
ca ch cñ cz

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
                   ` (6 preceding siblings ...)
  2014-02-19 21:31 ` aldocassola at gmail dot com
@ 2014-06-13  8:14 ` fweimer at redhat dot com
  2017-10-21  8:25 ` maiku.fabian at gmail dot com
  2024-01-04 12:05 ` maiku.fabian at gmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: fweimer at redhat dot com @ 2014-06-13  8:14 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
                   ` (7 preceding siblings ...)
  2014-06-13  8:14 ` fweimer at redhat dot com
@ 2017-10-21  8:25 ` maiku.fabian at gmail dot com
  2024-01-04 12:05 ` maiku.fabian at gmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-10-21  8:25 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |maiku.fabian at gmail dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/16608] es_US locale has invalid collation rules for 'ch' and 'll'
  2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
                   ` (8 preceding siblings ...)
  2017-10-21  8:25 ` maiku.fabian at gmail dot com
@ 2024-01-04 12:05 ` maiku.fabian at gmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-01-04 12:05 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=16608

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|---                         |FIXED
           Assignee|unassigned at sourceware dot org   |maiku.fabian at gmail dot com
   Target Milestone|---                         |2.38

--- Comment #7 from Mike FABIAN <maiku.fabian at gmail dot com> ---
I think this is fixed.

Currently **all** es locales inherit their collation from es_ES:

mfabian@hathi:/local/mfabian/src/glibc/localedata/locales (master $%)
$ grep -A2 ^LC_COLLATE es_*
es_AR:LC_COLLATE
es_AR-copy "es_ES"
es_AR-END LC_COLLATE
--
es_BO:LC_COLLATE
es_BO-copy "es_ES"
es_BO-END LC_COLLATE
--
es_CL:LC_COLLATE
es_CL-copy "es_ES"
es_CL-END LC_COLLATE
--
es_CO:LC_COLLATE
es_CO-copy "es_ES"
es_CO-END LC_COLLATE
--
es_CR:LC_COLLATE
es_CR-copy "es_ES"
es_CR-END LC_COLLATE
--
es_CU:LC_COLLATE
es_CU-copy "es_ES"
es_CU-END LC_COLLATE
--
es_DO:LC_COLLATE
es_DO-copy "es_ES"
es_DO-END LC_COLLATE
--
es_EC:LC_COLLATE
es_EC-copy "es_ES"
es_EC-END LC_COLLATE
--
es_ES:LC_COLLATE
es_ES-% CLDR collation rules for Spanish:
es_ES-% (see:
https://unicode.org/cldr/trac/browser/trunk/common/collation/es.xml)
--
es_ES@euro:LC_COLLATE
es_ES@euro-copy "es_ES"
es_ES@euro-END LC_COLLATE
--
es_GT:LC_COLLATE
es_GT-copy "es_ES"
es_GT-END LC_COLLATE
--
es_HN:LC_COLLATE
es_HN-copy "es_ES"
es_HN-END LC_COLLATE
--
es_MX:LC_COLLATE
es_MX-copy "es_ES"
es_MX-END LC_COLLATE
--
es_NI:LC_COLLATE
es_NI-copy "es_ES"
es_NI-END LC_COLLATE
--
es_PA:LC_COLLATE
es_PA-copy "es_ES"
es_PA-END LC_COLLATE
--
es_PE:LC_COLLATE
es_PE-copy "es_ES"
es_PE-END LC_COLLATE
--
es_PR:LC_COLLATE
es_PR-copy "es_ES"
es_PR-END LC_COLLATE
--
es_PY:LC_COLLATE
es_PY-copy "es_ES"
es_PY-END LC_COLLATE
--
es_SV:LC_COLLATE
es_SV-copy "es_ES"
es_SV-END LC_COLLATE
--
es_US:LC_COLLATE
es_US-copy "es_ES"
es_US-END LC_COLLATE
--
es_UY:LC_COLLATE
es_UY-copy "es_ES"
es_UY-END LC_COLLATE
--
es_VE:LC_COLLATE
es_VE-copy "es_ES"
es_VE-END LC_COLLATE
mfabian@hathi:/local/mfabian/src/glibc/localedata/locales (master $%)
$

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-01-04 12:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-19 20:03 [Bug localedata/16608] New: es_US locale has invalid collation rules for 'ch' and 'll' aldocassola at gmail dot com
2014-02-19 20:03 ` [Bug localedata/16608] " aldocassola at gmail dot com
2014-02-19 20:04 ` carlos at redhat dot com
2014-02-19 20:05 ` carlos at redhat dot com
2014-02-19 20:05 ` aldocassola at gmail dot com
2014-02-19 20:05 ` aldocassola at gmail dot com
2014-02-19 21:06 ` aldocassola at gmail dot com
2014-02-19 21:31 ` aldocassola at gmail dot com
2014-06-13  8:14 ` fweimer at redhat dot com
2017-10-21  8:25 ` maiku.fabian at gmail dot com
2024-01-04 12:05 ` maiku.fabian at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).