public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
* [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1"
@ 2017-11-29 12:28 maiku.fabian at gmail dot com
  2017-11-29 12:29 ` [Bug localedata/22515] " maiku.fabian at gmail dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-11-29 12:28 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

            Bug ID: 22515
           Summary: hsb_DE LC_COLLATE does not use copy "iso14651_t1"
           Product: glibc
           Version: 2.26
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: maiku.fabian at gmail dot com
                CC: libc-locales at sourceware dot org
  Target Milestone: ---

LC_COLLATE in localedata/locales/hsb_DE does not build upon

copy "iso14651_t1"

missing all updates from there.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug localedata/22515] hsb_DE LC_COLLATE does not use copy "iso14651_t1"
  2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
@ 2017-11-29 12:29 ` maiku.fabian at gmail dot com
  2017-11-29 12:33 ` maiku.fabian at gmail dot com
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-11-29 12:29 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at sourceware dot org   |maiku.fabian at gmail dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug localedata/22515] hsb_DE LC_COLLATE does not use copy "iso14651_t1"
  2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
  2017-11-29 12:29 ` [Bug localedata/22515] " maiku.fabian at gmail dot com
@ 2017-11-29 12:33 ` maiku.fabian at gmail dot com
  2017-12-06  8:50 ` maiku.fabian at gmail dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-11-29 12:33 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

--- Comment #1 from Mike FABIAN <maiku.fabian at gmail dot com> ---
https://unicode.org/cldr/trac/browser/trunk/common/collation/hsb.xml

contains:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE ldml SYSTEM "../../common/dtd/ldml.dtd">
<!--
Copyright © 2014 Unicode, Inc.
CLDR data files are interpreted according to the LDML specification
(http://unicode.org/reports/tr35/)
For terms of use, see http://www.unicode.org/copyright.html
-->
<ldml>
  <identity>
    <version number="$Revision: 11914 $" />
    <language type="hsb" />
  </identity>
  <collations>
    <collation type="standard" references="Prawopisny słownik hornjoserbskeje
rěče, Pawoł Völkel,
                                           wobdźěłał Timo Meškank, 1970/2005,
ISBN 3-7420-1920-1 ">
      <cr><![CDATA[
      &C<č<<<Č<ć<<<Ć
      &E<ě<<<Ě
      &H<ch<<<cH<<<Ch<<<CH
      &[before 1] L<ł<<<Ł
      &R<ř<<<Ř
      &S<š<<<Š
      &Z<ž<<<Ž<ź<<<Ź
      ]]></cr>
    </collation>
  </collations>
</ldml>

In glibc, in localedata/locales/hsb_DE, LC_COLLATE contains:

collating-element <D-Z'> from "<U0044><U0179>"
collating-element <D-z'> from "<U0044><U017A>"
collating-element <d-Z'> from "<U0064><U0179>"
collating-element <d-z'> from "<U0064><U017A>"
[...]
<d8>
<D-Z'>  <D-Z'>;<NONE>;<CAPITAL>;IGNORE
<D-z'>  <D-Z'>;<NONE>;<CAPITAL-SMALL>;IGNORE
<d-Z'>  <D-Z'>;<NONE>;<SMALL-CAPITAL>;IGNORE
<d-z'>  <D-Z'>;<NONE>;<SMALL>;IGNORE
[...]

I.e. it contains special rules to sort dź which CLDR has not.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug localedata/22515] hsb_DE LC_COLLATE does not use copy "iso14651_t1"
  2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
  2017-11-29 12:29 ` [Bug localedata/22515] " maiku.fabian at gmail dot com
  2017-11-29 12:33 ` maiku.fabian at gmail dot com
@ 2017-12-06  8:50 ` maiku.fabian at gmail dot com
  2017-12-06  8:52 ` maiku.fabian at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-12-06  8:50 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

--- Comment #2 from Mike FABIAN <maiku.fabian at gmail dot com> ---
The current hsb_DE locale sorts ć and Ć after t:

<t8>
<U0106> <U0106>;<NONE>;<CAPITAL>;IGNORE
<U0107> <U0106>;<NONE>;<SMALL>;IGNORE

I.e. it sorts like this:

   S
   š
   Š
   ć
   Ć
   Z

This seems wrong.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug localedata/22515] hsb_DE LC_COLLATE does not use copy "iso14651_t1"
  2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
                   ` (2 preceding siblings ...)
  2017-12-06  8:50 ` maiku.fabian at gmail dot com
@ 2017-12-06  8:52 ` maiku.fabian at gmail dot com
  2017-12-06  8:59 ` maiku.fabian at gmail dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-12-06  8:52 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

--- Comment #3 from Mike FABIAN <maiku.fabian at gmail dot com> ---
The current hsb_DE sorting also contradicts the CLDR sort order in
sorting like  this:

   Z
   ź
   Ź
   ž
   Ž

i.e. sorting ž after ź. In  CLDR it is the other way round:

&Z<ž<<<Ž<ź<<<Ź

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug localedata/22515] hsb_DE LC_COLLATE does not use copy "iso14651_t1"
  2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
                   ` (3 preceding siblings ...)
  2017-12-06  8:52 ` maiku.fabian at gmail dot com
@ 2017-12-06  8:59 ` maiku.fabian at gmail dot com
  2017-12-06  9:07 ` maiku.fabian at gmail dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-12-06  8:59 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

--- Comment #4 from Mike FABIAN <maiku.fabian at gmail dot com> ---
There is a little bit of a contradiction in the CLDR data
for collation.

https://unicode.org/cldr/trac/browser/trunk/common/collation/hsb.xml

contains:

    &C<č<<<Č<ć<<<Ć
    &E<ě<<<Ě
    &H<ch<<<cH<<<Ch<<<CH
    &[before 1] L<ł<<<Ł
    &R<ř<<<Ř
    &S<š<<<Š
    &Z<ž<<<Ž<ź<<<Ź

but

https://unicode.org/cldr/trac/browser/trunk/common/main/hsb.xml

contains:

<exemplarCharacters type="index">[A B C Č Ć D {DŹ} E F G H {CH} I J K Ł L M N O
P Q R S Š T U V W X Y Z Ž]</exemplarCharacters>

I.e. in the index, DŹ is considered as a special character whereas in
the sorting rules it is not.

Also, Ź is special in the sorting rules but not in the index.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug localedata/22515] hsb_DE LC_COLLATE does not use copy "iso14651_t1"
  2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
                   ` (4 preceding siblings ...)
  2017-12-06  8:59 ` maiku.fabian at gmail dot com
@ 2017-12-06  9:07 ` maiku.fabian at gmail dot com
  2017-12-06 11:33 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-12-06  9:07 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |digitalfreak@lingonborough.
                   |                            |com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug localedata/22515] hsb_DE LC_COLLATE does not use copy "iso14651_t1"
  2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
                   ` (5 preceding siblings ...)
  2017-12-06  9:07 ` maiku.fabian at gmail dot com
@ 2017-12-06 11:33 ` cvs-commit at gcc dot gnu.org
  2017-12-06 11:38 ` maiku.fabian at gmail dot com
  2017-12-06 13:48 ` maiku.fabian at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2017-12-06 11:33 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

--- Comment #5 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  62ea2193ee4b538b13da1c579113761e0b92376c (commit)
      from  37ac8e635a29810318f6d79902102e2e96b2b5bf (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=62ea2193ee4b538b13da1c579113761e0b92376c

commit 62ea2193ee4b538b13da1c579113761e0b92376c
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Wed Dec 6 10:02:48 2017 +0100

    hsb_DE locale: Base collation on copy "iso14651_t1" [BZ #22515]

        [BZ #22515]
        * localedata/Makefile: Add hsb_DE.UTF-8 to test-input
        and to the list of locales to be built for testing.
        * localedata/hsb_DE.UTF-8.in: New file for testing the collation.
        * localedata/locales/hsb_DE (LC_COLLATE): Use “copy "iso14651_t1"”
        and build the collation rules upon that.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                  |    9 +
 localedata/Makefile        |    5 +-
 localedata/hsb_DE.UTF-8.in |   35 +
 localedata/locales/hsb_DE  | 2159 ++------------------------------------------
 4 files changed, 133 insertions(+), 2075 deletions(-)
 create mode 100644 localedata/hsb_DE.UTF-8.in

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug localedata/22515] hsb_DE LC_COLLATE does not use copy "iso14651_t1"
  2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
                   ` (6 preceding siblings ...)
  2017-12-06 11:33 ` cvs-commit at gcc dot gnu.org
@ 2017-12-06 11:38 ` maiku.fabian at gmail dot com
  2017-12-06 13:48 ` maiku.fabian at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-12-06 11:38 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |2.27

--- Comment #6 from Mike FABIAN <maiku.fabian at gmail dot com> ---
Fixed in glibc master.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug localedata/22515] hsb_DE LC_COLLATE does not use copy "iso14651_t1"
  2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
                   ` (7 preceding siblings ...)
  2017-12-06 11:38 ` maiku.fabian at gmail dot com
@ 2017-12-06 13:48 ` maiku.fabian at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-12-06 13:48 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22515

--- Comment #7 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Mike FABIAN from comment #4)
> There is a little bit of a contradiction in the CLDR data
> for collation.
> 
> https://unicode.org/cldr/trac/browser/trunk/common/collation/hsb.xml
> 
> contains:
> 
>     &C<č<<<Č<ć<<<Ć
>     &E<ě<<<Ě
>     &H<ch<<<cH<<<Ch<<<CH
>     &[before 1] L<ł<<<Ł
>     &R<ř<<<Ř
>     &S<š<<<Š
>     &Z<ž<<<Ž<ź<<<Ź
> 
> but
> 
> https://unicode.org/cldr/trac/browser/trunk/common/main/hsb.xml
> 
> contains:
> 
> <exemplarCharacters type="index">[A B C Č Ć D {DŹ} E F G H {CH} I J K Ł L M
> N O P Q R S Š T U V W X Y Z Ž]</exemplarCharacters>
> 
> I.e. in the index, DŹ is considered as a special character whereas in
> the sorting rules it is not.
> 
> Also, Ź is special in the sorting rules but not in the index.

I reported this to CLDR:

https://unicode.org/cldr/trac/ticket/10797

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-12-06 13:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-29 12:28 [Bug localedata/22515] New: hsb_DE LC_COLLATE does not use copy "iso14651_t1" maiku.fabian at gmail dot com
2017-11-29 12:29 ` [Bug localedata/22515] " maiku.fabian at gmail dot com
2017-11-29 12:33 ` maiku.fabian at gmail dot com
2017-12-06  8:50 ` maiku.fabian at gmail dot com
2017-12-06  8:52 ` maiku.fabian at gmail dot com
2017-12-06  8:59 ` maiku.fabian at gmail dot com
2017-12-06  9:07 ` maiku.fabian at gmail dot com
2017-12-06 11:33 ` cvs-commit at gcc dot gnu.org
2017-12-06 11:38 ` maiku.fabian at gmail dot com
2017-12-06 13:48 ` maiku.fabian at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).