public inbox for libc-locales@sourceware.org
 help / color / mirror / Atom feed
* [Bug localedata/22469] pl_PL LC_COLLATE does not use i18n
  2017-11-21  3:53 [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n maiku.fabian at gmail dot com
@ 2017-11-21  3:53 ` maiku.fabian at gmail dot com
  2017-11-21  6:33 ` maiku.fabian at gmail dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-11-21  3:53 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22469

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at sourceware dot org   |maiku.fabian at gmail dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n
@ 2017-11-21  3:53 maiku.fabian at gmail dot com
  2017-11-21  3:53 ` [Bug localedata/22469] " maiku.fabian at gmail dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-11-21  3:53 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22469

            Bug ID: 22469
           Summary: pl_PL LC_COLLATE does not use i18n
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: maiku.fabian at gmail dot com
                CC: libc-locales at sourceware dot org
  Target Milestone: ---

localedata/locales/pl_PL does not build upon localedata/locales/i18n, missing
all updates from there.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/22469] pl_PL LC_COLLATE does not use i18n
  2017-11-21  3:53 [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n maiku.fabian at gmail dot com
  2017-11-21  3:53 ` [Bug localedata/22469] " maiku.fabian at gmail dot com
@ 2017-11-21  6:33 ` maiku.fabian at gmail dot com
  2017-11-22 15:42 ` piotrdrag at gmail dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-11-21  6:33 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22469

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |digitalfreak@lingonborough.
                   |                            |com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/22469] pl_PL LC_COLLATE does not use i18n
  2017-11-21  3:53 [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n maiku.fabian at gmail dot com
  2017-11-21  3:53 ` [Bug localedata/22469] " maiku.fabian at gmail dot com
  2017-11-21  6:33 ` maiku.fabian at gmail dot com
@ 2017-11-22 15:42 ` piotrdrag at gmail dot com
  2017-11-23 10:42 ` maiku.fabian at gmail dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: piotrdrag at gmail dot com @ 2017-11-22 15:42 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22469

Piotr Drąg <piotrdrag at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |piotrdrag at gmail dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/22469] pl_PL LC_COLLATE does not use i18n
  2017-11-21  3:53 [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n maiku.fabian at gmail dot com
                   ` (2 preceding siblings ...)
  2017-11-22 15:42 ` piotrdrag at gmail dot com
@ 2017-11-23 10:42 ` maiku.fabian at gmail dot com
  2017-11-24  5:06 ` maiku.fabian at gmail dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-11-23 10:42 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22469

--- Comment #1 from Mike FABIAN <maiku.fabian at gmail dot com> ---
Created attachment 10630
  --> https://sourceware.org/bugzilla/attachment.cgi?id=10630&action=edit
0001-pl_PL-locale-Base-collation-on-iso14651_t1.patch

This patch uses “copy "iso14651_t1"”

and then implements the collatin rules for Polish from CLDR on top of that,
see:

https://unicode.org/cldr/trac/browser/trunk/common/collation/pl.xml

And, it also adds some rules to handle spaces in order not
to cause a regression for bug#388, see:

https://sourceware.org/bugzilla/show_bug.cgi?id=388

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/22469] pl_PL LC_COLLATE does not use i18n
  2017-11-21  3:53 [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n maiku.fabian at gmail dot com
                   ` (3 preceding siblings ...)
  2017-11-23 10:42 ` maiku.fabian at gmail dot com
@ 2017-11-24  5:06 ` maiku.fabian at gmail dot com
  2017-11-24  5:06 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-11-24  5:06 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22469

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |2.27

--- Comment #3 from Mike FABIAN <maiku.fabian at gmail dot com> ---
Fixed in glibc master

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/22469] pl_PL LC_COLLATE does not use i18n
  2017-11-21  3:53 [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n maiku.fabian at gmail dot com
                   ` (4 preceding siblings ...)
  2017-11-24  5:06 ` maiku.fabian at gmail dot com
@ 2017-11-24  5:06 ` cvs-commit at gcc dot gnu.org
  2017-11-24  5:07 ` maiku.fabian at gmail dot com
  2017-12-01  1:09 ` digitalfreak at lingonborough dot com
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2017-11-24  5:06 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22469

--- Comment #2 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  3ffc4cc1ad37fb36e419c9a3a72e1916d7d893d3 (commit)
      from  3a327316ad615f7e4264d3e13d23052d9dc84694 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=3ffc4cc1ad37fb36e419c9a3a72e1916d7d893d3

commit 3ffc4cc1ad37fb36e419c9a3a72e1916d7d893d3
Author: Mike FABIAN <mfabian@redhat.com>
Date:   Mon Nov 20 17:55:33 2017 +0530

    pl_PL locale: Base collation on iso14651_t1

        [BZ #22469]
        * localedata/locales/pl_PL (LC_COLLATE): Use “copy "iso14651_t1"”
        and implement the collation rules for pl from CLDR on top of that.
        * Makefile: Add pl_PL.UTF-8 to test-input and to the list
        of locales to be built for testing.
        * pl_PL.UTF-8.in: New file with test data to test the Polish sorting.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                 |    9 +
 localedata/Makefile       |    6 +-
 localedata/locales/pl_PL  | 2116 ++-------------------------------------------
 localedata/pl_PL.UTF-8.in |  162 ++++
 4 files changed, 231 insertions(+), 2062 deletions(-)
 create mode 100644 localedata/pl_PL.UTF-8.in

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/22469] pl_PL LC_COLLATE does not use i18n
  2017-11-21  3:53 [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n maiku.fabian at gmail dot com
                   ` (5 preceding siblings ...)
  2017-11-24  5:06 ` cvs-commit at gcc dot gnu.org
@ 2017-11-24  5:07 ` maiku.fabian at gmail dot com
  2017-12-01  1:09 ` digitalfreak at lingonborough dot com
  7 siblings, 0 replies; 9+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-11-24  5:07 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22469

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #4 from Mike FABIAN <maiku.fabian at gmail dot com> ---
FIXED.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug localedata/22469] pl_PL LC_COLLATE does not use i18n
  2017-11-21  3:53 [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n maiku.fabian at gmail dot com
                   ` (6 preceding siblings ...)
  2017-11-24  5:07 ` maiku.fabian at gmail dot com
@ 2017-12-01  1:09 ` digitalfreak at lingonborough dot com
  7 siblings, 0 replies; 9+ messages in thread
From: digitalfreak at lingonborough dot com @ 2017-12-01  1:09 UTC (permalink / raw)
  To: libc-locales

https://sourceware.org/bugzilla/show_bug.cgi?id=22469

--- Comment #5 from Rafal Luzynski <digitalfreak at lingonborough dot com> ---
For the record and for the future reference: Polish alphabetical sorting is
standardized by PN–80/N–01223 standard (by Polish Committee for
Standardization). Some of its rules:

1. Alphabetical order must accord with the Polish alphabet with the letters: q,
v, x added.
2. Non-Polish diacritical characters are ignored, ex.: Hašek < Hass
2a. It is also allowed to ignore Polish diacritical characters (although nobody
seems to apply this rule, Polish diacritical characters are always respected).
3. Spaces and punctuation characters are before the letters, ex.: "mur z cegły"
< "murawa".
4. Lowercase letter is before the uppercase, ex.: arab < Arab.
5. Numbers (also spelled) must be sorted according to their numerical value and
placed before the letters, ex.: 1 < 5 < ósmy < trzynaście < 17 < XXI <
Agnieszka < Antoni ... (This rule is difficult to implement, let's skip it.)
6. The placement of the Icelandic letter Þ (Thorn) is not regulated but the
Icelandic alphabet places it at the end, after Z. We are encouraged to follow
this rule as well, ex.: X < Y < Z < Þ.

Source: https://pl.wikipedia.org/wiki/Porz%C4%85dek_alfabetyczny

Another scientific source says that Polish language has two rules of sorting:
for dictionaries the spaces and punctuation characters are ignored
(letter-by-letter order) but for encyclopedias they are not (word-by-word
order). Thanks to these rules people who don't know whether the correct
spelling is „na pewno” or „napewno” will find the word ("na pewno" ==
"napewno"). On the other hand in encyclopedias all monarchs named Jan are
grouped together: "Jan III Sobieski" < "Jan XXIII" < "Janina". We can't
implement two different rules, here we have implemented the word-by-word rule
and it is correct. The same has been requested in bug 388.

Source:
https://sjp.pwn.pl/poradnia/haslo/porzadek-alfabetyczny-ale-jaki;16226.html

One more source saying that non-Polish diacritical characters should be
ignored: https://sjp.pwn.pl/poradnia/haslo/porzadek-alfabetyczny;4208.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-12-01  1:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-21  3:53 [Bug localedata/22469] New: pl_PL LC_COLLATE does not use i18n maiku.fabian at gmail dot com
2017-11-21  3:53 ` [Bug localedata/22469] " maiku.fabian at gmail dot com
2017-11-21  6:33 ` maiku.fabian at gmail dot com
2017-11-22 15:42 ` piotrdrag at gmail dot com
2017-11-23 10:42 ` maiku.fabian at gmail dot com
2017-11-24  5:06 ` maiku.fabian at gmail dot com
2017-11-24  5:06 ` cvs-commit at gcc dot gnu.org
2017-11-24  5:07 ` maiku.fabian at gmail dot com
2017-12-01  1:09 ` digitalfreak at lingonborough dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).