public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests
@ 2015-09-08  8:33 egmont at gmail dot com
  2015-09-08  8:38 ` [Bug localedata/18934] " egmont at gmail dot com
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: egmont at gmail dot com @ 2015-09-08  8:33 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

            Bug ID: 18934
           Summary: [PATCH] Hungarian collate: fix multiple bugs and add
                    tests
           Product: glibc
           Version: 2.22
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: egmont at gmail dot com
                CC: libc-locales at sourceware dot org
        Depends on: 18589
  Target Milestone: ---

Created attachment 8587
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8587&action=edit
Fix

Please apply the attached patch which addresses multiple bugs in Hungarian
collation.

It also adds an extensive unittest (including all the examples from the
official rules and much more), a significantly bigger one that any other locale
has.

Note that these unittests pass with glibc-2.21 but fail with 2.22 and current
git due to bug 18589 which points to a broken change in the collate algorithm
that needs to be reverted first.

(I know that generally one patch per issue is a cleaner approach, but this time
apologize for an all-in-one: the patches would heavily conflict, and it would
be really cumbersome to unittest an incremental series. Instead, think about it
as TDD (test driven development): I attach a decent unittest with explanations
and pointers to the rules, and a locale definition that implements it.)

The addressed bugs are (in no particular order):

- The fix to bug 13547 was incorrect. It fixed a corner case, whereas I didn't
realize it broke a more frequent once. See details over there.

- Two bugs/inconsistencies wrt. sorting upper/lowercase values, as described in
bug 18587.

- Someone enabled backwards ordering of diacrits by default (bug 17750),
breaking tons of locales including Hungarian.

- Foreign accents should be sorted after the native Hungarian ones, it wasn't
the case so far.

I hope that these changes will not only fix Hungarian, but also provide a
better overall quality for all the locales and a guideline to follow for other
locale implementations, since these extensive tests probably would have helped
(and probably will help in the future) catch bugs similar to 18589 and 17750
before they get committed.


Referenced Bugs:

https://sourceware.org/bugzilla/show_bug.cgi?id=18589
[Bug 18589] sort-test.sh fails at random
-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-10-13 22:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
2015-09-08  8:38 ` [Bug localedata/18934] " egmont at gmail dot com
2015-09-09 18:51 ` egmont at gmail dot com
2015-09-09 20:03 ` egmont at gmail dot com
2015-09-13  1:06 ` [Bug localedata/18934] hu_HU: " vapier at gentoo dot org
2015-10-10  0:48 ` carlos at redhat dot com
2015-10-12 20:05 ` egmont at gmail dot com
2015-10-13  0:16 ` vapier at gentoo dot org
2015-10-13 21:34 ` egmont at gmail dot com
2015-10-13 21:39 ` egmont at gmail dot com
2015-10-13 22:25 ` egmont at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).