public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests
@ 2015-09-08  8:33 egmont at gmail dot com
  2015-09-08  8:38 ` [Bug localedata/18934] " egmont at gmail dot com
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: egmont at gmail dot com @ 2015-09-08  8:33 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

            Bug ID: 18934
           Summary: [PATCH] Hungarian collate: fix multiple bugs and add
                    tests
           Product: glibc
           Version: 2.22
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: egmont at gmail dot com
                CC: libc-locales at sourceware dot org
        Depends on: 18589
  Target Milestone: ---

Created attachment 8587
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8587&action=edit
Fix

Please apply the attached patch which addresses multiple bugs in Hungarian
collation.

It also adds an extensive unittest (including all the examples from the
official rules and much more), a significantly bigger one that any other locale
has.

Note that these unittests pass with glibc-2.21 but fail with 2.22 and current
git due to bug 18589 which points to a broken change in the collate algorithm
that needs to be reverted first.

(I know that generally one patch per issue is a cleaner approach, but this time
apologize for an all-in-one: the patches would heavily conflict, and it would
be really cumbersome to unittest an incremental series. Instead, think about it
as TDD (test driven development): I attach a decent unittest with explanations
and pointers to the rules, and a locale definition that implements it.)

The addressed bugs are (in no particular order):

- The fix to bug 13547 was incorrect. It fixed a corner case, whereas I didn't
realize it broke a more frequent once. See details over there.

- Two bugs/inconsistencies wrt. sorting upper/lowercase values, as described in
bug 18587.

- Someone enabled backwards ordering of diacrits by default (bug 17750),
breaking tons of locales including Hungarian.

- Foreign accents should be sorted after the native Hungarian ones, it wasn't
the case so far.

I hope that these changes will not only fix Hungarian, but also provide a
better overall quality for all the locales and a guideline to follow for other
locale implementations, since these extensive tests probably would have helped
(and probably will help in the future) catch bugs similar to 18589 and 17750
before they get committed.


Referenced Bugs:

https://sourceware.org/bugzilla/show_bug.cgi?id=18589
[Bug 18589] sort-test.sh fails at random
-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] [PATCH] Hungarian collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
@ 2015-09-08  8:38 ` egmont at gmail dot com
  2015-09-09 18:51 ` egmont at gmail dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: egmont at gmail dot com @ 2015-09-08  8:38 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

--- Comment #1 from Egmont Koblinger <egmont at gmail dot com> ---
*** Bug 18587 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] [PATCH] Hungarian collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
  2015-09-08  8:38 ` [Bug localedata/18934] " egmont at gmail dot com
@ 2015-09-09 18:51 ` egmont at gmail dot com
  2015-09-09 20:03 ` egmont at gmail dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: egmont at gmail dot com @ 2015-09-09 18:51 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

Egmont Koblinger <egmont at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #8587|0                           |1
        is obsolete|                            |

--- Comment #2 from Egmont Koblinger <egmont at gmail dot com> ---
Created attachment 8593
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8593&action=edit
Fix v2

A small fix for a (hopefully) innocent problem: accidentally the 4th level was
set to garbage rather than IGNORE for some letters.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] [PATCH] Hungarian collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
  2015-09-08  8:38 ` [Bug localedata/18934] " egmont at gmail dot com
  2015-09-09 18:51 ` egmont at gmail dot com
@ 2015-09-09 20:03 ` egmont at gmail dot com
  2015-09-13  1:06 ` [Bug localedata/18934] hu_HU: " vapier at gentoo dot org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: egmont at gmail dot com @ 2015-09-09 20:03 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

Egmont Koblinger <egmont at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #8593|0                           |1
        is obsolete|                            |

--- Comment #3 from Egmont Koblinger <egmont at gmail dot com> ---
Created attachment 8594
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8594&action=edit
Fix v3

No functional change. Just shuffled a few lines around so that definitions of
compound letters are in consistent order.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] hu_HU: collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
                   ` (2 preceding siblings ...)
  2015-09-09 20:03 ` egmont at gmail dot com
@ 2015-09-13  1:06 ` vapier at gentoo dot org
  2015-10-10  0:48 ` carlos at redhat dot com
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: vapier at gentoo dot org @ 2015-09-13  1:06 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

Mike Frysinger <vapier at gentoo dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vapier at gentoo dot org
            Summary|[PATCH] Hungarian collate:  |hu_HU: collate: fix
                   |fix multiple bugs and add   |multiple bugs and add tests
                   |tests                       |

--- Comment #4 from Mike Frysinger <vapier at gentoo dot org> ---
please post patches to the libc locale mailing list

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] hu_HU: collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
                   ` (3 preceding siblings ...)
  2015-09-13  1:06 ` [Bug localedata/18934] hu_HU: " vapier at gentoo dot org
@ 2015-10-10  0:48 ` carlos at redhat dot com
  2015-10-12 20:05 ` egmont at gmail dot com
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: carlos at redhat dot com @ 2015-10-10  0:48 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934
Bug 18934 depends on bug 18589, which changed state.

Bug 18589 Summary: sort-test.sh fails at random
https://sourceware.org/bugzilla/show_bug.cgi?id=18589

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] hu_HU: collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
                   ` (4 preceding siblings ...)
  2015-10-10  0:48 ` carlos at redhat dot com
@ 2015-10-12 20:05 ` egmont at gmail dot com
  2015-10-13  0:16 ` vapier at gentoo dot org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: egmont at gmail dot com @ 2015-10-12 20:05 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

--- Comment #5 from Egmont Koblinger <egmont at gmail dot com> ---
> please post patches to the libc locale mailing list

Mike,

The bugs filed here are automatically forwarded to that list. Could you please
clarify what would be the point of me sending them again? Sounds like maybe
you're only willing to consider changes you were asked at least twice to
submit?? Doesn't make too much sense to me.

Should I send the entire patch and description, or just a link?

Thanks in advance!

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] hu_HU: collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
                   ` (5 preceding siblings ...)
  2015-10-12 20:05 ` egmont at gmail dot com
@ 2015-10-13  0:16 ` vapier at gentoo dot org
  2015-10-13 21:34 ` egmont at gmail dot com
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: vapier at gentoo dot org @ 2015-10-13  0:16 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

--- Comment #6 from Mike Frysinger <vapier at gentoo dot org> ---
(In reply to Egmont Koblinger from comment #5)

patches aren't forwarded to the list.  if you want to see how development
happens, see the wiki documentation:
https://sourceware.org/glibc/wiki/Contribution%20checklist

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] hu_HU: collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
                   ` (6 preceding siblings ...)
  2015-10-13  0:16 ` vapier at gentoo dot org
@ 2015-10-13 21:34 ` egmont at gmail dot com
  2015-10-13 21:39 ` egmont at gmail dot com
  2015-10-13 22:25 ` egmont at gmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: egmont at gmail dot com @ 2015-10-13 21:34 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

Egmont Koblinger <egmont at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #8594|0                           |1
        is obsolete|                            |

--- Comment #7 from Egmont Koblinger <egmont at gmail dot com> ---
Created attachment 8710
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8710&action=edit
Fix v4

Same as v3, with a ChangeLog entry added.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] hu_HU: collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
                   ` (7 preceding siblings ...)
  2015-10-13 21:34 ` egmont at gmail dot com
@ 2015-10-13 21:39 ` egmont at gmail dot com
  2015-10-13 22:25 ` egmont at gmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: egmont at gmail dot com @ 2015-10-13 21:39 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

Egmont Koblinger <egmont at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #8710|0                           |1
        is obsolete|                            |

--- Comment #8 from Egmont Koblinger <egmont at gmail dot com> ---
Created attachment 8711
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8711&action=edit
Fix v4, take 2

Forget the previous, this is Fix v4 :)

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug localedata/18934] hu_HU: collate: fix multiple bugs and add tests
  2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
                   ` (8 preceding siblings ...)
  2015-10-13 21:39 ` egmont at gmail dot com
@ 2015-10-13 22:25 ` egmont at gmail dot com
  9 siblings, 0 replies; 11+ messages in thread
From: egmont at gmail dot com @ 2015-10-13 22:25 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=18934

Egmont Koblinger <egmont at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #8711|0                           |1
        is obsolete|                            |

--- Comment #9 from Egmont Koblinger <egmont at gmail dot com> ---
Created attachment 8712
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8712&action=edit
Fix v5

After another round of checking, I've realized that I've omitted the compound
letter "ny" from the "alphabet" section of the unittest. This is fixed now. So
the only change from v4 is a slightly even more extensive unittest.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-10-13 22:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-08  8:33 [Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests egmont at gmail dot com
2015-09-08  8:38 ` [Bug localedata/18934] " egmont at gmail dot com
2015-09-09 18:51 ` egmont at gmail dot com
2015-09-09 20:03 ` egmont at gmail dot com
2015-09-13  1:06 ` [Bug localedata/18934] hu_HU: " vapier at gentoo dot org
2015-10-10  0:48 ` carlos at redhat dot com
2015-10-12 20:05 ` egmont at gmail dot com
2015-10-13  0:16 ` vapier at gentoo dot org
2015-10-13 21:34 ` egmont at gmail dot com
2015-10-13 21:39 ` egmont at gmail dot com
2015-10-13 22:25 ` egmont at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).