public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug localedata/1430] New: regression: worse collation for hu_HU
@ 2005-10-06 17:45 egmont at uhulinux dot hu
  2005-10-14 20:25 ` [Bug localedata/1430] " drepper at redhat dot com
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: egmont at uhulinux dot hu @ 2005-10-06 17:45 UTC (permalink / raw)
  To: glibc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3115 bytes --]

Please revert libc/localedata/locales/hu_HU revision 1.18, "Better collation".
It is not better, it is worse.

According to the Hungarian rules, aacute, eacute, iacute, oacute and uacute
must be treated the same as their unaccented counterparts, also wovels with
diaeresis should be treated the same as ther counterparts with double acutes.
In other words:
a = á < e = é < i = í < o = ó < ö = &#337; < u = ú < ü = &#369;

For example, the following is a correct alphabetical order:
ablak
állat
apa
áru
az

These wovels in one equivalence class only make a difference if they are the
only letters which differ, e.g.:
Eger
egér
éger
eget
éget

This was perfectly implemented in the previous version, as well as mentioned
in some comment lines within this file (which comment is still there although
it doesn't correspond to what's implemented right now).

I don't know who and why suggested the modifications of 1.18, but he was surely
wrong. If needed, I can scan some pages of dictionaries or phone books and
upload it to prove these sorting rules.

If someone just happens to prefer sorting this way, then he is of course
absolutely free to create an own locale for himself, or set LC_COLLATE=C or
something similar, but there's hardly any place for that work in glibc. Glibc
should follow the national rules, and r1.18 was a move against it.


Ulrich, If I recall correctly, some years ago it was you to whom I sent the
hu_HU sorting rules which fixed some bugs. Then you asked me to manually
sort a lot of words you had previously received from some other Hungarian guy
and test whether glibc sorts it in the same order. Then glibc with those
Hungarian collating rules passed that test, but the new rules would obviously
fail on them. Do you happen to still have that file? (I don't think I have
them, but I'll take a look at it.)

I guess it would be a really wise move to put such kind of sorted files into
glibc's source and add a sorting test case for them.


Ps1: a and á, as well as e and é are different voices so it's often argued
if it's logical to put them in the same group, this is rather a tradition than
a logical decision. On the other hand, i and í, o and ó, ö and &#337;, u and ú, and
finally ü and &#369; are the same voices, with the latter ones pronounced longer.
Crosswords and similar stuff treat a and á, and é and é differently, while the
other pairs are interchangeable there. But alphabetical sorting uses different
rules.

Ps2: All the words above in the examples are real Hungarian words.

-- 
           Summary: regression: worse collation for hu_HU
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
        AssignedTo: libc-locales at sources dot redhat dot com
        ReportedBy: egmont at uhulinux dot hu
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
@ 2005-10-14 20:25 ` drepper at redhat dot com
  2005-10-17  7:57 ` egmont at uhulinux dot hu
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: drepper at redhat dot com @ 2005-10-14 20:25 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2005-10-14 20:24 -------
Discuss this with the other reported and get back with the result.  I have no
reason to believe anyone over somebody else.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |drepper at redhat dot com
             Status|NEW                         |WAITING


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
  2005-10-14 20:25 ` [Bug localedata/1430] " drepper at redhat dot com
@ 2005-10-17  7:57 ` egmont at uhulinux dot hu
  2005-10-17  8:01 ` jakub at redhat dot com
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: egmont at uhulinux dot hu @ 2005-10-17  7:57 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From egmont at uhulinux dot hu  2005-10-17 07:57 -------
Who is the other reporter? Please give me some contact info, I couldn't find
such an entry in this bugzilla.


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
  2005-10-14 20:25 ` [Bug localedata/1430] " drepper at redhat dot com
  2005-10-17  7:57 ` egmont at uhulinux dot hu
@ 2005-10-17  8:01 ` jakub at redhat dot com
  2005-10-18 11:06 ` egmont at uhulinux dot hu
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at redhat dot com @ 2005-10-17  8:01 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From jakub at redhat dot com  2005-10-17 08:01 -------
2005-07-26  Ulrich Drepper  <drepper@redhat.com>

        * locales/hu_HU: Better collation.
        Patch by Gyuro Lehel <lehel@freemail.hu>.


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (2 preceding siblings ...)
  2005-10-17  8:01 ` jakub at redhat dot com
@ 2005-10-18 11:06 ` egmont at uhulinux dot hu
  2005-10-18 14:24 ` drepper at redhat dot com
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: egmont at uhulinux dot hu @ 2005-10-18 11:06 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From egmont at uhulinux dot hu  2005-10-18 11:06 -------
I received a reply from Lehel. He writes (in Hungarian) that he doesn't want to
create an account in bugzilla because he receives twice as much spam since he
registered in redhat bugzilla. On the other hand he asked me to copy/paste
this text here:

Well, I do not argue the point, it was just the customers at my old job
who did not really liked this kind of sorting. Maybe the solution could
be to add a locale that contains the alphabetical sorting and let the
users choose their preferred one.


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (3 preceding siblings ...)
  2005-10-18 11:06 ` egmont at uhulinux dot hu
@ 2005-10-18 14:24 ` drepper at redhat dot com
  2005-10-18 14:29 ` egmont at uhulinux dot hu
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: drepper at redhat dot com @ 2005-10-18 14:24 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2005-10-18 14:24 -------
> Maybe the solution could be to add a locale that contains the alphabetical
> sorting and let the users choose their preferred one.

No, creating variant locales is not an option.  There is one and only one locale.

-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (4 preceding siblings ...)
  2005-10-18 14:24 ` drepper at redhat dot com
@ 2005-10-18 14:29 ` egmont at uhulinux dot hu
  2006-04-25 18:28 ` drepper at redhat dot com
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: egmont at uhulinux dot hu @ 2005-10-18 14:29 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From egmont at uhulinux dot hu  2005-10-18 14:29 -------
> No, creating variant locales is not an option.

I perfectly agree, I also answered him this. (If there are 2 choices then
in a few minutes there'll be request for about 2^N choices where N keeps on
growing forever...).


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (5 preceding siblings ...)
  2005-10-18 14:29 ` egmont at uhulinux dot hu
@ 2006-04-25 18:28 ` drepper at redhat dot com
  2006-04-25 18:50 ` egmont at uhulinux dot hu
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: drepper at redhat dot com @ 2006-04-25 18:28 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2006-04-25 18:27 -------
No response in 6+ months.  Closing.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |RESOLVED
         Resolution|                            |WONTFIX


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (6 preceding siblings ...)
  2006-04-25 18:28 ` drepper at redhat dot com
@ 2006-04-25 18:50 ` egmont at uhulinux dot hu
  2006-05-03 14:05 ` egmont at uhulinux dot hu
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: egmont at uhulinux dot hu @ 2006-04-25 18:50 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From egmont at uhulinux dot hu  2006-04-25 18:50 -------
No response to what? Sorry, but I think that _I_'ve been waiting 6+ months for
_you_ to fix this bug.

I told you that Lehel agreed in private mail that he was wrong and I am right,
unfortunately I couldn't get him to comment here in bugzilla so I cannot prove
this, but I hope you do not think I'm lying; and it's not my fault that he is
not as co-operative as he should be.

In the original report I told "If needed, I can scan some pages of
dictionaries..." It's not easy for me to find access to a scanner but I am
happily willing to do this _if_ I know that it's needed to get this bug fixed.
But I still don't know if that would make you happy, you haven't replied
anything like "yes, scanning those pages would be cool".

I'll be back shortly with some scanned pages. If that's not enough then please,
please let me know what to do to prove I'm right.


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (7 preceding siblings ...)
  2006-04-25 18:50 ` egmont at uhulinux dot hu
@ 2006-05-03 14:05 ` egmont at uhulinux dot hu
  2006-05-03 14:07 ` egmont at uhulinux dot hu
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: egmont at uhulinux dot hu @ 2006-05-03 14:05 UTC (permalink / raw)
  To: glibc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 495 bytes --]


------- Additional Comments From egmont at uhulinux dot hu  2006-05-03 14:05 -------
Created an attachment (id=999)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=999&action=view)
dictionary

A random page scanned from a Hungarian-German dictionary. Words beginning with
e and é appear in mixed order.


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (8 preceding siblings ...)
  2006-05-03 14:05 ` egmont at uhulinux dot hu
@ 2006-05-03 14:07 ` egmont at uhulinux dot hu
  2006-05-03 14:13 ` egmont at uhulinux dot hu
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: egmont at uhulinux dot hu @ 2006-05-03 14:07 UTC (permalink / raw)
  To: glibc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 461 bytes --]


------- Additional Comments From egmont at uhulinux dot hu  2006-05-03 14:07 -------
Created an attachment (id=1000)
 --> (http://sourceware.org/bugzilla/attachment.cgi?id=1000&action=view)
phonebook

The page where ö and &#337; starts, scanned from a quite recent phonebook.


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] regression: worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (9 preceding siblings ...)
  2006-05-03 14:07 ` egmont at uhulinux dot hu
@ 2006-05-03 14:13 ` egmont at uhulinux dot hu
  2006-11-16 11:52 ` [Bug localedata/1430] [2.4/2.5 regression] " egmont at uhulinux dot hu
  2007-02-18  4:43 ` drepper at redhat dot com
  12 siblings, 0 replies; 14+ messages in thread
From: egmont at uhulinux dot hu @ 2006-05-03 14:13 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From egmont at uhulinux dot hu  2006-05-03 14:13 -------
I modified the scanned pictures due to potential privacy or legal problems.
I can send the unmodified versions in private e-mail, if required.

If you need any other proof, please let me know.


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|WONTFIX                     |


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] [2.4/2.5 regression] worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (10 preceding siblings ...)
  2006-05-03 14:13 ` egmont at uhulinux dot hu
@ 2006-11-16 11:52 ` egmont at uhulinux dot hu
  2007-02-18  4:43 ` drepper at redhat dot com
  12 siblings, 0 replies; 14+ messages in thread
From: egmont at uhulinux dot hu @ 2006-11-16 11:52 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From egmont at uhulinux dot hu  2006-11-16 11:52 -------
No response in 6+ months.

Last time you closed this bug with this justification. Now _you_ haven't 
replied in half year, so let me please increase the severity (as requested in 
the help pages of this bugzilla -- though I admit this is not a critical bug at 
all, but somehow I'd like to draw your attention on it, and anyway your docs 
say I should do this).

It is a regression anyway (now already present in 2 consecuvite official 
releases), and I see no reason why it couldn't be fixed quickly. I hope that 
regression bugs are handled with higher priority (as this is the case with many 
other software projects).

In the mean time I also changed the summary according to the docs, HTH too.


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |critical
            Summary|regression: worse collation |[2.4/2.5 regression] worse
                   |for hu_HU                   |collation for hu_HU
            Version|unspecified                 |2.4


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug localedata/1430] [2.4/2.5 regression] worse collation for hu_HU
  2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
                   ` (11 preceding siblings ...)
  2006-11-16 11:52 ` [Bug localedata/1430] [2.4/2.5 regression] " egmont at uhulinux dot hu
@ 2007-02-18  4:43 ` drepper at redhat dot com
  12 siblings, 0 replies; 14+ messages in thread
From: drepper at redhat dot com @ 2007-02-18  4:43 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2007-02-18 04:43 -------
I reverted the patch.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|REOPENED                    |RESOLVED
         Resolution|                            |FIXED


http://sourceware.org/bugzilla/show_bug.cgi?id=1430

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2007-02-18  4:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-06 17:45 [Bug localedata/1430] New: regression: worse collation for hu_HU egmont at uhulinux dot hu
2005-10-14 20:25 ` [Bug localedata/1430] " drepper at redhat dot com
2005-10-17  7:57 ` egmont at uhulinux dot hu
2005-10-17  8:01 ` jakub at redhat dot com
2005-10-18 11:06 ` egmont at uhulinux dot hu
2005-10-18 14:24 ` drepper at redhat dot com
2005-10-18 14:29 ` egmont at uhulinux dot hu
2006-04-25 18:28 ` drepper at redhat dot com
2006-04-25 18:50 ` egmont at uhulinux dot hu
2006-05-03 14:05 ` egmont at uhulinux dot hu
2006-05-03 14:07 ` egmont at uhulinux dot hu
2006-05-03 14:13 ` egmont at uhulinux dot hu
2006-11-16 11:52 ` [Bug localedata/1430] [2.4/2.5 regression] " egmont at uhulinux dot hu
2007-02-18  4:43 ` drepper at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).