* [Bug localedata/16527] strxfrm & strcoll broken with Hangul & en_US.UTF-8
2014-02-04 22:14 [Bug localedata/16527] New: strxfrm & strcoll broken with Hangul & en_US.UTF-8 ju.orth+sourceware at gmail dot com
@ 2014-02-04 22:14 ` ju.orth+sourceware at gmail dot com
2014-06-13 8:46 ` fweimer at redhat dot com
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: ju.orth+sourceware at gmail dot com @ 2014-02-04 22:14 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=16527
ju.orth+sourceware at gmail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ju.orth+sourceware at gmail dot co
| |m
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug localedata/16527] New: strxfrm & strcoll broken with Hangul & en_US.UTF-8
@ 2014-02-04 22:14 ju.orth+sourceware at gmail dot com
2014-02-04 22:14 ` [Bug localedata/16527] " ju.orth+sourceware at gmail dot com
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: ju.orth+sourceware at gmail dot com @ 2014-02-04 22:14 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=16527
Bug ID: 16527
Summary: strxfrm & strcoll broken with Hangul & en_US.UTF-8
Product: glibc
Version: 2.18
Status: NEW
Severity: normal
Priority: P2
Component: localedata
Assignee: unassigned at sourceware dot org
Reporter: ju.orth+sourceware at gmail dot com
CC: libc-locales at sourceware dot org
Consider this program:
=============
#include <stdio.h>
#include <locale.h>
#include <string.h>
#include <malloc.h>
void ps(const char *a)
{
size_t s;
unsigned char *b;
int i;
s = strxfrm(NULL, a, 0);
b = malloc(s+1);
strxfrm((void *)b, a, s+1);
for (i = 0; i <= s; i++)
printf("%u ", (unsigned)b[i]);
printf("\n");
}
int main(void)
{
ps("퍼");
ps("흐");
setlocale(LC_COLLATE, "");
ps("퍼");
ps("흐");
}
=============
On systems with LANG=en_US.UTF-8 the output is
=============
237 141 188 0
237 157 144 0
1 1 1 1 194 182 1 194 182 1 194 182 0
1 1 1 1 194 182 1 194 182 1 194 182 0
=============
The output after setlocale(LC_COLLATE, "") is completely nonsensical. Similar
useless output is generated with the locales de_DE.UTF-8, ru_RU.UTF-8, and
jp_JP.UTF-8. ko_KR.UTF-8 seem to be the only working locale.
This can be circumvented by adding the following code to iso14651_t1:
=============
script <HANGUL>
order_start <HANGUL>;forward;forward;forward;forward,position
<UAC00> <UAC00>;IGNORE;IGNORE;IGNORE
.. ..;IGNORE;IGNORE;IGNORE
<UD7A3> <UD7A3>;IGNORE;IGNORE;IGNORE
#
order_end
#
=============
Right below a very similar workaround...
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug localedata/16527] strxfrm & strcoll broken with Hangul & en_US.UTF-8
2014-02-04 22:14 [Bug localedata/16527] New: strxfrm & strcoll broken with Hangul & en_US.UTF-8 ju.orth+sourceware at gmail dot com
2014-02-04 22:14 ` [Bug localedata/16527] " ju.orth+sourceware at gmail dot com
@ 2014-06-13 8:46 ` fweimer at redhat dot com
2015-09-06 23:12 ` egmont at gmail dot com
2017-10-21 8:26 ` maiku.fabian at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: fweimer at redhat dot com @ 2014-06-13 8:46 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=16527
Florian Weimer <fweimer at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Flags| |security-
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug localedata/16527] strxfrm & strcoll broken with Hangul & en_US.UTF-8
2014-02-04 22:14 [Bug localedata/16527] New: strxfrm & strcoll broken with Hangul & en_US.UTF-8 ju.orth+sourceware at gmail dot com
2014-02-04 22:14 ` [Bug localedata/16527] " ju.orth+sourceware at gmail dot com
2014-06-13 8:46 ` fweimer at redhat dot com
@ 2015-09-06 23:12 ` egmont at gmail dot com
2017-10-21 8:26 ` maiku.fabian at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: egmont at gmail dot com @ 2015-09-06 23:12 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=16527
Egmont Koblinger <egmont at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |egmont at gmail dot com
--- Comment #1 from Egmont Koblinger <egmont at gmail dot com> ---
Please see bug 18927 as well.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug localedata/16527] strxfrm & strcoll broken with Hangul & en_US.UTF-8
2014-02-04 22:14 [Bug localedata/16527] New: strxfrm & strcoll broken with Hangul & en_US.UTF-8 ju.orth+sourceware at gmail dot com
` (2 preceding siblings ...)
2015-09-06 23:12 ` egmont at gmail dot com
@ 2017-10-21 8:26 ` maiku.fabian at gmail dot com
3 siblings, 0 replies; 5+ messages in thread
From: maiku.fabian at gmail dot com @ 2017-10-21 8:26 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=16527
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |maiku.fabian at gmail dot com
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-10-21 8:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-04 22:14 [Bug localedata/16527] New: strxfrm & strcoll broken with Hangul & en_US.UTF-8 ju.orth+sourceware at gmail dot com
2014-02-04 22:14 ` [Bug localedata/16527] " ju.orth+sourceware at gmail dot com
2014-06-13 8:46 ` fweimer at redhat dot com
2015-09-06 23:12 ` egmont at gmail dot com
2017-10-21 8:26 ` maiku.fabian at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).