* [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others)
[not found] <bug-2253-716@http.sourceware.org/bugzilla/>
@ 2014-02-07 2:56 ` jsm28 at gcc dot gnu.org
2014-06-26 5:15 ` pravin.d.s at gmail dot com
` (4 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: jsm28 at gcc dot gnu.org @ 2014-02-07 2:56 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=2253
Joseph Myers <jsm28 at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |libc-locales at sourceware dot org
Component|libc |localedata
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others)
[not found] <bug-2253-716@http.sourceware.org/bugzilla/>
2014-02-07 2:56 ` [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others) jsm28 at gcc dot gnu.org
@ 2014-06-26 5:15 ` pravin.d.s at gmail dot com
2015-05-04 20:39 ` maiku.fabian at gmail dot com
` (3 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: pravin.d.s at gmail dot com @ 2014-06-26 5:15 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=2253
Pravin S <pravin.d.s at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |pravin.d.s at gmail dot com
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others)
[not found] <bug-2253-716@http.sourceware.org/bugzilla/>
2014-02-07 2:56 ` [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others) jsm28 at gcc dot gnu.org
2014-06-26 5:15 ` pravin.d.s at gmail dot com
@ 2015-05-04 20:39 ` maiku.fabian at gmail dot com
2015-05-04 23:07 ` samuel.thibault@ens-lyon.org
` (2 subsequent siblings)
5 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2015-05-04 20:39 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=2253
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |maiku.fabian at gmail dot com
--- Comment #5 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Samuel Thibault from comment #4)
> Right, thus changing bug title: the transliteration however still produces
> "e", while it could produce "é".
Transliterating to “e” is probably OK in most locales, for
example in English dropping accents seemst to be common usage.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others)
[not found] <bug-2253-716@http.sourceware.org/bugzilla/>
` (2 preceding siblings ...)
2015-05-04 20:39 ` maiku.fabian at gmail dot com
@ 2015-05-04 23:07 ` samuel.thibault@ens-lyon.org
2015-05-05 9:46 ` maiku.fabian at gmail dot com
2018-04-19 13:58 ` fweimer at redhat dot com
5 siblings, 0 replies; 6+ messages in thread
From: samuel.thibault@ens-lyon.org @ 2015-05-04 23:07 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=2253
--- Comment #6 from Samuel Thibault <samuel.thibault@ens-lyon.org> ---
Err, but here e+combineacute *is* representable in latin1, it's eacute. So
transliteration should not discard the accent.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others)
[not found] <bug-2253-716@http.sourceware.org/bugzilla/>
` (3 preceding siblings ...)
2015-05-04 23:07 ` samuel.thibault@ens-lyon.org
@ 2015-05-05 9:46 ` maiku.fabian at gmail dot com
2018-04-19 13:58 ` fweimer at redhat dot com
5 siblings, 0 replies; 6+ messages in thread
From: maiku.fabian at gmail dot com @ 2015-05-05 9:46 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=2253
--- Comment #7 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Samuel Thibault from comment #6)
> Err, but here e+combineacute *is* representable in latin1, it's eacute. So
> transliteration should not discard the accent.
Yes, maybe.
But is this doable with the glibc transliteration system?
All the glibc/localedata/locales/translit_* files just transliterate
one single character to another character or a list of characters.
It never starts with a character sequence. So I guess this is not supported.
As Jungshik Shin suggests in comment#1, iconv could
normalize the input to NFC before attempting a transliteration.
Certainly not without transliteration, as Rich Felker writes in
comment#3, but *if* transliteration is used, normalizing to NFC and
then doing the transliteration might be a reasonable approach.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others)
[not found] <bug-2253-716@http.sourceware.org/bugzilla/>
` (4 preceding siblings ...)
2015-05-05 9:46 ` maiku.fabian at gmail dot com
@ 2018-04-19 13:58 ` fweimer at redhat dot com
5 siblings, 0 replies; 6+ messages in thread
From: fweimer at redhat dot com @ 2018-04-19 13:58 UTC (permalink / raw)
To: libc-locales
https://sourceware.org/bugzilla/show_bug.cgi?id=2253
Florian Weimer <fweimer at redhat dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |fweimer at redhat dot com
--- Comment #8 from Florian Weimer <fweimer at redhat dot com> ---
Created attachment 10961
--> https://sourceware.org/bugzilla/attachment.cgi?id=10961&action=edit
e-combining-acute
Attaching file for posterity.
00000000: 65cc 810a e...
Issue is still present.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-04-19 13:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-2253-716@http.sourceware.org/bugzilla/>
2014-02-07 2:56 ` [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others) jsm28 at gcc dot gnu.org
2014-06-26 5:15 ` pravin.d.s at gmail dot com
2015-05-04 20:39 ` maiku.fabian at gmail dot com
2015-05-04 23:07 ` samuel.thibault@ens-lyon.org
2015-05-05 9:46 ` maiku.fabian at gmail dot com
2018-04-19 13:58 ` fweimer at redhat dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).