public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug locale/31859] New: Transliteration rules with two input characters like  "ḌḌ" "DDH" do not work.
@ 2024-06-07 13:44 maiku.fabian at gmail dot com
  2024-06-11 21:28 ` [Bug locale/31859] " maiku.fabian at gmail dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-06-07 13:44 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=31859

            Bug ID: 31859
           Summary: Transliteration rules with two input characters like
                    "ḌḌ" "DDH" do not work.
           Product: glibc
           Version: 2.39
            Status: NEW
          Severity: normal
          Priority: P2
         Component: locale
          Assignee: unassigned at sourceware dot org
          Reporter: maiku.fabian at gmail dot com
  Target Milestone: ---

See: https://sourceware.org/pipermail/libc-alpha/2024-May/156769.html

If transliteration rules like this:

translit_start
"ḌḌ" "DDH"
"ḍḍ" "ddh"
"Ḍḍ" "Ddh"
translit_en

are used in the LC_CTYPE section of a locale, they don’t work.

These are in our new scn_IT locale, but commented out for the moment because
they do not work.

If localedata/locales/translit_combining is not changed, the rules for the
single characters Ḍ U+01E0C and ḍ U+1E0D from translit_combining did always win
when I tested, the longer input sequences "ḌḌ", "ḍḍ", and "Ḍḍ" were never used.

But when I commented out these short single characters transliteration rules in
translit_combining like this:

diff --git a/localedata/locales/translit_combining
b/localedata/locales/translit_combining
index ce2f19eee1..6f879d9caf 100644
--- a/localedata/locales/translit_combining
+++ b/localedata/locales/translit_combining
@@ -2486,9 +2486,9 @@ translit_start
 % LATIN SMALL LETTER D WITH DOT ABOVE
 <U1E0B> <U0064>
 % LATIN CAPITAL LETTER D WITH DOT BELOW
-<U1E0C> <U0044>
+%<U1E0C> <U0044>
 % LATIN SMALL LETTER D WITH DOT BELOW
-<U1E0D> <U0064>
+%<U1E0D> <U0064>
 % LATIN CAPITAL LETTER D WITH LINE BELOW
 <U1E0E> <U0044>
 % LAT


then

bash-5.2# echo 'ḌḌ'|iconv -f UTF-8 -t ASCII//translit
^C
bash-5.2#

uses 100% CPU and never stops until I stop it with Control-C.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug locale/31859] Transliteration rules with two input characters like  "ḌḌ" "DDH" do not work.
  2024-06-07 13:44 [Bug locale/31859] New: Transliteration rules with two input characters like "ḌḌ" "DDH" do not work maiku.fabian at gmail dot com
@ 2024-06-11 21:28 ` maiku.fabian at gmail dot com
  2024-08-16 12:41 ` carlos at redhat dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: maiku.fabian at gmail dot com @ 2024-06-11 21:28 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=31859

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |maiku.fabian at gmail dot com

--- Comment #1 from Mike FABIAN <maiku.fabian at gmail dot com> ---
https://sourceware.org/pipermail/libc-alpha/2024-June/157316.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug locale/31859] Transliteration rules with two input characters like  "ḌḌ" "DDH" do not work.
  2024-06-07 13:44 [Bug locale/31859] New: Transliteration rules with two input characters like "ḌḌ" "DDH" do not work maiku.fabian at gmail dot com
  2024-06-11 21:28 ` [Bug locale/31859] " maiku.fabian at gmail dot com
@ 2024-08-16 12:41 ` carlos at redhat dot com
  2024-08-16 12:50 ` carlos at redhat dot com
  2024-08-16 12:57 ` fweimer at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: carlos at redhat dot com @ 2024-08-16 12:41 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=31859

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |2.41
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
                 CC|                            |carlos at redhat dot com

--- Comment #2 from Carlos O'Donell <carlos at redhat dot com> ---
commit 1b0a2062c8938c7333cd118d85d9976c4e7c92af
Author: Andreas Schwab <schwab@suse.de>
Date:   Mon Jun 10 12:19:17 2024 +0200

    iconv: Fix matching of multi-character transliterations (bug 31859)

    Only return __GCONV_INCOMPLETE_INPUT for a partial match when the end of
    the input buffer is reached.  Otherwise it is a non-match, and other
    patterns should be tried.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug locale/31859] Transliteration rules with two input characters like  "ḌḌ" "DDH" do not work.
  2024-06-07 13:44 [Bug locale/31859] New: Transliteration rules with two input characters like "ḌḌ" "DDH" do not work maiku.fabian at gmail dot com
  2024-06-11 21:28 ` [Bug locale/31859] " maiku.fabian at gmail dot com
  2024-08-16 12:41 ` carlos at redhat dot com
@ 2024-08-16 12:50 ` carlos at redhat dot com
  2024-08-16 12:57 ` fweimer at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: carlos at redhat dot com @ 2024-08-16 12:50 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=31859

Carlos O'Donell <carlos at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
              Flags|                            |security-

--- Comment #3 from Carlos O'Donell <carlos at redhat dot com> ---
In general it might have been possible to cause service breakage by building a
custom locale with these transliterations, enabling the locale on a server, and
then attempting to process these conversions with the locale enabled. However,
since glibc didn't ship such a locale, this would be a failure in testing for
the developer using the custom locale. There is no actual, concrete,
non-synthetic scenario reported here, so I'm marking this security- for the
hang in the converter.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug locale/31859] Transliteration rules with two input characters like  "ḌḌ" "DDH" do not work.
  2024-06-07 13:44 [Bug locale/31859] New: Transliteration rules with two input characters like "ḌḌ" "DDH" do not work maiku.fabian at gmail dot com
                   ` (2 preceding siblings ...)
  2024-08-16 12:50 ` carlos at redhat dot com
@ 2024-08-16 12:57 ` fweimer at redhat dot com
  3 siblings, 0 replies; 5+ messages in thread
From: fweimer at redhat dot com @ 2024-08-16 12:57 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=31859

Florian Weimer <fweimer at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fweimer at redhat dot com

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-08-16 12:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-07 13:44 [Bug locale/31859] New: Transliteration rules with two input characters like "ḌḌ" "DDH" do not work maiku.fabian at gmail dot com
2024-06-11 21:28 ` [Bug locale/31859] " maiku.fabian at gmail dot com
2024-08-16 12:41 ` carlos at redhat dot com
2024-08-16 12:50 ` carlos at redhat dot com
2024-08-16 12:57 ` fweimer at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).