From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2155) id DFB043858D32; Sat, 18 Feb 2023 22:14:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DFB043858D32 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1676758496; bh=mRu+oWEIjp3Vx5Noe/xZQ7Cgh0OQU99VHhrYBKw9hAk=; h=From:To:Subject:Date:From; b=Q8cuDbdCbV+dhQCKGOW24scNTpZN1C3NwzEZWEYd0KDyCMd0TqKI5JlrCADchQeJ2 ZLyn96uvW2V6yiI7MyaBqeNrvNcVzcgME7Dc15I+fkeynK1s2CVmiAfVwhaUI/M0GF 1Xo9GzcTT44Y14yZMpxUjczin7jyN9BZ4dTJZIfQ= Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Corinna Vinschen To: cygwin-cvs@sourceware.org Subject: [newlib-cygwin/main] Cygwin: is_unicode_equiv: fix normalization X-Act-Checkin: newlib-cygwin X-Git-Author: Corinna Vinschen X-Git-Refname: refs/heads/main X-Git-Oldrev: e4cc9e48462b538253d62109012b90befaaf7bc5 X-Git-Newrev: f0417a620182083fa787eea90e2e1d9884c8e573 Message-Id: <20230218221456.DFB043858D32@sourceware.org> Date: Sat, 18 Feb 2023 22:14:56 +0000 (GMT) List-Id: https://sourceware.org/git/gitweb.cgi?p=3Dnewlib-cygwin.git;h=3Df0417a62018= 2083fa787eea90e2e1d9884c8e573 commit f0417a620182083fa787eea90e2e1d9884c8e573 Author: Corinna Vinschen AuthorDate: Sat Feb 18 23:14:11 2023 +0100 Commit: Corinna Vinschen CommitDate: Sat Feb 18 23:14:11 2023 +0100 Cygwin: is_unicode_equiv: fix normalization =20 Change normalization to form KD and make room for longer decomposed sequences. Diff: --- winsup/cygwin/nlsfuncs.cc | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/winsup/cygwin/nlsfuncs.cc b/winsup/cygwin/nlsfuncs.cc index aa7e8434d7cf..d80567737d7b 100644 --- a/winsup/cygwin/nlsfuncs.cc +++ b/winsup/cygwin/nlsfuncs.cc @@ -1200,14 +1200,14 @@ __collate_range_cmp (int c1, int c2) Note that we only recognize input in Unicode normalization form C, that is, we expect all letters to be composed. A single character is all we look at. - To check equivalence, decompose pattern letter and input letter and che= ck - the base character for equality. Also, convert all digits to the ASCII - digits 0 - 9 and compare. */ + To check equivalence, decompose pattern letter and input letter into + normalization form KD and check the base character for equality. Also, + convert all digits to the ASCII digits 0 - 9 and compare. */ extern "C" int is_unicode_equiv (wint_t test, wint_t eqv) { - wchar_t decomp_testc[5] =3D { 0 }; - wchar_t decomp_eqvc[5] =3D { 0 }; + wchar_t decomp_testc[24] =3D { 0 }; + wchar_t decomp_eqvc[24] =3D { 0 }; wchar_t testc[3] =3D { 0 }; wchar_t eqvc[3] =3D { 0 }; =20 @@ -1229,8 +1229,10 @@ is_unicode_equiv (wint_t test, wint_t eqv) } else testc[0] =3D test; /* Convert to denormalized form */ - FoldStringW (MAP_COMPOSITE | MAP_FOLDDIGITS, eqvc, -1, decomp_eqvc, 5); - FoldStringW (MAP_COMPOSITE | MAP_FOLDDIGITS, testc, -1, decomp_testc, 5); + FoldStringW (MAP_COMPOSITE | MAP_FOLDCZONE | MAP_FOLDDIGITS, + eqvc, -1, decomp_eqvc, 24); + FoldStringW (MAP_COMPOSITE | MAP_FOLDCZONE | MAP_FOLDDIGITS, + testc, -1, decomp_testc, 24); /* If they are equivalent, the base char must be the same. */ if (decomp_eqvc[0] !=3D decomp_testc[0]) return 0;