From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2155) id BCCAD3858D37; Tue, 14 Feb 2023 12:09:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BCCAD3858D37 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1676376598; bh=Ko0KeQmNY0ay9tu6gO79Q2Fly5umjqMv2Ja9nAdmP8g=; h=From:To:Subject:Date:From; b=ty2lU9fYQqU1FumJKAvMxN3zQIbdn/bEJYtAYtcao4w6xpaPyHa/slLd61h6ngV6t TNzfIQBiCxDAag0LoYrHofp8o4NRRFcVTIdhFLrpXvoo8dLQb6wCpA1/O8aATOVBcj xKUrwMUDa2+34QLmi8WlS+V5+GyNDl0UulVD5hoA= Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Corinna Vinschen To: cygwin-cvs@sourceware.org Subject: [newlib-cygwin/main] Cygwin: __collate_range_cmp: handle Unicode values >= 0x10000 X-Act-Checkin: newlib-cygwin X-Git-Author: Corinna Vinschen X-Git-Refname: refs/heads/main X-Git-Oldrev: 60c25da90d015f27c5697c6db7ab0557585d09aa X-Git-Newrev: eac830e0feac1e5f4fbb9637506bd071e7530a1f Message-Id: <20230214120958.BCCAD3858D37@sourceware.org> Date: Tue, 14 Feb 2023 12:09:58 +0000 (GMT) List-Id: https://sourceware.org/git/gitweb.cgi?p=3Dnewlib-cygwin.git;h=3Deac830e0fea= c1e5f4fbb9637506bd071e7530a1f commit eac830e0feac1e5f4fbb9637506bd071e7530a1f Author: Corinna Vinschen AuthorDate: Tue Feb 14 12:22:36 2023 +0100 Commit: Corinna Vinschen CommitDate: Tue Feb 14 12:48:26 2023 +0100 Cygwin: __collate_range_cmp: handle Unicode values >=3D 0x10000 =20 So far the input to __collate_range_cmp was handled as a wchar_t. Change that to handle it as wint_t holding a UTF-32 value and add creating surrogate pairs for the call to wcscoll. =20 Signed-off-by: Corinna Vinschen Diff: --- winsup/cygwin/nlsfuncs.cc | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/winsup/cygwin/nlsfuncs.cc b/winsup/cygwin/nlsfuncs.cc index ddd85bea1647..0d204929d24c 100644 --- a/winsup/cygwin/nlsfuncs.cc +++ b/winsup/cygwin/nlsfuncs.cc @@ -1176,8 +1176,20 @@ strcoll (const char *__restrict s1, const char *__re= strict s2) extern "C" int __collate_range_cmp (int c1, int c2) { - wchar_t s1[2] =3D { (wchar_t) c1, L'\0' }; - wchar_t s2[2] =3D { (wchar_t) c2, L'\0' }; + wchar_t s1[3] =3D { (wchar_t) c1, L'\0', L'\0' }; + wchar_t s2[3] =3D { (wchar_t) c2, L'\0', L'\0' }; + + /* Handle Unicode values >=3D 0x10000, convert to surrogate pair */ + if (c1 > 0xffff) + { + s1[0] =3D ((c1 - 0x10000) >> 10) + 0xd800; + s1[1] =3D ((c1 - 0x10000) & 0x3ff) + 0xdc00; + } + if (c2 > 0xffff) + { + s2[0] =3D ((c2 - 0x10000) >> 10) + 0xd800; + s2[1] =3D ((c2 - 0x10000) & 0x3ff) + 0xdc00; + } return wcscoll (s1, s2); }