From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2155) id 5C60A385843D; Thu, 16 Mar 2023 12:55:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5C60A385843D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1678971301; bh=g9EQge6NgMGhr4agascIOkKfm14bp6loUGFHP5TH1Sc=; h=From:To:Subject:Date:From; b=YvlpZK6cUmHIeT/BJr+NklGnMvzG8xaFrWbC/Hj3DUOlrzNIfvEiYNYxllXfBwauC BYhfTeZX42hcDTifgM1GpzZydQ6UYjsp1XGbH4gbqHHFz6LbiVh0i2tSuaPZFERZa3 bpBbag2Lpf17ZsbaPR56yyKyEzdcZH8EvVs7oaWI= Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Corinna Vinschen To: cygwin-cvs@sourceware.org Subject: [newlib-cygwin/main] Cygwin: regex: wgetnext: Re-add kludge to be more glibc compatible X-Act-Checkin: newlib-cygwin X-Git-Author: Corinna Vinschen X-Git-Refname: refs/heads/main X-Git-Oldrev: 585e7f9891d68cf14a5fdce70e1f1c613c98bb94 X-Git-Newrev: 0bdc764b421b56ac2961ce54f538d4a71f38b724 Message-Id: <20230316125501.5C60A385843D@sourceware.org> Date: Thu, 16 Mar 2023 12:55:01 +0000 (GMT) List-Id: https://sourceware.org/git/gitweb.cgi?p=3Dnewlib-cygwin.git;h=3D0bdc764b421= b56ac2961ce54f538d4a71f38b724 commit 0bdc764b421b56ac2961ce54f538d4a71f38b724 Author: Corinna Vinschen AuthorDate: Thu Mar 16 12:44:32 2023 +0100 Commit: Corinna Vinschen CommitDate: Thu Mar 16 13:46:01 2023 +0100 Cygwin: regex: wgetnext: Re-add kludge to be more glibc compatible =20 Add comment to explain. =20 Signed-off-by: Corinna Vinschen Diff: --- winsup/cygwin/regex/regcomp.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/winsup/cygwin/regex/regcomp.c b/winsup/cygwin/regex/regcomp.c index 3c735931040f..59da896a90a1 100644 --- a/winsup/cygwin/regex/regcomp.c +++ b/winsup/cygwin/regex/regcomp.c @@ -1528,6 +1528,18 @@ wgetnext(struct parse *p) wint_t wc; size_t n; =20 +#ifdef __CYGWIN__ + /* Kludge for more glibc compatibility. On Cygwin as well as on + Linux, mbrtowc returns -1 if the current local's codeset is ASCII + and the character is >=3D 0x80. Nevertheless, glibc's regcomp allows + any char value, even stuff like [\xc0-\xff], if the locale's codeset + is ASCII, so in regcomp it ignores the fact that chars >=3D 0x80 are + invalid ASCII chars. To be more Linux-compatible, we align the + behaviour to glibc here. Allow any character value if the current + local's codeset is ASCII. */ + if (*__current_locale_charset () =3D=3D 'A') /* SCII */ + return (wint_t) (unsigned char) *p->next++; +#endif memset(&mbs, 0, sizeof(mbs)); n =3D mbrtowi(&wc, p->next, p->end - p->next, &mbs); if (n =3D=3D (size_t)-1 || n =3D=3D (size_t)-2) {