From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.133]) by sourceware.org (Postfix) with ESMTPS id 902943858D35 for ; Wed, 15 Feb 2023 13:52:27 +0000 (GMT) Authentication-Results: sourceware.org; dmarc=permerror header.from=cygwin.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=cygwin.com Received: from calimero.vinschen.de ([24.134.7.25]) by mrelayeu.kundenserver.de (mreue012 [212.227.15.167]) with ESMTPSA (Nemesis) id 1M8C8J-1pXWzz0MRn-005FW5; Wed, 15 Feb 2023 14:52:24 +0100 Received: by calimero.vinschen.de (Postfix, from userid 500) id 4CAFFA81B74; Wed, 15 Feb 2023 14:52:23 +0100 (CET) Date: Wed, 15 Feb 2023 14:52:23 +0100 From: Corinna Vinschen To: cygwin@cygwin.com Cc: Brian Inglis Subject: Re: [ANNOUNCEMENT] Updated: dash 0.5.12-2 Message-ID: Reply-To: cygwin@cygwin.com Mail-Followup-To: cygwin@cygwin.com, Brian Inglis References: <6810586169.20230213204858@yandex.ru> <8a583e14-b413-d1a2-35d9-e76f73a4b338@Shaw.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Provags-ID: V03:K1:RLNiTs5iRAnO+JihjdF343de7+0HbSsn0BpYLjDrjfsX4pxVDsk S6BxsmWBe3CzSzCrXsSN0Nl3PotWw2dj6+4zbYvdP0si/jQttzwYyXOF9++lHxfDPskzc9V FTKdwhKzWE/TQS2JXy8GINattGphi0ZhxKoILvvLrpUKywEUe5WTO/k8OyS6tFheVi4aqFp wj0F/fpDOG+xBF3bafSZA== UI-OutboundReport: notjunk:1;M01:P0:xRkfNTXmPSw=;WhKwHoiIb4qJ6iNnFz3+gW/F5bD j2FDZrnnl2UtUlvH0Pt7uJM/0jSUTYlCJdTa4JVYKOkM3iTw7PyoNEsnXGmcxIdzdqz38qX32 1JxLMAJEUUMJ9747KGlRRzJjTa0f3HUzNrY/s5E+7rGxSF+ru0HZGqxvjMlks2XIpwKT63uCY bxV1JU3YVWQo02qDMNn2aRTdJ/U232BX77CTli+BWn6VSUNtG76U4xkC3Kp8NnFo7+fl0mkpo l9KxLEQbAHvsis2lvu1b7xqCkJIx7Q9r5BOgdvZN39K76PVYUkxOSUI+0rkNua4hnqrWoDvdn q5mFm/GPdMAPBYLYzwppWU0C0wlzuE+Mw31fIyN3CoyA2a5Y9zZjsh99cglGC/VrQtTHGj0N5 A6DnaQDpHJpqcVLgDv6jpHCqfnpxz9G9Yh0Wz/JAPt2yAeDsKzAnYyv0F4QR8vGNXo6Uie3v/ 8pE+38L+5aenTfIxbjygdQGyzUdlhBTpmX922HpL4gA2f94BE0vDvbrHGrt7LSb/YwmZD/sZ+ TMn77XcPGjL8UnuggjzLIguRZMAmRpvvZt8xGn9rC/S1iKmMuFKH0Sdj29fpCrq3MPsaJrlPN V0gLMXTyAgehD/ojcL28LNxmCdU4qT0/v9IQcy/HXnop3OcJqXiN60q8GH02dEARB8HVpwhkK 4fie1APRmm9o6VnkBzDUh79b+avsPLdUp+5dbEsk0Q== X-Spam-Status: No, score=-97.1 required=5.0 tests=BAYES_00,GOOD_FROM_CORINNA_CYGWIN,KAM_DMARC_STATUS,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_FAIL,SPF_HELO_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Brian, On Feb 13 20:37, Corinna Vinschen via Cygwin wrote: > On Feb 13 12:03, Brian Inglis via Cygwin wrote: > > On 2023-02-13 10:43, ASSI via Cygwin wrote: > > > Corinna Vinschen via Cygwin writes: > > > > Can you give me an example? I'm a bit puzzled because fnmatch as well > > > > as glob in Cygwin support native characters. > > > > But not locale dependent named character classes like regexp in paths. > > I checked the dash code of curent dash git, and while its internal glob > implementation supports character classes, they are no localized, using > standard singlebyte functions isalnum, isalpha, etc. under the hood. > > So, yeah, what you say further down this mail... looks like dash > supports locale dependent character classes only with glibc. > [...] > Either way, I don't care much for what a certain application provides by > itself. I'm talking about our libc, that is Cygwin, and what it > provides to processes calling its implementations of regcomp/regexec, > glob and fnmatch. > > All these functions have been taken from FreeBSD and all three suffer > shortcomings: > > - regcomp/regexec supports POSIX named character classes, collating > symbols, and equivalence class expressions, but all of them only work > for ASCII chars. > > - fnmatch and glob support neither of named character classes, > collating symbols, and equivalence class expressions. > > I checked the upstream code in FreeBSD, OpenBSD and NetBSD and none of > these functions are improved to support locales (regcomp) or any of > the character classes stuff (fnmatch/glob). > > So, if we want to add this support to Cygwin (and thus, to all > applications calling the libc implementation of these functions), > quite a bit of work is required. > > Being able to fetch the implementation from some other source > would reduce the effort enourmously :} I took the liberty to add [::] support to Cygwin's fnmatch(3) and glob(3) functions. They also recognize collating symbols [.=]. But the latter two are not implemented yet and fnmatch/glob simply skip them in the pattern. Given that glob and fnmatch use wide characters internally, the support for character classes is internationalized by default, albeit in a slightly differentt way than in glibc. The classes a unicode character belongs to is not locale dependent in Cygwin/newlib. All characters have their classes assigned all the time, so, for instance, the german character 'รค' is lower and alpha even in the en_US.utf8 locale. The currently building cygwin test release 3.5.0-0.174.gd6d4436145b8 contains the new code. Would you mind to build a dash for testing so we can see if and how it works? Thanks, Corinna