From: Corinna Vinschen <corinna-cygwin@cygwin.com>
To: cygwin@cygwin.com
Cc: Brian Inglis <Brian.Inglis@shaw.ca>
Subject: Re: [ANNOUNCEMENT] Updated: dash 0.5.12-2
Date: Wed, 15 Feb 2023 14:52:23 +0100 [thread overview]
Message-ID: <Y+zjl5E4SsUZpQ4Y@calimero.vinschen.de> (raw)
In-Reply-To: <Y+qRXYAzPKsSHWAy@calimero.vinschen.de>
Hi Brian,
On Feb 13 20:37, Corinna Vinschen via Cygwin wrote:
> On Feb 13 12:03, Brian Inglis via Cygwin wrote:
> > On 2023-02-13 10:43, ASSI via Cygwin wrote:
> > > Corinna Vinschen via Cygwin writes:
> > > > Can you give me an example? I'm a bit puzzled because fnmatch as well
> > > > as glob in Cygwin support native characters.
> >
> > But not locale dependent named character classes like regexp in paths.
>
> I checked the dash code of curent dash git, and while its internal glob
> implementation supports character classes, they are no localized, using
> standard singlebyte functions isalnum, isalpha, etc. under the hood.
>
> So, yeah, what you say further down this mail... looks like dash
> supports locale dependent character classes only with glibc.
> [...]
> Either way, I don't care much for what a certain application provides by
> itself. I'm talking about our libc, that is Cygwin, and what it
> provides to processes calling its implementations of regcomp/regexec,
> glob and fnmatch.
>
> All these functions have been taken from FreeBSD and all three suffer
> shortcomings:
>
> - regcomp/regexec supports POSIX named character classes, collating
> symbols, and equivalence class expressions, but all of them only work
> for ASCII chars.
>
> - fnmatch and glob support neither of named character classes,
> collating symbols, and equivalence class expressions.
>
> I checked the upstream code in FreeBSD, OpenBSD and NetBSD and none of
> these functions are improved to support locales (regcomp) or any of
> the character classes stuff (fnmatch/glob).
>
> So, if we want to add this support to Cygwin (and thus, to all
> applications calling the libc implementation of these functions),
> quite a bit of work is required.
>
> Being able to fetch the implementation from some other source
> would reduce the effort enourmously :}
I took the liberty to add [:<class>:] support to Cygwin's fnmatch(3) and
glob(3) functions. They also recognize collating symbols [.<coll.] and
equivalence class expressions [=<equiv>=]. But the latter two are not
implemented yet and fnmatch/glob simply skip them in the pattern.
Given that glob and fnmatch use wide characters internally, the support
for character classes is internationalized by default, albeit in a
slightly differentt way than in glibc. The classes a unicode character
belongs to is not locale dependent in Cygwin/newlib. All characters
have their classes assigned all the time, so, for instance, the german
character 'ä' is lower and alpha even in the en_US.utf8 locale.
The currently building cygwin test release 3.5.0-0.174.gd6d4436145b8
contains the new code. Would you mind to build a dash for testing so we
can see if and how it works?
Thanks,
Corinna
next prev parent reply other threads:[~2023-02-15 13:52 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-13 5:03 Cygwin dash Co-Maintainer via Cygwin-announce
2023-02-13 9:22 ` Corinna Vinschen
2023-02-13 16:38 ` Corinna Vinschen
2023-02-13 17:43 ` ASSI
2023-02-13 17:48 ` Andrey Repin
2023-02-13 19:03 ` Brian Inglis
2023-02-13 19:37 ` Corinna Vinschen
2023-02-15 13:52 ` Corinna Vinschen [this message]
2023-02-15 14:05 ` Corinna Vinschen
2023-02-15 15:56 ` Andrey Repin
2023-02-15 22:31 ` Brian Inglis
2023-02-16 9:53 ` Corinna Vinschen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+zjl5E4SsUZpQ4Y@calimero.vinschen.de \
--to=corinna-cygwin@cygwin.com \
--cc=Brian.Inglis@shaw.ca \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).