From: Bruno Haible <bruno@clisp.org>
To: cygwin@cygwin.com, Brian Inglis <Brian.Inglis@shaw.ca>
Subject: Re: character class "alpha"
Date: Mon, 31 Jul 2023 23:37:08 +0200 [thread overview]
Message-ID: <18620212.dDkQJl9nhx@nimes> (raw)
In-Reply-To: <223e3d56-1a63-57ef-5236-bc1df37716a0@Shaw.ca>
Brian Inglis wrote:
> It seems to me that most application developers needing to support
> non-Western-European languages might want a non-POSIX interpretation of digits.
Sure. GNU libunistring has dedicated API for this:
- https://www.gnu.org/software/libunistring/manual/html_node/Object-oriented-API.html
UC_DECIMAL_DIGIT_NUMBER.
- https://www.gnu.org/software/libunistring/manual/html_node/Decimal-digit-value.html
- https://www.gnu.org/software/libunistring/manual/html_node/Digit-value.html
- https://www.gnu.org/software/libunistring/manual/html_node/Properties-as-objects.html
UC_PROPERTY_DECIMAL_DIGIT
- https://www.gnu.org/software/libunistring/manual/html_node/Properties-as-functions.html
uc_is_property_decimal_digit
I'm sure ICU4C has similar APIs too.
> Are the Unicode character attribute classes supported for those application use
> cases that need more than POSIX limitations allow?
POSIX allows the libc to define additional character classes. But these will be
platform and locale dependent, and I don't know of any application which makes
use of such additional character classes via wctype() and iswctype().
> I know that I sometimes want to see some alternative numeric digit forms and
> expect to be able to find those with an appropriate grep expression.
I think you can do so with GNU 'grep', when it was built with PCRE support.
PCRE includes support for Unicode character classes.
<https://www.pcre.org/current/doc/html/pcre2pattern.html>
Bruno
next prev parent reply other threads:[~2023-07-31 21:37 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-27 10:15 fnmatch improvements Bruno Haible
2023-07-27 18:24 ` Corinna Vinschen
2023-07-27 19:05 ` Corinna Vinschen
2023-07-27 20:25 ` Brian Inglis
2023-07-27 21:22 ` Bruno Haible
2023-07-27 22:17 ` Brian Inglis
2023-07-28 9:00 ` Corinna Vinschen
2023-07-28 9:53 ` Corinna Vinschen
2023-07-27 21:40 ` Bruno Haible
2023-07-28 8:53 ` Corinna Vinschen
2023-07-28 10:56 ` Bruno Haible
2023-07-28 11:14 ` Corinna Vinschen
2023-07-28 18:59 ` Corinna Vinschen
2023-07-28 19:33 ` Bruno Haible
2023-07-28 19:54 ` GB18030 locale Bruno Haible
2023-07-29 9:23 ` Corinna Vinschen
2023-07-29 9:53 ` Bruno Haible
2023-07-31 10:07 ` Corinna Vinschen
2023-07-31 13:38 ` Corinna Vinschen
2023-07-31 14:06 ` character class "alpha" Bruno Haible
2023-07-31 17:46 ` Corinna Vinschen
2023-07-31 18:20 ` Corinna Vinschen
2023-07-31 18:43 ` Bruno Haible
2023-07-31 21:12 ` Corinna Vinschen
2023-08-01 16:29 ` Brian Inglis
2023-08-02 7:56 ` Corinna Vinschen
2023-08-02 15:06 ` Corinna Vinschen
2023-07-31 21:13 ` Brian Inglis
2023-07-31 21:37 ` Bruno Haible [this message]
2023-07-28 11:12 ` fnmatch improvements Corinna Vinschen
2023-07-28 11:22 ` Bruno Haible
2023-07-28 21:42 ` Bill Stewart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=18620212.dDkQJl9nhx@nimes \
--to=bruno@clisp.org \
--cc=Brian.Inglis@shaw.ca \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).