public inbox for cygwin@cygwin.com
 help / color / mirror / Atom feed
From: Bruno Haible <bruno@clisp.org>
To: cygwin@cygwin.com, Brian Inglis <Brian.Inglis@shaw.ca>
Subject: Re: character class "alpha"
Date: Mon, 31 Jul 2023 23:37:08 +0200	[thread overview]
Message-ID: <18620212.dDkQJl9nhx@nimes> (raw)
In-Reply-To: <223e3d56-1a63-57ef-5236-bc1df37716a0@Shaw.ca>

Brian Inglis wrote:
> It seems to me that most application developers needing to support 
> non-Western-European languages might want a non-POSIX interpretation of digits.

Sure. GNU libunistring has dedicated API for this:
  - https://www.gnu.org/software/libunistring/manual/html_node/Object-oriented-API.html
    UC_DECIMAL_DIGIT_NUMBER.
  - https://www.gnu.org/software/libunistring/manual/html_node/Decimal-digit-value.html
  - https://www.gnu.org/software/libunistring/manual/html_node/Digit-value.html
  - https://www.gnu.org/software/libunistring/manual/html_node/Properties-as-objects.html
    UC_PROPERTY_DECIMAL_DIGIT
  - https://www.gnu.org/software/libunistring/manual/html_node/Properties-as-functions.html
    uc_is_property_decimal_digit

I'm sure ICU4C has similar APIs too.

> Are the Unicode character attribute classes supported for those application use 
> cases that need more than POSIX limitations allow?

POSIX allows the libc to define additional character classes. But these will be
platform and locale dependent, and I don't know of any application which makes
use of such additional character classes via wctype() and iswctype().

> I know that I sometimes want to see some alternative numeric digit forms and 
> expect to be able to find those with an appropriate grep expression.

I think you can do so with GNU 'grep', when it was built with PCRE support.
PCRE includes support for Unicode character classes.
<https://www.pcre.org/current/doc/html/pcre2pattern.html>

Bruno




  reply	other threads:[~2023-07-31 21:37 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-27 10:15 fnmatch improvements Bruno Haible
2023-07-27 18:24 ` Corinna Vinschen
2023-07-27 19:05   ` Corinna Vinschen
2023-07-27 20:25     ` Brian Inglis
2023-07-27 21:22       ` Bruno Haible
2023-07-27 22:17         ` Brian Inglis
2023-07-28  9:00           ` Corinna Vinschen
2023-07-28  9:53             ` Corinna Vinschen
2023-07-27 21:40     ` Bruno Haible
2023-07-28  8:53       ` Corinna Vinschen
2023-07-28 10:56         ` Bruno Haible
2023-07-28 11:14           ` Corinna Vinschen
2023-07-28 18:59           ` Corinna Vinschen
2023-07-28 19:33             ` Bruno Haible
2023-07-28 19:54             ` GB18030 locale Bruno Haible
2023-07-29  9:23               ` Corinna Vinschen
2023-07-29  9:53                 ` Bruno Haible
2023-07-31 10:07                   ` Corinna Vinschen
2023-07-31 13:38                     ` Corinna Vinschen
2023-07-31 14:06                       ` character class "alpha" Bruno Haible
2023-07-31 17:46                         ` Corinna Vinschen
2023-07-31 18:20                           ` Corinna Vinschen
2023-07-31 18:43                             ` Bruno Haible
2023-07-31 21:12                               ` Corinna Vinschen
2023-08-01 16:29                                 ` Brian Inglis
2023-08-02  7:56                                   ` Corinna Vinschen
2023-08-02 15:06                                     ` Corinna Vinschen
2023-07-31 21:13                               ` Brian Inglis
2023-07-31 21:37                                 ` Bruno Haible [this message]
2023-07-28 11:12         ` fnmatch improvements Corinna Vinschen
2023-07-28 11:22           ` Bruno Haible
2023-07-28 21:42           ` Bill Stewart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18620212.dDkQJl9nhx@nimes \
    --to=bruno@clisp.org \
    --cc=Brian.Inglis@shaw.ca \
    --cc=cygwin@cygwin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).