public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Mike FABIAN <mfabian@redhat.com>
To: Carlos O'Donell <carlos@redhat.com>
Cc: Florian Weimer <fweimer@redhat.com>,
	 GNU C Library <libc-alpha@sourceware.org>,
	 Rich Felker <dalias@aerifal.cx>,  Zorro Lang <zlang@redhat.com>,
	 "Joseph S. Myers" <joseph@codesourcery.com>,
	 Rafal Luzynski <digitalfreak@lingonborough.com>
Subject: Re: Rational Ranges - Rafal and Mike's opinion? (Bug 23393).
Date: Wed, 25 Jul 2018 15:44:00 -0000	[thread overview]
Message-ID: <s9dmuufuxqc.fsf@taka.site> (raw)
In-Reply-To: <a7b3665a-48e6-1c61-d705-69be173e70b9@redhat.com> (Carlos O'Donell's message of "Mon, 23 Jul 2018 14:09:31 -0400")

Carlos O'Donell <carlos@redhat.com> さんはかきました:

> On 07/23/2018 11:10 AM, Florian Weimer wrote:
>> On 07/20/2018 11:56 PM, Carlos O'Donell wrote:
>>> v2
>>> - Fixed tr_TR by duplicating A-Z rational range.
>>> - Fixed tst-rxspender.
>>> - Fixed bug-regex17.
>>>
>>> Tell me how the new version does.
>> 
>> My tester likes it.  tr_TR.ISO-8859-9 is now fixed.  I added fnmatch
>> support, too, and initial results look good as well.
>
> OK, so we have the capability to deploy rational ranges.
>
> Florian,
>
> Should we do so in 2.28? Avoiding all possible problems in the future
> and making the ranges portable, rational, and safe from a security
> perspective?
>
> Rafal,
>
> As localedata maintainer what is your opinion of changing the meaning
> of [a-z], [A-Z], and [0-9] to be rational ranges for *all* locales
> which mean exactly the latin character sequences you would expect
> e.g. {a,b,c,d,e,f,g,h,i,j,k,l,n,m,o,p,q,r,s,t,u,v,w,x,y,z} for [a-z],
> [A-Z] likewise, and {0,1,2,3,4,5,6,7,8,9}?
>
> Mike,
>
> Same question to you.

I agree that rational ranges are much more useful.

I cannot imagine any use case for [a-z] matching aAbB...z and not Z.

One never knows what [a-z] would match if it uses the locale sort order,
it is just too confusing.

In the long run, I think implementing ranges by code points would be
the best solution and make updates of the iso14651_t1_common file easier
because we need to make less changes to the upstream version of that
file then.

But for 2.28 this cannot be done. Therefore, I think the solution
by Carlos is very good.

> For historical context in gawk:
> https://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html
>
> For context from POSIX:
> http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap09.html
> (see the section on "RE Bracket Expressions").
>
> Support for rational ranges would make [a-z], [A-Z], [0-9] and other subranges
> rational for all locales, and would no longer include mixed case, or accents.
>
> I'd like to year affirmatives from the localedata maintainers on this issue.
>
> Cheers,
> Carlos.

-- 
Mike FABIAN <mfabian@redhat.com>
睡眠不足はいい仕事の敵だ。

  parent reply	other threads:[~2018-07-25 15:44 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-19 19:43 [PATCH] Keep expected behaviour for [a-z] and [A-z] " Carlos O'Donell
2018-07-19 20:39 ` Florian Weimer
2018-07-20 18:49   ` Carlos O'Donell
2018-07-20 19:02     ` Rich Felker
2018-07-20 19:19     ` Florian Weimer
2018-07-20 21:56       ` Carlos O'Donell
2018-07-23 15:11         ` Florian Weimer
2018-07-23 18:09           ` Rational Ranges - Rafal and Mike's opinion? " Carlos O'Donell
2018-07-24 20:45             ` Rafal Luzynski
2018-07-24 20:53               ` Carlos O'Donell
2018-07-24 20:59               ` Carlos O'Donell
2018-07-25 15:44             ` Mike FABIAN [this message]
2018-07-25 15:54           ` [PATCHv3] Expected behaviour for a-z, A-Z, and 0-9 " Carlos O'Donell
2018-07-25 20:19             ` Florian Weimer
2018-07-25 20:25               ` Carlos O'Donell
2018-07-25 20:31                 ` Florian Weimer
2018-07-25 20:57                   ` [PATCHv4] " Carlos O'Donell
2018-07-26  2:34                     ` [PATCHv4a] " Carlos O'Donell
2018-07-26 14:51                       ` Florian Weimer
2018-07-26 14:59                         ` Carlos O'Donell
2018-07-28  1:12                         ` [WIPv5] " Carlos O'Donell
2018-07-30 17:40                           ` Florian Weimer
2018-07-30 17:45                             ` Carlos O'Donell
2018-07-30 17:54                               ` Florian Weimer
2018-07-30 18:26                                 ` Carlos O'Donell
2018-07-30 18:34                                   ` Florian Weimer
2018-07-31  2:18                             ` Carlos O'Donell
2018-07-25 21:06                 ` [PATCHv3] " Rafal Luzynski
2018-07-25 21:12                   ` Carlos O'Donell
2018-07-25 21:35 ` [PATCH] Keep expected behaviour for [a-z] and [A-z] " Carlos O'Donell
2018-07-25 22:50   ` Florian Weimer
2018-07-26  1:20     ` Carlos O'Donell
2018-07-26  8:09       ` Andreas Schwab
2018-07-26  9:16         ` Florian Weimer
2018-07-26  1:33 ` Jonathan Nieder
2018-07-26  1:49   ` Carlos O'Donell
2018-07-26  2:16     ` Jonathan Nieder
2018-07-26  3:48       ` Carlos O'Donell
2018-07-26  7:42       ` Florian Weimer
2018-07-26  8:18         ` Andreas Schwab
2018-07-26  9:15           ` Florian Weimer
2018-07-26 13:25           ` Carlos O'Donell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=s9dmuufuxqc.fsf@taka.site \
    --to=mfabian@redhat.com \
    --cc=carlos@redhat.com \
    --cc=dalias@aerifal.cx \
    --cc=digitalfreak@lingonborough.com \
    --cc=fweimer@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=libc-alpha@sourceware.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).