From: Carlos O'Donell <carlos@redhat.com>
To: Florian Weimer <fweimer@redhat.com>,
GNU C Library <libc-alpha@sourceware.org>,
Rich Felker <dalias@aerifal.cx>, Mike Fabian <mfabian@redhat.com>,
Zorro Lang <zlang@redhat.com>,
"Joseph S. Myers" <joseph@codesourcery.com>
Subject: Re: [PATCH] Keep expected behaviour for [a-z] and [A-z] (Bug 23393).
Date: Thu, 26 Jul 2018 01:20:00 -0000 [thread overview]
Message-ID: <1313f0d2-8c64-8ec0-ef09-cd39bd6d4416@redhat.com> (raw)
In-Reply-To: <646a94c8-3b25-b65e-7fc7-0637e58cacc1@redhat.com>
On 07/25/2018 06:50 PM, Florian Weimer wrote:
> On 07/25/2018 11:35 PM, Carlos O'Donell wrote:
>> I have committed only the most conservative fix for this issue,
>> which is to deinterlace the lower and upper case ranges.
>>
>> I think we are too late to commit rational ranges, and we can do
>> that in 2.29 when it opens. Right now I want to remove the blocker
>> that is causing regressions for en_US.UTF-8 scripts that use [a-z],
>> and [A-Z].
>
> How is this the most conservative fix, relative to glibc 2.27
> upstream?
We have two solutions to fix the regression:
* Revert the entire ISO 14651 udpate.
- This is 13 commits for just the update.
- Several more commits for Rafal and Mike's work on locales on top of that.
* Fix the key issue of a-z interleaving with A-Z.
My opinion is that is most conservative to fix the interleaving.
In 2.27 we accepted 297 characters between A-Z.
In 2.28 we accept 2280 characters between A-Z as part of the ISO 14651 update.
> [a-z] still matches lots of non-ASCII characters, which it did not
> before.
This is not true, we were already matching 297 characters between A-Z
in 2.27. It has always been the case that we accepted non-ASCII characters
in the range. With the ISO 14651 update the *key* issue was that lowercase
and uppercase were now mixed in collation element ordering, resulting in
surprising matches and failures like the reported xfs test failure where
[a-z] matched "Makefile" and broke their test infrastructure.
> When I meant that we left regression-fixing territory, I was talking
> about the locales which had iso14651_t1_common customizations.
OK, so to be clear you think we *should* go forward with rational ranges?
I don't think it's too late, we could commit it tomorrow, it should not
impact machine testing in way.
My v4 fixes all of the locales that either have customizations on
iso14651_t1_common or have their own custom locales. No more locales
remain to be fixed, I tested all of them with tst-fnmatch.input additions
to catch the ones that needed fixing.
Cheers,
Carlos.
next prev parent reply other threads:[~2018-07-26 1:20 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-19 19:43 Carlos O'Donell
2018-07-19 20:39 ` Florian Weimer
2018-07-20 18:49 ` Carlos O'Donell
2018-07-20 19:02 ` Rich Felker
2018-07-20 19:19 ` Florian Weimer
2018-07-20 21:56 ` Carlos O'Donell
2018-07-23 15:11 ` Florian Weimer
2018-07-23 18:09 ` Rational Ranges - Rafal and Mike's opinion? " Carlos O'Donell
2018-07-24 20:45 ` Rafal Luzynski
2018-07-24 20:53 ` Carlos O'Donell
2018-07-24 20:59 ` Carlos O'Donell
2018-07-25 15:44 ` Mike FABIAN
2018-07-25 15:54 ` [PATCHv3] Expected behaviour for a-z, A-Z, and 0-9 " Carlos O'Donell
2018-07-25 20:19 ` Florian Weimer
2018-07-25 20:25 ` Carlos O'Donell
2018-07-25 20:31 ` Florian Weimer
2018-07-25 20:57 ` [PATCHv4] " Carlos O'Donell
2018-07-26 2:34 ` [PATCHv4a] " Carlos O'Donell
2018-07-26 14:51 ` Florian Weimer
2018-07-26 14:59 ` Carlos O'Donell
2018-07-28 1:12 ` [WIPv5] " Carlos O'Donell
2018-07-30 17:40 ` Florian Weimer
2018-07-30 17:45 ` Carlos O'Donell
2018-07-30 17:54 ` Florian Weimer
2018-07-30 18:26 ` Carlos O'Donell
2018-07-30 18:34 ` Florian Weimer
2018-07-31 2:18 ` Carlos O'Donell
2018-07-25 21:06 ` [PATCHv3] " Rafal Luzynski
2018-07-25 21:12 ` Carlos O'Donell
2018-07-25 21:35 ` [PATCH] Keep expected behaviour for [a-z] and [A-z] " Carlos O'Donell
2018-07-25 22:50 ` Florian Weimer
2018-07-26 1:20 ` Carlos O'Donell [this message]
2018-07-26 8:09 ` Andreas Schwab
2018-07-26 9:16 ` Florian Weimer
2018-07-26 1:33 ` Jonathan Nieder
2018-07-26 1:49 ` Carlos O'Donell
2018-07-26 2:16 ` Jonathan Nieder
2018-07-26 3:48 ` Carlos O'Donell
2018-07-26 7:42 ` Florian Weimer
2018-07-26 8:18 ` Andreas Schwab
2018-07-26 9:15 ` Florian Weimer
2018-07-26 13:25 ` Carlos O'Donell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1313f0d2-8c64-8ec0-ef09-cd39bd6d4416@redhat.com \
--to=carlos@redhat.com \
--cc=dalias@aerifal.cx \
--cc=fweimer@redhat.com \
--cc=joseph@codesourcery.com \
--cc=libc-alpha@sourceware.org \
--cc=mfabian@redhat.com \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).