public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Carlos O'Donell <carlos@redhat.com>
To: Florian Weimer <fweimer@redhat.com>,
	GNU C Library <libc-alpha@sourceware.org>,
	Rich Felker <dalias@aerifal.cx>, Mike Fabian <mfabian@redhat.com>,
	Zorro Lang <zlang@redhat.com>,
	"Joseph S. Myers" <joseph@codesourcery.com>
Subject: Re: [PATCH] Keep expected behaviour for [a-z] and [A-z] (Bug 23393).
Date: Thu, 26 Jul 2018 01:20:00 -0000	[thread overview]
Message-ID: <1313f0d2-8c64-8ec0-ef09-cd39bd6d4416@redhat.com> (raw)
In-Reply-To: <646a94c8-3b25-b65e-7fc7-0637e58cacc1@redhat.com>

On 07/25/2018 06:50 PM, Florian Weimer wrote:
> On 07/25/2018 11:35 PM, Carlos O'Donell wrote:
>> I have committed only the most conservative fix for this issue,
>> which is to deinterlace the lower and upper case ranges.
>> 
>> I think we are too late to commit rational ranges, and we can do
>> that in 2.29 when it opens. Right now I want to remove the blocker
>> that is causing regressions for en_US.UTF-8 scripts that use [a-z],
>> and [A-Z].
> 
> How is this the most conservative fix, relative to glibc 2.27
> upstream?

We have two solutions to fix the regression:

* Revert the entire ISO 14651 udpate.
  - This is 13 commits for just the update.
  - Several more commits for Rafal and Mike's work on locales on top of that.

* Fix the key issue of a-z interleaving with A-Z.

My opinion is that is most conservative to fix the interleaving.

In 2.27 we accepted 297 characters between A-Z.

In 2.28 we accept 2280 characters between A-Z as part of the ISO 14651 update.
 
> [a-z] still matches lots of non-ASCII characters, which it did not
> before.

This is not true, we were already matching 297 characters between A-Z
in 2.27. It has always been the case that we accepted non-ASCII characters
in the range. With the ISO 14651 update the *key* issue was that lowercase
and uppercase were now mixed in collation element ordering, resulting in
surprising matches and failures like the reported xfs test failure where
[a-z] matched "Makefile" and broke their test infrastructure.
 
> When I meant that we left regression-fixing territory, I was talking
> about the locales which had iso14651_t1_common customizations.

OK, so to be clear you think we *should* go forward with rational ranges?

I don't think it's too late, we could commit it tomorrow, it should not
impact machine testing in way.

My v4 fixes all of the locales that either have customizations on
iso14651_t1_common or have their own custom locales. No more locales
remain to be fixed, I tested all of them with tst-fnmatch.input additions
to catch the ones that needed fixing.

Cheers,
Carlos.

  reply	other threads:[~2018-07-26  1:20 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-19 19:43 Carlos O'Donell
2018-07-19 20:39 ` Florian Weimer
2018-07-20 18:49   ` Carlos O'Donell
2018-07-20 19:02     ` Rich Felker
2018-07-20 19:19     ` Florian Weimer
2018-07-20 21:56       ` Carlos O'Donell
2018-07-23 15:11         ` Florian Weimer
2018-07-23 18:09           ` Rational Ranges - Rafal and Mike's opinion? " Carlos O'Donell
2018-07-24 20:45             ` Rafal Luzynski
2018-07-24 20:53               ` Carlos O'Donell
2018-07-24 20:59               ` Carlos O'Donell
2018-07-25 15:44             ` Mike FABIAN
2018-07-25 15:54           ` [PATCHv3] Expected behaviour for a-z, A-Z, and 0-9 " Carlos O'Donell
2018-07-25 20:19             ` Florian Weimer
2018-07-25 20:25               ` Carlos O'Donell
2018-07-25 20:31                 ` Florian Weimer
2018-07-25 20:57                   ` [PATCHv4] " Carlos O'Donell
2018-07-26  2:34                     ` [PATCHv4a] " Carlos O'Donell
2018-07-26 14:51                       ` Florian Weimer
2018-07-26 14:59                         ` Carlos O'Donell
2018-07-28  1:12                         ` [WIPv5] " Carlos O'Donell
2018-07-30 17:40                           ` Florian Weimer
2018-07-30 17:45                             ` Carlos O'Donell
2018-07-30 17:54                               ` Florian Weimer
2018-07-30 18:26                                 ` Carlos O'Donell
2018-07-30 18:34                                   ` Florian Weimer
2018-07-31  2:18                             ` Carlos O'Donell
2018-07-25 21:06                 ` [PATCHv3] " Rafal Luzynski
2018-07-25 21:12                   ` Carlos O'Donell
2018-07-25 21:35 ` [PATCH] Keep expected behaviour for [a-z] and [A-z] " Carlos O'Donell
2018-07-25 22:50   ` Florian Weimer
2018-07-26  1:20     ` Carlos O'Donell [this message]
2018-07-26  8:09       ` Andreas Schwab
2018-07-26  9:16         ` Florian Weimer
2018-07-26  1:33 ` Jonathan Nieder
2018-07-26  1:49   ` Carlos O'Donell
2018-07-26  2:16     ` Jonathan Nieder
2018-07-26  3:48       ` Carlos O'Donell
2018-07-26  7:42       ` Florian Weimer
2018-07-26  8:18         ` Andreas Schwab
2018-07-26  9:15           ` Florian Weimer
2018-07-26 13:25           ` Carlos O'Donell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1313f0d2-8c64-8ec0-ef09-cd39bd6d4416@redhat.com \
    --to=carlos@redhat.com \
    --cc=dalias@aerifal.cx \
    --cc=fweimer@redhat.com \
    --cc=joseph@codesourcery.com \
    --cc=libc-alpha@sourceware.org \
    --cc=mfabian@redhat.com \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).