public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Carlos O'Donell <carlos@redhat.com>
To: Florian Weimer <fw@deneb.enyo.de>
Cc: Andreas Schwab <schwab@suse.de>, Mike FABIAN <mfabian@redhat.com>,
	libc-alpha@sourceware.org
Subject: Re: Is it OK to write ASCII strings directly into locale source files?
Date: Tue, 25 Jul 2017 05:40:00 -0000	[thread overview]
Message-ID: <7fa0552d-c24b-3c5c-cad3-1359eb4dd6bd@redhat.com> (raw)
In-Reply-To: <87379lczdi.fsf@mid.deneb.enyo.de>

On 07/24/2017 05:13 PM, Florian Weimer wrote:
>> My only technical objection with writing straight UTF-8 is that it could
>> lead to more mistakes, and Mike just found one in CLDR where an Arabic
>> Farsi character was used incorrectly because it displayed the same glyph.
>> It was caught when harmonizing with glibc where you have to write out the
>> code points (Mike filed a bug upstream with CLDR).
> 
> Wasn't it caught by locale testing which revealed that the locale
> wasn't compatible with ISO-8859-6?  That sanity check would still
> apply to locale definitions written in UTF-8.

My point was that the mistake was made in CLDR upstream where I only
presume the mistake was made because the glyphs are identical.

If we had not been using ISO-8859-6, or if we'd had a mapping from
all the UTF-8 chars into ISO-8859-6 (there was no transliteration for the
Farsi character), then we would not have noticed the error in the 
original source locale.

My only argument is that when you are forced to use <Uxxx> encoding it
is empirically less likely you'll make a mistake. Like reading a sentence
backwards to catch errors since it prevents your brain from filling in
the missing information.

> I would still prefer the <U…> encoding for control characters which
> are in the portable character set.  So I have to object to the
> “maximum” part. :)

Yes, I had ignored the control characters, so I agree, not maximally :}

-- 
Cheers,
Carlos.

  parent reply	other threads:[~2017-07-24 22:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-24 13:13 Mike FABIAN
2017-07-24 13:28 ` Carlos O'Donell
2017-07-24 13:32   ` Mike FABIAN
2017-07-24 14:47     ` Carlos O'Donell
2017-07-24 15:03       ` Mike FABIAN
2017-07-24 15:45         ` Carlos O'Donell
2017-07-24 22:39       ` Rafal Luzynski
2017-07-24 22:55         ` Carlos O'Donell
2017-07-24 14:49   ` Andreas Schwab
2017-07-24 15:07     ` Carlos O'Donell
2017-07-24 17:07     ` Florian Weimer
2017-07-24 20:07       ` Carlos O'Donell
2017-07-24 22:34         ` Florian Weimer
2017-07-24 22:51           ` Rafal Luzynski
2017-07-25  5:40           ` Carlos O'Donell [this message]
2017-07-25  6:27             ` Mike FABIAN
2017-07-25 12:48               ` Carlos O'Donell
2017-07-25 14:21                 ` Florian Weimer
2017-07-25 14:37                   ` Carlos O'Donell
2017-07-25 19:05                     ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7fa0552d-c24b-3c5c-cad3-1359eb4dd6bd@redhat.com \
    --to=carlos@redhat.com \
    --cc=fw@deneb.enyo.de \
    --cc=libc-alpha@sourceware.org \
    --cc=mfabian@redhat.com \
    --cc=schwab@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).