From: Carlos O'Donell <carlos@redhat.com>
To: Florian Weimer <fw@deneb.enyo.de>, Andreas Schwab <schwab@suse.de>
Cc: Mike FABIAN <mfabian@redhat.com>, libc-alpha@sourceware.org
Subject: Re: Is it OK to write ASCII strings directly into locale source files?
Date: Mon, 24 Jul 2017 20:07:00 -0000 [thread overview]
Message-ID: <e43a088a-cb33-c322-7587-c20d993e7fa6@redhat.com> (raw)
In-Reply-To: <87h8y13gvb.fsf@mid.deneb.enyo.de>
On 07/24/2017 01:05 PM, Florian Weimer wrote:
> * Andreas Schwab:
>
>> On Jul 24 2017, Carlos O'Donell <carlos@redhat.com> wrote:
>>
>>> So let us start slowly and agree with 'ASCII - [<>]' where < denotes
>>> the start of a code point and > the end of the code point.
>>
>> POSIX says "character in the portable character set" if you want to keep
>> it portable.
>
> But our locales only have to be compatible with our localedef, right?
Should developers be able to write tools to the POSIX locale spec and parse
our source locale definitions? Supporting more than just GNU/Linux? Do the
BSDs share our locale definitions?
> I know that the FSF does not claim copyright on our locales, so anyone
> is free to take them and use them with their own non-GNU systems (or
> sell them as PDFs/books). But this does not mean we have to make
> their lives easier if it comes at a cost to us (e.g., verifying that
> we only use the portable character set, or refraining from using full
> UTF-8 at a future date).
I agree with your sentiment, and leave it up to Mike to decide what makes
it ultimately easier for him as a subsystem maintainer to work with. There
is certainly a cost/reward balance.
My only technical objection with writing straight UTF-8 is that it could
lead to more mistakes, and Mike just found one in CLDR where an Arabic
Farsi character was used incorrectly because it displayed the same glyph.
It was caught when harmonizing with glibc where you have to write out the
code points (Mike filed a bug upstream with CLDR).
My preference would be to start small, start using the POSIX portable
character set to it's maximum extent for all latin-based languages, see
how that works out, and then decide if we even need to pursue full UTF-8
and in which form.
--
Cheers,
Carlos.
next prev parent reply other threads:[~2017-07-24 19:00 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-24 13:13 Mike FABIAN
2017-07-24 13:28 ` Carlos O'Donell
2017-07-24 13:32 ` Mike FABIAN
2017-07-24 14:47 ` Carlos O'Donell
2017-07-24 15:03 ` Mike FABIAN
2017-07-24 15:45 ` Carlos O'Donell
2017-07-24 22:39 ` Rafal Luzynski
2017-07-24 22:55 ` Carlos O'Donell
2017-07-24 14:49 ` Andreas Schwab
2017-07-24 15:07 ` Carlos O'Donell
2017-07-24 17:07 ` Florian Weimer
2017-07-24 20:07 ` Carlos O'Donell [this message]
2017-07-24 22:34 ` Florian Weimer
2017-07-24 22:51 ` Rafal Luzynski
2017-07-25 5:40 ` Carlos O'Donell
2017-07-25 6:27 ` Mike FABIAN
2017-07-25 12:48 ` Carlos O'Donell
2017-07-25 14:21 ` Florian Weimer
2017-07-25 14:37 ` Carlos O'Donell
2017-07-25 19:05 ` Florian Weimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e43a088a-cb33-c322-7587-c20d993e7fa6@redhat.com \
--to=carlos@redhat.com \
--cc=fw@deneb.enyo.de \
--cc=libc-alpha@sourceware.org \
--cc=mfabian@redhat.com \
--cc=schwab@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).