public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Carlos O'Donell <carlos@redhat.com>
To: Florian Weimer <fw@deneb.enyo.de>, Andreas Schwab <schwab@suse.de>
Cc: Mike FABIAN <mfabian@redhat.com>, libc-alpha@sourceware.org
Subject: Re: Is it OK to write ASCII strings directly into locale source files?
Date: Mon, 24 Jul 2017 20:07:00 -0000	[thread overview]
Message-ID: <e43a088a-cb33-c322-7587-c20d993e7fa6@redhat.com> (raw)
In-Reply-To: <87h8y13gvb.fsf@mid.deneb.enyo.de>

On 07/24/2017 01:05 PM, Florian Weimer wrote:
> * Andreas Schwab:
> 
>> On Jul 24 2017, Carlos O'Donell <carlos@redhat.com> wrote:
>>
>>> So let us start slowly and agree with 'ASCII - [<>]' where < denotes
>>> the start of a code point and > the end of the code point.
>>
>> POSIX says "character in the portable character set" if you want to keep
>> it portable.
> 
> But our locales only have to be compatible with our localedef, right?

Should developers be able to write tools to the POSIX locale spec and parse
our source locale definitions? Supporting more than just GNU/Linux? Do the
BSDs share our locale definitions?

> I know that the FSF does not claim copyright on our locales, so anyone
> is free to take them and use them with their own non-GNU systems (or
> sell them as PDFs/books).  But this does not mean we have to make
> their lives easier if it comes at a cost to us (e.g., verifying that
> we only use the portable character set, or refraining from using full
> UTF-8 at a future date). 
I agree with your sentiment, and leave it up to Mike to decide what makes
it ultimately easier for him as a subsystem maintainer to work with. There
is certainly a cost/reward balance.

My only technical objection with writing straight UTF-8 is that it could
lead to more mistakes, and Mike just found one in CLDR where an Arabic
Farsi character was used incorrectly because it displayed the same glyph.
It was caught when harmonizing with glibc where you have to write out the
code points (Mike filed a bug upstream with CLDR).

My preference would be to start small, start using the POSIX portable
character set to it's maximum extent for all latin-based languages, see
how that works out, and then decide if we even need to pursue full UTF-8
and in which form.

-- 
Cheers,
Carlos.

  reply	other threads:[~2017-07-24 19:00 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-24 13:13 Mike FABIAN
2017-07-24 13:28 ` Carlos O'Donell
2017-07-24 13:32   ` Mike FABIAN
2017-07-24 14:47     ` Carlos O'Donell
2017-07-24 15:03       ` Mike FABIAN
2017-07-24 15:45         ` Carlos O'Donell
2017-07-24 22:39       ` Rafal Luzynski
2017-07-24 22:55         ` Carlos O'Donell
2017-07-24 14:49   ` Andreas Schwab
2017-07-24 15:07     ` Carlos O'Donell
2017-07-24 17:07     ` Florian Weimer
2017-07-24 20:07       ` Carlos O'Donell [this message]
2017-07-24 22:34         ` Florian Weimer
2017-07-24 22:51           ` Rafal Luzynski
2017-07-25  5:40           ` Carlos O'Donell
2017-07-25  6:27             ` Mike FABIAN
2017-07-25 12:48               ` Carlos O'Donell
2017-07-25 14:21                 ` Florian Weimer
2017-07-25 14:37                   ` Carlos O'Donell
2017-07-25 19:05                     ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e43a088a-cb33-c322-7587-c20d993e7fa6@redhat.com \
    --to=carlos@redhat.com \
    --cc=fw@deneb.enyo.de \
    --cc=libc-alpha@sourceware.org \
    --cc=mfabian@redhat.com \
    --cc=schwab@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).