From: Florian Weimer <fweimer@redhat.com>
To: наб <nabijaczleweli@nabijaczleweli.xyz>
Cc: libc-alpha@sourceware.org, Victor Stinner <vstinner@redhat.com>
Subject: Re: [PATCH v7] POSIX locale covers every byte [BZ# 29511]
Date: Thu, 10 Nov 2022 10:52:10 +0100 [thread overview]
Message-ID: <87tu37uofp.fsf@oldenburg.str.redhat.com> (raw)
In-Reply-To: <20221109161415.eyqgyrp2jlwzfdmb@tarta.nabijaczleweli.xyz> (=?utf-8?B?ItC90LDQsSIncw==?= message of "Wed, 9 Nov 2022 17:14:15 +0100")
* наб:
>> Not sure what is more important here, musl compatibility or Python
>> compatibility. Cc:ing Victor in case he as comments. I should probably
>> ask on the musl list as well as how this divergence came to pass.
> I went for musl because (a) it's a libc not some random programming
> language, (b) putting the end of our domain at the end of the
> surrogates is more aesthetically and ideologically pleasing, and (c)
> there's marginal value of having both musl and glibc produce the same
> characters if you like save them as integers for some reason.
> But the choice of any range therein is pretty much editorial, I think.
Let's wait and see what the musl folks say.
>> This change definitely needs a NEWS entry.
> Something like this?
> Deprecated and removed features, and other changes affecting compatibility:
> * The default/"POSIX"/"C" locale's character set is now "POSIX",
> instead of "ANSI_X3.4-1968" this is a new fully-reversible
> 8-bit transparent encoding for compatibility with Issue 7 TC 2,
“POSIX Issue 7 TC 2”
> identity-mapping bytes in the ASCII [0, 0x7F] range,
> and mapping [0x80, 0xFF] bytes to [<U+DF80>, <U+DFFF>].
It should go into the major new features section, I think.
I would also say that POSIX no longer allows using UTF-8 for the C/POSIX
locale because the obvious question will be “why this custom encoding
and not UTF-8?”. This new POSIX requirement is still a major
disappointment to me.
No need to repost for now.
>> > diff --git a/stdio-common/tst-printf-bz25691.c b/stdio-common/tst-printf-bz25691.c
>> > index 44844e71c3..e66242b58f 100644
>> > --- a/stdio-common/tst-printf-bz25691.c
>> > +++ b/stdio-common/tst-printf-bz25691.c
>> > @@ -30,6 +30,8 @@
>> > static int
>> > do_test (void)
>> > {
>> > + setlocale(LC_CTYPE, "C.UTF-8");
>> > +
>> > mtrace ();
>> >
>> > /* For 's' conversion specifier with 'l' modifier the array must be
>>
>> What's the rationale for this change? If it is really required, you
>> must also update stdio-common/Makefile with a new dependency on
>> $(gen-locales).
> The test depends on the locale having a hole at 0xFF, cf. ll. 93-100:
> /* Same test, but with an invalid multibyte sequence. */
> mbs[mbssize - 2] = 0xff;
>
> ret = swprintf (result, resultsize, L"%.65537s", mbs);
> TEST_COMPARE (ret, -1);
>
> ret = swprintf (result, resultsize, L"%1$.65537s", mbs);
> TEST_COMPARE (ret, -1);
> And this is the simplest way to ensure that, I think.
>
> Dependency added.
Right, makes sense.
Thanks,
Florian
next prev parent reply other threads:[~2022-11-10 9:52 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-30 18:19 [PATCH] " наб
2022-09-06 14:06 ` [PATCH v2] " наб
2022-09-06 14:19 ` [PATCH] " Florian Weimer
2022-09-06 18:06 ` наб
2022-09-06 18:10 ` [PATCH v3 1/2] iconvdata/tst-table-charmap.sh: remove handling of old, borrowed format наб
2022-09-14 2:39 ` [PATCH v4 " наб
2022-09-21 14:01 ` [PATCH v5 " наб
2022-11-02 17:17 ` [PATCH v6 " наб
2022-11-09 12:49 ` Florian Weimer
2022-11-02 17:17 ` [PATCH v6 2/2] POSIX locale covers every byte [BZ# 29511] наб
2022-11-09 14:20 ` Florian Weimer
2022-11-09 16:14 ` [PATCH v7] " наб
2022-11-10 9:52 ` Florian Weimer [this message]
2023-01-09 15:17 ` [PATCH v8] " наб
2023-02-07 14:16 ` [PATCH v9] " наб
2023-02-13 14:52 ` Florian Weimer
2023-04-26 18:54 ` наб
2023-04-26 21:27 ` Florian Weimer
2023-04-27 0:17 ` [PATCH v10] " наб
2023-04-28 15:43 ` [PATCH v11] " наб
2023-05-07 22:53 ` [PATCH v12] " наб
2023-05-29 13:54 ` [PATCH v13] " наб
2022-11-10 8:10 ` [PATCH v6 2/2] " Florian Weimer
2022-11-28 16:24 ` наб
2022-12-02 17:36 ` Florian Weimer
2022-12-02 18:42 ` наб
2022-09-21 14:01 ` [PATCH v5 " наб
2022-09-14 2:39 ` [PATCH v4 " наб
2022-09-06 18:11 ` [PATCH v3 " наб
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tu37uofp.fsf@oldenburg.str.redhat.com \
--to=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
--cc=nabijaczleweli@nabijaczleweli.xyz \
--cc=vstinner@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).