From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from tarta.nabijaczleweli.xyz (unknown [139.28.40.42]) by sourceware.org (Postfix) with ESMTP id B37BA3858D20 for ; Wed, 12 Jul 2023 20:22:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B37BA3858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=nabijaczleweli.xyz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nabijaczleweli.xyz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nabijaczleweli.xyz; s=202305; t=1689193332; bh=gRVIdvj2o48jNuV3cuz6vdYcrlr8a0j1VxUc2Y019Dg=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=UxFc73lsV4xsqMjaj+G3zj/qAop9BDy7jrz9hPfjfdFz+h6DbzOtfce8T7CJXtvTu CCtTbzkdVvhYL8ZwxxkEQYoZp47OlMWcgadCm5GJzFQsIh6KG5sQL33DE9IxQYSlMr sz5mGoS4rt35FSRxEcz7xUxFVDbSxgbRlDymfq6vQkIgYvtnNGCpeli855FDlugJya KXRmj0AJRekkH1CqK7mC375BMUnSDrH01dYAYeFTNSj7zJFwAOBRIFrNoW2mYAsfar Yiy6qY0cnfJPr8kLWKYDNdbnzmjYmZCdJZgu0g4YEz633DDPzb4A94xP+SxBYRFizf uxDc+iU1H2DFg== Received: from tarta.nabijaczleweli.xyz (unknown [192.168.1.250]) by tarta.nabijaczleweli.xyz (Postfix) with ESMTPSA id EB5E83308; Wed, 12 Jul 2023 22:22:11 +0200 (CEST) Date: Wed, 12 Jul 2023 22:22:10 +0200 From: =?utf-8?B?0L3QsNCx?= To: Bruno Haible Cc: libc-alpha@sourceware.org Subject: Re: [PATCH v16] POSIX locale covers every byte [BZ# 29511] Message-ID: <6row5r5lyev35lddwxlukyopx5j5faxaqlw2audnolca55zrhc@fvrom2u2wmaj> References: <4881032.NnENhoQgcM@nimes> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="ings7xh4z3ejiacz" Content-Disposition: inline In-Reply-To: <4881032.NnENhoQgcM@nimes> User-Agent: NeoMutt/20230517 X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_INFOUSMEBIZ,RDNS_DYNAMIC,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --ings7xh4z3ejiacz Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi! On Wed, Jul 12, 2023 at 09:44:26PM +0200, Bruno Haible wrote: > Regarding the mapping of the bytes 0x80..0xFF: > > By strategically picking c=3D we land at the same point of the > > Unicode Low Surrogate Area at DC00-DCFF, described as > > > Isolated surrogate code points have no interpretation; > > > consequently, no character code charts or names lists > > > are provided for this range. > > as the Python UTF-8 errors=3Dsurrogateescape encoding. > musl libc maps the bytes 0x80..0xFF to U+DF80..U+DFFF. [1][2] >=20 > I think it is more useful to avoid an inconsistency between glibc and > musl libc, than to be consistent with what a particular user-space > program (Python) does. >=20 > How about mapping the bytes 0x80..0xFF to U+DF80..U+DFFF, like musl libc > does? That's what I had done originally (and citing the same exact reasons!), but changed it in v10 https://sourceware.org/pipermail/libc-alpha/2023-April/147652.html because Florian likes it better. He forwarded it to musl@ https://www.openwall.com/lists/musl/2022/11/10/1 in v6 https://sourceware.org/pipermail/libc-alpha/2022-December/143690.html In short: python uses the DCxx range, musl put it at DFxx for no particular reason but decided to not move it because that would imply some sort of stability or semantic meaning. I personally like DFxx more but don't really care, so Reviewer's Privilege of Mostly-Arbitrary Design Choice. I can change it back for the same reason, but I'd rather do it, uh. Once. Don't wanna be ping-ponging this. --ings7xh4z3ejiacz Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfWlHToQCjFzAxEFjvP0LAY0mWPEFAmSvC3AACgkQvP0LAY0m WPHsWxAAj0REUmPGgtvl6v9syZH1ytQPsTlLvLlU0ZjRPDWzsyLzPcxdT/oEs0he BwuUNyEqQjKrsFXPjspoyE+Zri9u6f5dDB4lWUUyWbYOWRlPHKKBCKFN5EcrX2aN IqHByQixcAhPN1Kk2vD6g5UkyqTwGt0h4IC9+t4MOIvZQoNEOJ2QBFCMw1aQ6wb9 tHeRryPxbuTazKn3Eyw+Fhj06BsSV7F//HBxE8CtE7g2wH0iuCkwltXIV+LwAujw YScdhgphAfOmTelZNjCWPzdyCZvuj8l5bevklUWJoQgOZ9pyLIaf3c53IoL+ZeP9 EkaPLJ+QzKsnUfGLlRS2Xw1xHtRcCmA3F/pGCdsCD2kRwB9cGKiwug7QfOZEJtnO f8FJyhmqCIwmO+JjoBDHYotJ1Hw1MMSsx2FD6k+PwKxwjdFUwWezAfNwpx20dthA Wq83gb4zcQo0+PdlHebwhL+Q9KRQZgCJ+59Ny03HS7xSnUrGGY3WjveCM/FX83RC rGzujvN190AizH9Xf1q7WWcuBJmtnK6At16xMJga2VKGuvdfhFhiPn+Kv9RaoFcF HsR6LGRrlerq+UlDEITH5v45HbV/xEwARUnrqIy58Ds1o1Fk4GrlWmB0hPM1cDx5 t6D/Mt7NIYBPCgfftkYsJLr6qI6qk0HdBAkJCVNhP1NN+1Qpgow= =p3LX -----END PGP SIGNATURE----- --ings7xh4z3ejiacz--