From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 109203 invoked by alias); 27 Jun 2019 13:25:38 -0000 Mailing-List: contact libc-locales-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: libc-locales-owner@sourceware.org Received: (qmail 108619 invoked by uid 89); 27 Jun 2019 13:25:27 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: =?ISO-8859-1?Q?No, score=0.8 required=5.0 tests=AWL,BAYES_00,BODY_8BITS,GARBLED_BODY,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no version=3.3.1 spammy=intensive, appliances, =d0=a1=d1, summarize?= X-HELO: mail1.protonmail.ch Date: Thu, 27 Jun 2019 13:25:00 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kobylkin.com; s=protonmail; t=1561641916; bh=Wuf9fMEaLnumXKPFy3jYxZJ421TxfvaQaLRgpyW+U9o=; h=Date:To:From:Cc:Reply-To:Subject:Feedback-ID:From; b=MTL+zWYkoiK7iIGAqBxv5chcF4dwmB1sjQBMVowvle2fQYtEzOan44wV9DEeRE7+v KcHZSTvMhjGTQIcNfSxsrA1o7bzVTUS9V5eqLw74eYwwfa1WoQHeKKQ0Y2aVQd2nZH h6QOv8w6EzYkdEGK73RLeo3u74+wOqhJtwDUNARU= To: Rafal Luzynski From: "Diego (Egor) Kobylkin" Cc: Carlos O'Donell , Marko Myllynen , "libc-alpha@sourceware.org" , "libc-locales@sourceware.org" , Siddhesh Poyarekar , Mike Fabian Reply-To: "Diego (Egor) Kobylkin" Subject: [PING^9][PATCH v12] Locales: Cyrillic -> ASCII transliteration [BZ #2872] Message-ID: MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha512; boundary="---------------------6179f7dd02bc8875ea6164bd82186d0f"; charset=UTF-8 X-SW-Source: 2019-q2/txt/msg00117.txt.bz2 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) -----------------------6179f7dd02bc8875ea6164bd82186d0f Content-Type: multipart/mixed;boundary=---------------------97f7c321e56b7008bf13992e8ca46e5f -----------------------97f7c321e56b7008bf13992e8ca46e5f Content-Transfer-Encoding: quoted-printable Content-Type: text/plain;charset=utf-8 Content-length: 3301 ping =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 Original Me= ssage =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 On Monday, June 17, 2019 10:59 AM, Diego (Egor) Kobylkin wrote: >=20 >=20 > Carlos, >=20 > we seem to have a consensus of all involved that the patch can be committ= ed as is. > Do you see it like this on your side as well or are there any more questi= ons or suggestions? >=20 > Bests, > Egor >=20 > P.S. Just a clarification to Rafal points below and thanks @Rafal for the= intensive "peer review" so far! > It definitely looks to me like we finally don't have any more divergent p= oints after all the issues discussed. >=20 > =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 Original = Message =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 > On Tuesday, June 11, 2019 12:40 AM, Rafal Luzynski digitalfreak@lingonbor= ough.com wrote: > ... >=20 > > 7.06.2019 14:59 "Diego (Egor) Kobylkin" egor@kobylkin.com wrote: > >=20 > > > But the target system doesn't support Russian locale and so you must > > > transliterate the filenames. > >=20 > > While talking about the filesystem: I think the problem is not > > that it does not support Russian locale but that it tries to > > handle it and fails at this. If the filesystem accepted any > > byte string as a file name wouldn't it accept a byte string which > > constructs correct Cyrillic characters in UTF-8, without any > > transliteration? >=20 > Just to clarify here - the need to transliterate is the essential part in= this example, not the actual cause of that need. > A lot of "things" don't support UTF-8 or Cyrillic - filesystems, some UNI= X power tools, older network appliances, databases, key-value stores etc. W= e are talking about a situation where you are forced to transliterate to AS= CII. So that requirement is a given. >=20 > ... >=20 > > > In glibc we don't have any framework for an intelligent conversion. > > > We would have to write specific code to handle this case and add > > > it into the translit code for special handling in this case. > >=20 > > My suggestion was to add such an intelligent conversion. The rule > > should be simple: if a letter is followed by a lowercase it should > > be a titlecase (Sh), otherwise it should be uppercase (SH). But > > this may break Egor's requirement to keep them always uppercase. >=20 > Again for the record my "requirement" is to have a minimal patch committe= d sooner than later. It turned out surprisingly difficult to keep our focus= even on a single flat mapping table that the ASCII transliteration really = is. >=20 > > > I think we should today leave "=D0=A8"->"SH" and "=D0=A1=D1=85"->"Sh"= , since it's > > > the most conservative position that avoids ambiguity, and then we > > > can discuss the aesthetics of this and the other impacts and solution= s. > > > I appreciate Rafal's position, but I think being conservative here, > > > even if it's not as pretty as uconv, is a good guiding idea. > >=20 > > Just to summarize: if you want to apply the relaxed rules, more > > technical than linguistic, then I am more willing to accept these > > patches. >=20 > The great thing is that we seem to have a consensus now and can proceed. -----------------------97f7c321e56b7008bf13992e8ca46e5f Content-Type: application/pgp-keys; filename="publickey - egor@kobylkin.com - 0x01FEB4E8.asc"; name="publickey - egor@kobylkin.com - 0x01FEB4E8.asc" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="publickey - egor@kobylkin.com - 0x01FEB4E8.asc"; name="publickey - egor@kobylkin.com - 0x01FEB4E8.asc" Content-length: 891 LS0tLS1CRUdJTiBQR1AgUFVCTElDIEtFWSBCTE9DSy0tLS0tDQpWZXJzaW9u OiBPcGVuUEdQLmpzIHY0LjUuMQ0KQ29tbWVudDogaHR0cHM6Ly9vcGVucGdw anMub3JnDQoNCnhqTUVYTGN4NkJZSkt3WUJCQUhhUnc4QkFRZEFUYVpYRStO US9ZYXJYRk9jTEhJQk9DSWJ6TXNnNXpQZQ0KSTZ5VzR4OHBQVlhOSnlKbFoy OXlRR3R2WW5sc2EybHVMbU52YlNJZ1BHVm5iM0pBYTI5aWVXeHJhVzR1DQpZ Mjl0UHNKM0JCQVdDZ0FmQlFKY3R6SG9CZ3NKQndnREFnUVZDQW9DQXhZQ0FR SVpBUUliQXdJZUFRQUsNCkNSQStPcVNEZ0FHcG9acmVBUDlOTUdxMXZ1UVJi Y1hBbGhZbStvRU9XMGVWYXRyK0RJcDRBdGJoYzdkZw0KUUFFQXA1NjBKMFEz RHpmK1BKY1pDdFBHeERlOWZWVkZyelBYUzN3MTBYN00wd2ZPT0FSY3R6SG9F Z29yDQpCZ0VFQVpkVkFRVUJBUWRBb2RSbXRLSDkwV0ZMZzlwTHloS0c2b0Rv ZWpIdWhjOEd0eTROSXlhRUxtd0QNCkFRZ0h3bUVFR0JZSUFBa0ZBbHkzTWVn Q0d3d0FDZ2tRUGpxa2c0QUJxYUVtc2dFQTZnSWdWQ29jMVp0cw0KWWMyNVh6 MEtVWXNuMWtPNEZxZmwyd2pQNzVUYkxYZ0EvQW9odWdlc2xXZVFsRTdUQ2Fh U3hFV0RXL2xYDQo4SmRlTEo4dFlIZFEvNU1MDQo9T0JwMQ0KLS0tLS1FTkQg UEdQIFBVQkxJQyBLRVkgQkxPQ0stLS0tLQ0K -----------------------97f7c321e56b7008bf13992e8ca46e5f-- -----------------------6179f7dd02bc8875ea6164bd82186d0f Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" Content-length: 249 -----BEGIN PGP SIGNATURE----- Version: ProtonMail Comment: https://protonmail.com wl4EARYKAAYFAl0Uw7kACgkQPjqkg4ABqaFMUAD/bz554J3EHr7zeKWZ7KqN 0YoujgYMu8FelsZcWq5VTQsA/0Eolhm9nTrnwRboBugE/zZASDUvRpgwfyO6 4QI+WngH =YuSU -----END PGP SIGNATURE----- -----------------------6179f7dd02bc8875ea6164bd82186d0f--