From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by sourceware.org (Postfix) with ESMTPS id 8F2A33858D33 for ; Mon, 25 Mar 2024 20:18:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8F2A33858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=systematicsw.ab.ca Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=systematicsw.ab.ca ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8F2A33858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=216.40.44.14 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711397938; cv=none; b=FJinGQDzihM2mr1Qm/lIIWCALr9Y8dIZGbr3GEI0aFpP/k328mY0+6VeEabsXHkBRA8lgiiHdeVEctq9cvrUO0e5AHVsWTaaX6P/7uGAt/01rYwzmLca8auGBsYmsIh/T57MCDqgzwbHTMxsVDxvXbN8EnFde6Tvh1T9/gFnCXE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711397938; c=relaxed/simple; bh=D/R2xSOcPk4zkVwO49vLxfH5v02shZLf9cvbs2F658I=; h=Message-ID:Date:MIME-Version:From:Subject:To; b=SrRduZ/z1Vn5FG5ftA+FiVeHR3aro/l+FBVJ6vwelLYIytASsD6yDSlikESnOHYHntaJ+Wj+oSdZ5uYgktRQRhq+l4moG/9aw0CHt2RvsuGsVoRBvD2Xuv8x6/1YGJT2fg9iyoPrQlHUuXsrBrzEF6RjVx/pWbK3O4u8+Yqhzlg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from omf02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D5253C074E for ; Mon, 25 Mar 2024 20:18:55 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: Brian.Inglis@SystematicSW.ab.ca) by omf02.hostedemail.com (Postfix) with ESMTPA id 592A180011 for ; Mon, 25 Mar 2024 20:18:53 +0000 (UTC) Message-ID: <000faa1d-91bf-4d90-9e4e-138c4bf889c0@systematicsw.ab.ca> Date: Mon, 25 Mar 2024 14:18:52 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: brian.inglis@systematicsw.ab.ca Reply-To: newlib@sourceware.org Subject: Re: wctomb() accepts out-of-range character in C-locale Content-Language: en-CA To: newlib@sourceware.org References: <7028441.Tto2BC3hUo@nimes> <5DC0BA8B-0B0C-4C91-8F35-C11ACE3E9EF9@kba.biglobe.ne.jp> Autocrypt: addr=brian.inglis@systematicsw.ab.ca; keydata= xjMEXopx8xYJKwYBBAHaRw8BAQdAnCK0qv/xwUCCZQoA9BHRYpstERrspfT0NkUWQVuoePbN LkJyaWFuIEluZ2xpcyA8QnJpYW4uSW5nbGlzQFN5c3RlbWF0aWNTdy5hYi5jYT7ClgQTFggA PhYhBMM5/lbU970GBS2bZB62lxu92I8YBQJeinHzAhsDBQkJZgGABQsJCAcCBhUKCQgLAgQW AgMBAh4BAheAAAoJEB62lxu92I8Y0ioBAI8xrggNxziAVmr+Xm6nnyjoujMqWcq3oEhlYGAO WacZAQDFtdDx2koSVSoOmfaOyRTbIWSf9/Cjai29060fsmdsDM44BF6KcfMSCisGAQQBl1UB BQEBB0Awv8kHI2PaEgViDqzbnoe8B9KMHoBZLS92HdC7ZPh8HQMBCAfCfgQYFggAJhYhBMM5 /lbU970GBS2bZB62lxu92I8YBQJeinHzAhsMBQkJZgGAAAoJEB62lxu92I8YZwUBAJw/74rF IyaSsGI7ewCdCy88Lce/kdwX7zGwid+f8NZ3AQC/ezTFFi5obXnyMxZJN464nPXiggtT9gN5 RSyTY8X+AQ== Organization: Systematic Software In-Reply-To: <5DC0BA8B-0B0C-4C91-8F35-C11ACE3E9EF9@kba.biglobe.ne.jp> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: pafqbjot6wq33k7o4tywhncq7uokghj6 X-Rspamd-Server: rspamout07 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Rspamd-Queue-Id: 592A180011 X-Session-Marker: 427269616E2E496E676C69734053797374656D6174696353572E61622E6361 X-Session-ID: U2FsdGVkX18dDGUWXt0X2oZC8C65V5QxCsu7HLDiO2g= X-HE-Tag: 1711397933-301674 X-HE-Meta: U2FsdGVkX19XTDVUiND7kjcBsy57TouXaWwZaFVhExBAypLCUVzKbR6rPLErj2VISoCN4NXYgVpXGjqjzI5p6rQp1Ns1+UwqpCaj4Vohgg1S6v9jrYxyVj5C7RTRrbBzbNs5H04rG9wxShhWEHTu4eglJNFHJF73cSzrtONNr3ysx53Vj5zf1l3+8cqkAcB5PaTmnjU+tYsI89SCm9IcvWuSYSgJOcSG7CS7nlQUepr6FHvwx94543aAZ0os50GFIwCzzifpssSbd0DwiegDIxn1Bx3gstLJ2/Jnybi8oV0B5htlaaBCOpxSQ6AEBifC2vB/mnSX/GQ//SAun+wt6jqblqyncIdF566sHIRwHb29ZytamdwBsQmWy9ZbpAgb21Uw27ZfkpSOVik57v+WFkvydj34RayciBMNFYUt/KB7QSd4RxzeJ9GyCdyB3+wwobfeX+IL1gebAMuXb+VZVp2u5PW2LuDp8ZwXLZsc8YUu2ZvzZsALmcMcueZglfzL X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2024-03-25 08:07, Jun. T wrote: > >> 2024/03/25 20:26, Bruno Haible wrote: >> >>>> But a wide character >= 0x80 can't be converted into a valid >>>> character in C-loccale (7bit), I think. >> >> Err. "C" locale, a.k.a. "POSIX" locale, is not 7-bit but 8-bit. >> Quoting https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/basedefs/V1_chap06.html#tag_06_02 : >> "The POSIX locale shall contain 256 single-byte characters ..." > > I still can't understand why it is useful to convert wide char > in the range 0x80-0xff to an 8bit char in C-locale (for example > convert wide char 0xe1 (U+00e1) = á to an 8bit char 0xe1). Before Unicode, UCS, and UTF character sets, European Single Byte Character Sets such as ISO-8859-* were used for Latin script based languages, including most programming languages, with accented characters mainly in the high half, and supported (most of) the POSIX character set; whereas Arabic, Cyrillic, Greek, Hebrew, other Asian and Indian, and CJK Han script based languages used some local SBCS, fuller featured Double Byte Character Sets, and Multi Byte Character Sets, some of which supported (parts of) the POSIX character set, and used shift characters to switch to characters encoded using the second and other bytes. For more info see https://en.wikipedia.org/wiki/SBCS and linked articles. > But if you say this is THE correct behavior then it's OK. POSIX says it, so by definition, it's OK! ;^> -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry