From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 24C753858D1E for ; Mon, 13 Feb 2023 15:53:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 24C753858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1676303600; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=n36t/0mevb3I9VnEPeRwKge7idJcs7qFu3m+Gkd4OnU=; b=anibcFhJ9+3lQNa4dsfokEQ9w59B7Fhj+E7/0SzW3BeN9DI3FF7+NkM0nZVEW0lQ02WOte SSUuScPloXjxsN48QwbnDmdvNOjAv13otyP6pKVz5JcR1Ftpq8A91y44DS/HwZeCyLntGU N8r1o/5R5gsgd+tsxX3e57IbWVPk0YY= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-65-RagkKFVJPy-pAe3v6fYuJg-1; Mon, 13 Feb 2023 10:53:19 -0500 X-MC-Unique: RagkKFVJPy-pAe3v6fYuJg-1 Received: by mail-qv1-f71.google.com with SMTP id pv24-20020ad45498000000b0056ea549d728so2511122qvb.10 for ; Mon, 13 Feb 2023 07:53:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:from:content-language:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=h1rcvEPoo8jcMNyflGCgsnCOHI70xUzqZ/kxVfRtP/4=; b=OacPBXKJnDVF6uHr0J+8jBT8J2HPKuLhxcRNL1NJGTywK0voVWtSLyTcXdBBP7iaBP RUgm6htGwnLI77RXT9gQXU4+JtZQesjNSayRGsLY82serCZVawYPgGZrvCqVrFs8pu7l xafyPrdY7u3+AARYR/+zqZ0eRaPPLk2V1GuApuSE/aCzZd+f/hkYGoCx58T+537vjfB+ naW6J/ybDQ6v44F5zYPS1HSfvUaRpgr+yoSGI+EaUnoJcRxQBwEq+qp0Rd3gzgyjtHxN yTHu7eOB/9b3/NfQ2w7J31sx1Uj7RmLiqjcm6UMTpmjNO5U8yW0hqQQq9CjHQTWFmuiN l+HQ== X-Gm-Message-State: AO0yUKWKJ4upJN0mAIYwOj44V64cMd9hj/UroYVGB0G7eYdZEPb/jge3 VcKY7ZJ/Oa+YwocbeEzkiPJ+gbI+ZP7iulgfT0HbT498nZ8/aOpFCs+X/oXAltqTivvxn1WfMN7 EsfEJkaNknUKB X-Received: by 2002:a05:6214:dce:b0:56e:b2ad:e2e0 with SMTP id 14-20020a0562140dce00b0056eb2ade2e0mr7185076qvt.30.1676303598785; Mon, 13 Feb 2023 07:53:18 -0800 (PST) X-Google-Smtp-Source: AK7set/mGoxtEpMg30n8tEaQwAEgUngmdjLJuIMGABC3x12ZT/bQdL1KptRKvDdrw0ZZyLMhIZmv9w== X-Received: by 2002:a05:6214:dce:b0:56e:b2ad:e2e0 with SMTP id 14-20020a0562140dce00b0056eb2ade2e0mr7185041qvt.30.1676303598478; Mon, 13 Feb 2023 07:53:18 -0800 (PST) Received: from [192.168.1.108] (130-44-159-43.s15913.c3-0.arl-cbr1.sbo-arl.ma.cable.rcncustomer.com. [130.44.159.43]) by smtp.gmail.com with ESMTPSA id t3-20020a379103000000b0070617deb4b7sm9915914qkd.134.2023.02.13.07.53.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 13 Feb 2023 07:53:17 -0800 (PST) Message-ID: <6427dfd9-9ccd-c313-9251-75b9de8bc0af@redhat.com> Date: Mon, 13 Feb 2023 10:53:17 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.2 Subject: Re: [PATCH v5 1/5] libcpp: reject codepoints above 0x10FFFF To: Ben Boeckel , gcc-patches@gcc.gnu.org Cc: nathan@acm.org, fortran@gcc.gnu.org, gcc@gcc.gnu.org, brad.king@kitware.com References: <20230125210636.2960049-1-ben.boeckel@kitware.com> <20230125210636.2960049-2-ben.boeckel@kitware.com> From: Jason Merrill In-Reply-To: <20230125210636.2960049-2-ben.boeckel@kitware.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: multipart/mixed; boundary="------------H3o0qCDNqpFadzkdgEyim0ik" Content-Language: en-US X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multi-part message in MIME format. --------------H3o0qCDNqpFadzkdgEyim0ik Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 1/25/23 13:06, Ben Boeckel wrote: > Unicode does not support such values because they are unrepresentable in > UTF-16. > > libcpp/ > > * charset.cc: Reject encodings of codepoints above 0x10FFFF. > UTF-16 does not support such codepoints and therefore all > Unicode rejects such values. It seems that this causes a bunch of testsuite failures from tests that expect this limit to be checked elsewhere with a different diagnostic, so I think the easiest thing is to fold this into _cpp_valid_utf8_str instead, i.e.: Make sense? Jason --------------H3o0qCDNqpFadzkdgEyim0ik Content-Type: text/x-patch; charset=UTF-8; name="0001-libcpp-add-a-function-to-determine-UTF-8-validity-of.patch" Content-Disposition: attachment; filename*0="0001-libcpp-add-a-function-to-determine-UTF-8-validity-of.pa"; filename*1="tch" Content-Transfer-Encoding: base64 RnJvbSAyOTZlOWQxZTE2NTMzOTc5ZDEyYmQ5OGRiMjkzN2UzOTZhMDc5NmYzIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBCZW4gQm9lY2tlbCA8YmVuLmJvZWNrZWxAa2l0d2FyZS5jb20+ CkRhdGU6IFNhdCwgMTAgRGVjIDIwMjIgMTc6MjA6NDkgLTA1MDAKU3ViamVjdDogW1BBVENIXSBs aWJjcHA6IGFkZCBhIGZ1bmN0aW9uIHRvIGRldGVybWluZSBVVEYtOCB2YWxpZGl0eSBvZiBhIEMK IHN0cmluZwpUbzogZ2NjLXBhdGNoZXNAZ2NjLmdudS5vcmcKClRoaXMgc2ltcGxpZmllcyB0aGUg aW50ZXJmYWNlIGZvciBvdGhlciBVVEYtOCB2YWxpZGl0eSBkZXRlY3Rpb25zIHdoZW4gYQpzaW1w bGUgInllcyIgb3IgIm5vIiBhbnN3ZXIgaXMgc3VmZmljaWVudC4KCmxpYmNwcC8KCgkqIGNoYXJz ZXQuY2M6IEFkZCBgX2NwcF92YWxpZF91dGY4X3N0cmAgd2hpY2ggZGV0ZXJtaW5lcyB3aGV0aGVy CglhIEMgc3RyaW5nIGlzIHZhbGlkIFVURi04IG9yIG5vdC4KCSogaW50ZXJuYWwuaDogQWRkIHBy b3RvdHlwZSBmb3IgYF9jcHBfdmFsaWRfdXRmOF9zdHJgLgoKU2lnbmVkLW9mZi1ieTogQmVuIEJv ZWNrZWwgPGJlbi5ib2Vja2VsQGtpdHdhcmUuY29tPgotLS0KIGxpYmNwcC9pbnRlcm5hbC5oIHwg IDIgKysKIGxpYmNwcC9jaGFyc2V0LmNjIHwgMjQgKysrKysrKysrKysrKysrKysrKysrKysrCiAy IGZpbGVzIGNoYW5nZWQsIDI2IGluc2VydGlvbnMoKykKCmRpZmYgLS1naXQgYS9saWJjcHAvaW50 ZXJuYWwuaCBiL2xpYmNwcC9pbnRlcm5hbC5oCmluZGV4IDk3MjQ2NzZhOGNkLi40ODUyMDkwMWIy ZCAxMDA2NDQKLS0tIGEvbGliY3BwL2ludGVybmFsLmgKKysrIGIvbGliY3BwL2ludGVybmFsLmgK QEAgLTgzNCw2ICs4MzQsOCBAQCBleHRlcm4gYm9vbCBfY3BwX3ZhbGlkX3V0ZjggKGNwcF9yZWFk ZXIgKnBmaWxlLAogCQkJICAgICBzdHJ1Y3Qgbm9ybWFsaXplX3N0YXRlICpuc3QsCiAJCQkgICAg IGNwcGNoYXJfdCAqY3ApOwogCitleHRlcm4gYm9vbCBfY3BwX3ZhbGlkX3V0Zjhfc3RyIChjb25z dCBjaGFyICpzdHIpOworCiBleHRlcm4gdm9pZCBfY3BwX2Rlc3Ryb3lfaWNvbnYgKGNwcF9yZWFk ZXIgKik7CiBleHRlcm4gdW5zaWduZWQgY2hhciAqX2NwcF9jb252ZXJ0X2lucHV0IChjcHBfcmVh ZGVyICosIGNvbnN0IGNoYXIgKiwKIAkJCQkJICB1bnNpZ25lZCBjaGFyICosIHNpemVfdCwgc2l6 ZV90LApkaWZmIC0tZ2l0IGEvbGliY3BwL2NoYXJzZXQuY2MgYi9saWJjcHAvY2hhcnNldC5jYwpp bmRleCAzYzQ3ZDRmODY4Yi4uNDJhMWI1OTZjMDYgMTAwNjQ0Ci0tLSBhL2xpYmNwcC9jaGFyc2V0 LmNjCisrKyBiL2xpYmNwcC9jaGFyc2V0LmNjCkBAIC0xODY0LDYgKzE4NjQsMzAgQEAgX2NwcF92 YWxpZF91dGY4IChjcHBfcmVhZGVyICpwZmlsZSwKICAgcmV0dXJuIHRydWU7CiB9CiAKKy8qICBE ZXRlY3Qgd2hldGhlciBhIEMtc3RyaW5nIGlzIGEgdmFsaWQgVVRGLTgtZW5jb2RlZCBzZXQgb2Yg Ynl0ZXMuIFJldHVybnMKKyAgICBgZmFsc2VgIGlmIGFueSBjb250YWluZWQgYnl0ZSBzZXF1ZW5j ZSBlbmNvZGVzIGFuIGludmFsaWQgVW5pY29kZSBjb2RlcG9pbnQKKyAgICBvciBpcyBub3QgYSB2 YWxpZCBVVEYtOCBzZXF1ZW5jZS4gUmV0dXJucyBgdHJ1ZWAgb3RoZXJ3aXNlLiAqLworCitleHRl cm4gYm9vbAorX2NwcF92YWxpZF91dGY4X3N0ciAoY29uc3QgY2hhciAqbmFtZSkKK3sKKyAgY29u c3QgdWNoYXIqIGluID0gKGNvbnN0IHVjaGFyKiluYW1lOworICBzaXplX3QgbGVuID0gc3RybGVu IChuYW1lKTsKKyAgY3BwY2hhcl90IGNwOworCisgIHdoaWxlICgqaW4pCisgICAgeworICAgICAg aWYgKG9uZV91dGY4X3RvX2NwcGNoYXIgKCZpbiwgJmxlbiwgJmNwKSkKKwlyZXR1cm4gZmFsc2U7 CisKKyAgICAgIC8qIG9uZV91dGY4X3RvX2NwcGNoYXIgZG9lc24ndCBjaGVjayB0aGlzIGxpbWl0 LiAgKi8KKyAgICAgIGlmIChjcCA+IFVDU19MSU1JVCkKKwlyZXR1cm4gZmFsc2U7CisgICAgfQor CisgIHJldHVybiB0cnVlOworfQorCiAvKiBTdWJyb3V0aW5lIG9mIGNvbnZlcnRfaGV4IGFuZCBj b252ZXJ0X29jdC4gIE4gaXMgdGhlIHJlcHJlc2VudGF0aW9uCiAgICBpbiB0aGUgZXhlY3V0aW9u IGNoYXJhY3RlciBzZXQgb2YgYSBudW1lcmljIGVzY2FwZTsgd3JpdGUgaXQgaW50byB0aGUKICAg IHN0cmluZyBidWZmZXIgVEJVRiBhbmQgdXBkYXRlIHRoZSBlbmQtb2Ytc3RyaW5nIHBvaW50ZXIg dGhlcmVpbi4gIFdJREUKLS0gCjIuMzEuMQoK --------------H3o0qCDNqpFadzkdgEyim0ik--