From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id DF30F385E017 for ; Fri, 28 Oct 2022 12:59:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DF30F385E017 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666961960; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gvj8V05cDdz5w5+BAnv/gvCrLjjz9NHqajmW0SLepTo=; b=Gz3b/TzgGlCyQAp2+b8Ih1t+KerYcGLLfFdGqVaHOgfQazeQHpQ3xiyGmEMocKEHMW31iD bTsnaXMb0wX79yrZOH6RkhhKyxj8lSiASbdQ4T0Tvx0uOyZ0qI9OC4pMVcQxBlw1mSdLGT O2JXBj9JKDqEQGCv9ygALlCTLNYHpOc= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-383-aoDGOl_zN1KZUEN79wBW-w-1; Fri, 28 Oct 2022 08:59:19 -0400 X-MC-Unique: aoDGOl_zN1KZUEN79wBW-w-1 Received: by mail-qk1-f199.google.com with SMTP id bl11-20020a05620a1a8b00b006f107ab09dcso3614648qkb.15 for ; Fri, 28 Oct 2022 05:59:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=gvj8V05cDdz5w5+BAnv/gvCrLjjz9NHqajmW0SLepTo=; b=RWVotJBZdo6N9rm7pJAgMxSkHNI/szSXytVEwDxSXZHfwSONaTeoTXd0LAIrtQwEbo 3iQWibMru/8gfE7pvGHcvAPmVHTV/9AJpBS8F9BxA8NYnN5tB2H3rvM8oJSW9Dl//d/Q /TxUqMvqJxwrqNIQT09HV5lxyGzwx2o5lQN12vP2eoz14EwY/OUHBVfgOQxOJc0Jkd2S aA9WbATT0UVR1g4SeGv1/783Wmpg5iMGYcEszUV0n2bNJRmjumlFNG2IY7YWfmosxCu2 OMDK60bHHNrV47Cy2RpOLyi0S1Px98g0zU/3JqluDL8RFjD2SoDYgv8wU883Kp97GB8/ gXSQ== X-Gm-Message-State: ACrzQf2DDOa1nZyPs3U5NY41qLYtsRYd03U2rm9l9HEVyF9QEvMhoTdY jaGyFx3FNjqUOniGkhWn0BYGXgksG6VzH0rGtZoMxa2jF2W/2aOLHCtVYovx9VmsMoSc0QnjNuR 8HPkqbJI= X-Received: by 2002:a05:6214:763:b0:4bb:92b0:3872 with SMTP id f3-20020a056214076300b004bb92b03872mr13097165qvz.42.1666961958619; Fri, 28 Oct 2022 05:59:18 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4ve3eJ6Fgyl+5IIwaai8T++Ggztr+6CTWU1Ayeigs8Y6kbXdNXDbiPNxBA2vYKg5Reh6kSbw== X-Received: by 2002:a05:6214:763:b0:4bb:92b0:3872 with SMTP id f3-20020a056214076300b004bb92b03872mr13097153qvz.42.1666961958448; Fri, 28 Oct 2022 05:59:18 -0700 (PDT) Received: from t14s.localdomain (c-73-69-212-193.hsd1.ma.comcast.net. [73.69.212.193]) by smtp.gmail.com with ESMTPSA id cq9-20020a05622a424900b003a50b9f099esm473085qtb.12.2022.10.28.05.59.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Oct 2022 05:59:17 -0700 (PDT) Message-ID: <66cdd46f6951420cbbee34117ec8870e3ce3e658.camel@redhat.com> Subject: Re: [PATCH v2 2/3] libcpp: add a function to determine UTF-8 validity of a C string From: David Malcolm To: Ben Boeckel , gcc-patches@gcc.gnu.org Cc: jason@redhat.com, nathan@acm.org, fortran@gcc.gnu.org, gcc@gcc.gnu.org, brad.king@kitware.com, mliska@suse.cz, anlauf@gmx.de Date: Fri, 28 Oct 2022 08:59:16 -0400 In-Reply-To: <20221027231645.67623-3-ben.boeckel@kitware.com> References: <20221027231645.67623-1-ben.boeckel@kitware.com> <20221027231645.67623-3-ben.boeckel@kitware.com> User-Agent: Evolution 3.44.4 (3.44.4-1.fc36) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00,BODY_8BITS,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 2022-10-27 at 19:16 -0400, Ben Boeckel wrote: > This simplifies the interface for other UTF-8 validity detections > when a > simple "yes" or "no" answer is sufficient. >=20 > Signed-off-by: Ben Boeckel > --- > =C2=A0libcpp/ChangeLog=C2=A0 |=C2=A0 6 ++++++ > =C2=A0libcpp/charset.cc | 18 ++++++++++++++++++ > =C2=A0libcpp/internal.h |=C2=A0 2 ++ > =C2=A03 files changed, 26 insertions(+) >=20 > diff --git a/libcpp/ChangeLog b/libcpp/ChangeLog > index 4d707277531..4e2c7900ae2 100644 > --- a/libcpp/ChangeLog > +++ b/libcpp/ChangeLog > @@ -1,3 +1,9 @@ > +2022-10-27=C2=A0 Ben Boeckel=C2=A0 > + > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* include/charset.cc: Add `_cp= p_valid_utf8_str` which > determines > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0whether a C string is valid UT= F-8 or not. > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* include/internal.h: Add prot= otype for > `_cpp_valid_utf8_str`. > + > =C2=A02022-10-27=C2=A0 Ben Boeckel=C2=A0 > =C2=A0 > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* include/charset.cc: Rej= ect encodings of codepoints above > 0x10FFFF. The patch looks good to me, with the same potential caveat that you might need to move the ChangeLog entry from the patch "body" to the leading blurb, to satisfy: ./contrib/gcc-changelog/git_check_commit.py Thanks Dave