From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by sourceware.org (Postfix) with ESMTPS id 930B53858422 for ; Mon, 16 Oct 2023 15:02:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 930B53858422 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 930B53858422 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::62d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697468555; cv=none; b=M5+IQz2w2FkKI0iWIEFcfRMSk3Qf8IqosdGJbXK8m/GDGHZD7d3rbHwqRA/rv8aEJQs2WoGOk+Atngg3uunnG0djs8cwEHHQNIDNutqumFSBCEqad08RsudGISFXkRPtZuYhOwkIevohERmwvXhJqwVJaBc2U18BSuyL4IUH6pM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697468555; c=relaxed/simple; bh=tOz/purD50KJTmhnB1VvKscJxF7E8bu5EN6SEo7pqWo=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:From:To; b=muJz9YH2uk8qQyvz2zSoESbM9GA3O2n+tb9NgwUGAf3f1nnr2yh153EkK08AXqvL7us/HNbaOgaOaqw2496v3Xvkn6+yqnidBwE/tzal9SyzzN/ci8Hzk4l7fzQiAq1pLMOQW8r4md5vUWDBVtnVHNoMEZOeLH+uyE2dhJVb914= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-9bf0ac97fdeso293951166b.2 for ; Mon, 16 Oct 2023 08:02:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; t=1697468552; x=1698073352; darn=gcc.gnu.org; h=content-transfer-encoding:in-reply-to:references:cc:to:from :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=X1FeBICAPyw85RXkX2kMQCC3kMY9A9tLHAddG9GXfJ8=; b=KNP2dNi708c4/FOegE7fyQlcCc6cQm6HYKEqChpk3kLbsbtGjySZEYh2psF4IHfkQY L9mcuMbvoHTOlhwvgYPPRB7V8JXcJlPraWta/LSH4napmLf4yuJOGsyqXJvKrKeMkoWp /S8ufCKQ8RUynwY9ruXfIILHPT9S4AqjDSFO6cAo7YKC3m2yBHkHLrQBp/k/dfLuxCJT 5/qqiUgjVu8wpGeK4F0sEO5d3jhZfVTaDdFzJvlnUNGzvj0AAEjFq9lnO2FnQrA0eJ0b 5u4DtN5t3d8bh754+zuF2s3LKvfAuIKNWXJ2EXZhDv0XsquUijZ++O8+OFQU/su8LCWk dvUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697468552; x=1698073352; h=content-transfer-encoding:in-reply-to:references:cc:to:from :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=X1FeBICAPyw85RXkX2kMQCC3kMY9A9tLHAddG9GXfJ8=; b=bucjGapmgXPRahR86viL7+KMnwgSrbNLneRA+Y9567a8d6X4uaWgMax3J8ddSobzQN eX4GV078kV6UuGCCR4uinx/ZMngpD3tIkH2RSUeRmiwgcKfEu6IIQCealh9azIk5AMCS O3aQ2FVsUxFRRfjgqks45NMiuZcFuSlLvTwqbUYKbjQ9rkMPj67IIsF2PVyuseDluIzU OIwR8KQkSnCvHqCmJsB9iloS+njyKyTGwvqtyX8JtXdX/7QKv15/W3xy0R53Bc/tX7u6 RiaKiHffFRdhqq336ingzsahVyf03qpQ7Uw5Gfznnk0F5fDQfNcho+MdZnC74giktI5+ npyQ== X-Gm-Message-State: AOJu0YxHyhp+Y7ADFDezpZnPAMUhVZ2jwzl6aue88izmLUeUvf5kn3dm 3wd12ipqVsBXqhxnAds9KtF/yCk/fV4ju2AsDc2+ X-Google-Smtp-Source: AGHT+IEM9Ip90Jwxa6FwfjQ17ZEiObdhhTrc4W9NpmFqet4I+GWhf07CeGJtjoEr6gs3ZSP7s+IScQ== X-Received: by 2002:a17:907:c24:b0:9be:45b3:1c3b with SMTP id ga36-20020a1709070c2400b009be45b31c3bmr7902163ejc.60.1697468551743; Mon, 16 Oct 2023 08:02:31 -0700 (PDT) Received: from ?IPV6:2a04:cec2:24:170a:75f5:b210:cbd0:e304? ([2a04:cec2:24:170a:75f5:b210:cbd0:e304]) by smtp.gmail.com with ESMTPSA id y25-20020a1709063a9900b009ae3e6c342asm4249013ejd.111.2023.10.16.08.02.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 16 Oct 2023 08:02:31 -0700 (PDT) Message-ID: Date: Mon, 16 Oct 2023 17:02:27 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3] libcpp: add function to check XID properties Content-Language: en-US From: Arthur Cohen To: gcc-patches@gcc.gnu.org Cc: gcc-rust@gcc.gnu.org, dmalcolm@redhat.com, Raiki Tamura , tom@tromey.com References: <20230908145908.915341-1-arthur.cohen@embecosm.com> In-Reply-To: <20230908145908.915341-1-arthur.cohen@embecosm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-14.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Ping? Best, Arthur On 9/8/23 16:59, Arthur Cohen wrote: > From: Raiki Tamura > > Fixed to include the enum's name which I had forgotten to commit. > > Thanks > > ---- > > This commit adds a new function intended for checking the XID properties > of a possibly unicode character, as well as the accompanying enum > describing the possible properties. > > libcpp/ChangeLog: > > * charset.cc (cpp_check_xid_property): New. > * include/cpplib.h > (cpp_check_xid_property): New. > (enum cpp_xid_property): New. > > Signed-off-by: Raiki Tamura > --- > libcpp/charset.cc | 36 ++++++++++++++++++++++++++++++++++++ > libcpp/include/cpplib.h | 7 +++++++ > 2 files changed, 43 insertions(+) > > diff --git a/libcpp/charset.cc b/libcpp/charset.cc > index 7b625c9956a..a92ba75539e 100644 > --- a/libcpp/charset.cc > +++ b/libcpp/charset.cc > @@ -1256,6 +1256,42 @@ _cpp_uname2c_uax44_lm2 (const char *name, size_t len, char *canon_name) > return result; > } > > +/* Returns flags representing the XID properties of the given codepoint. */ > +unsigned int > +cpp_check_xid_property (cppchar_t c) > +{ > + // fast path for ASCII > + if (c < 0x80) > + { > + if (('A' <= c && c <= 'Z') || ('a' <= c && c <= 'z')) > + return CPP_XID_START | CPP_XID_CONTINUE; > + if (('0' <= c && c <= '9') || c == '_') > + return CPP_XID_CONTINUE; > + } > + > + if (c > UCS_LIMIT) > + return 0; > + > + int mn, mx, md; > + mn = 0; > + mx = ARRAY_SIZE (ucnranges) - 1; > + while (mx != mn) > + { > + md = (mn + mx) / 2; > + if (c <= ucnranges[md].end) > + mx = md; > + else > + mn = md + 1; > + } > + > + unsigned short flags = ucnranges[mn].flags; > + > + if (flags & CXX23) > + return CPP_XID_START | CPP_XID_CONTINUE; > + if (flags & NXX23) > + return CPP_XID_CONTINUE; > + return 0; > +} > > /* Returns 1 if C is valid in an identifier, 2 if C is valid except at > the start of an identifier, and 0 if C is not valid in an > diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h > index fcdaf082b09..583e3071e90 100644 > --- a/libcpp/include/cpplib.h > +++ b/libcpp/include/cpplib.h > @@ -1606,4 +1606,11 @@ bool cpp_valid_utf8_p (const char *data, size_t num_bytes); > bool cpp_is_combining_char (cppchar_t c); > bool cpp_is_printable_char (cppchar_t c); > > +enum cpp_xid_property { > + CPP_XID_START = 1, > + CPP_XID_CONTINUE = 2 > +}; > + > +unsigned int cpp_check_xid_property (cppchar_t c); > + > #endif /* ! LIBCPP_CPPLIB_H */