From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [IPv6:2a00:1450:4864:20::52e]) by sourceware.org (Postfix) with ESMTPS id 953E93858C20; Sat, 18 Mar 2023 08:31:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 953E93858C20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x52e.google.com with SMTP id h8so28717524ede.8; Sat, 18 Mar 2023 01:31:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679128318; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KNNCEWgSawYJvVp1+2U6z1zF3RZo9+kFgwonWvH+YEE=; b=kp6JNf/bkOhtsLSHWRHHjUeAHpqQnGTbuhXIvih9nk5vmimOeFQqAQUMSXfmap0jCm e3wGoEf2rjlI9M05fifPhoQQjYuMTLFy1Hk0zmzvUvS4pJ56Y0H8XrrXe4075M8qdpyX IDs7XEqkvJiJzXFAl089z6OmtKf4Kso65U/BqnFVdAbqfVtoLXWS0ZnMinZclHSp7GCK c4VleHdmw1IXZGvcqJjecPhUyMxBe/x57Ev847fRPB5fcbt7PymGoIz8Ij9tCmNjrfHI ZcpZG82kDNgVXKxo6cLrbPylj7IQwjqNZDAdYm3/V2dMV2SuaOAFgQCfN/PoiN0z76L+ FGCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679128318; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=KNNCEWgSawYJvVp1+2U6z1zF3RZo9+kFgwonWvH+YEE=; b=C4RqDB1jYpa1trv7tRklmP6dO94BXXubaMOFITqQvQDG/DULAclOvDN1hPdRUwG+ew 98gjcVJcxKMfau98baIqhvMlgdg3fdEy9ASH16rU/3QoVmFCyeSmJnoaKfh5K1I1WpOf FXENIiPyWWBeRFfVdGd/VJA4HLsQR31Kv9WhxhTnhO5Sd3JrEPRoNE9A5wydcXWBHhqO PoAMTU6rNSLjzuTGHD/ChNCToWwvlrbTUpryt7JzOKMRSrkEnHFLuNtbcSzw/+X0k7a9 162S9p+gVASen/Y7ewomSn04PXbOTmAYvFIk3C/oMBPlU9+lQMj/jeceBLreqTz+8t93 dfJA== X-Gm-Message-State: AO0yUKXyCz5sDU2piaLNcmw2l0B8cKwoJI0xKK6Vml69JHTkSUHxxj18 Y45/ws3rZ/41YBUNn2lVIlhGcBsHjmJ6EIMeLz8= X-Google-Smtp-Source: AK7set/0qGf6Al+lLeBkiH+HQidt2AbmZWkSk/X75keHC8Wd7n29+kf+U8Fb0Z5uKPxtOmljD4wSoEWPtPCGEj+k6GU= X-Received: by 2002:a50:a456:0:b0:4fb:9f5:b994 with SMTP id v22-20020a50a456000000b004fb09f5b994mr3184364edb.0.1679128318003; Sat, 18 Mar 2023 01:31:58 -0700 (PDT) MIME-Version: 1.0 References: <87lejxujso.fsf@euler.schwinge.homeip.net> In-Reply-To: From: Raiki Tamura Date: Sat, 18 Mar 2023 17:31:44 +0900 Message-ID: Subject: Re: [GSoC] gccrs Unicode support To: Mark Wielaard Cc: Thomas Schwinge , Jakub Jelinek , Philip Herron , gcc@gcc.gnu.org, gcc-rust@gcc.gnu.org, David Edelsohn , Arthur Cohen , =?UTF-8?Q?Arsen_Arsenovi=C4=87?= Content-Type: multipart/alternative; boundary="000000000000ed4da805f7288625" X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000ed4da805f7288625 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you everyone for your advice. Some kinds of names are restricted to unicode alphabetic/numeric in Rust. And the current definition of the table defined in libcpp/ucind.h lacks some rows representing which characters are alphabetic/numeric. But it is not a problem because it seems to be easy to add missing rows to the table and use it in the Rust frontend. 2023=E5=B9=B43=E6=9C=8816=E6=97=A5(=E6=9C=A8) 21:59 Mark Wielaard : > You might want to research whether NFC normalization of identifiers is > required to be done by the lexer or parser in Rust and how it interacts > with proc macros. Yes, NFC normalization must be done by the lexer, which may be complex and hard to implement. libunistring can also be used for normalization, so is it good to use libunistring only in the normalization process? Raiki Tamura --000000000000ed4da805f7288625--