From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id E5D353858D39 for ; Wed, 15 Mar 2023 15:18:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E5D353858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678893505; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=iqgPTJgh632fG7MdiaLNsURq4lcQ1QhjGEDBhnG+YMI=; b=MLL58mUrDM17xMwCrU88uF3V0aTZ1jVH2l8G5XpqRlaTztBzOi5yQ3lzzkKGy5I9U8xQlR XEk7DAMSpHtSZd2SC+XcTVan+aQnnqdALHey6PSvS0IPvUc4jbl+NAySE0yGRP9XzR18J2 Bo19q8kfIL2k2cb8lZIIuPjpZs3vO0c= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-474--AlHQ2nhPTWgv-QPElos_g-1; Wed, 15 Mar 2023 11:18:23 -0400 X-MC-Unique: -AlHQ2nhPTWgv-QPElos_g-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C32A085CE66; Wed, 15 Mar 2023 15:18:22 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.16]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 80D7DC164E7; Wed, 15 Mar 2023 15:18:22 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 32FFIJ7O2865238 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Wed, 15 Mar 2023 16:18:20 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 32FFIIO22865237; Wed, 15 Mar 2023 16:18:18 +0100 Date: Wed, 15 Mar 2023 16:18:18 +0100 From: Jakub Jelinek To: Philip Herron Cc: Raiki Tamura , gcc@gcc.gnu.org, gcc-rust@gcc.gnu.org, David Edelsohn , Arthur Cohen Subject: Re: [GSoC] gccrs Unicode support Message-ID: Reply-To: Jakub Jelinek References: MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Mar 15, 2023 at 11:00:19AM +0000, Philip Herron via Gcc wrote: > Excellent work on getting up to speed on the rust front-end. From my > perspective I am interested to see what the wider GCC community thinks > about using https://www.gnu.org/software/libunistring/ library within GCC > instead of rolling our own, this means it will be another dependency on GCC. > > The other option is there is already code in the other front-ends to do > this so in the worst case it should be possible to extract something out of > them and possibly make this a shared piece of functionality which we can > mentor you through. I don't know what exactly Rust FE needs in this area, but e.g. libcpp already handles whatever C/C++ need from Unicode support POV and can handle it without any extra libraries. So, if we could avoid the extra dependency, it would be certainly better, unless you really need massive amounts of code from those libraries. libcpp already e.g. provides mapping of unicode character names to code points, determining which unicode characters can appear at the start or in the middle of identifiers, etc. Jakub