From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 87136 invoked by alias); 23 Oct 2019 17:19:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 87126 invoked by uid 89); 23 Oct 2019 17:19:31 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 spammy= X-HELO: gate.crashing.org Received: from gate.crashing.org (HELO gate.crashing.org) (63.228.1.57) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 23 Oct 2019 17:19:30 +0000 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id x9NHJDcw029916; Wed, 23 Oct 2019 12:19:14 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id x9NHJAXV029913; Wed, 23 Oct 2019 12:19:10 -0500 Date: Wed, 23 Oct 2019 17:29:00 -0000 From: Segher Boessenkool To: Jakub Jelinek Cc: Alexander Monakov , Eduard-Mihai Burtescu , Ian Lance Taylor , gcc-patches , Ian Lance Taylor Subject: Re: [PATCH] Refactor rust-demangle to be independent of C++ demangling. Message-ID: <20191023171910.GP28442@gate.crashing.org> References: <20191023163726.GO28442@gate.crashing.org> <20191023164614.GG2116@tucnak> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191023164614.GG2116@tucnak> User-Agent: Mutt/1.4.2.3i X-IsSubscribed: yes X-SW-Source: 2019-10/txt/msg01679.txt.bz2 On Wed, Oct 23, 2019 at 06:46:14PM +0200, Jakub Jelinek wrote: > On Wed, Oct 23, 2019 at 11:37:26AM -0500, Segher Boessenkool wrote: > > On Wed, Oct 23, 2019 at 07:22:47PM +0300, Alexander Monakov wrote: > > > On Wed, 23 Oct 2019, Eduard-Mihai Burtescu wrote: > > > > @@ -384,6 +384,14 @@ rust_demangle_callback (const char *mangled, int options, > > > > return 0; > > > > rdm.sym_len--; > > > > > > > > + /* Legacy Rust symbols also always end with a path segment > > > > + that encodes a 16 hex digit hash, i.e. '17h[a-f0-9]{16}'. > > > > + This early check, before any parse_ident calls, should > > > > + quickly filter out most C++ symbols unrelated to Rust. */ > > > > + if (!(rdm.sym_len > 19 > > > > + && !strncmp (&rdm.sym[rdm.sym_len - 19], "17h", 3))) > > > > > > This can be further optimized by using memcmp in place of strncmp, since from > > > the length check you know that you won't see the null terminator among the three > > > chars you're checking. > > > > > > The compiler can expand memcmp(buf, "abc", 3) inline as two comparisons against > > > a 16-bit immediate and an 8-bit immediate. It can't do the same for strncmp. > > > > The compiler does not currently do that, but it *could*. Or why not? The > > compiler is always allowed to load 3 characters here, whether some string > > has a NUL character earlier or not. > > It is valid to call strncmp (mmap(...)+page_size-1, "abc", 3), the reading > of the string should stop when 0 is seen. Where does it say that, though? I don't see where it prohibits reading more characters (up to 3 here), and you can get much better code using that. I of course know that for e.g. strcmp or strlen we need to be careful of page crossings; but this is strncmp, which has a size argument saying the size of the array objects of its arguments! Segher