From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4641 invoked by alias); 5 May 2005 18:58:16 -0000 Mailing-List: contact binutils-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sources.redhat.com Received: (qmail 4460 invoked from network); 5 May 2005 18:58:07 -0000 Received: from unknown (HELO mail.codesourcery.com) (65.74.133.9) by sourceware.org with SMTP; 5 May 2005 18:58:07 -0000 Received: (qmail 28698 invoked from network); 5 May 2005 18:58:06 -0000 Received: from localhost (HELO taltos.codesourcery.com) (zack@127.0.0.1) by mail.codesourcery.com with SMTP; 5 May 2005 18:58:06 -0000 Received: by taltos.codesourcery.com (sSMTP sendmail emulation); Thu, 5 May 2005 11:58:06 -0700 To: "Dave Korn" Cc: "'Nick Clifton'" , "'Alan Modra'" , "'Ben Elliston'" , Subject: Re: PATCH: use hashtab for pseudo op table References: From: Zack Weinberg Date: Thu, 05 May 2005 18:59:00 -0000 In-Reply-To: (Dave Korn's message of "Thu, 5 May 2005 18:54:20 +0100") Message-ID: <87fyx1cxfl.fsf@codesourcery.com> User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2005-05/txt/msg00196.txt.bz2 "Dave Korn" writes: >> I'm not just bringing up the indirect function call penalty as a >> hypothetical, though; over in GCC we got measurable speedups by >> switching back to a custom hash table (libcpp/symtab.c) for >> identifier lookups. > > You mentioned this before in the context of iterators, but surely > passing a function pointer into an iterator routine and getting that > function called once per hashtable item is no worse than having an > iterator, calling a function (not through a pointer admittedly) to > advance it once per hashtable item, and then processing the thing > inline? There's still one function call per entry and one block of > code that loops over all entries calling that function..... Um, I don't know how iterators got into it. I'm concerned about the constant-factor cost of individual lookups. The lookup routine in libiberty/hashtab.c does one indirect function call to compute the hash value, plus at one indirect function call to compare the key to each candidate found from the hash value. The lookup routine in gas/hash.c, by contrast, calculates the hash value inline, and makes direct function calls to strncmp. > The complexity, structure, architecture, implementation and > behaviour of a backtracking parser that can make sense of C++ in no > way compares to the simple minded "Every line begins with an > optional label, then has an opcode, then some operands, and might > end with a comment" style parsing of gas! Clearly I should have explained my thinking in detail rather than going for the cheap joke. Let's try this again. I haven't spent much, er, okay, *any* time profiling GAS. I have, however, spent lots of time profiling cc1 and cc1plus. Both of them have parsers which are more complicated than GAS's -- as you say, in the C++ case, absurdly more complicated. And yet, despite that, the lookup function for the identifier hash is *invariably* in the top three on a flat profile. That being the case, I find it very likely that the lookup function for the opcode hash is going to dominate profiles of GAS performance. A crude experiment would seem to confirm this - I created a source file containing 250,000 occurrences of 'mov r0,r1' and compiled it with a profiled ARM gas (from the tree containing my Thumb-2 work). Here's the top five: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 11.47 1.13 1.13 85 0.01 0.02 do_scrub_chars 11.17 2.23 1.10 751985 0.00 0.00 hash_lookup 9.14 3.13 0.90 3 0.30 0.30 write_relocs 8.53 3.97 0.84 250003 0.00 0.00 strncpy 7.72 4.73 0.76 1 0.76 7.30 read_a_source_file [The strncpy calls are all from read_a_source_file.] > So, perhaps a slight bit of work on the libiberty hashtab api, to > offer a few interfaces that take a) NUL-terminated strings b) > fixed-size structs and maybe even c) variable sized structs with > parameters specifying an offset and size within the struct where the > sizeof the struct can be found, would make you reconsider? I'd feel better about the interface, but I'd still be concerned about the constant factors. And I want support for not-NUL-terminated strings with length passed in separately, too. zw