From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <binutils-return-40406-listarch-binutils=sources.redhat.com@sources.redhat.com>
Received: (qmail 4641 invoked by alias); 5 May 2005 18:58:16 -0000
Mailing-List: contact binutils-help@sources.redhat.com; run by ezmlm
Precedence: bulk
List-Subscribe: <mailto:binutils-subscribe@sources.redhat.com>
List-Archive: <http://sources.redhat.com/ml/binutils/>
List-Post: <mailto:binutils@sources.redhat.com>
List-Help: <mailto:binutils-help@sources.redhat.com>, <http://sources.redhat.com/ml/#faqs>
Sender: binutils-owner@sources.redhat.com
Received: (qmail 4460 invoked from network); 5 May 2005 18:58:07 -0000
Received: from unknown (HELO mail.codesourcery.com) (65.74.133.9)
  by sourceware.org with SMTP; 5 May 2005 18:58:07 -0000
Received: (qmail 28698 invoked from network); 5 May 2005 18:58:06 -0000
Received: from localhost (HELO taltos.codesourcery.com) (zack@127.0.0.1)
  by mail.codesourcery.com with SMTP; 5 May 2005 18:58:06 -0000
Received: by taltos.codesourcery.com (sSMTP sendmail emulation); Thu,  5 May 2005 11:58:06 -0700
To: "Dave Korn" <dave.korn@artimi.com>
Cc: "'Nick Clifton'" <nickc@redhat.com>,
	  "'Alan Modra'" <amodra@bigpond.net.au>,
	  "'Ben Elliston'" <bje@au1.ibm.com>,  <binutils@sources.redhat.com>
Subject: Re: PATCH: use hashtab for pseudo op table
References: <SERRANOIA3EEmYDKFJ200000245@SERRANO.CAM.ARTIMI.COM>
From: Zack Weinberg <zack@codesourcery.com>
Date: Thu, 05 May 2005 18:59:00 -0000
In-Reply-To: <SERRANOIA3EEmYDKFJ200000245@SERRANO.CAM.ARTIMI.COM> (Dave Korn's message of "Thu, 5 May 2005 18:54:20 +0100")
Message-ID: <87fyx1cxfl.fsf@codesourcery.com>
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-SW-Source: 2005-05/txt/msg00196.txt.bz2

"Dave Korn" <dave.korn@artimi.com> writes:

>> I'm not just bringing up the indirect function call penalty as a
>> hypothetical, though; over in GCC we got measurable speedups by
>> switching back to a custom hash table (libcpp/symtab.c) for
>> identifier lookups.
>
> You mentioned this before in the context of iterators, but surely
> passing a function pointer into an iterator routine and getting that
> function called once per hashtable item is no worse than having an
> iterator, calling a function (not through a pointer admittedly) to
> advance it once per hashtable item, and then processing the thing
> inline?  There's still one function call per entry and one block of
> code that loops over all entries calling that function.....

Um, I don't know how iterators got into it.  I'm concerned about the
constant-factor cost of individual lookups.  The lookup routine in
libiberty/hashtab.c does one indirect function call to compute the
hash value, plus at one indirect function call to compare the key to
each candidate found from the hash value.  The lookup routine in
gas/hash.c, by contrast, calculates the hash value inline, and makes
direct function calls to strncmp.

> The complexity, structure, architecture, implementation and
> behaviour of a backtracking parser that can make sense of C++ in no
> way compares to the simple minded "Every line begins with an
> optional label, then has an opcode, then some operands, and might
> end with a comment" style parsing of gas!

Clearly I should have explained my thinking in detail rather than
going for the cheap joke.  Let's try this again.

I haven't spent much, er, okay, *any* time profiling GAS.  I have,
however, spent lots of time profiling cc1 and cc1plus.  Both of them
have parsers which are more complicated than GAS's -- as you say, in
the C++ case, absurdly more complicated.  And yet, despite that, the
lookup function for the identifier hash is *invariably* in the top
three on a flat profile.  That being the case, I find it very likely
that the lookup function for the opcode hash is going to dominate
profiles of GAS performance.

A crude experiment would seem to confirm this - I created a source
file containing 250,000 occurrences of 'mov r0,r1' and compiled it
with a profiled ARM gas (from the tree containing my Thumb-2 work).
Here's the top five:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 11.47      1.13     1.13       85     0.01     0.02  do_scrub_chars
 11.17      2.23     1.10   751985     0.00     0.00  hash_lookup
  9.14      3.13     0.90        3     0.30     0.30  write_relocs
  8.53      3.97     0.84   250003     0.00     0.00  strncpy
  7.72      4.73     0.76        1     0.76     7.30  read_a_source_file

[The strncpy calls are all from read_a_source_file.]

> So, perhaps a slight bit of work on the libiberty hashtab api, to
> offer a few interfaces that take a) NUL-terminated strings b)
> fixed-size structs and maybe even c) variable sized structs with
> parameters specifying an offset and size within the struct where the
> sizeof the struct can be found, would make you reconsider?

I'd feel better about the interface, but I'd still be concerned about
the constant factors.  And I want support for not-NUL-terminated
strings with length passed in separately, too.

zw