public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Robert Dewar <dewar@adacore.com>
To: Dave Korn <dave.korn.cygwin@googlemail.com>
Cc: Jae Hyuk Kwak <wrice127@gmail.com>, gcc@gnu.org
Subject: Re: Hash Function for "switch statement"
Date: Fri, 19 Mar 2010 23:28:00 -0000	[thread overview]
Message-ID: <4BA405D8.8020309@adacore.com> (raw)
In-Reply-To: <4BA3FF48.5070703@gmail.com>

Dave Korn wrote:
> On 19/03/2010 22:07, Robert Dewar wrote:
> 
>> You miss my point, doing a mod with 256 is an AWFUL hash algorithm
>> since it only uses the low order 8 bits! 
> 
>   This statement is only true contingent on a number of significant
> assumptions you haven't stated - assumptions which can easily be violated.

I did make the mistake of assuming that since JHK talked about 
efficiency of the mod operation he was assuming mod as part of
the hashing algorithm, rather than just a way of narrowing the
range of the result of an otherwise already pseudo-randomized hash.
> 
>> I think you will find that people on this mailing list know all about
>> hash tables 
> 
>   I think you should get down from that high horse before you come down with
> an embarrassing bump.
> 
>> So this does not get around the possibility of a bad luck worst case.
> 
>   Perfect hashing does exactly that.  That's why it's "popular for hashing
> keywords for compilers", and indeed why it "ought to be popular for optimizing
> switch statements":

I still doubt that it is worth while in practice, it will be interesting 
to see whether actual measurements show otherwise. The experience with
the BCPL compiler was that in practice though clever, this approach was
not actually helpful in real programs.

I can't see perfect hashing EVER being useful for hashing keywords for
compilers, since in practice given good error detection you don't know
whether you are looking for a keyword or an identifier in any context.
So almost always you just incorporate the hash entries for keywords into
the main identifier hash table. Are there really compilers which work
any other way? Given that you have to treat nicely the case of 
programmers using keywords as identifiers accidentally, and misspelling
keywords accidentally:

>      1. pakage X is
>         |
>         >>> incorrect spelling of keyword "package"
> 
>      2.    type Package is range 1 .. 10;
>                 |
>         >>> reserved word "package" cannot be used as identifier
> 
>      3. end X;

I really don't see any other approach

(hmmm, now I have to file a bug report about the inconsistent
terminology of keyword vs reserved word :-))


> 
> http://burtleburtle.net/bob/hash/perfect.html
> 
>     cheers,
>       DaveK

  reply	other threads:[~2010-03-19 23:17 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-14 20:48 Jae Hyuk Kwak
2010-03-15  3:12 ` Basile Starynkevitch
2010-03-15  8:00   ` Jae Hyuk Kwak
2010-03-16  1:41     ` Jae Hyuk Kwak
2010-03-16  7:12       ` Basile Starynkevitch
2010-03-16  6:00     ` Dave Korn
     [not found] ` <20100317200410.GA13807@hungry-tiger.westford.ibm.com>
2010-03-18  6:39   ` Jae Hyuk Kwak
2010-03-18 11:20     ` Andrew Haley
     [not found]     ` <20100318151753.GA4065@hungry-tiger.westford.ibm.com>
2010-03-19  5:26       ` Jae Hyuk Kwak
2010-03-19 12:26         ` Frank Ch. Eigler
2010-03-19 18:11         ` Jae Hyuk Kwak
     [not found]         ` <20100319165443.GA9993@hungry-tiger.westford.ibm.com>
2010-03-19 21:26           ` Jae Hyuk Kwak
2010-03-19 21:30             ` Robert Dewar
2010-03-19 21:53               ` Jae Hyuk Kwak
2010-03-19 22:18                 ` Robert Dewar
2010-03-19 22:43                   ` Dave Korn
2010-03-19 23:28                     ` Robert Dewar [this message]
2010-03-19 22:57                   ` Jae Hyuk Kwak
2010-03-19 23:33                     ` Robert Dewar
2010-03-19 23:33                     ` Robert Dewar
2010-03-19 22:24               ` Dave Korn
2010-03-19 23:09                 ` Robert Dewar
2010-03-19 23:17                   ` Robert Dewar
2010-03-19 19:14     ` Andrew Haley
2010-03-22 12:44 Unruh, Erwin
2010-03-22 13:22 ` Robert Dewar
2010-03-22 16:29   ` Andrew Haley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BA405D8.8020309@adacore.com \
    --to=dewar@adacore.com \
    --cc=dave.korn.cygwin@googlemail.com \
    --cc=gcc@gnu.org \
    --cc=wrice127@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).