From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12031 invoked by alias); 5 Jul 2005 19:58:09 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 11875 invoked by uid 22791); 5 Jul 2005 19:57:56 -0000 Received: from us02smtp1.synopsys.com (HELO vaxjo.synopsys.com) (198.182.60.75) by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Tue, 05 Jul 2005 19:57:56 +0000 Received: from crone.synopsys.com (crone.synopsys.com [146.225.7.23]) by vaxjo.synopsys.com (Postfix) with ESMTP id 9D655DA89; Tue, 5 Jul 2005 12:57:54 -0700 (PDT) Received: from piper.synopsys.com (localhost [127.0.0.1]) by crone.synopsys.com (8.9.1/8.9.1) with ESMTP id MAA26867; Tue, 5 Jul 2005 12:57:54 -0700 (PDT) Received: from piper.synopsys.com (localhost [127.0.0.1]) by piper.synopsys.com (8.12.10/8.12.3) with ESMTP id j65JvsOe002648; Tue, 5 Jul 2005 12:57:54 -0700 Received: (from jbuck@localhost) by piper.synopsys.com (8.12.10/8.12.10/Submit) id j65JvrPq002646; Tue, 5 Jul 2005 12:57:53 -0700 Date: Tue, 05 Jul 2005 19:58:00 -0000 From: Joe Buck To: Paolo Carlini Cc: Michael Veksler , gcc@gcc.gnu.org Subject: Re: tr1::unordered_set bizarre rounding behavior (x86) Message-ID: <20050705195753.GA2603@synopsys.com> References: <42CABDCC.3070508@suse.de> <20050705180759.GA2315@synopsys.com> <42CACDD5.1070002@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <42CACDD5.1070002@suse.de> User-Agent: Mutt/1.4.1i X-SW-Source: 2005-07/txt/msg00191.txt.bz2 On Tue, Jul 05, 2005 at 08:13:41PM +0200, Paolo Carlini wrote: > Hi Joe and Gaby and thanks for your messages, > > >Close, but not quite. > > > >Hash functions are, by nature, many-to-one. A good hash function has > >few collisions for values that frequently appear; the program will preform > >very poorly if many inputs hash to the same value. The existing function > >will make all values over max(size_t) collide (assuming the cast clips). > >Using only the mantissa is better, but if doubles are used to > >represent fixed-point, then common values like 0.25, 0.5, 1, 2, 4 might > >all be in the hash table, and they will collide. > > > >You could do frexp, scale the mantissa to form an integer (e.g. multiply > >by a large integer), then add it (modulo 2**n) to some prime number > >multiplied by the exponent. This should give good spreading for both > >large values and exactly representable values. > > > Indeed, I can find around also more sophisticated solutions similar to > your proposal, perhaps in the python library?!? I'm tempted to go along > this way, but Gaby's idea also seems nice, opinions about it? We have to > make a choice... or we can provide both and the user chooses... still we > have to select a default ;) If I were you I'd be tempted to crib from Python. Because of the centrality of good hashing for Python performance, I'm sure that they've done a good job.