From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7352 invoked by alias); 5 Jul 2005 18:12:59 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 7344 invoked by uid 22791); 5 Jul 2005 18:12:56 -0000 Received: from ns.suse.de (HELO mx1.suse.de) (195.135.220.2) by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Tue, 05 Jul 2005 18:12:56 +0000 Received: from Relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id D0FA6EDBE; Tue, 5 Jul 2005 20:12:53 +0200 (CEST) Message-ID: <42CACDD5.1070002@suse.de> Date: Tue, 05 Jul 2005 18:12:00 -0000 From: Paolo Carlini User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050414 MIME-Version: 1.0 To: Joe Buck Cc: Michael Veksler , gcc@gcc.gnu.org Subject: Re: tr1::unordered_set bizarre rounding behavior (x86) References: <42CABDCC.3070508@suse.de> <20050705180759.GA2315@synopsys.com> In-Reply-To: <20050705180759.GA2315@synopsys.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-SW-Source: 2005-07/txt/msg00178.txt.bz2 Hi Joe and Gaby and thanks for your messages, >Close, but not quite. > >Hash functions are, by nature, many-to-one. A good hash function has >few collisions for values that frequently appear; the program will preform >very poorly if many inputs hash to the same value. The existing function >will make all values over max(size_t) collide (assuming the cast clips). >Using only the mantissa is better, but if doubles are used to >represent fixed-point, then common values like 0.25, 0.5, 1, 2, 4 might >all be in the hash table, and they will collide. > >You could do frexp, scale the mantissa to form an integer (e.g. multiply >by a large integer), then add it (modulo 2**n) to some prime number >multiplied by the exponent. This should give good spreading for both >large values and exactly representable values. > Indeed, I can find around also more sophisticated solutions similar to your proposal, perhaps in the python library?!? I'm tempted to go along this way, but Gaby's idea also seems nice, opinions about it? We have to make a choice... or we can provide both and the user chooses... still we have to select a default ;) Thanks