From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-116825-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 25124 invoked by alias); 5 Jul 2005 19:24:20 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 25080 invoked by uid 22791); 5 Jul 2005 19:24:16 -0000
Received: from smtp-102-tuesday.noc.nerim.net (HELO mallaury.nerim.net) (62.4.17.102)
    by sourceware.org (qpsmtpd/0.30-dev) with ESMTP; Tue, 05 Jul 2005 19:24:16 +0000
Received: from uniton.integrable-solutions.net (gdr.net1.nerim.net [62.212.99.186])
	by mallaury.nerim.net (Postfix) with ESMTP id 7CC064F3AA;
	Tue,  5 Jul 2005 21:24:02 +0200 (CEST)
Received: from uniton.integrable-solutions.net (localhost [127.0.0.1])
	by uniton.integrable-solutions.net (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id j65JN6KY007529;
	Tue, 5 Jul 2005 21:23:06 +0200
Received: (from gdr@localhost)
	by uniton.integrable-solutions.net (8.12.10/8.12.10/Submit) id j65JN6wD007528;
	Tue, 5 Jul 2005 21:23:06 +0200
To: Michael Veksler <VEKSLER@il.ibm.com>
Cc: Joe Buck <Joe.Buck@synopsys.COM>, gcc@gcc.gnu.org
Subject: Re: tr1::unordered_set<double> bizarre rounding behavior (x86)
References: <OFC50D7347.1C4339A9-ONC2257035.0066952B-C2257035.0068790B@il.ibm.com>
From: Gabriel Dos Reis <gdr@integrable-solutions.net>
In-Reply-To: <OFC50D7347.1C4339A9-ONC2257035.0066952B-C2257035.0068790B@il.ibm.com>
Date: Tue, 05 Jul 2005 19:24:00 -0000
Message-ID: <m3k6k5oy5h.fsf@uniton.integrable-solutions.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-SW-Source: 2005-07/txt/msg00189.txt.bz2

Michael Veksler <VEKSLER@il.ibm.com> writes:

| Joe Buck <Joe.Buck@synopsys.COM> wrote on 05/07/2005 21:10:25:
| 
| > On Tue, Jul 05, 2005 at 08:05:39PM +0200, Gabriel Dos Reis wrote:
| > > It is definitely a good thing to use the full bits of value
| > > representation if we ever want to make all "interesting" bits part of
| > > the hash value.  For reasonable or sane representations it suffices to
| > > get your hand on the object representation, e.g.:
| > >
| > >    const int objsize = sizeof (double);
| > >    typedef unsigned char objrep_t[objsize];
| > >    double x = ....;
| > >    objrep_t& p = reintepret_cast<objrep_t&>(x);
| > >    // ...
| > >
| > > and let frexp and friends only for less obvious value representation.
| >
| > I disagree; on an ILP32 machine, we pull out only 32 bits for the hash
| > value, and if you aren't careful, your approach will wind up using the
| > least significant bits of the mantissa.  This will cause all values that
| > are exactly representable as floats to collide.
| 
| For that you can do something like (or templated equivalent):
| namespace Impl
| {
|  template <class T>
|  size_t floating_point_hash(T in)
|  {
|    if (sizeof(in) <= sizeof(size_t))
|     Use Gaby's solution, with zero padding;
|    else
|     frexp and friends using Joe Buck's ideas;
|   }
| }
| 
| Gaby's solution should be done with care - to avoid any
| aliasing issues (never go directly from double& to size_t&).

The standard explicilty permit that you can regard any object as an
array of unsigned char.  Given that, and given no padding bits
(e.g. the "sane" representation assumption), hashing any object larger
than a size_t is no different from hashing a character string.

Now, the question is how to make sure we do not have padding bits.  For
most targets, that assumption is OK; only the one subject of this
discussion seems to pose problems ;-) 

| Both Gaby's and Joe Buck's solutions do not take
| the strangeness of IEEE (NNN?) into account.
| As I remember it (I don't have the reference at home),
| IEEE FP has many bit-representations for NaN, each
| containing some bit-encoding of errors.

My proposal explicilty takes that into account in the sense that it
looks at all bits of the value representation, therefore the encoding
bits of the NaNs too.


| "There *should* be a specialization for equal_to<double> that
| provides a strict weak ordering for NaNs as well as other
| values." [quoted forwarded mail from P.J. Plauger]
| Doing bit-wise conversions will not address this requirement.

I do not understand what you mean by that.

-- Gaby