From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 58688 invoked by alias); 10 Jan 2020 22:22:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 58671 invoked by uid 89); 10 Jan 2020 22:22:04 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-5.2 required=5.0 tests=AWL,BAYES_00,KAM_SHORT,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 spammy= X-HELO: us-smtp-1.mimecast.com Received: from us-smtp-delivery-1.mimecast.com (HELO us-smtp-1.mimecast.com) (207.211.31.120) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 10 Jan 2020 22:22:02 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1578694921; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=30zmPi4zbSPKMMCEwgij6aKaOQaZZVs80QYsRsaaShw=; b=IlvUtF7VljsWw0Q0ml+43+2/rQaf7oQnelFWgq+4sObuOyW0Mr+r+I8w8cOzJnQ1+acFHP mhWwl7FaRX4zJWkTn4q3OO+Ffz/oIiovSI7UJw35mgKCuPYxDX55b1f3JRmst/HgwBi76K KAZD4xiHUsXjEZHWrRvV2mGJ1RMtB2Y= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-11-jx5IhO08Phql74tcIFUCpw-1; Fri, 10 Jan 2020 17:21:57 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A031A1800D78; Fri, 10 Jan 2020 22:21:56 +0000 (UTC) Received: from localhost (unknown [10.33.36.12]) by smtp.corp.redhat.com (Postfix) with ESMTP id 15EA55C54A; Fri, 10 Jan 2020 22:21:55 +0000 (UTC) Date: Fri, 10 Jan 2020 22:31:00 -0000 From: Jonathan Wakely To: =?iso-8859-1?Q?Fran=E7ois?= Dumont Cc: "libstdc++@gcc.gnu.org" , gcc-patches Subject: Re: [PATCH] libstdc++/91223 Improve unordered containers == operator Message-ID: <20200110222155.GG490107@redhat.com> References: <25edbd87-809b-9c75-48a0-294852ea40f2@gmail.com> <20200110220128.GF490107@redhat.com> MIME-Version: 1.0 In-Reply-To: <20200110220128.GF490107@redhat.com> X-Clacks-Overhead: GNU Terry Pratchett X-Mimecast-Spam-Score: 0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-SW-Source: 2020-01/txt/msg00645.txt.bz2 On 10/01/20 22:01 +0000, Jonathan Wakely wrote: >On 10/01/20 18:54 +0100, Fran=E7ois Dumont wrote: >>Hi >> >>=A0=A0=A0 Here is my attempt to improve =3D=3D operator. >> >>=A0=A0=A0 There is a small optimization for the std::unordered_mutiXXX=20 >>containers but the main enhancement rely on some partial template=20 >>specialization of the _Equality type. I limit it to usage of=20 >>unordered containers with std::equal_to to be sure that the=20 >>container _Equal functor is like the key type =3D=3D. > >I think we can assume that for any _Equal, not just std::equal_to: >http://eel.is/c++draft/unord.req#12.sentence-5 > >However ... > >>=A0=A0=A0 Do I need to also consider user partial template specialization= =20 >>of std::equal_to ? It is a well know bad practice so I hope the=20 >>Standard says that such a partial specialization leads to undefined=20 >>behavior. > >It's certainly not undefined to specialize equal_to, and I'm not sure >how to make your optimisations valid in that case. > >Consider: > >struct X >{ > int i; > int rounded() const { return i - (i % 10); } > bool operator=3D=3D(X x) const { return i =3D=3D x.i; } >}; > >template<> struct std::equal_to >{ > bool operator()(X l, X r) const > { return l.rounded() =3D=3D r.rounded(); } >}; > >template<> std::hash >{ > bool operator()(X x) const { return hash()(x.rounded()); } >}; > >std::unordered_multiset u{ X{10}, X{11}, X{12} }; >std::unordered_multiset v{ X{15}, X{16}, X{17} }; >bool b1 =3D u =3D=3D v; >bool b2 =3D std::is_permutation(u.begin(), u.end(), v.begin()); >assert(b1 =3D=3D b2); > >I think the last new specialization in your patch would be used for >this case, and because __x_count =3D=3D v.count(*u.begin()) it will say >they're equal. But the is_permutation call says they're not. > >So I think the assertion would fail, but the standard says it should >pass. Am I mistaken? I believe the optimization would still be valid if you do not use __other.count(*__itx) to check for equivalent keys in the other container. Your patch does: + if (__x_count !=3D __other.count(*__itx)) + return false; This uses the _Equal predicate to count the equivalent elements in __other. Instead you need to use operator=3D=3D to count the **equal** elements. I think there's a similar problem in the _Equality specialization for unordered_map (i.e. key-value pairs, unique keys): + const auto __ity =3D __other.find(__itx->first); + if (__ity =3D=3D __other.end() || !bool(__ity->second =3D=3D __itx->sec= ond)) + return false; The call to __other.find(__itx->first) will return an element with equivalent key, but that's not guaranteed to be equal. I think you could fix this either by still using =3D=3D to compare the keys after __other.find(*__itx) returns an element (which doesn't fix the PR91263 bug) or by replacing find with a similar operation that looks up the hash code and then uses =3D=3D to test for equality (instead of using _Equal pred to test for equivalent keys). Basically, you can't use functions like find and count that rely on equivalence of keys, you need to use handwritten lookups using =3D=3D. And if you do that, then it doesn't matter whether _Equal is a specialization of std::equal_to or not, and it doesn't matter whether the user has defined their own specialization of std::equal_to. You can do the optimizations for any _Equal, because you won't actually be using it to test for equality. Does that make sense?