public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Jonathan Wakely <jwakely@redhat.com>
To: "François Dumont" <frs.dumont@gmail.com>
Cc: "libstdc++@gcc.gnu.org" <libstdc++@gcc.gnu.org>,
	gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: Hashtable Small size optimization
Date: Fri, 31 May 2019 11:56:00 -0000	[thread overview]
Message-ID: <20190531114431.GY2599@redhat.com> (raw)
In-Reply-To: <efa88a3a-efc3-aade-85d6-697a7228d66c@gmail.com>

Re https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00903.html


On 15/10/18 22:46 +0200, François Dumont wrote:
>I started considering PR libstdc++/68303.
>
>First thing was to write a dedicated performance test case, it is the 
>unordered_small_size.cc I'd like to add with this patch.

Great, more performance tests are always good.

>The first runs show a major difference between tr1 and std 
>implementations, tr1 being much better:
>
>std::tr1::unordered_set<int> without hash code cached: 1st insert    
>   9r    9u    1s  14725920mem    0pf
>std::tr1::unordered_set<int> with hash code cached: 1st insert       
>8r    9u    0s  14719680mem    0pf
>std::unordered_set<int> without hash code cached: 1st insert      
>17r   17u    0s  16640080mem    0pf
>std::unordered_set<int> with hash code cached: 1st insert      14r   
>14u    0s  16638656mem    0pf
>
>I had a look in gdb to find out why and the answer was quite obvious. 
>For 20 insertions tr1 implementation bucket count goes through [11, 
>23] whereas for std it is [2, 5, 11, 23], so 2 more expensive rehash.
>
>As unordered containers are dedicated to rather important number of 
>elements I propose to review the rehash policy with this patch so that 
>std also starts at 11 on the 1st insertion. After the patch figures 
>are:
>
>std::tr1::unordered_set<int> without hash code cached: 1st insert    
>   9r    9u    0s  14725920mem    0pf
>std::tr1::unordered_set<int> with hash code cached: 1st insert       
>8r    8u    0s  14719680mem    0pf
>std::unordered_set<int> without hash code cached: 1st insert      
>15r   15u    0s  16640128mem    0pf
>std::unordered_set<int> with hash code cached: 1st insert      12r   
>12u    0s  16638688mem    0pf

OK, that seems worthwhile then.

>Moreover I noticed that performance tests are built with -O2, is that 
>intentional ? The std implementation uses more abstractions than tr1, 
>looks like building with -O3 optimizes away most of those abstractions 
>making tr1 and std implementation much closer:

Yes, I think it's intentional that -O2 is used, because I think
that's the most commonly-used optimisation level. We don't want to 

>std::tr1::unordered_set<int> without hash code cached: 1st insert    
>   2r    1u    1s  14725952mem    0pf
>std::tr1::unordered_set<int> with hash code cached: 1st insert       
>2r    1u    0s  14719536mem    0pf
>std::unordered_set<int> without hash code cached: 1st insert       
>2r    2u    0s  16640064mem    0pf
>std::unordered_set<int> with hash code cached: 1st insert       2r    
>2u    0s  16638608mem    0pf

Hmm, interesting. I wonder if we can determine what is not being
optimized at -O2, and either tweak the code or improve the compiler.

>Note that this patch also rework the alternative rehash policy based 
>on powers of 2 so that it also starts with a larger number of bucket 
>(16) and respects LWG2156.
>
>Last I had to wider the memory column so that alignment is preserved 
>even when memory diff is negative.
>
>Tested under Linux x86_64.
>
>Ok to commit ?

Does this patch still apply cleanly? I think it would be good to
commit it now, early in stage 1.


  reply	other threads:[~2019-05-31 11:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-15 22:17 François Dumont
2019-05-31 11:56 ` Jonathan Wakely [this message]
2019-06-03 17:18   ` François Dumont
2019-06-03 22:25     ` Jonathan Wakely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190531114431.GY2599@redhat.com \
    --to=jwakely@redhat.com \
    --cc=frs.dumont@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=libstdc++@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).