From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 82444 invoked by alias); 10 May 2018 17:24:54 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 82345 invoked by uid 89); 10 May 2018 17:24:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=weighted, commonly X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 10 May 2018 17:24:38 +0000 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5DD6830C0953; Thu, 10 May 2018 17:24:25 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.36.118.110]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E589E5FCAF; Thu, 10 May 2018 17:24:24 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id w4AHOM2Y026438; Thu, 10 May 2018 19:24:23 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id w4AHOKQ9026437; Thu, 10 May 2018 19:24:20 +0200 Date: Thu, 10 May 2018 17:35:00 -0000 From: Jakub Jelinek To: Alexander Monakov Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH 0/2] Introduce gcc_qsort Message-ID: <20180510172420.GK8577@tucnak> Reply-To: Jakub Jelinek References: <20180510155641.2950-1-amonakov@ispras.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180510155641.2950-1-amonakov@ispras.ru> User-Agent: Mutt/1.9.2 (2017-12-15) X-IsSubscribed: yes X-SW-Source: 2018-05/txt/msg00485.txt.bz2 On Thu, May 10, 2018 at 06:56:39PM +0300, Alexander Monakov wrote: > Overall the new implementation is roughly 30% faster compared to Glibc qsort, > with 2x or more speedup for cases with tiny element count. I see one instance > where the new approach is significantly (1.5x) slower: it is ipa-icf.c: > sort_congruence_class_groups_by_decl_uid. It sorts a big array (1500 entries) > and needs 14 indirect loads just to reach values to compare, so when branch > prediction manages to guess correctly, it allows to overlap execution of two > comparators and better hide their cache miss latency. > > Overall GCC spends about 0.3% time under qsort, but this doesn't automatically > mean that this brings a 0.1% speed improvement: it may be larger or smaller > depending on how new code affects cache behavior and branch predictors in > other code, and it's not trivial to measure precisely. > > I can go into more detail about measured stats if there's interest :) > > Makefile.in changes are separated to patch 2 in the hope it'd make review > easier, but the two patches will need to be applied together. > > Bootstrapped/regtested on x86-64, OK for trunk? Have you gathered some statistics on the element sizes and how often they appear in qsort calls (perhaps weighted by n*log(n) of the element count) across bootstrap+regtest? glibc uses indirect sorting (sorts pointers to the elements or indexes and just reshuffles at the end) and has special case for the most commonly used small element size (4/8). With C++ templates you could achieve that even without macros, just by instantiating the mergesort and its helpers for the few common cases. Or is that not worth it (as in, we never sort really large (say > 32 bytes) element sizes and the 4 or 8 bytes long element sizes aren't common enough to see benefit by using constant size memcpy for those cases? Jakub