From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12345 invoked by alias); 1 Jul 2019 22:08:22 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 12337 invoked by uid 89); 1 Jul 2019 22:08:22 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.1 spammy=Gary, H*f:sk:DM6PR18, H*i:sk:DM6PR18, HX-Languages-Length:1903 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 01 Jul 2019 22:08:21 +0000 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9D301C1EB21F; Mon, 1 Jul 2019 22:08:18 +0000 (UTC) Received: from localhost.localdomain (ovpn-112-19.rdu2.redhat.com [10.10.112.19]) by smtp.corp.redhat.com (Postfix) with ESMTP id C03DC608BA; Mon, 1 Jul 2019 22:08:17 +0000 (UTC) Subject: Re: RFC on a new optimization To: Gary Oblock , "gcc@gcc.gnu.org" References: From: Jeff Law Openpgp: preference=signencrypt Message-ID: <8ff1dafa-9f61-8f03-e09f-018c30e87001@redhat.com> Date: Mon, 01 Jul 2019 22:08:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2019-07/txt/msg00009.txt.bz2 On 7/1/19 3:58 PM, Gary Oblock wrote: > I've been looking at trying to optimize the performance of code for > programs that use functions like qsort where a function is passed the > name of a function and some constant parameter(s). > > The function qsort itself is an excellent example of what I'm trying to show > what I want to do, except for being in a library, so please ignore > that while I proceed assuming that that qsort is not in a library. In > qsort the user passes in a size of the array elements and comparison > function name in addition to the location of the array to be sorted. I > noticed that for a given call site that the first two are always the > same so why not create a specialized version of qsort that eliminates > them and internally uses a constant value for the size parameter and > does a direct call instead of an indirect call. The later lets the > comparison function code be inlined. > > This seems to me to be a very useful optimization where heavy use is > made of this programming idiom. I saw a 30%+ overall improvement when > I specialized a function like this by hand in an application. > > My question is does anything inside gcc do something similar? I don't > want to reinvent the wheel and I want to do something that plays > nicely with the rest of gcc so it makes it into real world. Note, I > should mention that I'm an experienced compiler developed and I'm > planning on adding this optimization unless it's obvious from the > ensuing discussion that either it's a bad idea or that it's a matter > of simply tweaking gcc a bit to get this optimization to occur. Jan is the expert in this space, but yes, GCC has devirtualization and function specialization. See ipa-devirt.c and ipa-cp.c You can use the -fdump-ipa-all-details option to produce debugging dumps for the IPA passes. THat might help guide you a bit. jeff