From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 38955 invoked by alias); 20 May 2015 12:10:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 38934 invoked by uid 89); 20 May 2015 12:10:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Wed, 20 May 2015 12:10:44 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 3DB13AD1F; Wed, 20 May 2015 12:10:41 +0000 (UTC) Date: Wed, 20 May 2015 12:13:00 -0000 From: Michael Matz To: Richard Henderson cc: Rich Felker , "H.J. Lu" , Jan Hubicka , Alexander Monakov , GCC Patches , Uros Bizjak Subject: Re: [PATCH i386] Allow sibcalls in no-PLT PIC In-Reply-To: <555B87F4.30908@redhat.com> Message-ID: References: <20150515194824.GB14415@kam.mff.cuni.cz> <20150515202319.GE17573@brightrain.aerifal.cx> <20150515204237.GF17573@brightrain.aerifal.cx> <20150515230810.GA73210@kam.mff.cuni.cz> <20150515234403.GG17573@brightrain.aerifal.cx> <20150519180659.GG17573@brightrain.aerifal.cx> <555B87F4.30908@redhat.com> User-Agent: Alpine 2.20 (LSU 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-IsSubscribed: yes X-SW-Source: 2015-05/txt/msg01819.txt.bz2 Hi, On Tue, 19 May 2015, Richard Henderson wrote: > It is. The relaxation that HJ is working on requires that the reads > from the got not be hoisted. I'm not especially convinced that what > he's working on is a win. > > With LTO, the compiler can do the same job that he's attempting in the > linker, without an extra nop. Without LTO, leaving it to the linker > means that you can't hoist the load and hide the memory latency. Well, hoisting always needs a register, and if hoisted out of a loop (which you all seem to be after) that register is live through the whole loop body. You need a register for each different called function in such loop, trading the one GOT pointer with N other registers. For register-starved machines this is a real problem, even x86-64 doesn't have that many. I.e. I'm not convinced that this hoisting will really be much of a win that often, outside toy examples. Sure, the compiler can hoist function addresses trivially, but I think it will lead to spilling more often than not, or alternatively the hoisting will be undone by the register allocators rematerialization. Of course, this would have to be measured for real not hand-waved, but, well, I'd be surprised if it's not so. Ciao, Michael.