From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 73872 invoked by alias); 15 May 2015 20:42:55 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 73857 invoked by uid 89); 15 May 2015 20:42:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.3 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RDNS_DYNAMIC,TVD_RCVD_IP autolearn=no version=3.3.2 X-HELO: brightrain.aerifal.cx Received: from 216-12-86-13.cv.mvl.ntelos.net (HELO brightrain.aerifal.cx) (216.12.86.13) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 15 May 2015 20:42:52 +0000 Received: from dalias by brightrain.aerifal.cx with local (Exim 3.15 #2) id 1YtMRJ-0002Q5-00; Fri, 15 May 2015 20:42:37 +0000 Date: Fri, 15 May 2015 20:45:00 -0000 From: Rich Felker To: "H.J. Lu" Cc: Jan Hubicka , Alexander Monakov , GCC Patches , Uros Bizjak Subject: Re: [PATCH i386] Allow sibcalls in no-PLT PIC Message-ID: <20150515204237.GF17573@brightrain.aerifal.cx> References: <1430757479-14241-1-git-send-email-amonakov@ispras.ru> <1430757479-14241-5-git-send-email-amonakov@ispras.ru> <20150515194824.GB14415@kam.mff.cuni.cz> <20150515202319.GE17573@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-SW-Source: 2015-05/txt/msg01459.txt.bz2 On Fri, May 15, 2015 at 01:35:14PM -0700, H.J. Lu wrote: > On Fri, May 15, 2015 at 1:23 PM, Rich Felker wrote: > > On Fri, May 15, 2015 at 01:08:15PM -0700, H.J. Lu wrote: > >> With relax branch in 32-bit, there are 2 cases: > >> > >> 1. PIC or PIE: We generate > >> > >> set up EBX > >> relax call foo@PLT > >> > >> It is almost the same as we do now, except for the relax prefix. > >> If foo is defined in another shared library or may be preempted, > >> linker will generate > >> > >> call *foo@GOTPLT(%ebx) > >> > >> If foo turns out local, linker will output > >> > >> relax call foo > > > > This does not address the initial and primary motivation for no-plt on > > 32-bit: eliminating the awful codegen constraint costs of the > > GOT-register (ebx, and equivalent on other targets) ABI for calling > > PLT entries. If instead you generated code that sets up an expression > > for the GOT slot using arbitrary registers, and relaxed it to a direct > > call (possibly rendering the register setup useless), it would be > > comparable to the no-plt approach. So for example: > > > > set up ecx (or whatever register) > > relax call *foo@GOT(%ecx) > > > > and relax to: > > > > set up ecx (or whatever register; now useless) > > relax call foo > > > > But the no-plt approach is still superior in that the address load > > from the GOT can be hoisted out of loops, etc., resulting in something > > like: > > > > call *%esi > > > > This could be valuable in loops calling a math function repeatedly, > > for example. > > > > Overall I'm still not a fan of the relaxation approach. There are very > > few places it would actually help that couldn't already be improved > > better with use of visibility, and it can't give codegen as good as > > no-plt option. > > With no-plt option, compiler has to know if a function is external > or may be preempted. I still don't see significant practical cases where the linker would know this but the compiler can't. If you use visibility properly, the compiler knows, and if you do LTO and -Bsymbolic[-functions], the compiler should have that information available at LTO time (this is an enhancement that needs to be made, though). > If compiler guessed wrong, the generated > DSO or executable will always go through indirect branch even > though the target is local. The only way this is avoided now is with -Bsymbolic[-functions] which is not widely used. Otherwise interposition is always allowed for default-visibility functions, so I don't see how the indirect branch here is suboptimal. > With relax branch, the decision is left > to linker. Of course, EBX must be used unless we add a new PLT > relocation for each register used to to hold GOT base, like > > relax call foo@PLT_ECX > relax call foo@PLT_EDX No, that's not needed. If the linker doesn't make the relaxation, the instruction the compiler generated remains in place, and has the effective address expression using whichever register it wanted: relax call *foo@GOT(%ecx) relax call *foo@GOT(%edx) etc. If the linker chooses to relax it to a direct call, no register at all is needed, so the linker can just throw this away and use: call foo for all of them. Rich