From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 49789 invoked by alias); 19 May 2015 20:54:32 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 49774 invoked by uid 89); 19 May 2015 20:54:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.3 required=5.0 tests=AWL,BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RDNS_DYNAMIC,TVD_RCVD_IP autolearn=no version=3.3.2 X-HELO: brightrain.aerifal.cx Received: from 216-12-86-13.cv.mvl.ntelos.net (HELO brightrain.aerifal.cx) (216.12.86.13) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 May 2015 20:54:30 +0000 Received: from dalias by brightrain.aerifal.cx with local (Exim 3.15 #2) id 1YuoWq-0003mI-00; Tue, 19 May 2015 20:54:20 +0000 Date: Tue, 19 May 2015 21:28:00 -0000 From: Rich Felker To: "H.J. Lu" Cc: Richard Henderson , Michael Matz , Jan Hubicka , Alexander Monakov , GCC Patches , Uros Bizjak Subject: Re: [PATCH i386] Allow sibcalls in no-PLT PIC Message-ID: <20150519205420.GL17573@brightrain.aerifal.cx> References: <20150515234403.GG17573@brightrain.aerifal.cx> <20150519180659.GG17573@brightrain.aerifal.cx> <555B87F4.30908@redhat.com> <555B8ACD.20503@redhat.com> <20150519201557.GK17573@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-SW-Source: 2015-05/txt/msg01758.txt.bz2 On Tue, May 19, 2015 at 01:27:06PM -0700, H.J. Lu wrote: > On Tue, May 19, 2015 at 1:15 PM, Rich Felker wrote: > > On Tue, May 19, 2015 at 12:17:18PM -0700, H.J. Lu wrote: > >> On Tue, May 19, 2015 at 12:11 PM, Richard Henderson wrote: > >> > On 05/19/2015 12:06 PM, H.J. Lu wrote: > >> >> On Tue, May 19, 2015 at 11:59 AM, Richard Henderson wrote: > >> >>> On 05/19/2015 11:06 AM, Rich Felker wrote: > >> >>>> I'm still mildly worried that concerns for supporting > >> >>>> relaxation might lead to decisions not to optimize code in ways that > >> >>>> would be difficult to relax (e.g. certain types of address load > >> >>>> reordering or hoisting) but I don't understand GCC internals > >> >>>> sufficiently to know if this concern is warranted or not. > >> >>> > >> >>> It is. The relaxation that HJ is working on requires that the reads from the > >> >>> got not be hoisted. I'm not especially convinced that what he's working on is > >> >>> a win. > >> >>> > >> >>> With LTO, the compiler can do the same job that he's attempting in the linker, > >> >>> without an extra nop. Without LTO, leaving it to the linker means that you > >> >>> can't hoist the load and hide the memory latency. > >> >>> > >> >> > >> >> My relax approach won't take away any optimization done by compiler. > >> >> It simply turns indirect branch into direct branch with a nop prefix at > >> >> link-time. I am having a hard time to understand why we shouldn't do it. > >> > > >> > I well understand what you're doing. > >> > > >> > But my point is that the only time the compiler should present you with the > >> > form of indirect branch you're looking for is when there's no place to hoist > >> > the load. > >> > > >> > At which point, is it really worth adding a new relocation to the ABI? Is it > >> > really worth adding new code to the linker that won't be exercised often? > >> > >> I believe there are plenty of indirect branches via GOT when compiling > >> PIE/PIC with -fno-plt: > >> > >> [hjl@gnu-6 gcc]$ cat /tmp/x.c > >> extern void foo (void); > >> > >> void > >> bar (void) > >> { > >> foo (); > >> } > >> [hjl@gnu-6 gcc]$ ./xgcc -B./ -fPIC -O3 -S /tmp/x.c -fno-plt > >> [hjl@gnu-6 gcc]$ cat x.s > >> ..file "x.c" > >> ..section .text.unlikely,"ax",@progbits > >> ..LCOLDB0: > >> ..text > >> ..LHOTB0: > >> ..p2align 4,,15 > >> ..globl bar > >> ..type bar, @function > >> bar: > >> ..LFB0: > >> ..cfi_startproc > >> jmp *foo@GOTPCREL(%rip) > >> ..cfi_endproc > >> ..LFE0: > >> ..size bar, .-bar > > > > I agree these exist. What I question is whether the savings from the > > linker being able to relax this to a direct call in the case where the > > programmer failed to let the compiler make it a direct call to begin > > with (by using hidden or protected visibility) are worth the cost of > > not being able to hoist the load out of loops or schedule it earlier > > in cases where relaxation is not possible because the call target is > > not defined in the same DSO. > > Just for fun. I compiled binutils as PIE with -fno-plt -flto: > > [hjl@gnu-mic-2 gas]$ file as-new > as-new: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), > dynamically linked (uses shared libs), for GNU/Linux 2.6.32, not > stripped > [hjl@gnu-mic-2 gas]$ > > There are 43: > > ff 25 21 93 2d 00 jmpq *0x2d9321(%rip) # 3d5f58 <_DYNAMIC+0x1e8> > > and 1983 > > ff 15 eb f4 38 00 callq *0x38f4eb(%rip) # 3d60e0 <_DYNAMIC+0x370> How many of those would be relaxed? I suspect it depends a lot on whether libbfd is static or shared. Rich