From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11755 invoked by alias); 26 Jun 2014 17:55:22 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 11734 invoked by uid 89); 26 Jun 2014 17:55:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,SPF_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mail-vc0-f176.google.com Received: from mail-vc0-f176.google.com (HELO mail-vc0-f176.google.com) (209.85.220.176) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 26 Jun 2014 17:54:52 +0000 Received: by mail-vc0-f176.google.com with SMTP id ik5so3870077vcb.7 for ; Thu, 26 Jun 2014 10:54:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=Po6gYsfH9Yi21GoGKwpXKqGZYd3EZYHoCrqdwKr97rk=; b=XiDC4g80/nRnyBkLuN49i2eJaeBILhP7rCNwpU/IDK4/dIj8nhifkdXPbvxOXaPvuw gCWL5CqlQM4qpeA98FlZQfEwOpg7gvtWwlIR5grYjzwDFGS0m9nOyyYQpx42WlHfEQwD ffcxgzS3/2G4r5a6M88820Pv+Y7B8JtNRtg3yPKdr3aCOYcmRSTMmpvC5NBeVvVUsSeO rE0esbkYQ4bxDa2LCKkg0p/VczRf8lxaJg4iSc4b+qkabvq2VnNYjAsB2ZDgj41sgvqJ mvvuejUjiRsPwtL3C9j31wqgIiMj32Zme1j+NUvrxY0WPBezEyuPq+E8kVogEzWRbcE/ nN2A== X-Gm-Message-State: ALoCoQml0m3uYfj1xOx+mzfxjsXQNkm+zfSu4JEf2qG/jY8kNkm0wgh+l9oBofKrx/ptfUPBQhBW MIME-Version: 1.0 X-Received: by 10.58.151.211 with SMTP id us19mr2723613veb.5.1403805290037; Thu, 26 Jun 2014 10:54:50 -0700 (PDT) Received: by 10.52.102.133 with HTTP; Thu, 26 Jun 2014 10:54:49 -0700 (PDT) In-Reply-To: References: Date: Thu, 26 Jun 2014 17:55:00 -0000 Message-ID: Subject: Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations From: Sriraman Tallam To: GCC Patches , David Li , Cary Coutant , Ian Lance Taylor , Paul Pluzhnikov , Uros Bizjak , Jan Hubicka Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2014-06/txt/msg02155.txt.bz2 Hi Uros, Could you please review this patch? Thanks Sri On Fri, Jun 20, 2014 at 5:17 PM, Sriraman Tallam wrote: > Patch Updated. > > Sri > > On Mon, Jun 9, 2014 at 3:55 PM, Sriraman Tallam wrote: >> Ping. >> >> On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam wrote: >>> Ping. >>> >>> On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam wrote: >>>> Optimize access to globals with -fpie, x86_64 only: >>>> >>>> Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module >>>> using the GOT. This is two instructions, one to get the address of the global >>>> from the GOT and the other to get the value. If it turns out that the global >>>> gets defined in the executable at link-time, it still needs to go through the >>>> GOT as it is too late then to generate a direct access. >>>> >>>> Examples: >>>> >>>> foo.cc >>>> ------ >>>> int a_glob; >>>> int main () { >>>> return a_glob; // defined in this file >>>> } >>>> >>>> With -O2 -fpie -pie, the generated code directly accesses the global via >>>> PC-relative insn: >>>> >>>> 5e0
: >>>> mov 0x165a(%rip),%eax # 1c40 >>>> >>>> foo.cc >>>> ------ >>>> >>>> extern int a_glob; >>>> int main () { >>>> return a_glob; // defined in this file >>>> } >>>> >>>> With -O2 -fpie -pie, the generated code accesses global via GOT using two >>>> memory loads: >>>> >>>> 6f0
: >>>> mov 0x1609(%rip),%rax # 1d00 <_DYNAMIC+0x230> >>>> mov (%rax),%eax >>>> >>>> This is true even if in the latter case the global was defined in the >>>> executable through a different file. >>>> >>>> Some experiments on google benchmarks shows that the extra memory loads affects >>>> performance by 1% to 5%. >>>> >>>> >>>> Solution - Copy Relocations: >>>> >>>> When the linker supports copy relocations, GCC can always assume that the >>>> global will be defined in the executable. For globals that are truly extern >>>> (come from shared objects), the linker will create copy relocations and have >>>> them defined in the executable. Result is that no global access needs to go >>>> through the GOT and hence improves performance. >>>> >>>> This patch to the gold linker : >>>> https://sourceware.org/ml/binutils/2014-05/msg00092.html >>>> submitted recently allows gold to generate copy relocations for -pie mode when >>>> necessary. >>>> >>>> I have added option -mld-pie-copyrelocs which when combined with -fpie would do >>>> this. Note that the BFD linker does not support pie copyrelocs yet and this >>>> option cannot be used there. >>>> >>>> Please review. >>>> >>>> >>>> ChangeLog: >>>> >>>> * config/i386/i36.opt (mld-pie-copyrelocs): New option. >>>> * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this >>>> address is still legitimate in the presence of copy relocations >>>> and -fpie. >>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test. >>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test. >>>> >>>> >>>> >>>> Patch attached. >>>> Thanks >>>> Sri