From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4404 invoked by alias); 11 Jul 2014 17:42:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 4129 invoked by uid 89); 11 Jul 2014 17:42:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qa0-f46.google.com Received: from mail-qa0-f46.google.com (HELO mail-qa0-f46.google.com) (209.85.216.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Fri, 11 Jul 2014 17:42:46 +0000 Received: by mail-qa0-f46.google.com with SMTP id v10so1110784qac.19 for ; Fri, 11 Jul 2014 10:42:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=xo+xYTFeUUyImM1ghXD+qOkynKDQGxag6OauzNmCbEA=; b=iE4/wj8NIHIVcscSp+pxy0k/H6cLwBP+zBcZT8nRnyspvgoPzJKjvf1kf/R0uazrJa b5BuGijEk+nlASk5LPV6K9keGV48Ix0lPdhyJWYXK1+FcJ68xQttQvTOp/NuUx2oI/Uz J9+M32Gi7lPivVhPwEw3j+Wwikar8Tg+mLTHghSfCVus10nuK9pMOO1WhLzDWZWRsY/P bTS1T3ukFLF2LxibQrwqqyiNqRrlFHMjdd//j+lHYCd8kN8TwlhOws/E8yq9DR1Y9Mi8 5HwcxnDFsAubQYlBbT6DXJPtZRYMJ5tX6qO3j9nTGE0RVcMBIvIlR7P8pp9KF9/cRJgp rXzQ== X-Gm-Message-State: ALoCoQlOI9pXtrnT+eTe8PAdfhzu3FeDyzqkzK3bMo0Z9jPUmFhL9jh/Jqz8WA0CjQfVT9OtuusH MIME-Version: 1.0 X-Received: by 10.140.36.118 with SMTP id o109mr440293qgo.25.1405100564074; Fri, 11 Jul 2014 10:42:44 -0700 (PDT) Received: by 10.229.40.7 with HTTP; Fri, 11 Jul 2014 10:42:44 -0700 (PDT) In-Reply-To: References: Date: Fri, 11 Jul 2014 17:42:00 -0000 Message-ID: Subject: Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations From: Sriraman Tallam To: GCC Patches , David Li , Cary Coutant , Ian Lance Taylor , Paul Pluzhnikov , Uros Bizjak , Jan Hubicka , Jakub Jelinek Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2014-07/txt/msg00848.txt.bz2 Ping. On Thu, Jun 26, 2014 at 10:54 AM, Sriraman Tallam wrote: > Hi Uros, > > Could you please review this patch? > > Thanks > Sri > > On Fri, Jun 20, 2014 at 5:17 PM, Sriraman Tallam wrote: >> Patch Updated. >> >> Sri >> >> On Mon, Jun 9, 2014 at 3:55 PM, Sriraman Tallam wrote: >>> Ping. >>> >>> On Mon, May 19, 2014 at 11:11 AM, Sriraman Tallam wrote: >>>> Ping. >>>> >>>> On Thu, May 15, 2014 at 11:34 AM, Sriraman Tallam wrote: >>>>> Optimize access to globals with -fpie, x86_64 only: >>>>> >>>>> Currently, with -fPIE/-fpie, GCC accesses globals that are extern to the module >>>>> using the GOT. This is two instructions, one to get the address of the global >>>>> from the GOT and the other to get the value. If it turns out that the global >>>>> gets defined in the executable at link-time, it still needs to go through the >>>>> GOT as it is too late then to generate a direct access. >>>>> >>>>> Examples: >>>>> >>>>> foo.cc >>>>> ------ >>>>> int a_glob; >>>>> int main () { >>>>> return a_glob; // defined in this file >>>>> } >>>>> >>>>> With -O2 -fpie -pie, the generated code directly accesses the global via >>>>> PC-relative insn: >>>>> >>>>> 5e0
: >>>>> mov 0x165a(%rip),%eax # 1c40 >>>>> >>>>> foo.cc >>>>> ------ >>>>> >>>>> extern int a_glob; >>>>> int main () { >>>>> return a_glob; // defined in this file >>>>> } >>>>> >>>>> With -O2 -fpie -pie, the generated code accesses global via GOT using two >>>>> memory loads: >>>>> >>>>> 6f0
: >>>>> mov 0x1609(%rip),%rax # 1d00 <_DYNAMIC+0x230> >>>>> mov (%rax),%eax >>>>> >>>>> This is true even if in the latter case the global was defined in the >>>>> executable through a different file. >>>>> >>>>> Some experiments on google benchmarks shows that the extra memory loads affects >>>>> performance by 1% to 5%. >>>>> >>>>> >>>>> Solution - Copy Relocations: >>>>> >>>>> When the linker supports copy relocations, GCC can always assume that the >>>>> global will be defined in the executable. For globals that are truly extern >>>>> (come from shared objects), the linker will create copy relocations and have >>>>> them defined in the executable. Result is that no global access needs to go >>>>> through the GOT and hence improves performance. >>>>> >>>>> This patch to the gold linker : >>>>> https://sourceware.org/ml/binutils/2014-05/msg00092.html >>>>> submitted recently allows gold to generate copy relocations for -pie mode when >>>>> necessary. >>>>> >>>>> I have added option -mld-pie-copyrelocs which when combined with -fpie would do >>>>> this. Note that the BFD linker does not support pie copyrelocs yet and this >>>>> option cannot be used there. >>>>> >>>>> Please review. >>>>> >>>>> >>>>> ChangeLog: >>>>> >>>>> * config/i386/i36.opt (mld-pie-copyrelocs): New option. >>>>> * config/i386/i386.c (legitimate_pic_address_disp_p): Check if this >>>>> address is still legitimate in the presence of copy relocations >>>>> and -fpie. >>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-1.c: New test. >>>>> * testsuite/gcc.target/i386/ld-pie-copyrelocs-2.c: New test. >>>>> >>>>> >>>>> >>>>> Patch attached. >>>>> Thanks >>>>> Sri