From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 13736 invoked by alias); 27 Feb 2015 23:26:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 13727 invoked by uid 89); 27 Feb 2015 23:26:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=4.0 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,KAM_FROM_URIBL_PCCC,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-ob0-f171.google.com Received: from mail-ob0-f171.google.com (HELO mail-ob0-f171.google.com) (209.85.214.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 27 Feb 2015 23:26:07 +0000 Received: by mail-ob0-f171.google.com with SMTP id gq1so21623676obb.2 for ; Fri, 27 Feb 2015 15:26:05 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.183.24.162 with SMTP id ij2mr11856384obd.18.1425079564979; Fri, 27 Feb 2015 15:26:04 -0800 (PST) Received: by 10.76.134.102 with HTTP; Fri, 27 Feb 2015 15:26:04 -0800 (PST) In-Reply-To: References: Date: Fri, 27 Feb 2015 23:46:00 -0000 Message-ID: Subject: Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations From: "H.J. Lu" To: Uros Bizjak Cc: "gcc-patches@gcc.gnu.org" , Sriraman Tallam , Jakub Jelinek Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-02/txt/msg01735.txt.bz2 On Fri, Feb 27, 2015 at 3:23 PM, H.J. Lu wrote: > On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu wrote: >> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak wrote: >>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu wrote: >>> >>>>>>>>> It would probably help reviewers if you pointed to actual path >>>>>>>>> submission [1], which unfortunately contains the explanation in the >>>>>>>>> patch itself [2], which further explains that this functionality is >>>>>>>>> currently only supported with gold, patched with [3]. >>>>>>>>> >>>>>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html >>>>>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt >>>>>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html >>>>>>>>> >>>>>>>>> After a bit of the above detective work, I think that new gcc option >>>>>>>>> is not necessary. The configure should detect if new functionality is >>>>>>>>> supported in the linker, and auto-configure gcc to use it when >>>>>>>>> appropriate. >>>>>>>> >>>>>>>> I think GCC option is needed since one can use -fuse-ld= to >>>>>>>> change linker. >>>>>>> >>>>>>> IMO, nobody will use this highly special x86_64-only option. It would >>>>>>> be best for gnu-ld to reach feature parity with gold as far as this >>>>>>> functionality is concerned. In this case, the optimization would be >>>>>>> auto-configured, and would fire automatically, without any user >>>>>>> intervention. >>>>>>> >>>>>> >>>>>> Let's do it. I implemented the same feature in bfd linker on both >>>>>> master and 2.25 branch. >>>>>> >>>>> >>>>> +bool >>>>> +i386_binds_local_p (const_tree exp) >>>>> +{ >>>>> + /* Globals marked extern are treated as local when linker copy relocations >>>>> + support is available with -f{pie|PIE}. */ >>>>> + if (TARGET_64BIT && ix86_copyrelocs && flag_pie >>>>> + && TREE_CODE (exp) == VAR_DECL >>>>> + && DECL_EXTERNAL (exp) && !DECL_WEAK (exp)) >>>>> + return true; >>>>> + return default_binds_local_p (exp); >>>>> +} >>>>> + >>>>> >>>>> It returns true with -fPIE and false without -fPIE. It is lying to compiler. >>>>> Maybe legitimate_pic_address_disp_p is a better place. >>> >>> Agreed. >>> >>>> Something like this? >>> >>> Yes. >>> >>> OK, if Jakub doesn't have any objections here. Please also add >>> Sriraman as author to ChangeLog entry. >>> >>> Thanks, >>> Uros. >> >> Here is the patch. OK to install? >> >> Thanks. >> >> -- >> H.J. >> --- >> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the >> module using the GOT. This is two instructions, one to get the address >> of the global from the GOT and the other to get the value. If it turns >> out that the global gets defined in the executable at link-time, it still >> needs to go through the GOT as it is too late then to generate a direct >> access. >> >> Examples: >> >> foo.cc >> ------ >> int a_glob; >> int main () { >> return a_glob; // defined in this file >> } >> >> With -O2 -fpie -pie, the generated code directly accesses the global via >> PC-relative insn: >> >> 5e0
: >> mov 0x165a(%rip),%eax # 1c40 >> >> foo.cc >> ------ >> >> extern int a_glob; >> int main () { >> return a_glob; // defined in this file >> } >> >> With -O2 -fpie -pie, the generated code accesses global via GOT using >> two memory loads: >> >> 6f0
: >> mov 0x1609(%rip),%rax # 1d00 <_DYNAMIC+0x230> >> mov (%rax),%eax >> >> This is true even if in the latter case the global was defined in the >> executable through a different file. >> >> Some experiments on google benchmarks shows that the extra memory loads >> affects performance by 1% to 5%. >> >> Solution - Copy Relocations: >> >> When the linker supports copy relocations, GCC can always assume that >> the global will be defined in the executable. For globals that are truly >> extern (come from shared objects), the linker will create copy relocations >> and have them defined in the executable. Result is that no global access >> needs to go through the GOT and hence improves performance. >> >> This optimization only applies to undefined, non-weak global data. >> Undefined, weak global data access still must go through the GOT. >> >> This patch checks if linker supports PIE with copy reloc, which is >> enabled in gold and bfd linker in bininutils 2.25, at configure time >> and enables this optimization if the linker support is available. >> >> gcc/ >> >> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if >> Linux/x86-64 linker supports PIE with copy reloc. >> * config.in: Regenerated. >> * configure: Likewise. >> >> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow >> pc-relative address for undefined, non-weak, non-function >> symbol reference in 64-bit PIE if linker supports PIE with >> copy reloc. >> >> * doc/sourcebuild.texi: Document pie_copyreloc target. >> >> gcc/testsuite/ >> >> * gcc.target/i386/pie-copyrelocs-1.c: New test. >> * gcc.target/i386/pie-copyrelocs-2.c: Likewise. >> * gcc.target/i386/pie-copyrelocs-3.c: Likewise. >> * gcc.target/i386/pie-copyrelocs-4.c: Likewise. >> >> * lib/target-supports.exp (check_effective_target_pie_copyreloc): >> New procedure. > > This caused: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65248 > > Should we turn it off by default? > Or we can provide a command line option to turn it off. -- H.J.