From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24138 invoked by alias); 4 Dec 2014 23:54:57 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 24124 invoked by uid 89); 4 Dec 2014 23:54:57 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-ob0-f176.google.com Received: from mail-ob0-f176.google.com (HELO mail-ob0-f176.google.com) (209.85.214.176) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 04 Dec 2014 23:54:55 +0000 Received: by mail-ob0-f176.google.com with SMTP id vb8so3701402obc.21 for ; Thu, 04 Dec 2014 15:54:53 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.202.92.67 with SMTP id q64mr8250736oib.128.1417737293507; Thu, 04 Dec 2014 15:54:53 -0800 (PST) Received: by 10.76.185.7 with HTTP; Thu, 4 Dec 2014 15:54:53 -0800 (PST) In-Reply-To: <20141204221900.DE5E5105@mailhost.lps.ens.fr> References: <20141204221900.DE5E5105@mailhost.lps.ens.fr> Date: Thu, 04 Dec 2014 23:54:00 -0000 Message-ID: Subject: Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations From: "H.J. Lu" To: Dominique Dhumieres Cc: GCC Patches Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2014-12/txt/msg00476.txt.bz2 On Thu, Dec 4, 2014 at 2:19 PM, Dominique Dhumieres wr= ote: >> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the >> module using the GOT. This is two instructions, one to get the address >> of the global from the GOT and the other to get the value. If it turns >> out that the global gets defined in the executable at link-time, it still >> needs to go through the GOT as it is too late then to generate a direct >> access. >> >> Examples: >> >> foo.cc >> ------ >> int a_glob; >> int main () { >> return a_glob; // defined in this file >> } >> >> With -O2 -fpie -pie, the generated code directly accesses the global via >> PC-relative insn: >> >> 5e0
: >> mov 0x165a(%rip),%eax # 1c40 >> >> foo.cc >> ------ >> >> extern int a_glob; >> int main () { >> return a_glob; // defined in this file >> } >> >> With -O2 -fpie -pie, the generated code accesses global via GOT using >> two memory loads: >> >> 6f0
: >> mov 0x1609(%rip),%rax # 1d00 <_DYNAMIC+0x230> >> mov (%rax),%eax >> >> This is true even if in the latter case the global was defined in the >> executable through a different file. >> >> Some experiments on google benchmarks shows that the extra memory loads >> affects performance by 1% to 5%. >> >> Solution - Copy Relocations: >> >> When the linker supports copy relocations, GCC can always assume that >> the global will be defined in the executable. For globals that are truly >> extern (come from shared objects), the linker will create copy relocatio= ns >> and have them defined in the executable. Result is that no global access >> needs to go through the GOT and hence improves performance. >> >> This optimization only applies to undefined, non-weak global data. >> Undefined, weak global data access still must go through the GOT. >> >> This patch checks if linker supports PIE with copy reloc, which is >> enabled in gold and bfd linker in bininutils 2.25, at configure time >> and enables this optimization if the linker support is available. >> >> gcc/ >> >> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if >> Linux/x86-64 linker supports PIE with copy reloc. >> * config.in: Regenerated. >> * configure: Likewise. >> >> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow >> pc-relative address for undefined, non-weak, non-function >> symbol reference in 64-bit PIE if linker supports PIE with >> copy reloc. >> >> * doc/sourcebuild.texi: Document pie_copyreloc target. >> >> gcc/testsuite/ >> >> * gcc.target/i386/pie-copyrelocs-1.c: New test. >> * gcc.target/i386/pie-copyrelocs-2.c: Likewise. >> * gcc.target/i386/pie-copyrelocs-3.c: Likewise. >> * gcc.target/i386/pie-copyrelocs-4.c: Likewise. >> >> * lib/target-supports.exp (check_effective_target_pie_copyreloc): >> New procedure. > > It caused pr64189. > I checked in this as an obvious fix. Sorry for the inconvenience. --=20 H.J. --- Index: ChangeLog =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- ChangeLog (revision 218407) +++ ChangeLog (working copy) @@ -1,3 +1,9 @@ +2014-12-04 H.J. Lu + + PR bootstrap/64189 + * configure.ac (HAVE_LD_PIE_COPYRELOC): Always define. + * configure: Regenerated. + 2014-12-04 Manuel L=C3=B3pez-Ib=C3=A1=C3=B1ez * diagnostic.c (diagnostic_color_init): New. Index: configure =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- configure (revision 218407) +++ configure (working copy) @@ -27063,12 +27063,12 @@ EOF ;; esac fi +fi cat >>confdefs.h <<_ACEOF #define HAVE_LD_PIE_COPYRELOC `if test x"$gcc_cv_ld_pie_copyreloc" =3D xyes; then echo 1; else echo 0; fi` _ACEOF -fi { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_ld_pie_copyreloc"= >&5 $as_echo "$gcc_cv_ld_pie_copyreloc" >&6; } Index: configure.ac =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- configure.ac (revision 218407) +++ configure.ac (working copy) @@ -4730,10 +4730,10 @@ EOF ;; esac fi - AC_DEFINE_UNQUOTED(HAVE_LD_PIE_COPYRELOC, - [`if test x"$gcc_cv_ld_pie_copyreloc" =3D xyes; then echo 1; else echo 0; fi`], - [Define 0/1 if your linker supports -pie option with copy reloc.]) fi +AC_DEFINE_UNQUOTED(HAVE_LD_PIE_COPYRELOC, + [`if test x"$gcc_cv_ld_pie_copyreloc" =3D xyes; then echo 1; else echo 0= ; fi`], + [Define 0/1 if your linker supports -pie option with copy reloc.]) AC_MSG_RESULT($gcc_cv_ld_pie_copyreloc) AC_MSG_CHECKING(linker EH-compatible garbage collection of sections)