public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Sriraman Tallam <tmsriram@google.com>
To: "H.J. Lu" <hjl.tools@gmail.com>
Cc: Uros Bizjak <ubizjak@gmail.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
		Jakub Jelinek <jakub@redhat.com>, David Li <davidxl@google.com>,
		Cary Coutant <ccoutant@google.com>
Subject: Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations
Date: Tue, 03 Feb 2015 19:26:00 -0000	[thread overview]
Message-ID: <CAAs8Hmxb1PKeRAzwJ+i+tEY_PTdPDiED6Rw1sAnjRyA9Ark3hg@mail.gmail.com> (raw)
In-Reply-To: <CAAs8Hmw0-+jN9BKwPB2BkGhTZwrc9V3zsAW2QTCuKnZKz7NVAQ@mail.gmail.com>

+davidxl +ccoutant

On Tue, Feb 3, 2015 at 11:25 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>
>>>>>>>>> It would probably help reviewers if you pointed to actual path
>>>>>>>>> submission [1], which unfortunately contains the explanation in the
>>>>>>>>> patch itself [2], which further explains that this functionality is
>>>>>>>>> currently only supported with gold, patched with [3].
>>>>>>>>>
>>>>>>>>> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
>>>>>>>>> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
>>>>>>>>> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>>>>>>>>>
>>>>>>>>> After a bit of the above detective work, I think that new gcc option
>>>>>>>>> is not necessary. The configure should detect if new functionality is
>>>>>>>>> supported in the linker, and auto-configure gcc to use it when
>>>>>>>>> appropriate.
>>>>>>>>
>>>>>>>> I think GCC option is needed since one can use -fuse-ld= to
>>>>>>>> change linker.
>>>>>>>
>>>>>>> IMO, nobody will use this highly special x86_64-only option. It would
>>>>>>> be best for gnu-ld to reach feature parity with gold as far as this
>>>>>>> functionality is concerned. In this case, the optimization would be
>>>>>>> auto-configured, and would fire automatically, without any user
>>>>>>> intervention.
>>>>>>>
>>>>>>
>>>>>> Let's do it.  I implemented the same feature in bfd linker on both
>>>>>> master and 2.25 branch.
>>>>>>
>>>>>
>>>>> +bool
>>>>> +i386_binds_local_p (const_tree exp)
>>>>> +{
>>>>> +  /* Globals marked extern are treated as local when linker copy relocations
>>>>> +     support is available with -f{pie|PIE}.  */
>>>>> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
>>>>> +      && TREE_CODE (exp) == VAR_DECL
>>>>> +      && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
>>>>> +    return true;
>>>>> +  return default_binds_local_p (exp);
>>>>> +}
>>>>> +
>>>>>
>>>>> It returns true with -fPIE and false without -fPIE.  It is lying to compiler.
>>>>> Maybe legitimate_pic_address_disp_p is a better place.
>>>
>>> Agreed.
>>>
>>>> Something like this?
>>>
>>> Yes.
>>>
>>> OK, if Jakub doesn't have any objections here. Please also add
>>> Sriraman as author to ChangeLog entry.
>>>
>>> Thanks,
>>> Uros.
>>
>> Here is the patch.   OK to install?
>>
>> Thanks.
>>
>> --
>> H.J.
>> ---
>> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
>> module using the GOT.  This is two instructions, one to get the address
>> of the global from the GOT and the other to get the value.  If it turns
>> out that the global gets defined in the executable at link-time, it still
>> needs to go through the GOT as it is too late then to generate a direct
>> access.
>>
>> Examples:
>>
>> foo.cc
>> ------
>> int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code directly accesses the global via
>> PC-relative insn:
>>
>> 5e0   <main>:
>>    mov    0x165a(%rip),%eax        # 1c40 <a_glob>
>>
>> foo.cc
>> ------
>>
>> extern int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code accesses global via GOT using
>> two memory loads:
>>
>> 6f0  <main>:
>>    mov    0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>    mov    (%rax),%eax
>>
>> This is true even if in the latter case the global was defined in the
>> executable through a different file.
>>
>> Some experiments on google benchmarks shows that the extra memory loads
>> affects performance by 1% to 5%.
>>
>> Solution - Copy Relocations:
>>
>> When the linker supports copy relocations, GCC can always assume that
>> the global will be defined in the executable.  For globals that are truly
>> extern (come from shared objects), the linker will create copy relocations
>> and have them defined in the executable. Result is that no global access
>> needs to go through the GOT and hence improves performance.
>>
>> This optimization only applies to undefined, non-weak global data.
>> Undefined, weak global data access still must go through the GOT.
>
> Hi H.J.,
>
> This was the original patch to i386.c to let global accesses take
> advantage of copy relocations and avoid the GOT.
>
>
> @@ -13113,7 +13113,11 @@ legitimate_pic_address_disp_p (rtx disp)
>   return true;
>      }
>    else if (!SYMBOL_REF_FAR_ADDR_P (op0)
> -   && SYMBOL_REF_LOCAL_P (op0)
> +   && (SYMBOL_REF_LOCAL_P (op0)
> +       || (HAVE_LD_PIE_COPYRELOC
> +   && flag_pie
> +   && !SYMBOL_REF_WEAK (op0)
> +   && !SYMBOL_REF_FUNCTION_P (op0)))
>     && ix86_cmodel != CM_LARGE_PIC)
>
> I do not understand here why weak global data access must go through
> the GOT and not use copy relocations. Ultimately, there is only going
> to be one copy of the global either defined in the executable or the
> shared object right?
>
> Can we remove the check for SYMBOL_REF_WEAK?
>
> Thanks
> Sri
>
>
>
>>
>> This patch checks if linker supports PIE with copy reloc, which is
>> enabled in gold and bfd linker in bininutils 2.25, at configure time
>> and enables this optimization if the linker support is available.
>>
>> gcc/
>>
>> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
>> Linux/x86-64 linker supports PIE with copy reloc.
>> * config.in: Regenerated.
>> * configure: Likewise.
>>
>> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
>> pc-relative address for undefined, non-weak, non-function
>> symbol reference in 64-bit PIE if linker supports PIE with
>> copy reloc.
>>
>> * doc/sourcebuild.texi: Document pie_copyreloc target.
>>
>> gcc/testsuite/
>>
>> * gcc.target/i386/pie-copyrelocs-1.c: New test.
>> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
>> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>>
>> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
>> New procedure.

  reply	other threads:[~2015-02-03 19:26 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-02 19:19 Uros Bizjak
2014-12-02 19:39 ` H.J. Lu
2014-12-02 19:40 ` H.J. Lu
2014-12-02 20:01   ` Uros Bizjak
2014-12-02 20:43     ` H.J. Lu
2014-12-02 20:19       ` Jakub Jelinek
2014-12-02 22:14         ` H.J. Lu
2014-12-02 23:21           ` H.J. Lu
2014-12-03 13:47     ` H.J. Lu
2014-12-03 15:01       ` H.J. Lu
2014-12-03 21:35         ` H.J. Lu
2014-12-04 12:44           ` Uros Bizjak
2014-12-04 16:46             ` H.J. Lu
2014-12-04 19:32               ` Uros Bizjak
2015-02-03 19:25               ` Sriraman Tallam
2015-02-03 19:26                 ` Sriraman Tallam [this message]
2015-02-03 19:36                 ` Jakub Jelinek
2015-02-03 21:20                   ` Sriraman Tallam
2015-02-03 21:29                     ` H.J. Lu
2015-02-03 21:36                       ` Sriraman Tallam
2015-02-03 22:03                         ` H.J. Lu
2015-02-03 22:19                           ` Jakub Jelinek
2015-02-04  1:16                             ` H.J. Lu
2015-02-04 18:27                               ` Sriraman Tallam
2015-02-04 18:31                                 ` Jakub Jelinek
2015-02-04 18:38                                   ` H.J. Lu
2015-02-04 18:42                                     ` Jakub Jelinek
2015-02-04 18:45                                       ` H.J. Lu
2015-02-04 18:51                                         ` Sriraman Tallam
2015-02-04 18:57                                           ` H.J. Lu
2015-02-04 21:53                                             ` Sriraman Tallam
2015-02-04 22:37                                               ` H.J. Lu
2015-02-04 22:47                                                 ` Bernhard Reutner-Fischer
2015-02-04 23:10                                                   ` H.J. Lu
2015-02-04 23:29                                                     ` H.J. Lu
2015-02-05 16:57                                                       ` Bernhard Reutner-Fischer
2015-02-05 18:54                                                       ` Richard Henderson
2015-02-05 19:01                                                         ` H.J. Lu
2015-02-05 19:59                                                           ` Richard Henderson
2015-02-05 22:05                                                             ` Sriraman Tallam
2015-02-05 22:47                                                               ` H.J. Lu
2015-02-05 22:48                                                                 ` Sriraman Tallam
2015-02-06 16:25                                                               ` H.J. Lu
2015-02-27 23:39               ` H.J. Lu
2015-02-27 23:46                 ` H.J. Lu
  -- strict thread matches above, loose matches on Subject: below --
2014-12-04 22:19 Dominique Dhumieres
2014-12-04 23:54 ` H.J. Lu
2014-05-15 18:34 Sriraman Tallam
2014-05-19 18:11 ` Sriraman Tallam
2014-06-09 22:55   ` Sriraman Tallam
2014-06-21  0:17     ` Sriraman Tallam
2014-06-26 17:55       ` Sriraman Tallam
2014-07-11 17:42         ` Sriraman Tallam
2014-09-02 18:15           ` Sriraman Tallam
2014-09-02 20:40       ` Richard Henderson
2014-09-03  7:25         ` Bernhard Reutner-Fischer
2014-09-08 22:19         ` Sriraman Tallam
2014-09-19 21:11           ` Sriraman Tallam
2014-09-29 17:57             ` Sriraman Tallam
2014-10-06 20:43               ` Sriraman Tallam
2014-11-10 23:35                 ` Sriraman Tallam
2014-12-02 18:01                   ` Sriraman Tallam
2014-12-02 19:06           ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAs8Hmxb1PKeRAzwJ+i+tEY_PTdPDiED6Rw1sAnjRyA9Ark3hg@mail.gmail.com \
    --to=tmsriram@google.com \
    --cc=ccoutant@google.com \
    --cc=davidxl@google.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=jakub@redhat.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).