public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "H.J. Lu" <hjl.tools@gmail.com>
To: Michael Matz <matz@suse.de>
Cc: Sriraman Tallam <tmsriram@google.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
		David Li <davidxl@google.com>
Subject: Re: [RFC][PATCH][X86_64] Eliminate PLT stubs for specified external functions via -fno-plt=
Date: Sun, 10 May 2015 15:19:00 -0000	[thread overview]
Message-ID: <CAMe9rOpF996wsqCjCUtApvSh8tNcf--=+OnfnNuSyKm0VPoUKQ@mail.gmail.com> (raw)

On Sat, May 9, 2015 at 9:34 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, May 4, 2015 at 7:45 AM, Michael Matz <matz@suse.de> wrote:
>> Hi,
>>
>> On Thu, 30 Apr 2015, Sriraman Tallam wrote:
>>
>>> We noticed that one of our benchmarks sped-up by ~1% when we eliminated
>>> PLT stubs for some of the hot external library functions like memcmp,
>>> pow.  The win was from better icache and itlb performance. The main
>>> reason was that the PLT stubs had no spatial locality with the
>>> call-sites. I have started looking at ways to tell the compiler to
>>> eliminate PLT stubs (in-effect inline them) for specified external
>>> functions, for x86_64. I have a proposal and a patch and I would like to
>>> hear what you think.
>>>
>>> This comes with caveats.  This cannot be generally done for all
>>> functions marked extern as it is impossible for the compiler to say if a
>>> function is "truly extern" (defined in a shared library). If a function
>>> is not truly extern(ends up defined in the final executable), then
>>> calling it indirectly is a performance penalty as it could have been a
>>> direct call.
>>
>> This can be fixed by Alans idea.
>>
>>> Further, the newly created GOT entries are fixed up at
>>> start-up and do not get lazily bound.
>>
>> And this can be fixed by some enhancements in the linker and dynamic
>> linker.  The idea is to still generate a PLT stub and make its GOT entry
>> point to it initially (like a normal got.plt slot).  Then the first
>> indirect call will use the address of PLT entry (starting lazy resolution)
>> and update the GOT slot with the real address, so further indirect calls
>> will directly go to the function.
>>
>> This requires a new asm marker (and hence new reloc) as normally if
>> there's a GOT slot it's filled by the real symbols address, unlike if
>> there's only a got.plt slot.  E.g. a
>>
>>   call *foo@GOTPLT(%rip)
>>
>> would generate a GOT slot (and fill its address into above call insn), but
>> generate a JUMP_SLOT reloc in the final executable, not a GLOB_DAT one.
>>
>
> I added the "relax" prefix support to x86 assembler on users/hjl/relax
> branch
>
> at
>
> https://sourceware.org/git/?p=binutils-gdb.git;a=summary
>
> [hjl@gnu-tools-1 relax-3]$ cat r.S
> .text
> relax jmp foo
> relax call foo
> relax jmp foo@plt
> relax call foo@plt
> [hjl@gnu-tools-1 relax-3]$ ./as -o r.o r.S
> [hjl@gnu-tools-1 relax-3]$ ./objdump -drw r.o
>
> r.o:     file format elf64-x86-64
>
>
> Disassembly of section .text:
>
> 0000000000000000 <.text>:
>    0: 66 e9 00 00 00 00     data16 jmpq 0x6 2: R_X86_64_RELAX_PC32 foo-0x4
>    6: 66 e8 00 00 00 00     data16 callq 0xc 8: R_X86_64_RELAX_PC32 foo-0x4
>    c: 66 e9 00 00 00 00     data16 jmpq 0x12 e: R_X86_64_RELAX_PLT32foo-0x4
>   12: 66 e8 00 00 00 00     data16 callq 0x18 14: R_X86_64_RELAX_PLT32foo-0x4
> [hjl@gnu-tools-1 relax-3]$
>
> Right now, the relax relocations are treated as PC32/PLT32 relocations.
> I am working on linker support.
>

I implemented the linker support for x86-64:

00000000 <main>:
   0: 48 83 ec 08           sub    $0x8,%rsp
   4: e8 00 00 00 00       callq  9 <main+0x9> 5: R_X86_64_PC32 plt-0x4
   9: e8 00 00 00 00       callq  e <main+0xe> a: R_X86_64_PLT32 plt-0x4
   e: e8 00 00 00 00       callq  13 <main+0x13> f: R_X86_64_PC32 bar-0x4
  13: 66 e8 00 00 00 00     data16 callq 19 <main+0x19> 15:
R_X86_64_RELAX_PC32 bar-0x4
  19: 66 e8 00 00 00 00     data16 callq 1f <main+0x1f> 1b:
R_X86_64_RELAX_PLT32 bar-0x4
  1f: 66 e8 00 00 00 00     data16 callq 25 <main+0x25> 21:
R_X86_64_RELAX_PC32 foo-0x4
  25: 66 e8 00 00 00 00     data16 callq 2b <main+0x2b> 27:
R_X86_64_RELAX_PLT32 foo-0x4
  2b: 31 c0                 xor    %eax,%eax
  2d: 48 83 c4 08           add    $0x8,%rsp
  31: c3                   retq

00400460 <main>:
  400460: 48 83 ec 08           sub    $0x8,%rsp
  400464: e8 d7 ff ff ff       callq  400440 <plt@plt>
  400469: e8 d2 ff ff ff       callq  400440 <plt@plt>
  40046e: e8 ad ff ff ff       callq  400420 <bar@plt>
  400473: ff 15 ff 03 20 00     callq  *0x2003ff(%rip)        # 600878
<_DYNAMIC+0xf8>
  400479: ff 15 f9 03 20 00     callq  *0x2003f9(%rip)        # 600878
<_DYNAMIC+0xf8>
  40047f: 66 e8 f3 00 00 00     data16 callq 400578 <foo>
  400485: 66 e8 ed 00 00 00     data16 callq 400578 <foo>
  40048b: 31 c0                 xor    %eax,%eax
  40048d: 48 83 c4 08           add    $0x8,%rsp
  400491: c3                   retq

Sriraman, can you give it a try?

-- 
H.J.

             reply	other threads:[~2015-05-10 15:19 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-10 15:19 H.J. Lu [this message]
     [not found] ` <CAAs8HmwWSDY+KjKcB4W=TiYV0Pz7NSvfL_8igp+hPT-LU1utTg@mail.gmail.com>
2015-05-21 21:31   ` Sriraman Tallam
2015-05-21 21:39     ` Sriraman Tallam
2015-05-21 22:02     ` Pedro Alves
2015-05-21 22:02       ` Jakub Jelinek
2015-05-22  1:47         ` H.J. Lu
2015-05-22  3:38         ` Xinliang David Li
2015-05-21 22:34       ` Sriraman Tallam
2015-05-22  9:22         ` Pedro Alves
2015-05-22 15:13           ` Sriraman Tallam
2015-05-28 18:53           ` Sriraman Tallam
2015-05-28 19:05             ` H.J. Lu
2015-05-28 19:48               ` Sriraman Tallam
2015-05-28 20:19                 ` H.J. Lu
2015-05-28 21:27                   ` Sriraman Tallam
2015-05-28 21:31                     ` H.J. Lu
2015-05-28 21:52                       ` Sriraman Tallam
2015-05-28 22:48                         ` H.J. Lu
2015-05-29  3:51                           ` Sriraman Tallam
2015-05-29  5:13                             ` H.J. Lu
2015-05-29  7:13                               ` Sriraman Tallam
2015-05-29 17:36                                 ` Sriraman Tallam
2015-05-29 17:52                                   ` H.J. Lu
2015-05-29 18:33                                     ` Sriraman Tallam
2015-05-29 20:50                                 ` Jan Hubicka
2015-05-29 22:56                                   ` Sriraman Tallam
2015-05-29 23:08                                     ` Sriraman Tallam
     [not found]                                     ` <CAJA7tRYsMiq7rx34c=z6KwRdwYxxaeP6Z6qzA4XEwnJSMT7z=Q@mail.gmail.com>
2015-05-30  4:44                                       ` Sriraman Tallam
2015-06-01  8:24                                         ` Ramana Radhakrishnan
2015-06-01 18:01                                           ` Sriraman Tallam
2015-06-01 18:41                                             ` Ramana Radhakrishnan
2015-06-01 18:55                                               ` Sriraman Tallam
2015-06-01 20:33                                                 ` Ramana Radhakrishnan
2015-06-02 18:27                                                   ` Sriraman Tallam
2015-06-02 19:59                                                     ` Bernhard Reutner-Fischer
2015-06-02 20:09                                                       ` Sriraman Tallam
2015-06-02 21:18                                                         ` Bernhard Reutner-Fischer
2015-06-02 21:09                                                     ` Ramana Radhakrishnan
2015-06-02 21:25                                                       ` Xinliang David Li
2015-06-02 21:52                                                         ` Bernhard Reutner-Fischer
2015-06-02 21:40                                                       ` Sriraman Tallam
2015-06-03 14:37                                                         ` Ramana Radhakrishnan
2015-06-03 18:53                                                           ` Sriraman Tallam
2015-06-03 20:16                                                             ` Richard Henderson
2015-06-03 20:59                                                               ` Sriraman Tallam
2015-06-04 16:56                                                                 ` Sriraman Tallam
2015-06-04 17:30                                                                   ` Richard Henderson
2015-06-04 21:34                                                                     ` Sriraman Tallam
2015-07-24 19:02                                                                   ` H.J. Lu
2015-06-03 19:57                                                       ` Richard Henderson
  -- strict thread matches above, loose matches on Subject: below --
2015-05-01  0:31 Sriraman Tallam
2015-05-01  3:21 ` Alan Modra
2015-05-01  3:26   ` Sriraman Tallam
2015-05-01 15:01 ` Andi Kleen
2015-05-01 16:19   ` Xinliang David Li
2015-05-01 16:23     ` H.J. Lu
2015-05-01 16:26       ` Xinliang David Li
2015-05-01 18:06         ` Sriraman Tallam
2015-05-02 12:12           ` Andi Kleen
2015-05-01 17:50   ` Sriraman Tallam
2015-05-04 14:45 ` Michael Matz
2015-05-04 16:43   ` Xinliang David Li
2015-05-04 16:58     ` Michael Matz
2015-05-04 17:22       ` Xinliang David Li
2015-05-09 16:35   ` H.J. Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMe9rOpF996wsqCjCUtApvSh8tNcf--=+OnfnNuSyKm0VPoUKQ@mail.gmail.com' \
    --to=hjl.tools@gmail.com \
    --cc=davidxl@google.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=matz@suse.de \
    --cc=tmsriram@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).