From: "H.J. Lu" <hjl.tools@gmail.com>
To: Rich Felker <dalias@libc.org>
Cc: Alexander Monakov <amonakov@ispras.ru>,
Jakub Jelinek <jakub@redhat.com>, Jeff Law <law@redhat.com>,
GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Expand PIC calls without PLT with -fno-plt
Date: Wed, 06 May 2015 18:45:00 -0000 [thread overview]
Message-ID: <CAMe9rOoxbn33SwXCmHcwYL87t4_JgxgHtT6HSECtFguyLQpxKw@mail.gmail.com> (raw)
In-Reply-To: <20150506183735.GK17573@brightrain.aerifal.cx>
On Wed, May 6, 2015 at 11:37 AM, Rich Felker <dalias@libc.org> wrote:
> On Wed, May 06, 2015 at 11:26:29AM -0700, H.J. Lu wrote:
>> On Wed, May 6, 2015 at 10:35 AM, Rich Felker <dalias@libc.org> wrote:
>> > On Wed, May 06, 2015 at 07:43:58PM +0300, Alexander Monakov wrote:
>> >> On Wed, 6 May 2015, Jakub Jelinek wrote:
>> >> > The linker would know very well what kind of relocations are used for
>> >> > particular PLT slot, and for the new relocations which would resolve to the
>> >> > address of the .got.plt slot it could just tweak corresponding 3rd insn
>> >> > in the slot, to not jump to first plt slot - 16, but a few bytes before that
>> >> > that would just load the address of _G_O_T_ into %ebx and then fallthru
>> >> > into the 0x4c2b7310 snippet above. The lazy binding would be a few ticks
>> >> > slower in that case, but no requirement on %ebx to contain _G_O_T_.
>> >>
>> >> No, %ebx is callee-saved, so you can't outright overwrite it in the PLT stub.
>> >
>> > Indeed. And the situation is the same on almost all targets. The only
>> > exceptions are those with direct PC-relative addressing (like x86_64)
>> > and those with reserved inter-procedural linkage registers and
>> > efficient PC-relative address loading via them (like ARM and AArch64).
>> > MIPS (o32) is also an interesting exception in that the normal ABI is
>> > already PLT-free, and while callees need a PIC register loaded, it's a
>> > call-clobbered register, not a call-saved one, so it doesn't make the
>> > same kind of trouble,
>> >
>> > I really don't see a need to make no-PLT code gen support lazy binding
>> > when it's necessarily going to be costly to do so, and precludes most
>> > of the benefits of the no-PLT approach. Anyone still wanting/needing
>> > lazy binding semantics can use PLT, and can even choose on a per-TU
>> > basis (or maybe even more fine-grained with pragmas/attributes?).
>> > Those of us who are suffering the cost of PLT with no benefits
>> > (because we use -Wl,-z,relro -Wl,-z,now) can just be rid of it (by
>> > adding -fno-plt) and enjoy something like a 10% performance boost in
>> > PIC/PIE.
>> >
>>
>> There are things compiler can do for performance and correctness
>> if it is told what options will be passed to linker. -z now is one and
>> -Bsymbolic is another one:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65886
>>
>> I think we should add -fnow and -fsymbolic. Together with LTO,
>> we can generate faster executables as well as shared libraries.
>
> I don't see how knowing about -Bsymbolic can help the compiler
> optimize. Without visibility, it can't know whether the symbols will
> be defined in the same DSO. With visibility, it can already do the
> equivalent hints. Perhaps it helps in the case where the symbol is
> already defined (and non-weak) in the same TU, but I think in this
> case it should already be optimizing the reference. Symbol
> interposition over top of a non-weak symbol from the same TU is always
> invalid and the compiler should not be pessimizing code to make it
> work.
-Bsymbolic will bind all references to local definitions in shared libraries,
with and without visibility, weak or non-weak. Compiler can use it
in binds_tls_local_p and we can generate much better codes in shared
libraries.
> As for -fnow, I haven't thought about it much but I also don't see
> many places where it could help. The only benefit that comes to mind
> is on targets with weak memory order, where it would eliminate some of
> the cost of synchronizing TLSDESC lazy bindings (see Szabolcs Nagy's
> work on AArch64). It might also benefit PLT calls on such targets, but
> you would get a lot more benefit from -fno-plt, and in that case -fnow
> would not allow any further optimization.
>
-fno-plt doesn't work with lazy binding. -fnow tells compiler that
lazy binding is not used and it can optimize without PLT. With
-flto -fnow, compiler can make much better choices.
--
H.J.
next prev parent reply other threads:[~2015-05-06 18:45 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-04 16:38 PIC calls without PLT, generic implementation Alexander Monakov
2015-05-04 16:38 ` [PATCH i386] Extend sibcall peepholes to allow source in %eax Alexander Monakov
2015-05-10 16:54 ` Jan Hubicka
2015-05-11 17:50 ` Alexander Monakov
2015-05-11 18:00 ` Jan Hubicka
2015-05-11 19:46 ` Uros Bizjak
2015-05-11 19:48 ` Jeff Law
2015-05-11 20:16 ` Jan Hubicka
2015-05-13 19:05 ` Alexander Monakov
2015-05-13 20:04 ` Jan Hubicka
2015-05-14 17:36 ` Alexander Monakov
2015-05-04 16:38 ` [PATCH i386] Move CLOBBERED_REGS earlier in register class list Alexander Monakov
2015-05-10 16:44 ` Jan Hubicka
2015-05-10 17:51 ` Uros Bizjak
2015-05-10 18:09 ` Uros Bizjak
2015-05-11 16:26 ` Alexander Monakov
2015-05-11 16:30 ` Uros Bizjak
2015-05-04 16:38 ` [RFC PATCH] ira: accept loads via argp rtx in validate_equiv_mem Alexander Monakov
2015-05-04 17:37 ` Jeff Law
2015-05-04 16:38 ` [PATCH i386] PR65753: allow PIC tail calls via function pointers Alexander Monakov
2015-05-10 16:37 ` Jan Hubicka
2015-05-11 16:11 ` Alexander Monakov
2015-05-04 16:38 ` [PATCH i386] Allow sibcalls in no-PLT PIC Alexander Monakov
2015-05-15 16:37 ` Alexander Monakov
2015-05-15 16:48 ` H.J. Lu
2015-05-15 20:08 ` Jan Hubicka
2015-05-15 20:23 ` H.J. Lu
2015-05-15 20:35 ` Rich Felker
2015-05-15 20:37 ` H.J. Lu
2015-05-15 20:45 ` Rich Felker
2015-05-15 22:16 ` H.J. Lu
2015-05-15 23:14 ` Jan Hubicka
2015-05-15 23:30 ` H.J. Lu
2015-05-15 23:35 ` H.J. Lu
2015-05-15 23:44 ` H.J. Lu
2015-05-16 0:18 ` Rich Felker
2015-05-16 14:33 ` H.J. Lu
2015-05-16 19:03 ` H.J. Lu
2015-05-16 19:32 ` Rich Felker
2015-05-16 23:23 ` H.J. Lu
2015-05-15 23:49 ` Rich Felker
2015-05-19 14:48 ` Michael Matz
2015-05-19 15:11 ` Jeff Law
2015-05-19 16:03 ` Michael Matz
2015-05-19 19:11 ` Rich Felker
2015-05-19 18:08 ` Rich Felker
2015-05-19 19:03 ` Richard Henderson
2015-05-19 19:10 ` H.J. Lu
2015-05-19 19:17 ` Richard Henderson
2015-05-19 19:20 ` H.J. Lu
2015-05-19 19:54 ` Richard Henderson
2015-05-19 20:27 ` Rich Felker
2015-05-19 20:44 ` H.J. Lu
2015-05-19 21:28 ` Rich Felker
2015-05-20 0:52 ` H.J. Lu
2015-05-20 1:09 ` Rich Felker
2015-05-22 19:32 ` Richard Henderson
2015-05-19 19:48 ` Rich Felker
2015-05-19 20:16 ` Richard Henderson
2015-05-20 12:13 ` Michael Matz
2015-05-20 12:40 ` H.J. Lu
2015-05-20 14:17 ` Rich Felker
2015-05-20 14:33 ` Michael Matz
2015-05-18 18:25 ` Alexander Monakov
2015-05-18 19:03 ` Jan Hubicka
2015-05-04 16:38 ` [PATCH] Expand PIC calls without PLT with -fno-plt Alexander Monakov
2015-05-04 17:34 ` Jeff Law
2015-05-04 17:40 ` Jakub Jelinek
2015-05-04 17:42 ` Jeff Law
2015-05-06 3:08 ` Rich Felker
2015-05-10 17:07 ` Jan Hubicka
2015-05-06 15:25 ` Alexander Monakov
2015-05-06 15:46 ` Jakub Jelinek
2015-05-06 15:55 ` Jeff Law
2015-05-06 16:44 ` Alexander Monakov
2015-05-06 17:35 ` Rich Felker
2015-05-06 18:26 ` H.J. Lu
2015-05-06 18:37 ` Rich Felker
2015-05-06 18:45 ` H.J. Lu [this message]
2015-05-06 19:01 ` Rich Felker
2015-05-06 19:05 ` H.J. Lu
2015-05-06 19:18 ` Rich Felker
2015-05-06 19:24 ` H.J. Lu
2015-05-11 11:48 ` Michael Matz
2015-05-11 14:20 ` Rich Felker
2015-05-07 18:22 ` Jeff Law
2015-05-07 19:13 ` H.J. Lu
2015-05-10 16:59 ` Jan Hubicka
2015-05-11 20:36 ` Jeff Law
2015-05-11 20:55 ` H.J. Lu
2015-05-11 22:13 ` Jan Hubicka
2015-06-22 15:52 ` Jiong Wang
2015-06-22 18:18 ` Alexander Monakov
2015-06-23 8:41 ` Ramana Radhakrishnan
2015-06-23 10:43 ` Alexander Monakov
2015-06-23 13:28 ` Jeff Law
2015-07-16 10:37 ` [AArch64] Tighten direct call pattern to repair -fno-plt Jiong Wang
2015-07-16 10:47 ` Alexander Monakov
2015-07-16 10:48 ` Jiong Wang
2015-07-21 12:52 ` [AArch64][sibcall]Tighten " Jiong Wang
2015-08-04 9:50 ` James Greenhalgh
2015-08-06 16:18 ` [COMMITTED][AArch64][sibcall]Tighten " Jiong Wang
2015-08-07 8:22 ` James Greenhalgh
2015-08-07 13:28 ` Jiong Wang
2015-08-04 9:50 ` [AArch64] Tighten " James Greenhalgh
2015-08-06 16:16 ` [COMMITTED][AArch64] " Jiong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMe9rOoxbn33SwXCmHcwYL87t4_JgxgHtT6HSECtFguyLQpxKw@mail.gmail.com \
--to=hjl.tools@gmail.com \
--cc=amonakov@ispras.ru \
--cc=dalias@libc.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=law@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).