public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Re: [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines
       [not found]         ` <CAMe9rOozW=4q_a=nmNvbZhsXooRZ4opvTTNtq2WHcGSSd2ONOw@mail.gmail.com>
@ 2012-08-31  1:10           ` H.J. Lu
  2012-08-31 14:43             ` Richard Henderson
  0 siblings, 1 reply; 9+ messages in thread
From: H.J. Lu @ 2012-08-31  1:10 UTC (permalink / raw)
  To: Andreas Krebbel; +Cc: David Miller, aj, libc-alpha, Binutils

On Thu, Aug 30, 2012 at 6:09 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> The linker has to optimize the GOT reference into a relative reloc if
>>> you want IFUNC to work properly, sparc does this as does x86.
>>
>> It would only work if ld would be able to get rid of the runtime relocations entirely. In order to
>> do this ld would need to rewrite the code accessing the GOT slots to use pc or got relative
>> addressing. Interesting, but I don't think x86 is already doing this. At least ld didn't in the
>> testcase I'm discussing with H.J.Lu.
>>
>
> I have no plan to edit code sequence for this.
>

Here is the x86 change to convert MOV to LEA.

-- 
H.J.
---
diff --git a/bfd/elf32-i386.c b/bfd/elf32-i386.c
index 7d3652d..7dc6fb7 100644
--- a/bfd/elf32-i386.c
+++ b/bfd/elf32-i386.c
@@ -3470,6 +3470,25 @@ elf_i386_relocate_section (bfd *output_bfd,
 	  if (off >= (bfd_vma) -2)
 	    abort ();

+	  if (h != NULL
+	      && h->def_regular
+	      && info->shared
+	      && SYMBOL_REFERENCES_LOCAL (info, h)
+	      && bfd_get_8 (input_bfd,
+			    contents + rel->r_offset - 2) == 0x8b)
+	    {
+	      /* Convert
+		 movl	foo@GOT(%reg), %reg
+		 to
+		 leal	foo@GOTOFF(%reg), %reg
+	       */
+	      bfd_put_8 (output_bfd, 0x8d,
+			 contents + rel->r_offset - 2);
+	      relocation -= (htab->elf.sgotplt->output_section->vma
+			     + htab->elf.sgotplt->output_offset);
+	      break;
+	    }
+
 	  relocation = htab->elf.sgot->output_section->vma
 		       + htab->elf.sgot->output_offset + off
 		       - htab->elf.sgotplt->output_section->vma
diff --git a/bfd/elf64-x86-64.c b/bfd/elf64-x86-64.c
index a29ba8a..c291662 100644
--- a/bfd/elf64-x86-64.c
+++ b/bfd/elf64-x86-64.c
@@ -3460,6 +3460,23 @@ elf_x86_64_relocate_section (bfd *output_bfd,
 	  if (off >= (bfd_vma) -2)
 	    abort ();

+	  if (r_type == R_X86_64_GOTPCREL
+	      && info->shared
+	      && h->def_regular
+	      && SYMBOL_REFERENCES_LOCAL (info, h)
+	      && bfd_get_8 (input_bfd,
+			    contents + rel->r_offset - 2) == 0x8b)
+	    {
+	      /* Convert
+		 movl foo@GOTPCREL(%rip), %reg
+		 to
+		 leal foo(%rip), %reg
+	       */
+	      bfd_put_8 (output_bfd, 0x8d,
+			 contents + rel->r_offset - 2);
+	      break;
+	    }
+
 	  relocation = base_got->output_section->vma
 		       + base_got->output_offset + off;
 	  if (r_type != R_X86_64_GOTPCREL && r_type != R_X86_64_GOTPCREL64)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines
  2012-08-31  1:10           ` [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines H.J. Lu
@ 2012-08-31 14:43             ` Richard Henderson
  2012-08-31 19:16               ` H.J. Lu
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2012-08-31 14:43 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Andreas Krebbel, David Miller, aj, libc-alpha, Binutils

On 08/30/2012 06:02 PM, H.J. Lu wrote:
> @@ -3470,6 +3470,25 @@ elf_i386_relocate_section (bfd *output_bfd,

Don't do these sorts of things in relocate_section.

Do them in relax_section where you still have time
to remove the got entry if it becomes unused.


r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines
  2012-08-31 14:43             ` Richard Henderson
@ 2012-08-31 19:16               ` H.J. Lu
  2012-09-01 16:51                 ` Richard Henderson
  0 siblings, 1 reply; 9+ messages in thread
From: H.J. Lu @ 2012-08-31 19:16 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andreas Krebbel, David Miller, aj, libc-alpha, Binutils

On Fri, Aug 31, 2012 at 7:40 AM, Richard Henderson <rth@twiddle.net> wrote:
> On 08/30/2012 06:02 PM, H.J. Lu wrote:
>> @@ -3470,6 +3470,25 @@ elf_i386_relocate_section (bfd *output_bfd,
>
> Don't do these sorts of things in relocate_section.
>
> Do them in relax_section where you still have time
> to remove the got entry if it becomes unused.
>

This is an excellent idea.  But  relax_section is too late for
x86 and x86-64.  I need to do it in size_dynamic_sections
so that we can get proper GOT entries.


-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines
  2012-08-31 19:16               ` H.J. Lu
@ 2012-09-01 16:51                 ` Richard Henderson
  2012-09-01 17:22                   ` H.J. Lu
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2012-09-01 16:51 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Andreas Krebbel, David Miller, aj, libc-alpha, Binutils

On 2012-08-31 11:09, H.J. Lu wrote:
> This is an excellent idea.  But  relax_section is too late for
> x86 and x86-64.  I need to do it in size_dynamic_sections
> so that we can get proper GOT entries.

It's not too late.  Look at some of the other ports.


r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines
  2012-09-01 16:51                 ` Richard Henderson
@ 2012-09-01 17:22                   ` H.J. Lu
  2012-09-01 21:25                     ` Richard Henderson
  0 siblings, 1 reply; 9+ messages in thread
From: H.J. Lu @ 2012-09-01 17:22 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andreas Krebbel, David Miller, aj, libc-alpha, Binutils

On Sat, Sep 1, 2012 at 9:51 AM, Richard Henderson <rth@twiddle.net> wrote:
> On 2012-08-31 11:09, H.J. Lu wrote:
>> This is an excellent idea.  But  relax_section is too late for
>> x86 and x86-64.  I need to do it in size_dynamic_sections
>> so that we can get proper GOT entries.
>
> It's not too late.  Look at some of the other ports.
>

It may work for other targets.  But x86 doesn't have a relax
pass.  I believe doing it in size_dynamic_sections is more
sensible.


-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines
  2012-09-01 17:22                   ` H.J. Lu
@ 2012-09-01 21:25                     ` Richard Henderson
  2012-09-02 14:50                       ` H.J. Lu
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2012-09-01 21:25 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Andreas Krebbel, David Miller, aj, libc-alpha, Binutils

On 2012-09-01 10:21, H.J. Lu wrote:
> It may work for other targets.  But x86 doesn't have a relax
> pass.  I believe doing it in size_dynamic_sections is more
> sensible.

It *should* have a relax pass.  Especially as that allows one
to turn it all off just in case a bug is encountered.

r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines
  2012-09-01 21:25                     ` Richard Henderson
@ 2012-09-02 14:50                       ` H.J. Lu
  2012-09-02 19:41                         ` Richard Henderson
  0 siblings, 1 reply; 9+ messages in thread
From: H.J. Lu @ 2012-09-02 14:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andreas Krebbel, David Miller, aj, libc-alpha, Binutils

On Sat, Sep 1, 2012 at 2:24 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 2012-09-01 10:21, H.J. Lu wrote:
>> It may work for other targets.  But x86 doesn't have a relax
>> pass.  I believe doing it in size_dynamic_sections is more
>> sensible.
>
> It *should* have a relax pass.  Especially as that allows one

A relax pass won't work properly for x86 without some surgery
since lang_relax_sections is called after bfd_elf_size_dynamic_sections
which calls elf_x86_64_allocate_dynrelocs to allocate GOT entries
and dynamic relocations. After it is done, it is not to easy to undo
the damage.

> to turn it all off just in case a bug is encountered.
>

This is a separate issue.  X86 backends also optimize
TLS relocations.  Some compilers generate bad TLS
sequences which lead to corrupted output:

http://sourceware.org/bugzilla/show_bug.cgi?id=4928

But there is no way to turn off the TLS optimization.  It
will be nice to move disable_target_specific_optimizations
from command_line to link_info so that a backend
can check it.


-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines
  2012-09-02 14:50                       ` H.J. Lu
@ 2012-09-02 19:41                         ` Richard Henderson
  2012-09-02 20:06                           ` H.J. Lu
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2012-09-02 19:41 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Andreas Krebbel, David Miller, aj, libc-alpha, Binutils

On 2012-09-02 07:50, H.J. Lu wrote:
> A relax pass won't work properly for x86 without some surgery
> since lang_relax_sections is called after bfd_elf_size_dynamic_sections
> which calls elf_x86_64_allocate_dynrelocs to allocate GOT entries
> and dynamic relocations. After it is done, it is not to easy to undo
> the damage.

Other targets keep usage counts of GOT entries live throughout relaxation.
It becomes easy to adjust the GOT *after* bfd_elf_size_dynamic_sections
has been called.  Indeed, many targets require this ordering since some
of the optimizations applied are true relaxations, and require knowledge
of true displacements.  And indeed, the reduction in GOT size may enable
another relaxation pass, enabling further optimizations to succeed.

> This is a separate issue.  X86 backends also optimize
> TLS relocations.  Some compilers generate bad TLS
> sequences which lead to corrupted output:
> 
> http://sourceware.org/bugzilla/show_bug.cgi?id=4928
> 
> But there is no way to turn off the TLS optimization.

... because it was done in the wrong place.  An excellent opportunity
to move the TLS relaxations to a relaxation pass, wouldn't you agree?


r~

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines
  2012-09-02 19:41                         ` Richard Henderson
@ 2012-09-02 20:06                           ` H.J. Lu
  0 siblings, 0 replies; 9+ messages in thread
From: H.J. Lu @ 2012-09-02 20:06 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andreas Krebbel, David Miller, aj, libc-alpha, Binutils

On Sun, Sep 2, 2012 at 12:41 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 2012-09-02 07:50, H.J. Lu wrote:
>> A relax pass won't work properly for x86 without some surgery
>> since lang_relax_sections is called after bfd_elf_size_dynamic_sections
>> which calls elf_x86_64_allocate_dynrelocs to allocate GOT entries
>> and dynamic relocations. After it is done, it is not to easy to undo
>> the damage.
>
> Other targets keep usage counts of GOT entries live throughout relaxation.
> It becomes easy to adjust the GOT *after* bfd_elf_size_dynamic_sections
> has been called.  Indeed, many targets require this ordering since some
> of the optimizations applied are true relaxations, and require knowledge
> of true displacements.  And indeed, the reduction in GOT size may enable
> another relaxation pass, enabling further optimizations to succeed.

That is true.  IA64 needs more than one pass to do it properly.

>> This is a separate issue.  X86 backends also optimize
>> TLS relocations.  Some compilers generate bad TLS
>> sequences which lead to corrupted output:
>>
>> http://sourceware.org/bugzilla/show_bug.cgi?id=4928
>>
>> But there is no way to turn off the TLS optimization.
>
> ... because it was done in the wrong place.  An excellent opportunity
> to move the TLS relaxations to a relaxation pass, wouldn't you agree?
>

I think it is overkill for x86 since all relocations targets can be reached.
I don't want to make big changes in relaxation codes for this.


-- 
H.J.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-09-02 20:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <503E009B.3080302@linux.vnet.ibm.com>
     [not found] ` <CAMe9rOrmiF3VZOBNvEbMV5jFNog1-_EoPoT9rHTUQFsANaSD8w@mail.gmail.com>
     [not found]   ` <503E3930.5040603@linux.vnet.ibm.com>
     [not found]     ` <20120829.125208.824114683359549094.davem@davemloft.net>
     [not found]       ` <503F14A3.8070801@linux.vnet.ibm.com>
     [not found]         ` <CAMe9rOozW=4q_a=nmNvbZhsXooRZ4opvTTNtq2WHcGSSd2ONOw@mail.gmail.com>
2012-08-31  1:10           ` [PATCH] S/390: Fix two issues with the IFUNC optimized mem* routines H.J. Lu
2012-08-31 14:43             ` Richard Henderson
2012-08-31 19:16               ` H.J. Lu
2012-09-01 16:51                 ` Richard Henderson
2012-09-01 17:22                   ` H.J. Lu
2012-09-01 21:25                     ` Richard Henderson
2012-09-02 14:50                       ` H.J. Lu
2012-09-02 19:41                         ` Richard Henderson
2012-09-02 20:06                           ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).