public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Avoiding unnecessary jump relocations in gas?
@ 2015-05-07  6:02 Andy Lutomirski
  2015-05-07 11:52 ` Jan Beulich
  0 siblings, 1 reply; 21+ messages in thread
From: Andy Lutomirski @ 2015-05-07  6:02 UTC (permalink / raw)
  To: binutils, linux-kernel

AFAICT gas will produce relocations for jumps to global labels in the
same file.  This doesn't seem directly harmful to me, except that, on
x86, it forces five-byte jumps instead of two-byte jumps.

This seems especially unfortunate, since even hidden and protected
symbols have this problem.

Given that many users don't want interposition support (especially the
kernel and anyone using .hidden or .protected), it would be nice to
have a command-line option to turn this off and probably also to turn
it off by default for hidden and protected symbols.  Can gas do this?
I looked at the code, and this seems to be EXTERN_FORCE_RELOC, which
isn't configurable at runtime.

One safer way to fix this would be to add a new type of relocation
that represents a non-relocatable relative reference.  This could be
emitted for short jumps to local symbols, would not actually be
relocated, but would generate an error if a user did something awful
in a linker script that would otherwise generate incorrect linker
output.

--Andy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-07  6:02 Avoiding unnecessary jump relocations in gas? Andy Lutomirski
@ 2015-05-07 11:52 ` Jan Beulich
  2015-05-07 16:21   ` H.J. Lu
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2015-05-07 11:52 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: binutils, linux-kernel

>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
> AFAICT gas will produce relocations for jumps to global labels in the
> same file.  This doesn't seem directly harmful to me, except that, on
> x86, it forces five-byte jumps instead of two-byte jumps.
> 
> This seems especially unfortunate, since even hidden and protected
> symbols have this problem.
> 
> Given that many users don't want interposition support (especially the
> kernel and anyone using .hidden or .protected), it would be nice to
> have a command-line option to turn this off and probably also to turn
> it off by default for hidden and protected symbols.  Can gas do this?

I've been running with the below changes (taken off of a bigger set
of changes, so the line numbers may look a little odd) for the last
couple of years. I never tried to submit this change because so far
I couldn't find the time to check whether this would have any
unwanted side effects on cases I don't normally use.

--- binutils-2.25/gas/config/tc-i386.c
+++ 2.25/gas/config/tc-i386.c
@@ -8872,6 +8899,22 @@ i386_frag_max_var (fragS *frag)
 int
 md_estimate_size_before_relax (fragS *fragP, segT segment)
 {
+#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
+  asymbol *bfdsym = NULL;
+  elf_symbol_type *elfsym = NULL;
+
+  if (IS_ELF)
+    {
+      bfdsym = symbol_get_bfdsym (fragP->fr_symbol);
+
+      if (S_IS_EXTERNAL (fragP->fr_symbol))
+	{
+	  elfsym = elf_symbol_from (bfd_asymbol_bfd (bfdsym), bfdsym);
+	  gas_assert (elfsym);
+	}
+    }
+#endif
+
   /* We've already got fragP->fr_subtype right;  all we have to do is
      check for un-relaxable symbols.  On an ELF system, we can't relax
      an externally visible symbol, because it may be overridden by a
@@ -8879,10 +8922,12 @@ md_estimate_size_before_relax (fragS *fr
   if (S_GET_SEGMENT (fragP->fr_symbol) != segment
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
       || (IS_ELF
-	  && (S_IS_EXTERNAL (fragP->fr_symbol)
+	  && ((S_IS_EXTERNAL (fragP->fr_symbol)
+	       && elfsym->internal_elf_sym.st_other != STV_PROTECTED
+	       && elfsym->internal_elf_sym.st_other != STV_HIDDEN
+	       && elfsym->internal_elf_sym.st_other != STV_INTERNAL)
 	      || S_IS_WEAK (fragP->fr_symbol)
-	      || ((symbol_get_bfdsym (fragP->fr_symbol)->flags
-		   & BSF_GNU_INDIRECT_FUNCTION))))
+	      || (bfdsym->flags & BSF_GNU_INDIRECT_FUNCTION)))
 #endif
 #if defined (OBJ_COFF) && defined (TE_PE)
       || (OUTPUT_FLAVOR == bfd_target_coff_flavour

But then I also have this (intentionally compiled out)

--- binutils-2.25/gas/symbols.c
+++ 2.25/gas/symbols.c
@@ -2075,13 +2075,43 @@ S_IS_DEFINED (symbolS *s)
 int
 S_FORCE_RELOC (symbolS *s, int strict)
 {
+#if 0/*defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)*/
+  While generally correct, this and the similarly framed code fragments
+  below cause, if enabled, relocations to be dropped that should not be (i.e.
+  BFD_RELOC_X86_64_GOTPCREL) because the side effects of using a certain
+  relocation type are not accounted for in generic_force_reloc (in write.c)
+  except for the vtable relocations. GOT references, at least, would have
+  to fall into this category too, but there does not seem to be a flag in
+  struct fix to indicate such side effects (maybe fx_plt really is, but then
+  it is grossly misnamed), nor does it seem reasonable to enumerate all such
+  relocations in generic_force_reloc.
+  elf_symbol_type *elfsym = NULL;
+#endif
+
   if (LOCAL_SYMBOL_CHECK (s))
     return ((struct local_symbol *) s)->lsy_section == undefined_section;
 
+#if 0/*defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)*/
+  if (IS_ELF && (s->bsym->flags & BSF_GLOBAL))
+    {
+      asymbol *bfdsym = symbol_get_bfdsym (s);
+
+      elfsym = elf_symbol_from (bfd_asymbol_bfd (bfdsym), bfdsym);
+      gas_assert (elfsym);
+    }
+#endif
+
   return ((strict
 	   && ((s->bsym->flags & BSF_WEAK) != 0
 	       || (EXTERN_FORCE_RELOC
-		   && (s->bsym->flags & BSF_GLOBAL) != 0)))
+		   && (s->bsym->flags & BSF_GLOBAL) != 0
+#if 0/*defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)*/
+		   && (elfsym == NULL
+		       || (elfsym->internal_elf_sym.st_other != STV_PROTECTED
+		           && elfsym->internal_elf_sym.st_other != STV_HIDDEN
+			   && elfsym->internal_elf_sym.st_other != STV_INTERNAL))
+#endif
+		   )))
 	  || (s->bsym->flags & BSF_GNU_INDIRECT_FUNCTION) != 0
 	  || s->bsym->section == undefined_section
 	  || bfd_is_com_section (s->bsym->section));

Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-07 11:52 ` Jan Beulich
@ 2015-05-07 16:21   ` H.J. Lu
  2015-05-08  3:23     ` Andy Lutomirski
  0 siblings, 1 reply; 21+ messages in thread
From: H.J. Lu @ 2015-05-07 16:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andy Lutomirski, Binutils, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1553 bytes --]

On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>> AFAICT gas will produce relocations for jumps to global labels in the
>> same file.  This doesn't seem directly harmful to me, except that, on
>> x86, it forces five-byte jumps instead of two-byte jumps.
>>
>> This seems especially unfortunate, since even hidden and protected
>> symbols have this problem.
>>
>> Given that many users don't want interposition support (especially the
>> kernel and anyone using .hidden or .protected), it would be nice to
>> have a command-line option to turn this off and probably also to turn
>> it off by default for hidden and protected symbols.  Can gas do this?
>
> I've been running with the below changes (taken off of a bigger set
> of changes, so the line numbers may look a little odd) for the last
> couple of years. I never tried to submit this change because so far
> I couldn't find the time to check whether this would have any
> unwanted side effects on cases I don't normally use.
>

This is the patch I checked in.

Thanks.

-- 
H.J.
---
Branches to global non-weak symbols defined in the same segment with
non-default visibility can be optimized the same way as branches to
local symbols.

gas/

* config/tc-i386.c (elf_symbol_resolved_in_segment_p): New.
(md_estimate_size_before_relax): Use it.

gas/testsuite/

* gas/i386/i386.exp: Run relax-3 and x86-64-relax-2.
* gas/i386/relax-3.d: New file.
* gas/i386/relax-3.s: Likewise.
* gas/i386/x86-64-relax-2.d: Likewise.

[-- Attachment #2: 0001-Optimize-branches-to-non-weak-symbols-with-visibilit.patch --]
[-- Type: text/x-patch, Size: 7688 bytes --]

From b084df0b8d1262fb1e969c74bcc5c61e262a6199 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Thu, 7 May 2015 09:13:39 -0700
Subject: [PATCH] Optimize branches to non-weak symbols with visibility

Branches to global non-weak symbols defined in the same segment with
non-default visibility can be optimized the same way as branches to
local symbols.

gas/

	* config/tc-i386.c (elf_symbol_resolved_in_segment_p): New.
	(md_estimate_size_before_relax): Use it.

gas/testsuite/

	* gas/i386/i386.exp: Run relax-3 and x86-64-relax-2.
	* gas/i386/relax-3.d: New file.
	* gas/i386/relax-3.s: Likewise.
	* gas/i386/x86-64-relax-2.d: Likewise.
---
 gas/ChangeLog                           |  5 +++++
 gas/config/tc-i386.c                    | 24 ++++++++++++++++----
 gas/testsuite/ChangeLog                 |  7 ++++++
 gas/testsuite/gas/i386/i386.exp         |  4 ++++
 gas/testsuite/gas/i386/relax-3.d        | 30 +++++++++++++++++++++++++
 gas/testsuite/gas/i386/relax-3.s        | 39 +++++++++++++++++++++++++++++++++
 gas/testsuite/gas/i386/x86-64-relax-2.d | 32 +++++++++++++++++++++++++++
 7 files changed, 137 insertions(+), 4 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/relax-3.d
 create mode 100644 gas/testsuite/gas/i386/relax-3.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-relax-2.d

diff --git a/gas/ChangeLog b/gas/ChangeLog
index 7f42a64..9758e72 100644
--- a/gas/ChangeLog
+++ b/gas/ChangeLog
@@ -1,3 +1,8 @@
+2015-05-07  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* config/tc-i386.c (elf_symbol_resolved_in_segment_p): New.
+	(md_estimate_size_before_relax): Use it.
+
 2015-05-06  Jose E. Marchesi  <jose.marchesi@oracle.com>
 
 	* config/tc-sparc.c: Typo in comment fixed.
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 50f9cb4..c4ba13d 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -8772,6 +8772,25 @@ i386_frag_max_var (fragS *frag)
   return TYPE_FROM_RELAX_STATE (frag->fr_subtype) == UNCOND_JUMP ? 4 : 5;
 }
 
+#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
+static int
+elf_symbol_resolved_in_segment_p (symbolS *fr_symbol)
+{
+  /* STT_GNU_IFUNC symbol must go through PLT.  */
+  if ((symbol_get_bfdsym (fr_symbol)->flags
+       & BSF_GNU_INDIRECT_FUNCTION) != 0)
+    return 0;
+
+  if (!S_IS_EXTERNAL (fr_symbol))
+    /* Symbol may be weak or local.  */
+    return !S_IS_WEAK (fr_symbol);
+
+  /* Global symbols with default visibility in a shared library may be
+     preempted by another definition.  */
+  return ELF_ST_VISIBILITY (S_GET_OTHER (fr_symbol)) != STV_DEFAULT;
+}
+#endif
+
 /* md_estimate_size_before_relax()
 
    Called just before relax() for rs_machine_dependent frags.  The x86
@@ -8795,10 +8814,7 @@ md_estimate_size_before_relax (fragS *fragP, segT segment)
   if (S_GET_SEGMENT (fragP->fr_symbol) != segment
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
       || (IS_ELF
-	  && (S_IS_EXTERNAL (fragP->fr_symbol)
-	      || S_IS_WEAK (fragP->fr_symbol)
-	      || ((symbol_get_bfdsym (fragP->fr_symbol)->flags
-		   & BSF_GNU_INDIRECT_FUNCTION))))
+	  && !elf_symbol_resolved_in_segment_p (fragP->fr_symbol))
 #endif
 #if defined (OBJ_COFF) && defined (TE_PE)
       || (OUTPUT_FLAVOR == bfd_target_coff_flavour
diff --git a/gas/testsuite/ChangeLog b/gas/testsuite/ChangeLog
index 6339bff..1cfa577 100644
--- a/gas/testsuite/ChangeLog
+++ b/gas/testsuite/ChangeLog
@@ -1,3 +1,10 @@
+2015-05-07  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* gas/i386/i386.exp: Run relax-3 and x86-64-relax-2.
+	* gas/i386/relax-3.d: New file.
+	* gas/i386/relax-3.s: Likewise.
+	* gas/i386/x86-64-relax-2.d: Likewise.
+
 2015-05-06  Jose E. Marchesi  <jose.marchesi@oracle.com>
 
 	* gas/sparc/natural-32.d: Test ldn, ldna, stn, stna, slln, srln,
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index e1fdd18..af56c26 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -394,6 +394,8 @@ if [expr ([istarget "i*86-*-*"] ||  [istarget "x86_64-*-*"]) && [gas_32_check]]
 	run_dump_test "size-4"
 
 	run_dump_test "note"
+
+	run_dump_test "relax-3"
     }
 
     # This is a PE specific test.
@@ -748,6 +750,8 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
 	run_dump_test "x86-64-size-4"
 	run_dump_test "x86-64-size-5"
 	run_list_test "x86-64-size-inval-1" "-al"
+
+	run_dump_test "x86-64-relax-2"
     }
 
     set ASFLAGS "$old_ASFLAGS"
diff --git a/gas/testsuite/gas/i386/relax-3.d b/gas/testsuite/gas/i386/relax-3.d
new file mode 100644
index 0000000..8aa94e9
--- /dev/null
+++ b/gas/testsuite/gas/i386/relax-3.d
@@ -0,0 +1,30 @@
+#objdump: -dwr
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+[ 	]*[a-f0-9]+:	eb 1f                	jmp    21 <local>
+[ 	]*[a-f0-9]+:	eb 19                	jmp    1d <hidden_def>
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    5 <foo\+0x5>	5: (R_386_PC)?(DISP)?32	global_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    a <foo\+0xa>	a: (R_386_PC)?(DISP)?32	weak_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    f <foo\+0xf>	f: (R_386_PC)?(DISP)?32	weak_hidden_undef
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    14 <foo\+0x14>	14: (R_386_PC)?(DISP)?32	weak_hidden_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    19 <foo\+0x19>	19: (R_386_PC)?(DISP)?32	hidden_undef
+
+0+1d <hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+1e <weak_hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+1f <global_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+20 <weak_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+21 <local>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+#pass
diff --git a/gas/testsuite/gas/i386/relax-3.s b/gas/testsuite/gas/i386/relax-3.s
new file mode 100644
index 0000000..ab52185
--- /dev/null
+++ b/gas/testsuite/gas/i386/relax-3.s
@@ -0,0 +1,39 @@
+	.text
+	.global foo
+foo:
+	jmp local
+	jmp hidden_def
+	jmp global_def
+	jmp weak_def
+	jmp weak_hidden_undef
+	jmp weak_hidden_def
+	jmp hidden_undef
+
+	.hidden hidden_undef
+
+	.global hidden_def
+	.hidden hidden_def
+hidden_def:
+	ret
+
+	.global weak_hidden_def
+	.hidden weak_hidden_def
+	.weak weak_hidden_def
+weak_hidden_def:
+	ret
+
+	.global global_def
+global_def:
+	ret
+
+	.global weak_def
+	.weak weak_def
+weak_def:
+	ret
+
+local:
+	ret
+
+	.global weak_hidden_undef
+	.weak weak_hidden_undef
+	.hidden weak_hidden_undef
diff --git a/gas/testsuite/gas/i386/x86-64-relax-2.d b/gas/testsuite/gas/i386/x86-64-relax-2.d
new file mode 100644
index 0000000..7b0bd56
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-relax-2.d
@@ -0,0 +1,32 @@
+#source: relax-3.s
+#objdump: -dwr
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <foo>:
+[ 	]*[a-f0-9]+:	eb 1f                	jmp    21 <local>
+[ 	]*[a-f0-9]+:	eb 19                	jmp    1d <hidden_def>
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   9 <foo\+0x9>	5: R_X86_64_PC32	global_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   e <foo\+0xe>	a: R_X86_64_PC32	weak_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   13 <foo\+0x13>	f: R_X86_64_PC32	weak_hidden_undef-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   18 <foo\+0x18>	14: R_X86_64_PC32	weak_hidden_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1d <hidden_def>	19: R_X86_64_PC32	hidden_undef-0x4
+
+0+1d <hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+1e <weak_hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+1f <global_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+20 <weak_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+21 <local>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+#pass
-- 
1.9.3


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-07 16:21   ` H.J. Lu
@ 2015-05-08  3:23     ` Andy Lutomirski
  2015-05-08 12:09       ` H.J. Lu
  0 siblings, 1 reply; 21+ messages in thread
From: Andy Lutomirski @ 2015-05-08  3:23 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jan Beulich, Binutils, linux-kernel

On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>> AFAICT gas will produce relocations for jumps to global labels in the
>>> same file.  This doesn't seem directly harmful to me, except that, on
>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>
>>> This seems especially unfortunate, since even hidden and protected
>>> symbols have this problem.
>>>
>>> Given that many users don't want interposition support (especially the
>>> kernel and anyone using .hidden or .protected), it would be nice to
>>> have a command-line option to turn this off and probably also to turn
>>> it off by default for hidden and protected symbols.  Can gas do this?
>>
>> I've been running with the below changes (taken off of a bigger set
>> of changes, so the line numbers may look a little odd) for the last
>> couple of years. I never tried to submit this change because so far
>> I couldn't find the time to check whether this would have any
>> unwanted side effects on cases I don't normally use.
>>
>
> This is the patch I checked in.
>
> Thanks.
>
> --
> H.J.
> ---
> Branches to global non-weak symbols defined in the same segment with
> non-default visibility can be optimized the same way as branches to
> local symbols.

Would it make sense to also add a command line option along the lines
of gcc's -fno-semantic-interposition or some way to override the
default visibility?  AFAICS this patch helps but only if asm code gets
liberally sprinkled with .hidden or .protected directives.

--Andy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-08  3:23     ` Andy Lutomirski
@ 2015-05-08 12:09       ` H.J. Lu
  2015-05-08 20:16         ` H.J. Lu
  0 siblings, 1 reply; 21+ messages in thread
From: H.J. Lu @ 2015-05-08 12:09 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Jan Beulich, Binutils, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2558 bytes --]

On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>>> AFAICT gas will produce relocations for jumps to global labels in the
>>>> same file.  This doesn't seem directly harmful to me, except that, on
>>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>>
>>>> This seems especially unfortunate, since even hidden and protected
>>>> symbols have this problem.
>>>>
>>>> Given that many users don't want interposition support (especially the
>>>> kernel and anyone using .hidden or .protected), it would be nice to
>>>> have a command-line option to turn this off and probably also to turn
>>>> it off by default for hidden and protected symbols.  Can gas do this?
>>>
>>> I've been running with the below changes (taken off of a bigger set
>>> of changes, so the line numbers may look a little odd) for the last
>>> couple of years. I never tried to submit this change because so far
>>> I couldn't find the time to check whether this would have any
>>> unwanted side effects on cases I don't normally use.
>>>
>>
>> This is the patch I checked in.
>>
>> Thanks.
>>
>> --
>> H.J.
>> ---
>> Branches to global non-weak symbols defined in the same segment with
>> non-default visibility can be optimized the same way as branches to
>> local symbols.
>
> Would it make sense to also add a command line option along the lines
> of gcc's -fno-semantic-interposition or some way to override the
> default visibility?  AFAICS this patch helps but only if asm code gets
> liberally sprinkled with .hidden or .protected directives.
>

This is what I checked in.  With

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 2fda005..186e6f7 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -107,6 +107,10 @@ else
         KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
 endif

+NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared)
+KBUILD_CFLAGS += $(NO_SHARED_CFLAGS)
+KBUILD_AFLAGS += $(NO_SHARED_CFLAGS)
+
 # Make sure compiler does not have buggy stack-protector support.
 ifdef CONFIG_CC_STACKPROTECTOR
   cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh

On kernel master branch, I got

   text   data    bss    dec    hex filename
10934167 2275232 1609728 14819127 e21f37 vmlinux.old
10934119 2275232 1609728 14819079 e21f07 vmlinux

It saves 48 bytes.

-- 
H.J.

[-- Attachment #2: 0001-Add-mno-shared-to-x86-assembler.patch --]
[-- Type: text/x-patch, Size: 9130 bytes --]

From 7ae909be2dce6c47045e66fe94bd1a8261db1761 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 8 May 2015 05:04:12 -0700
Subject: [PATCH] Add -mno-shared to x86 assembler

On ELF target, the assembler normally generates code which can go into a
shared library where non-weak symbols can be preempted.  The -mno-shared
option tells the assembler to generate code not for a shared library,
where non-weak symbols won't be preempted.  The resulting code is slightly
smaller.  This option mainly affects the handling of branch instructions.

gas/

	* config/tc-i386.c (no_shared): New.
	(OPTION_MNO_SHARED): Likewise.
	(elf_symbol_resolved_in_segment_p): Check no_shared.
	(md_longopts): Add mno-shared.
	(md_parse_option): Handle OPTION_MNO_SHARED.
	(md_show_usage): Add -mno-shared.
	* doc/c-i386.texi: Document -mno-shared.

gas/testsuite/

	* gas/i386/i386.exp: Run relax-4 and x86-64-relax-3.
	* gas/i386/relax-4.d: New file.
	* gas/i386/x86-64-relax-3.d: Likewise.
---
 gas/ChangeLog                           | 10 ++++++++++
 gas/config/tc-i386.c                    | 17 +++++++++++++++++
 gas/doc/c-i386.texi                     | 10 ++++++++++
 gas/testsuite/ChangeLog                 |  6 ++++++
 gas/testsuite/gas/i386/i386.exp         |  2 ++
 gas/testsuite/gas/i386/relax-4.d        | 32 ++++++++++++++++++++++++++++++++
 gas/testsuite/gas/i386/x86-64-relax-3.d | 33 +++++++++++++++++++++++++++++++++
 7 files changed, 110 insertions(+)
 create mode 100644 gas/testsuite/gas/i386/relax-4.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-relax-3.d

diff --git a/gas/ChangeLog b/gas/ChangeLog
index 9758e72..9bf6931 100644
--- a/gas/ChangeLog
+++ b/gas/ChangeLog
@@ -1,3 +1,13 @@
+2015-05-08  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* config/tc-i386.c (no_shared): New.
+	(OPTION_MNO_SHARED): Likewise.
+	(elf_symbol_resolved_in_segment_p): Check no_shared.
+	(md_longopts): Add mno-shared.
+	(md_parse_option): Handle OPTION_MNO_SHARED.
+	(md_show_usage): Add -mno-shared.
+	* doc/c-i386.texi: Document -mno-shared.
+
 2015-05-07  H.J. Lu  <hongjiu.lu@intel.com>
 
 	* config/tc-i386.c (elf_symbol_resolved_in_segment_p): New.
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index c4ba13d..bbc0969 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -524,6 +524,11 @@ static enum x86_elf_abi x86_elf_abi = I386_ABI;
 static int use_big_obj = 0;
 #endif
 
+#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
+/* 1 if not generating code for a shared library.  */
+static int no_shared = 0;
+#endif
+
 /* 1 for intel syntax,
    0 if att syntax.  */
 static int intel_syntax = 0;
@@ -8785,6 +8790,10 @@ elf_symbol_resolved_in_segment_p (symbolS *fr_symbol)
     /* Symbol may be weak or local.  */
     return !S_IS_WEAK (fr_symbol);
 
+  /* Non-weak symbols won't be preempted.  */
+  if (no_shared)
+    return 1;
+
   /* Global symbols with default visibility in a shared library may be
      preempted by another definition.  */
   return ELF_ST_VISIBILITY (S_GET_OTHER (fr_symbol)) != STV_DEFAULT;
@@ -9484,6 +9493,7 @@ const char *md_shortopts = "qn";
 #define OPTION_MBIG_OBJ (OPTION_MD_BASE + 18)
 #define OPTION_OMIT_LOCK_PREFIX (OPTION_MD_BASE + 19)
 #define OPTION_MEVEXRCIG (OPTION_MD_BASE + 20)
+#define OPTION_MNO_SHARED (OPTION_MD_BASE + 21)
 
 struct option md_longopts[] =
 {
@@ -9494,6 +9504,7 @@ struct option md_longopts[] =
 #endif
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
   {"x32", no_argument, NULL, OPTION_X32},
+  {"mno-shared", no_argument, NULL, OPTION_MNO_SHARED},
 #endif
   {"divide", no_argument, NULL, OPTION_DIVIDE},
   {"march", required_argument, NULL, OPTION_MARCH},
@@ -9554,6 +9565,10 @@ md_parse_option (int c, char *arg)
       /* -s: On i386 Solaris, this tells the native assembler to use
 	 .stab instead of .stab.excl.  We always use .stab anyhow.  */
       break;
+
+    case OPTION_MNO_SHARED:
+      no_shared = 1;
+      break;
 #endif
 #if (defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF) \
      || defined (TE_PE) || defined (TE_PEP) || defined (OBJ_MACH_O))
@@ -9980,6 +9995,8 @@ md_show_usage (FILE *stream)
   -mold-gcc               support old (<= 2.8.1) versions of gcc\n"));
   fprintf (stream, _("\
   -madd-bnd-prefix        add BND prefix for all valid branches\n"));
+  fprintf (stream, _("\
+  -mno-shared             enable branch optimization for non shared code\n"));
 # if defined (TE_PE) || defined (TE_PEP)
   fprintf (stream, _("\
   -mbig-obj               generate big object files\n"));
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 7f0e79f..eb6790c 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -297,6 +297,16 @@ The @code{.att_syntax} and @code{.intel_syntax} directives will take precedent.
 This option forces the assembler to add BND prefix to all branches, even
 if such prefix was not explicitly specified in the source code.
 
+@cindex @samp{-mno-shared} option, i386
+@cindex @samp{-mno-shared} option, x86-64
+@item -mno-shared
+On ELF target, the assembler normally generates code which can go into a
+shared library where non-weak symbols can be preempted.  The
+@samp{-mno-shared} option tells the assembler to generate code not for
+a shared library, where non-weak symbols won't be preempted.  The
+resulting code is slightly smaller.  This option mainly affects the
+handling of branch instructions.
+
 @cindex @samp{-mbig-obj} option, x86-64
 @item -mbig-obj
 On x86-64 PE/COFF target this option forces the use of big object file
diff --git a/gas/testsuite/ChangeLog b/gas/testsuite/ChangeLog
index 1cfa577..d63f48a 100644
--- a/gas/testsuite/ChangeLog
+++ b/gas/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2015-05-08  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* gas/i386/i386.exp: Run relax-4 and x86-64-relax-3.
+	* gas/i386/relax-4.d: New file.
+	* gas/i386/x86-64-relax-3.d: Likewise.
+
 2015-05-07  H.J. Lu  <hongjiu.lu@intel.com>
 
 	* gas/i386/i386.exp: Run relax-3 and x86-64-relax-2.
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index af56c26..bedd84c 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -396,6 +396,7 @@ if [expr ([istarget "i*86-*-*"] ||  [istarget "x86_64-*-*"]) && [gas_32_check]]
 	run_dump_test "note"
 
 	run_dump_test "relax-3"
+	run_dump_test "relax-4"
     }
 
     # This is a PE specific test.
@@ -752,6 +753,7 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
 	run_list_test "x86-64-size-inval-1" "-al"
 
 	run_dump_test "x86-64-relax-2"
+	run_dump_test "x86-64-relax-3"
     }
 
     set ASFLAGS "$old_ASFLAGS"
diff --git a/gas/testsuite/gas/i386/relax-4.d b/gas/testsuite/gas/i386/relax-4.d
new file mode 100644
index 0000000..b188841
--- /dev/null
+++ b/gas/testsuite/gas/i386/relax-4.d
@@ -0,0 +1,32 @@
+#source: relax-3.s
+#as: -mno-shared
+#objdump: -dwr
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+[ 	]*[a-f0-9]+:	eb 1c                	jmp    1e <local>
+[ 	]*[a-f0-9]+:	eb 16                	jmp    1a <hidden_def>
+[ 	]*[a-f0-9]+:	eb 16                	jmp    1c <global_def>
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    7 <foo\+0x7>	7: (R_386_PC)?(DISP)?32	weak_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    c <foo\+0xc>	c: (R_386_PC)?(DISP)?32	weak_hidden_undef
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    11 <foo\+0x11>	11: (R_386_PC)?(DISP)?32	weak_hidden_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    16 <foo\+0x16>	16: (R_386_PC)?(DISP)?32	hidden_undef
+
+0+1a <hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+1b <weak_hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+1c <global_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+1d <weak_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+1e <local>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-relax-3.d b/gas/testsuite/gas/i386/x86-64-relax-3.d
new file mode 100644
index 0000000..d0c7ee4
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-relax-3.d
@@ -0,0 +1,33 @@
+#source: relax-3.s
+#as: -mno-shared
+#objdump: -dwr
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <foo>:
+[ 	]*[a-f0-9]+:	eb 1c                	jmp    1e <local>
+[ 	]*[a-f0-9]+:	eb 16                	jmp    1a <hidden_def>
+[ 	]*[a-f0-9]+:	eb 16                	jmp    1c <global_def>
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   b <foo\+0xb>	7: R_X86_64_PC32	weak_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   10 <foo\+0x10>	c: R_X86_64_PC32	weak_hidden_undef-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   15 <foo\+0x15>	11: R_X86_64_PC32	weak_hidden_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1a <hidden_def>	16: R_X86_64_PC32	hidden_undef-0x4
+
+0+1a <hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+1b <weak_hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+1c <global_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+1d <weak_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+1e <local>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+#pass
-- 
2.1.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-08 12:09       ` H.J. Lu
@ 2015-05-08 20:16         ` H.J. Lu
  2015-05-18 19:37           ` Andy Lutomirski
  0 siblings, 1 reply; 21+ messages in thread
From: H.J. Lu @ 2015-05-08 20:16 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Jan Beulich, Binutils, linux-kernel

On Fri, May 8, 2015 at 5:09 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>>>> AFAICT gas will produce relocations for jumps to global labels in the
>>>>> same file.  This doesn't seem directly harmful to me, except that, on
>>>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>>>
>>>>> This seems especially unfortunate, since even hidden and protected
>>>>> symbols have this problem.
>>>>>
>>>>> Given that many users don't want interposition support (especially the
>>>>> kernel and anyone using .hidden or .protected), it would be nice to
>>>>> have a command-line option to turn this off and probably also to turn
>>>>> it off by default for hidden and protected symbols.  Can gas do this?
>>>>
>>>> I've been running with the below changes (taken off of a bigger set
>>>> of changes, so the line numbers may look a little odd) for the last
>>>> couple of years. I never tried to submit this change because so far
>>>> I couldn't find the time to check whether this would have any
>>>> unwanted side effects on cases I don't normally use.
>>>>
>>>
>>> This is the patch I checked in.
>>>
>>> Thanks.
>>>
>>> --
>>> H.J.
>>> ---
>>> Branches to global non-weak symbols defined in the same segment with
>>> non-default visibility can be optimized the same way as branches to
>>> local symbols.
>>
>> Would it make sense to also add a command line option along the lines
>> of gcc's -fno-semantic-interposition or some way to override the
>> default visibility?  AFAICS this patch helps but only if asm code gets
>> liberally sprinkled with .hidden or .protected directives.
>>
>
> This is what I checked in.  With
>
> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 2fda005..186e6f7 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -107,6 +107,10 @@ else
>          KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
>  endif
>
> +NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared)
> +KBUILD_CFLAGS += $(NO_SHARED_CFLAGS)
> +KBUILD_AFLAGS += $(NO_SHARED_CFLAGS)
> +
>  # Make sure compiler does not have buggy stack-protector support.
>  ifdef CONFIG_CC_STACKPROTECTOR
>    cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
>
> On kernel master branch, I got
>
>    text   data    bss    dec    hex filename
> 10934167 2275232 1609728 14819127 e21f37 vmlinux.old
> 10934119 2275232 1609728 14819079 e21f07 vmlinux
>
> It saves 48 bytes.

This is before I fixed:

/* This is global to keep gas from relaxing the jumps */
ENTRY(early_idt_handler)
        cld

in arch/x86/kernel/head_64.S.  With -mno-shared, we must
make early_idt_handler weak to keep gas from relaxing the jumps.


-- 
H.J.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-08 20:16         ` H.J. Lu
@ 2015-05-18 19:37           ` Andy Lutomirski
  2015-05-18 20:02             ` H.J. Lu
  2015-05-18 20:07             ` H. Peter Anvin
  0 siblings, 2 replies; 21+ messages in thread
From: Andy Lutomirski @ 2015-05-18 19:37 UTC (permalink / raw)
  To: H.J. Lu, Borislav Petkov, H. Peter Anvin
  Cc: Jan Beulich, Binutils, linux-kernel

On Fri, May 8, 2015 at 1:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, May 8, 2015 at 5:09 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>>>>> AFAICT gas will produce relocations for jumps to global labels in the
>>>>>> same file.  This doesn't seem directly harmful to me, except that, on
>>>>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>>>>
>>>>>> This seems especially unfortunate, since even hidden and protected
>>>>>> symbols have this problem.
>>>>>>
>>>>>> Given that many users don't want interposition support (especially the
>>>>>> kernel and anyone using .hidden or .protected), it would be nice to
>>>>>> have a command-line option to turn this off and probably also to turn
>>>>>> it off by default for hidden and protected symbols.  Can gas do this?
>>>>>
>>>>> I've been running with the below changes (taken off of a bigger set
>>>>> of changes, so the line numbers may look a little odd) for the last
>>>>> couple of years. I never tried to submit this change because so far
>>>>> I couldn't find the time to check whether this would have any
>>>>> unwanted side effects on cases I don't normally use.
>>>>>
>>>>
>>>> This is the patch I checked in.
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> H.J.
>>>> ---
>>>> Branches to global non-weak symbols defined in the same segment with
>>>> non-default visibility can be optimized the same way as branches to
>>>> local symbols.
>>>
>>> Would it make sense to also add a command line option along the lines
>>> of gcc's -fno-semantic-interposition or some way to override the
>>> default visibility?  AFAICS this patch helps but only if asm code gets
>>> liberally sprinkled with .hidden or .protected directives.
>>>
>>
>> This is what I checked in.  With
>>
>> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
>> index 2fda005..186e6f7 100644
>> --- a/arch/x86/Makefile
>> +++ b/arch/x86/Makefile
>> @@ -107,6 +107,10 @@ else
>>          KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
>>  endif
>>
>> +NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared)
>> +KBUILD_CFLAGS += $(NO_SHARED_CFLAGS)
>> +KBUILD_AFLAGS += $(NO_SHARED_CFLAGS)
>> +
>>  # Make sure compiler does not have buggy stack-protector support.
>>  ifdef CONFIG_CC_STACKPROTECTOR
>>    cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
>>
>> On kernel master branch, I got
>>
>>    text   data    bss    dec    hex filename
>> 10934167 2275232 1609728 14819127 e21f37 vmlinux.old
>> 10934119 2275232 1609728 14819079 e21f07 vmlinux
>>
>> It saves 48 bytes.
>
> This is before I fixed:
>
> /* This is global to keep gas from relaxing the jumps */
> ENTRY(early_idt_handler)
>         cld
>
> in arch/x86/kernel/head_64.S.  With -mno-shared, we must
> make early_idt_handler weak to keep gas from relaxing the jumps.
>

I wonder if it would make sense to have explicit mnemonics for the
one-byte offset and four-byte offset jump variants.  Sometimes users
want a jump with a 32-bit offset for reasons that have nothing to do
with link-time or load-time relocations.

--Andy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-18 19:37           ` Andy Lutomirski
@ 2015-05-18 20:02             ` H.J. Lu
  2015-05-18 20:06               ` H. Peter Anvin
  2015-05-18 20:07             ` H. Peter Anvin
  1 sibling, 1 reply; 21+ messages in thread
From: H.J. Lu @ 2015-05-18 20:02 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, H. Peter Anvin, Jan Beulich, Binutils, linux-kernel

On Mon, May 18, 2015 at 12:36 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Fri, May 8, 2015 at 1:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Fri, May 8, 2015 at 5:09 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>>>>>> AFAICT gas will produce relocations for jumps to global labels in the
>>>>>>> same file.  This doesn't seem directly harmful to me, except that, on
>>>>>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>>>>>
>>>>>>> This seems especially unfortunate, since even hidden and protected
>>>>>>> symbols have this problem.
>>>>>>>
>>>>>>> Given that many users don't want interposition support (especially the
>>>>>>> kernel and anyone using .hidden or .protected), it would be nice to
>>>>>>> have a command-line option to turn this off and probably also to turn
>>>>>>> it off by default for hidden and protected symbols.  Can gas do this?
>>>>>>
>>>>>> I've been running with the below changes (taken off of a bigger set
>>>>>> of changes, so the line numbers may look a little odd) for the last
>>>>>> couple of years. I never tried to submit this change because so far
>>>>>> I couldn't find the time to check whether this would have any
>>>>>> unwanted side effects on cases I don't normally use.
>>>>>>
>>>>>
>>>>> This is the patch I checked in.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> H.J.
>>>>> ---
>>>>> Branches to global non-weak symbols defined in the same segment with
>>>>> non-default visibility can be optimized the same way as branches to
>>>>> local symbols.
>>>>
>>>> Would it make sense to also add a command line option along the lines
>>>> of gcc's -fno-semantic-interposition or some way to override the
>>>> default visibility?  AFAICS this patch helps but only if asm code gets
>>>> liberally sprinkled with .hidden or .protected directives.
>>>>
>>>
>>> This is what I checked in.  With
>>>
>>> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
>>> index 2fda005..186e6f7 100644
>>> --- a/arch/x86/Makefile
>>> +++ b/arch/x86/Makefile
>>> @@ -107,6 +107,10 @@ else
>>>          KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
>>>  endif
>>>
>>> +NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared)
>>> +KBUILD_CFLAGS += $(NO_SHARED_CFLAGS)
>>> +KBUILD_AFLAGS += $(NO_SHARED_CFLAGS)
>>> +
>>>  # Make sure compiler does not have buggy stack-protector support.
>>>  ifdef CONFIG_CC_STACKPROTECTOR
>>>    cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
>>>
>>> On kernel master branch, I got
>>>
>>>    text   data    bss    dec    hex filename
>>> 10934167 2275232 1609728 14819127 e21f37 vmlinux.old
>>> 10934119 2275232 1609728 14819079 e21f07 vmlinux
>>>
>>> It saves 48 bytes.
>>
>> This is before I fixed:
>>
>> /* This is global to keep gas from relaxing the jumps */
>> ENTRY(early_idt_handler)
>>         cld
>>
>> in arch/x86/kernel/head_64.S.  With -mno-shared, we must
>> make early_idt_handler weak to keep gas from relaxing the jumps.
>>
>
> I wonder if it would make sense to have explicit mnemonics for the
> one-byte offset and four-byte offset jump variants.  Sometimes users
> want a jump with a 32-bit offset for reasons that have nothing to do
> with link-time or load-time relocations.
>

There is:

jmp.d32 foo


-- 
H.J.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-18 20:02             ` H.J. Lu
@ 2015-05-18 20:06               ` H. Peter Anvin
  2015-05-18 20:25                 ` H.J. Lu
  0 siblings, 1 reply; 21+ messages in thread
From: H. Peter Anvin @ 2015-05-18 20:06 UTC (permalink / raw)
  To: H.J. Lu, Andy Lutomirski
  Cc: Borislav Petkov, Jan Beulich, Binutils, linux-kernel

On 05/18/2015 01:02 PM, H.J. Lu wrote:
>>
>> I wonder if it would make sense to have explicit mnemonics for the
>> one-byte offset and four-byte offset jump variants.  Sometimes users
>> want a jump with a 32-bit offset for reasons that have nothing to do
>> with link-time or load-time relocations.
>>
> 
> There is:
> 
> jmp.d32 foo
> 

How far back does that syntax work?

	-hpa


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-18 19:37           ` Andy Lutomirski
  2015-05-18 20:02             ` H.J. Lu
@ 2015-05-18 20:07             ` H. Peter Anvin
  2015-05-18 20:11               ` Borislav Petkov
  1 sibling, 1 reply; 21+ messages in thread
From: H. Peter Anvin @ 2015-05-18 20:07 UTC (permalink / raw)
  To: Andy Lutomirski, H.J. Lu, Borislav Petkov
  Cc: Jan Beulich, Binutils, linux-kernel

We can also just do a macro:

.byte 0xe9 ; .long \target - 4 - .

On May 18, 2015 12:36:44 PM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Fri, May 8, 2015 at 1:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Fri, May 8, 2015 at 5:09 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski
><luto@amacapital.net> wrote:
>>>> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com>
>wrote:
>>>>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com>
>wrote:
>>>>>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>>>>>> AFAICT gas will produce relocations for jumps to global labels
>in the
>>>>>>> same file.  This doesn't seem directly harmful to me, except
>that, on
>>>>>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>>>>>
>>>>>>> This seems especially unfortunate, since even hidden and
>protected
>>>>>>> symbols have this problem.
>>>>>>>
>>>>>>> Given that many users don't want interposition support
>(especially the
>>>>>>> kernel and anyone using .hidden or .protected), it would be nice
>to
>>>>>>> have a command-line option to turn this off and probably also to
>turn
>>>>>>> it off by default for hidden and protected symbols.  Can gas do
>this?
>>>>>>
>>>>>> I've been running with the below changes (taken off of a bigger
>set
>>>>>> of changes, so the line numbers may look a little odd) for the
>last
>>>>>> couple of years. I never tried to submit this change because so
>far
>>>>>> I couldn't find the time to check whether this would have any
>>>>>> unwanted side effects on cases I don't normally use.
>>>>>>
>>>>>
>>>>> This is the patch I checked in.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> H.J.
>>>>> ---
>>>>> Branches to global non-weak symbols defined in the same segment
>with
>>>>> non-default visibility can be optimized the same way as branches
>to
>>>>> local symbols.
>>>>
>>>> Would it make sense to also add a command line option along the
>lines
>>>> of gcc's -fno-semantic-interposition or some way to override the
>>>> default visibility?  AFAICS this patch helps but only if asm code
>gets
>>>> liberally sprinkled with .hidden or .protected directives.
>>>>
>>>
>>> This is what I checked in.  With
>>>
>>> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
>>> index 2fda005..186e6f7 100644
>>> --- a/arch/x86/Makefile
>>> +++ b/arch/x86/Makefile
>>> @@ -107,6 +107,10 @@ else
>>>          KBUILD_CFLAGS += $(call
>cc-option,-maccumulate-outgoing-args)
>>>  endif
>>>
>>> +NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared)
>>> +KBUILD_CFLAGS += $(NO_SHARED_CFLAGS)
>>> +KBUILD_AFLAGS += $(NO_SHARED_CFLAGS)
>>> +
>>>  # Make sure compiler does not have buggy stack-protector support.
>>>  ifdef CONFIG_CC_STACKPROTECTOR
>>>    cc_has_sp :=
>$(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
>>>
>>> On kernel master branch, I got
>>>
>>>    text   data    bss    dec    hex filename
>>> 10934167 2275232 1609728 14819127 e21f37 vmlinux.old
>>> 10934119 2275232 1609728 14819079 e21f07 vmlinux
>>>
>>> It saves 48 bytes.
>>
>> This is before I fixed:
>>
>> /* This is global to keep gas from relaxing the jumps */
>> ENTRY(early_idt_handler)
>>         cld
>>
>> in arch/x86/kernel/head_64.S.  With -mno-shared, we must
>> make early_idt_handler weak to keep gas from relaxing the jumps.
>>
>
>I wonder if it would make sense to have explicit mnemonics for the
>one-byte offset and four-byte offset jump variants.  Sometimes users
>want a jump with a 32-bit offset for reasons that have nothing to do
>with link-time or load-time relocations.
>
>--Andy

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-18 20:07             ` H. Peter Anvin
@ 2015-05-18 20:11               ` Borislav Petkov
  0 siblings, 0 replies; 21+ messages in thread
From: Borislav Petkov @ 2015-05-18 20:11 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andy Lutomirski, H.J. Lu, Jan Beulich, Binutils, linux-kernel

On Mon, May 18, 2015 at 01:01:15PM -0700, H. Peter Anvin wrote:
> We can also just do a macro:
> 
> .byte 0xe9 ; .long \target - 4 - .

We used to do that fun in some alternatives which got removed
recently... (by me). :-)

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-18 20:06               ` H. Peter Anvin
@ 2015-05-18 20:25                 ` H.J. Lu
  2015-05-18 20:28                   ` H. Peter Anvin
  0 siblings, 1 reply; 21+ messages in thread
From: H.J. Lu @ 2015-05-18 20:25 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andy Lutomirski, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

On Mon, May 18, 2015 at 1:06 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 05/18/2015 01:02 PM, H.J. Lu wrote:
>>>
>>> I wonder if it would make sense to have explicit mnemonics for the
>>> one-byte offset and four-byte offset jump variants.  Sometimes users
>>> want a jump with a 32-bit offset for reasons that have nothing to do
>>> with link-time or load-time relocations.
>>>
>>
>> There is:
>>
>> jmp.d32 foo
>>
>
> How far back does that syntax work?
>

.d32 support was added by

commit f8a5c266971d7b5b96f973805551c6e88669cada
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Oct 14 13:31:13 2010 +0000

    Add .d32 encoding suffix.

and .d8 supported was added by

commit a501d77eeba717f6d54dce44f286f9e3aad83144
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Jan 20 20:53:50 2012 +0000

    Add .d8 suffix support to x86 assembler

I don't believe .d16 is supported.

-- 
H.J.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-18 20:25                 ` H.J. Lu
@ 2015-05-18 20:28                   ` H. Peter Anvin
  2015-05-18 20:34                     ` H. Peter Anvin
  0 siblings, 1 reply; 21+ messages in thread
From: H. Peter Anvin @ 2015-05-18 20:28 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Andy Lutomirski, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1151 bytes --]

On 05/18/2015 01:25 PM, H.J. Lu wrote:
> On Mon, May 18, 2015 at 1:06 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 05/18/2015 01:02 PM, H.J. Lu wrote:
>>>>
>>>> I wonder if it would make sense to have explicit mnemonics for the
>>>> one-byte offset and four-byte offset jump variants.  Sometimes users
>>>> want a jump with a 32-bit offset for reasons that have nothing to do
>>>> with link-time or load-time relocations.
>>>>
>>>
>>> There is:
>>>
>>> jmp.d32 foo
>>>
>>
>> How far back does that syntax work?
>>
> 
> .d32 support was added by
> 
> commit f8a5c266971d7b5b96f973805551c6e88669cada
> Author: H.J. Lu <hjl.tools@gmail.com>
> Date:   Thu Oct 14 13:31:13 2010 +0000
> 
>     Add .d32 encoding suffix.
> 
> and .d8 supported was added by
> 
> commit a501d77eeba717f6d54dce44f286f9e3aad83144
> Author: H.J. Lu <hjl.tools@gmail.com>
> Date:   Fri Jan 20 20:53:50 2012 +0000
> 
>     Add .d8 suffix support to x86 assembler
> 

OK, that is probably too recent.  The simplest answer I think is just to
.balign 16 each vector.  This is init space... some extra padding really
doesn't matter.

Patch attached (still in compile test).

	-hpa


[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 2264 bytes --]

diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
index 5a9856e..f11621f 100644
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -233,7 +233,7 @@
 #ifdef __KERNEL__
 #ifndef __ASSEMBLY__
 
-extern const char early_idt_handlers[NUM_EXCEPTION_VECTORS][2+2+5];
+extern const char early_idt_handlers[NUM_EXCEPTION_VECTORS][16];
 #ifdef CONFIG_TRACING
 # define trace_early_idt_handlers early_idt_handlers
 #endif
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 02d2572..1c18826 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -524,6 +524,7 @@ setup_once:
 	andl $0,setup_once_ref	/* Once is enough, thanks */
 	ret
 
+	.balign 16
 ENTRY(early_idt_handlers)
 	# 36(%esp) %eflags
 	# 32(%esp) %cs
@@ -531,9 +532,8 @@ ENTRY(early_idt_handlers)
 	# 24(%rsp) error code
 	i = 0
 	.rept NUM_EXCEPTION_VECTORS
-	.if (EXCEPTION_ERRCODE_MASK >> i) & 1
-	ASM_NOP2
-	.else
+	.balign 16
+	.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
 	pushl $0		# Dummy error code, to make stack frame uniform
 	.endif
 	pushl $i		# 20(%esp) Vector number
@@ -542,8 +542,7 @@ ENTRY(early_idt_handlers)
 	.endr
 ENDPROC(early_idt_handlers)
 	
-	/* This is global to keep gas from relaxing the jumps */
-ENTRY(early_idt_handler)
+early_idt_handler:
 	cld
 
 	cmpl $2,(%esp)		# X86_TRAP_NMI
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 43eafc8..2d80a09 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -321,7 +321,8 @@ bad_address:
 	jmp bad_address
 
 	__INIT
-	.globl early_idt_handlers
+	.balign 16
+ENTRY(early_idt_handlers)
 early_idt_handlers:
 	# 104(%rsp) %rflags
 	#  96(%rsp) %cs
@@ -329,18 +330,17 @@ early_idt_handlers:
 	#  80(%rsp) error code
 	i = 0
 	.rept NUM_EXCEPTION_VECTORS
-	.if (EXCEPTION_ERRCODE_MASK >> i) & 1
-	ASM_NOP2
-	.else
+	.balign 16
+	.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
 	pushq $0		# Dummy error code, to make stack frame uniform
 	.endif
 	pushq $i		# 72(%rsp) Vector number
 	jmp early_idt_handler
 	i = i + 1
 	.endr
+ENDPROC(early_idt_handlers)
 
-/* This is global to keep gas from relaxing the jumps */
-ENTRY(early_idt_handler)
+early_idt_handler:
 	cld
 
 	cmpl $2,(%rsp)		# X86_TRAP_NMI

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-18 20:28                   ` H. Peter Anvin
@ 2015-05-18 20:34                     ` H. Peter Anvin
  2015-05-20 20:54                       ` Andy Lutomirski
  0 siblings, 1 reply; 21+ messages in thread
From: H. Peter Anvin @ 2015-05-18 20:34 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Andy Lutomirski, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 286 bytes --]

On 05/18/2015 01:28 PM, H. Peter Anvin wrote:
> 
> OK, that is probably too recent.  The simplest answer I think is just to
> .balign 16 each vector.  This is init space... some extra padding really
> doesn't matter.
> 
> Patch attached (still in compile test).
> 

Corrected.

	-hpa



[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 2456 bytes --]

diff --git a/arch/x86/include/asm/segment.h b/arch/x86/include/asm/segment.h
index 5a9856e..f11621f 100644
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -233,7 +233,7 @@
 #ifdef __KERNEL__
 #ifndef __ASSEMBLY__
 
-extern const char early_idt_handlers[NUM_EXCEPTION_VECTORS][2+2+5];
+extern const char early_idt_handlers[NUM_EXCEPTION_VECTORS][16];
 #ifdef CONFIG_TRACING
 # define trace_early_idt_handlers early_idt_handlers
 #endif
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 02d2572..e5afae9 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -492,7 +492,7 @@ setup_once:
 	movl %eax,4(%edi)
 	/* interrupt gate, dpl=0, present */
 	movl $(0x8E000000 + __KERNEL_CS),2(%edi)
-	addl $9,%eax
+	addl $16,%eax
 	addl $8,%edi
 	loop 1b
 
@@ -524,6 +524,7 @@ setup_once:
 	andl $0,setup_once_ref	/* Once is enough, thanks */
 	ret
 
+	.balign 16
 ENTRY(early_idt_handlers)
 	# 36(%esp) %eflags
 	# 32(%esp) %cs
@@ -531,9 +532,8 @@ ENTRY(early_idt_handlers)
 	# 24(%rsp) error code
 	i = 0
 	.rept NUM_EXCEPTION_VECTORS
-	.if (EXCEPTION_ERRCODE_MASK >> i) & 1
-	ASM_NOP2
-	.else
+	.balign 16
+	.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
 	pushl $0		# Dummy error code, to make stack frame uniform
 	.endif
 	pushl $i		# 20(%esp) Vector number
@@ -542,8 +542,7 @@ ENTRY(early_idt_handlers)
 	.endr
 ENDPROC(early_idt_handlers)
 	
-	/* This is global to keep gas from relaxing the jumps */
-ENTRY(early_idt_handler)
+early_idt_handler:
 	cld
 
 	cmpl $2,(%esp)		# X86_TRAP_NMI
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 43eafc8..2d80a09 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -321,7 +321,8 @@ bad_address:
 	jmp bad_address
 
 	__INIT
-	.globl early_idt_handlers
+	.balign 16
+ENTRY(early_idt_handlers)
 early_idt_handlers:
 	# 104(%rsp) %rflags
 	#  96(%rsp) %cs
@@ -329,18 +330,17 @@ early_idt_handlers:
 	#  80(%rsp) error code
 	i = 0
 	.rept NUM_EXCEPTION_VECTORS
-	.if (EXCEPTION_ERRCODE_MASK >> i) & 1
-	ASM_NOP2
-	.else
+	.balign 16
+	.if ((EXCEPTION_ERRCODE_MASK >> i) & 1) == 0
 	pushq $0		# Dummy error code, to make stack frame uniform
 	.endif
 	pushq $i		# 72(%rsp) Vector number
 	jmp early_idt_handler
 	i = i + 1
 	.endr
+ENDPROC(early_idt_handlers)
 
-/* This is global to keep gas from relaxing the jumps */
-ENTRY(early_idt_handler)
+early_idt_handler:
 	cld
 
 	cmpl $2,(%rsp)		# X86_TRAP_NMI

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-18 20:34                     ` H. Peter Anvin
@ 2015-05-20 20:54                       ` Andy Lutomirski
  2015-05-20 21:00                         ` H. Peter Anvin
  0 siblings, 1 reply; 21+ messages in thread
From: Andy Lutomirski @ 2015-05-20 20:54 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: H.J. Lu, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

On Mon, May 18, 2015 at 1:34 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 05/18/2015 01:28 PM, H. Peter Anvin wrote:
>>
>> OK, that is probably too recent.  The simplest answer I think is just to
>> .balign 16 each vector.  This is init space... some extra padding really
>> doesn't matter.
>>
>> Patch attached (still in compile test).
>>
>
> Corrected.

Egads.  Now I understand what that code is.  I don't like the balign,
since this has nothing to do with alignment -- we're creating an array
of functions.

Can't we make it explicit?

#define EARLY_IDT_HANDLER_STRIDE 9

...

    .rept NUM_EXCEPTION_VECTORS
    . = early_idt_handlers + i * EARLY_IDT_HANDLER_STRIDE
    .if (EXCEPTION_ERRCODE_MASK >> i) & 1
    ASM_NOP2
    .else
    pushl $0        # Dummy error code, to make stack frame uniform
    .endif
    pushl $i        # 20(%esp) Vector number
    jmp early_idt_handler
    i = i + 1
    .endr

gas will error out if we try to move . backwards, so this should be safe.

--Andy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-20 20:54                       ` Andy Lutomirski
@ 2015-05-20 21:00                         ` H. Peter Anvin
  2015-05-20 21:47                           ` Andy Lutomirski
  0 siblings, 1 reply; 21+ messages in thread
From: H. Peter Anvin @ 2015-05-20 21:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: H.J. Lu, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

On 05/20/2015 01:53 PM, Andy Lutomirski wrote:
> Egads.  Now I understand what that code is.  I don't like the balign,
> since this has nothing to do with alignment -- we're creating an array
> of functions.

Actually it does... we align to the beginning of each slot.  If .balign
could be something other than a power of 2 that would work, too.

I was mostly looking to minimize the amount of gas magic we rely on.

> Can't we make it explicit?
> 
> #define EARLY_IDT_HANDLER_STRIDE 9
> 
> ...
> 
>     .rept NUM_EXCEPTION_VECTORS
>     . = early_idt_handlers + i * EARLY_IDT_HANDLER_STRIDE
>     .if (EXCEPTION_ERRCODE_MASK >> i) & 1
>     ASM_NOP2
>     .else
>     pushl $0        # Dummy error code, to make stack frame uniform
>     .endif
>     pushl $i        # 20(%esp) Vector number
>     jmp early_idt_handler
>     i = i + 1
>     .endr
> 
> gas will error out if we try to move . backwards, so this should be safe.

If that works too with all the versions of gas we care about, that would
be fine (and I do appreciate the explicitness.)  However, .[b]align is
something that will have been well exercised in every version of gas, so
I do feel slightly safer with it.

	-hpa

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-20 21:00                         ` H. Peter Anvin
@ 2015-05-20 21:47                           ` Andy Lutomirski
  2015-05-20 22:00                             ` H. Peter Anvin
  2015-05-20 22:09                             ` H. Peter Anvin
  0 siblings, 2 replies; 21+ messages in thread
From: Andy Lutomirski @ 2015-05-20 21:47 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: H.J. Lu, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

On Wed, May 20, 2015 at 1:59 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 05/20/2015 01:53 PM, Andy Lutomirski wrote:
>> Egads.  Now I understand what that code is.  I don't like the balign,
>> since this has nothing to do with alignment -- we're creating an array
>> of functions.
>
> Actually it does... we align to the beginning of each slot.  If .balign
> could be something other than a power of 2 that would work, too.

When I see "align", I think that we want to align to a multiple of X
but we don't particularly care which multiple of X.  Here we want a
specific address and any other address would be an error.

>
> I was mostly looking to minimize the amount of gas magic we rely on.
>
>> Can't we make it explicit?
>>
>> #define EARLY_IDT_HANDLER_STRIDE 9
>>
>> ...
>>
>>     .rept NUM_EXCEPTION_VECTORS
>>     . = early_idt_handlers + i * EARLY_IDT_HANDLER_STRIDE
>>     .if (EXCEPTION_ERRCODE_MASK >> i) & 1
>>     ASM_NOP2
>>     .else
>>     pushl $0        # Dummy error code, to make stack frame uniform
>>     .endif
>>     pushl $i        # 20(%esp) Vector number
>>     jmp early_idt_handler
>>     i = i + 1
>>     .endr
>>
>> gas will error out if we try to move . backwards, so this should be safe.
>
> If that works too with all the versions of gas we care about, that would
> be fine (and I do appreciate the explicitness.)  However, .[b]align is
> something that will have been well exercised in every version of gas, so
> I do feel slightly safer with it.

I have no idea how to easily test my approach on really old binutils
versions, but I'm reasonably confident that assigning to . is common.

--Andy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-20 21:47                           ` Andy Lutomirski
@ 2015-05-20 22:00                             ` H. Peter Anvin
  2015-05-20 22:09                             ` H. Peter Anvin
  1 sibling, 0 replies; 21+ messages in thread
From: H. Peter Anvin @ 2015-05-20 22:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: H.J. Lu, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

Okay, let's go with it. that also let us merge this constant with the length of the array in the C header file.

On May 20, 2015 2:47:10 PM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Wed, May 20, 2015 at 1:59 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 05/20/2015 01:53 PM, Andy Lutomirski wrote:
>>> Egads.  Now I understand what that code is.  I don't like the
>balign,
>>> since this has nothing to do with alignment -- we're creating an
>array
>>> of functions.
>>
>> Actually it does... we align to the beginning of each slot.  If
>.balign
>> could be something other than a power of 2 that would work, too.
>
>When I see "align", I think that we want to align to a multiple of X
>but we don't particularly care which multiple of X.  Here we want a
>specific address and any other address would be an error.
>
>>
>> I was mostly looking to minimize the amount of gas magic we rely on.
>>
>>> Can't we make it explicit?
>>>
>>> #define EARLY_IDT_HANDLER_STRIDE 9
>>>
>>> ...
>>>
>>>     .rept NUM_EXCEPTION_VECTORS
>>>     . = early_idt_handlers + i * EARLY_IDT_HANDLER_STRIDE
>>>     .if (EXCEPTION_ERRCODE_MASK >> i) & 1
>>>     ASM_NOP2
>>>     .else
>>>     pushl $0        # Dummy error code, to make stack frame uniform
>>>     .endif
>>>     pushl $i        # 20(%esp) Vector number
>>>     jmp early_idt_handler
>>>     i = i + 1
>>>     .endr
>>>
>>> gas will error out if we try to move . backwards, so this should be
>safe.
>>
>> If that works too with all the versions of gas we care about, that
>would
>> be fine (and I do appreciate the explicitness.)  However, .[b]align
>is
>> something that will have been well exercised in every version of gas,
>so
>> I do feel slightly safer with it.
>
>I have no idea how to easily test my approach on really old binutils
>versions, but I'm reasonably confident that assigning to . is common.
>
>--Andy

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-20 21:47                           ` Andy Lutomirski
  2015-05-20 22:00                             ` H. Peter Anvin
@ 2015-05-20 22:09                             ` H. Peter Anvin
  2015-05-20 22:18                               ` Andy Lutomirski
  1 sibling, 1 reply; 21+ messages in thread
From: H. Peter Anvin @ 2015-05-20 22:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: H.J. Lu, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

Also, obviously, let's get rid of the now unnecessary ASM_NOP2.

On May 20, 2015 2:47:10 PM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Wed, May 20, 2015 at 1:59 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 05/20/2015 01:53 PM, Andy Lutomirski wrote:
>>> Egads.  Now I understand what that code is.  I don't like the
>balign,
>>> since this has nothing to do with alignment -- we're creating an
>array
>>> of functions.
>>
>> Actually it does... we align to the beginning of each slot.  If
>.balign
>> could be something other than a power of 2 that would work, too.
>
>When I see "align", I think that we want to align to a multiple of X
>but we don't particularly care which multiple of X.  Here we want a
>specific address and any other address would be an error.
>
>>
>> I was mostly looking to minimize the amount of gas magic we rely on.
>>
>>> Can't we make it explicit?
>>>
>>> #define EARLY_IDT_HANDLER_STRIDE 9
>>>
>>> ...
>>>
>>>     .rept NUM_EXCEPTION_VECTORS
>>>     . = early_idt_handlers + i * EARLY_IDT_HANDLER_STRIDE
>>>     .if (EXCEPTION_ERRCODE_MASK >> i) & 1
>>>     ASM_NOP2
>>>     .else
>>>     pushl $0        # Dummy error code, to make stack frame uniform
>>>     .endif
>>>     pushl $i        # 20(%esp) Vector number
>>>     jmp early_idt_handler
>>>     i = i + 1
>>>     .endr
>>>
>>> gas will error out if we try to move . backwards, so this should be
>safe.
>>
>> If that works too with all the versions of gas we care about, that
>would
>> be fine (and I do appreciate the explicitness.)  However, .[b]align
>is
>> something that will have been well exercised in every version of gas,
>so
>> I do feel slightly safer with it.
>
>I have no idea how to easily test my approach on really old binutils
>versions, but I'm reasonably confident that assigning to . is common.
>
>--Andy

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-20 22:09                             ` H. Peter Anvin
@ 2015-05-20 22:18                               ` Andy Lutomirski
  2015-05-20 22:22                                 ` H. Peter Anvin
  0 siblings, 1 reply; 21+ messages in thread
From: Andy Lutomirski @ 2015-05-20 22:18 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: H.J. Lu, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

On Wed, May 20, 2015 at 3:08 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> Also, obviously, let's get rid of the now unnecessary ASM_NOP2.

Will you write the patch or should I?

--Andy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Avoiding unnecessary jump relocations in gas?
  2015-05-20 22:18                               ` Andy Lutomirski
@ 2015-05-20 22:22                                 ` H. Peter Anvin
  0 siblings, 0 replies; 21+ messages in thread
From: H. Peter Anvin @ 2015-05-20 22:22 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: H.J. Lu, Borislav Petkov, Jan Beulich, Binutils, linux-kernel

If you could I'd appreciate it.

On May 20, 2015 3:17:53 PM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Wed, May 20, 2015 at 3:08 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> Also, obviously, let's get rid of the now unnecessary ASM_NOP2.
>
>Will you write the patch or should I?
>
>--Andy

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-05-20 22:22 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-07  6:02 Avoiding unnecessary jump relocations in gas? Andy Lutomirski
2015-05-07 11:52 ` Jan Beulich
2015-05-07 16:21   ` H.J. Lu
2015-05-08  3:23     ` Andy Lutomirski
2015-05-08 12:09       ` H.J. Lu
2015-05-08 20:16         ` H.J. Lu
2015-05-18 19:37           ` Andy Lutomirski
2015-05-18 20:02             ` H.J. Lu
2015-05-18 20:06               ` H. Peter Anvin
2015-05-18 20:25                 ` H.J. Lu
2015-05-18 20:28                   ` H. Peter Anvin
2015-05-18 20:34                     ` H. Peter Anvin
2015-05-20 20:54                       ` Andy Lutomirski
2015-05-20 21:00                         ` H. Peter Anvin
2015-05-20 21:47                           ` Andy Lutomirski
2015-05-20 22:00                             ` H. Peter Anvin
2015-05-20 22:09                             ` H. Peter Anvin
2015-05-20 22:18                               ` Andy Lutomirski
2015-05-20 22:22                                 ` H. Peter Anvin
2015-05-18 20:07             ` H. Peter Anvin
2015-05-18 20:11               ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).