public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* RFC: Add -mshared option to x86 ELF assembler
@ 2015-05-13  0:14 H.J. Lu
  2015-05-13 11:50 ` H.J. Lu
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2015-05-13  0:14 UTC (permalink / raw)
  To: Andy Lutomirski, H. Peter Anvin; +Cc: Jan Beulich, Binutils, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4735 bytes --]

On Fri, May 8, 2015 at 1:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, May 8, 2015 at 5:09 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>>>>> AFAICT gas will produce relocations for jumps to global labels in the
>>>>>> same file.  This doesn't seem directly harmful to me, except that, on
>>>>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>>>>
>>>>>> This seems especially unfortunate, since even hidden and protected
>>>>>> symbols have this problem.
>>>>>>
>>>>>> Given that many users don't want interposition support (especially the
>>>>>> kernel and anyone using .hidden or .protected), it would be nice to
>>>>>> have a command-line option to turn this off and probably also to turn
>>>>>> it off by default for hidden and protected symbols.  Can gas do this?
>>>>>
>>>>> I've been running with the below changes (taken off of a bigger set
>>>>> of changes, so the line numbers may look a little odd) for the last
>>>>> couple of years. I never tried to submit this change because so far
>>>>> I couldn't find the time to check whether this would have any
>>>>> unwanted side effects on cases I don't normally use.
>>>>>
>>>>
>>>> This is the patch I checked in.
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> H.J.
>>>> ---
>>>> Branches to global non-weak symbols defined in the same segment with
>>>> non-default visibility can be optimized the same way as branches to
>>>> local symbols.
>>>
>>> Would it make sense to also add a command line option along the lines
>>> of gcc's -fno-semantic-interposition or some way to override the
>>> default visibility?  AFAICS this patch helps but only if asm code gets
>>> liberally sprinkled with .hidden or .protected directives.
>>>
>>
>> This is what I checked in.  With
>>
>> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
>> index 2fda005..186e6f7 100644
>> --- a/arch/x86/Makefile
>> +++ b/arch/x86/Makefile
>> @@ -107,6 +107,10 @@ else
>>          KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
>>  endif
>>
>> +NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared)
>> +KBUILD_CFLAGS += $(NO_SHARED_CFLAGS)
>> +KBUILD_AFLAGS += $(NO_SHARED_CFLAGS)
>> +
>>  # Make sure compiler does not have buggy stack-protector support.
>>  ifdef CONFIG_CC_STACKPROTECTOR
>>    cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
>>
>> On kernel master branch, I got
>>
>>    text   data    bss    dec    hex filename
>> 10934167 2275232 1609728 14819127 e21f37 vmlinux.old
>> 10934119 2275232 1609728 14819079 e21f07 vmlinux
>>
>> It saves 48 bytes.
>
> This is before I fixed:
>
> /* This is global to keep gas from relaxing the jumps */
> ENTRY(early_idt_handler)
>         cld
>
> in arch/x86/kernel/head_64.S.  With -mno-shared, we must
> make early_idt_handler weak to keep gas from relaxing the jumps.
>

Here is a patch to change the assembler default to optimize out
relocations to defined non-weak global branch targets with default
visibility.  It will generate slightly smaller object files.  But Linux
kernel will be broken unless early_idt_handler is marked weak.
I am little uncomfortable with -mshare and I don't like -mno-shared
very much either.  I may just simply remove -mno-shared.


-- 
H.J.
---
This patch removes the newly added -mno-shared option from x86 ELF
assembler and adds -mshared option to x86 ELF assembler.  By default,
assembler will optimize out relocations to defined non-weak global
branch targets with default visibility.  The -mshared option tells
the assembler to generate code which may go into a shared library
where all non-weak global branch targets with default visibility can
be preempted.  The resulting code is slightly bigger.  This option
only affects the handling of branch instructions.

gas/

* config/tc-i386.c (no_shared): Renamed to ...
(shared): This.
(elf_symbol_resolved_in_segment_p): Add relocation argument.
Check PLT relocations and shared.
(md_estimate_size_before_relax): Pass fragP->fr_var to
elf_symbol_resolved_in_segment_p.
(OPTION_MNO_SHARED): Renamed to ...
(OPTION_MSHARED): This.
(md_longopts): Renamed -mno-shared to -mshared.
(md_show_usage): Likewise.
* doc/c-i386.texi: Likewise.

gas/testsuite/

* gas/i386/pcrel.d: Pass -mshared to assembler.
* gas/i386/relax-3.d: Likewise.  Updated.
* gas/i386/x86-64-relax-2.d: Likewise.
* gas/i386/relax-3.s: Add test for PLT relocation.
* gas/i386/relax-4.d: Remove -mno-shared.  Updated.
* gas/i386/x86-64-relax-3.d: Likewise.

[-- Attachment #2: 0001-Add-mshared-option-to-x86-ELF-assembler.patch --]
[-- Type: text/x-patch, Size: 15717 bytes --]

From 180c819ed8be897f02ef15a201ce4ea47899749f Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 12 May 2015 16:52:11 -0700
Subject: [PATCH] Add -mshared option to x86 ELF assembler

This patch removes the newly added -mno-shared option from x86 ELF
assembler and adds -mshared option to x86 ELF assembler.  By default,
assembler will optimize out relocations to defined non-weak global
branch targets with default visibility.  The -mshared option tells
the assembler to generate code which may go into a shared library
where all non-weak global branch targets with default visibility can
be preempted.  The resulting code is slightly bigger.  This option
only affects the handling of branch instructions.

gas/

	* config/tc-i386.c (no_shared): Renamed to ...
	(shared): This.
	(elf_symbol_resolved_in_segment_p): Add relocation argument.
	Check PLT relocations and shared.
	(md_estimate_size_before_relax): Pass fragP->fr_var to
	elf_symbol_resolved_in_segment_p.
	(OPTION_MNO_SHARED): Renamed to ...
	(OPTION_MSHARED): This.
	(md_longopts): Renamed -mno-shared to -mshared.
	(md_show_usage): Likewise.
	* doc/c-i386.texi: Likewise.

gas/testsuite/

	* gas/i386/pcrel.d: Pass -mshared to assembler.
	* gas/i386/relax-3.d: Likewise.  Updated.
	* gas/i386/x86-64-relax-2.d: Likewise.
	* gas/i386/relax-3.s: Add test for PLT relocation.
	* gas/i386/relax-4.d: Remove -mno-shared.  Updated.
	* gas/i386/x86-64-relax-3.d: Likewise.
---
 gas/config/tc-i386.c                    | 36 ++++++++++++++++++++++-----------
 gas/doc/c-i386.texi                     | 17 ++++++++--------
 gas/testsuite/gas/i386/pcrel.d          |  1 +
 gas/testsuite/gas/i386/relax-3.d        | 28 +++++++++++++------------
 gas/testsuite/gas/i386/relax-3.s        |  1 +
 gas/testsuite/gas/i386/relax-4.d        | 28 ++++++++++++-------------
 gas/testsuite/gas/i386/x86-64-relax-2.d | 24 ++++++++++++----------
 gas/testsuite/gas/i386/x86-64-relax-3.d | 28 ++++++++++++-------------
 8 files changed, 91 insertions(+), 72 deletions(-)

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index db263ee..254548f 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -525,8 +525,8 @@ static int use_big_obj = 0;
 #endif
 
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
-/* 1 if not generating code for a shared library.  */
-static int no_shared = 0;
+/* 1 if generating code for a shared library.  */
+static int shared = 0;
 #endif
 
 /* 1 for intel syntax,
@@ -8823,7 +8823,7 @@ i386_frag_max_var (fragS *frag)
 
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
 static int
-elf_symbol_resolved_in_segment_p (symbolS *fr_symbol)
+elf_symbol_resolved_in_segment_p (symbolS *fr_symbol, offsetT fr_var)
 {
   /* STT_GNU_IFUNC symbol must go through PLT.  */
   if ((symbol_get_bfdsym (fr_symbol)->flags
@@ -8834,13 +8834,24 @@ elf_symbol_resolved_in_segment_p (symbolS *fr_symbol)
     /* Symbol may be weak or local.  */
     return !S_IS_WEAK (fr_symbol);
 
-  /* Non-weak symbols won't be preempted.  */
-  if (no_shared)
+  /* Global symbols with non-default visibility can't be preempted. */
+  if (ELF_ST_VISIBILITY (S_GET_OTHER (fr_symbol)) != STV_DEFAULT)
     return 1;
 
+  if (fr_var != NO_RELOC)
+    switch ((enum bfd_reloc_code_real) fr_var)
+      {
+      case BFD_RELOC_386_PLT32:
+      case BFD_RELOC_X86_64_PLT32:
+	/* Symbol with PLT relocatin may be preempted. */
+	return 0;
+      default:
+	abort ();
+      }
+
   /* Global symbols with default visibility in a shared library may be
      preempted by another definition.  */
-  return ELF_ST_VISIBILITY (S_GET_OTHER (fr_symbol)) != STV_DEFAULT;
+  return !shared;
 }
 #endif
 
@@ -8867,7 +8878,8 @@ md_estimate_size_before_relax (fragS *fragP, segT segment)
   if (S_GET_SEGMENT (fragP->fr_symbol) != segment
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
       || (IS_ELF
-	  && !elf_symbol_resolved_in_segment_p (fragP->fr_symbol))
+	  && !elf_symbol_resolved_in_segment_p (fragP->fr_symbol,
+						fragP->fr_var))
 #endif
 #if defined (OBJ_COFF) && defined (TE_PE)
       || (OUTPUT_FLAVOR == bfd_target_coff_flavour
@@ -9537,7 +9549,7 @@ const char *md_shortopts = "qn";
 #define OPTION_MBIG_OBJ (OPTION_MD_BASE + 18)
 #define OPTION_OMIT_LOCK_PREFIX (OPTION_MD_BASE + 19)
 #define OPTION_MEVEXRCIG (OPTION_MD_BASE + 20)
-#define OPTION_MNO_SHARED (OPTION_MD_BASE + 21)
+#define OPTION_MSHARED (OPTION_MD_BASE + 21)
 
 struct option md_longopts[] =
 {
@@ -9548,7 +9560,7 @@ struct option md_longopts[] =
 #endif
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
   {"x32", no_argument, NULL, OPTION_X32},
-  {"mno-shared", no_argument, NULL, OPTION_MNO_SHARED},
+  {"mshared", no_argument, NULL, OPTION_MSHARED},
 #endif
   {"divide", no_argument, NULL, OPTION_DIVIDE},
   {"march", required_argument, NULL, OPTION_MARCH},
@@ -9610,8 +9622,8 @@ md_parse_option (int c, char *arg)
 	 .stab instead of .stab.excl.  We always use .stab anyhow.  */
       break;
 
-    case OPTION_MNO_SHARED:
-      no_shared = 1;
+    case OPTION_MSHARED:
+      shared = 1;
       break;
 #endif
 #if (defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF) \
@@ -10043,7 +10055,7 @@ md_show_usage (FILE *stream)
   fprintf (stream, _("\
   -madd-bnd-prefix        add BND prefix for all valid branches\n"));
   fprintf (stream, _("\
-  -mno-shared             enable branch optimization for non shared code\n"));
+  -mshared                disable branch optimization for shared code\n"));
 # if defined (TE_PE) || defined (TE_PEP)
   fprintf (stream, _("\
   -mbig-obj               generate big object files\n"));
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 47bcbbb..a1997f5 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -298,15 +298,16 @@ The @code{.att_syntax} and @code{.intel_syntax} directives will take precedent.
 This option forces the assembler to add BND prefix to all branches, even
 if such prefix was not explicitly specified in the source code.
 
-@cindex @samp{-mno-shared} option, i386
-@cindex @samp{-mno-shared} option, x86-64
+@cindex @samp{-mshared} option, i386
+@cindex @samp{-mshared} option, x86-64
 @item -mno-shared
-On ELF target, the assembler normally generates code which can go into a
-shared library where non-weak symbols can be preempted.  The
-@samp{-mno-shared} option tells the assembler to generate code not for
-a shared library, where non-weak symbols won't be preempted.  The
-resulting code is slightly smaller.  This option mainly affects the
-handling of branch instructions.
+On ELF target, the assembler normally optimizes out relocations to
+defined non-weak global branch targets with default visibility.  The
+@samp{-mshared} option tells the assembler to generate code which
+may go into a shared library where all non-weak global branch targets
+with default visibility can be preempted.  The resulting code is
+slightly bigger.  This option only affects the handling of branch
+instructions.
 
 @cindex @samp{-mbig-obj} option, x86-64
 @item -mbig-obj
diff --git a/gas/testsuite/gas/i386/pcrel.d b/gas/testsuite/gas/i386/pcrel.d
index 5b61c23..8a91a1a 100644
--- a/gas/testsuite/gas/i386/pcrel.d
+++ b/gas/testsuite/gas/i386/pcrel.d
@@ -1,4 +1,5 @@
 #objdump: -drw
+#as: -mshared
 #name: i386 pcrel reloc
 
 .*: +file format .*i386.*
diff --git a/gas/testsuite/gas/i386/relax-3.d b/gas/testsuite/gas/i386/relax-3.d
index 8aa94e9..4610553 100644
--- a/gas/testsuite/gas/i386/relax-3.d
+++ b/gas/testsuite/gas/i386/relax-3.d
@@ -1,3 +1,4 @@
+#as: -mshared
 #objdump: -dwr
 
 .*: +file format .*
@@ -5,26 +6,27 @@
 Disassembly of section .text:
 
 0+ <foo>:
-[ 	]*[a-f0-9]+:	eb 1f                	jmp    21 <local>
-[ 	]*[a-f0-9]+:	eb 19                	jmp    1d <hidden_def>
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    5 <foo\+0x5>	5: (R_386_PC)?(DISP)?32	global_def
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    a <foo\+0xa>	a: (R_386_PC)?(DISP)?32	weak_def
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    f <foo\+0xf>	f: (R_386_PC)?(DISP)?32	weak_hidden_undef
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    14 <foo\+0x14>	14: (R_386_PC)?(DISP)?32	weak_hidden_def
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    19 <foo\+0x19>	19: (R_386_PC)?(DISP)?32	hidden_undef
-
-0+1d <hidden_def>:
+[ 	]*[a-f0-9]+:	eb 24                	jmp    26 <local>
+[ 	]*[a-f0-9]+:	eb 1e                	jmp    22 <hidden_def>
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    5 <foo\+0x5>	5: R_386_PC32	global_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    a <foo\+0xa>	a: R_386_PLT32	global_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    f <foo\+0xf>	f: R_386_PC32	weak_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    14 <foo\+0x14>	14: R_386_PC32	weak_hidden_undef
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    19 <foo\+0x19>	19: R_386_PC32	weak_hidden_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    1e <foo\+0x1e>	1e: R_386_PC32	hidden_undef
+
+0+22 <hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+1e <weak_hidden_def>:
+0+23 <weak_hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+1f <global_def>:
+0+24 <global_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+20 <weak_def>:
+0+25 <weak_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+21 <local>:
+0+26 <local>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 #pass
diff --git a/gas/testsuite/gas/i386/relax-3.s b/gas/testsuite/gas/i386/relax-3.s
index ab52185..48ea917 100644
--- a/gas/testsuite/gas/i386/relax-3.s
+++ b/gas/testsuite/gas/i386/relax-3.s
@@ -4,6 +4,7 @@ foo:
 	jmp local
 	jmp hidden_def
 	jmp global_def
+	jmp global_def@PLT
 	jmp weak_def
 	jmp weak_hidden_undef
 	jmp weak_hidden_def
diff --git a/gas/testsuite/gas/i386/relax-4.d b/gas/testsuite/gas/i386/relax-4.d
index b188841..2039251 100644
--- a/gas/testsuite/gas/i386/relax-4.d
+++ b/gas/testsuite/gas/i386/relax-4.d
@@ -1,5 +1,4 @@
 #source: relax-3.s
-#as: -mno-shared
 #objdump: -dwr
 
 .*: +file format .*
@@ -7,26 +6,27 @@
 Disassembly of section .text:
 
 0+ <foo>:
-[ 	]*[a-f0-9]+:	eb 1c                	jmp    1e <local>
-[ 	]*[a-f0-9]+:	eb 16                	jmp    1a <hidden_def>
-[ 	]*[a-f0-9]+:	eb 16                	jmp    1c <global_def>
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    7 <foo\+0x7>	7: (R_386_PC)?(DISP)?32	weak_def
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    c <foo\+0xc>	c: (R_386_PC)?(DISP)?32	weak_hidden_undef
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    11 <foo\+0x11>	11: (R_386_PC)?(DISP)?32	weak_hidden_def
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    16 <foo\+0x16>	16: (R_386_PC)?(DISP)?32	hidden_undef
-
-0+1a <hidden_def>:
+[ 	]*[a-f0-9]+:	eb 21                	jmp    23 <local>
+[ 	]*[a-f0-9]+:	eb 1b                	jmp    1f <hidden_def>
+[ 	]*[a-f0-9]+:	eb 1b                	jmp    21 <global_def>
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    7 <foo\+0x7>	7: R_386_PLT32	global_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    c <foo\+0xc>	c: R_386_PC32	weak_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    11 <foo\+0x11>	11: R_386_PC32	weak_hidden_undef
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    16 <foo\+0x16>	16: R_386_PC32	weak_hidden_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    1b <foo\+0x1b>	1b: R_386_PC32	hidden_undef
+
+0+1f <hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+1b <weak_hidden_def>:
+0+20 <weak_hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+1c <global_def>:
+0+21 <global_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+1d <weak_def>:
+0+22 <weak_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+1e <local>:
+0+23 <local>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-relax-2.d b/gas/testsuite/gas/i386/x86-64-relax-2.d
index 7b0bd56..c124102 100644
--- a/gas/testsuite/gas/i386/x86-64-relax-2.d
+++ b/gas/testsuite/gas/i386/x86-64-relax-2.d
@@ -1,4 +1,5 @@
 #source: relax-3.s
+#as: -mshared
 #objdump: -dwr
 
 .*: +file format .*
@@ -7,26 +8,27 @@
 Disassembly of section .text:
 
 0+ <foo>:
-[ 	]*[a-f0-9]+:	eb 1f                	jmp    21 <local>
-[ 	]*[a-f0-9]+:	eb 19                	jmp    1d <hidden_def>
+[ 	]*[a-f0-9]+:	eb 24                	jmp    26 <local>
+[ 	]*[a-f0-9]+:	eb 1e                	jmp    22 <hidden_def>
 [ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   9 <foo\+0x9>	5: R_X86_64_PC32	global_def-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   e <foo\+0xe>	a: R_X86_64_PC32	weak_def-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   13 <foo\+0x13>	f: R_X86_64_PC32	weak_hidden_undef-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   18 <foo\+0x18>	14: R_X86_64_PC32	weak_hidden_def-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1d <hidden_def>	19: R_X86_64_PC32	hidden_undef-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   e <foo\+0xe>	a: R_X86_64_PLT32	global_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   13 <foo\+0x13>	f: R_X86_64_PC32	weak_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   18 <foo\+0x18>	14: R_X86_64_PC32	weak_hidden_undef-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1d <foo\+0x1d>	19: R_X86_64_PC32	weak_hidden_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   22 <hidden_def>	1e: R_X86_64_PC32	hidden_undef-0x4
 
-0+1d <hidden_def>:
+0+22 <hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+1e <weak_hidden_def>:
+0+23 <weak_hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+1f <global_def>:
+0+24 <global_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+20 <weak_def>:
+0+25 <weak_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+21 <local>:
+0+26 <local>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-relax-3.d b/gas/testsuite/gas/i386/x86-64-relax-3.d
index d0c7ee4..98fd28d 100644
--- a/gas/testsuite/gas/i386/x86-64-relax-3.d
+++ b/gas/testsuite/gas/i386/x86-64-relax-3.d
@@ -1,5 +1,4 @@
 #source: relax-3.s
-#as: -mno-shared
 #objdump: -dwr
 
 .*: +file format .*
@@ -8,26 +7,27 @@
 Disassembly of section .text:
 
 0+ <foo>:
-[ 	]*[a-f0-9]+:	eb 1c                	jmp    1e <local>
-[ 	]*[a-f0-9]+:	eb 16                	jmp    1a <hidden_def>
-[ 	]*[a-f0-9]+:	eb 16                	jmp    1c <global_def>
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   b <foo\+0xb>	7: R_X86_64_PC32	weak_def-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   10 <foo\+0x10>	c: R_X86_64_PC32	weak_hidden_undef-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   15 <foo\+0x15>	11: R_X86_64_PC32	weak_hidden_def-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1a <hidden_def>	16: R_X86_64_PC32	hidden_undef-0x4
-
-0+1a <hidden_def>:
+[ 	]*[a-f0-9]+:	eb 21                	jmp    23 <local>
+[ 	]*[a-f0-9]+:	eb 1b                	jmp    1f <hidden_def>
+[ 	]*[a-f0-9]+:	eb 1b                	jmp    21 <global_def>
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   b <foo\+0xb>	7: R_X86_64_PLT32	global_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   10 <foo\+0x10>	c: R_X86_64_PC32	weak_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   15 <foo\+0x15>	11: R_X86_64_PC32	weak_hidden_undef-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1a <foo\+0x1a>	16: R_X86_64_PC32	weak_hidden_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1f <hidden_def>	1b: R_X86_64_PC32	hidden_undef-0x4
+
+0+1f <hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+1b <weak_hidden_def>:
+0+20 <weak_hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+1c <global_def>:
+0+21 <global_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+1d <weak_def>:
+0+22 <weak_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+1e <local>:
+0+23 <local>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 #pass
-- 
1.9.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Add -mshared option to x86 ELF assembler
  2015-05-13  0:14 RFC: Add -mshared option to x86 ELF assembler H.J. Lu
@ 2015-05-13 11:50 ` H.J. Lu
  2015-05-13 12:59   ` H.J. Lu
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2015-05-13 11:50 UTC (permalink / raw)
  To: Andy Lutomirski, H. Peter Anvin; +Cc: Jan Beulich, Binutils, linux-kernel

On Tue, May 12, 2015 at 5:14 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, May 8, 2015 at 1:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Fri, May 8, 2015 at 5:09 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>>>>>> AFAICT gas will produce relocations for jumps to global labels in the
>>>>>>> same file.  This doesn't seem directly harmful to me, except that, on
>>>>>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>>>>>
>>>>>>> This seems especially unfortunate, since even hidden and protected
>>>>>>> symbols have this problem.
>>>>>>>
>>>>>>> Given that many users don't want interposition support (especially the
>>>>>>> kernel and anyone using .hidden or .protected), it would be nice to
>>>>>>> have a command-line option to turn this off and probably also to turn
>>>>>>> it off by default for hidden and protected symbols.  Can gas do this?
>>>>>>
>>>>>> I've been running with the below changes (taken off of a bigger set
>>>>>> of changes, so the line numbers may look a little odd) for the last
>>>>>> couple of years. I never tried to submit this change because so far
>>>>>> I couldn't find the time to check whether this would have any
>>>>>> unwanted side effects on cases I don't normally use.
>>>>>>
>>>>>
>>>>> This is the patch I checked in.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> H.J.
>>>>> ---
>>>>> Branches to global non-weak symbols defined in the same segment with
>>>>> non-default visibility can be optimized the same way as branches to
>>>>> local symbols.
>>>>
>>>> Would it make sense to also add a command line option along the lines
>>>> of gcc's -fno-semantic-interposition or some way to override the
>>>> default visibility?  AFAICS this patch helps but only if asm code gets
>>>> liberally sprinkled with .hidden or .protected directives.
>>>>
>>>
>>> This is what I checked in.  With
>>>
>>> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
>>> index 2fda005..186e6f7 100644
>>> --- a/arch/x86/Makefile
>>> +++ b/arch/x86/Makefile
>>> @@ -107,6 +107,10 @@ else
>>>          KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
>>>  endif
>>>
>>> +NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared)
>>> +KBUILD_CFLAGS += $(NO_SHARED_CFLAGS)
>>> +KBUILD_AFLAGS += $(NO_SHARED_CFLAGS)
>>> +
>>>  # Make sure compiler does not have buggy stack-protector support.
>>>  ifdef CONFIG_CC_STACKPROTECTOR
>>>    cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
>>>
>>> On kernel master branch, I got
>>>
>>>    text   data    bss    dec    hex filename
>>> 10934167 2275232 1609728 14819127 e21f37 vmlinux.old
>>> 10934119 2275232 1609728 14819079 e21f07 vmlinux
>>>
>>> It saves 48 bytes.
>>
>> This is before I fixed:
>>
>> /* This is global to keep gas from relaxing the jumps */
>> ENTRY(early_idt_handler)
>>         cld
>>
>> in arch/x86/kernel/head_64.S.  With -mno-shared, we must
>> make early_idt_handler weak to keep gas from relaxing the jumps.
>>
>
> Here is a patch to change the assembler default to optimize out
> relocations to defined non-weak global branch targets with default
> visibility.  It will generate slightly smaller object files.  But Linux
> kernel will be broken unless early_idt_handler is marked weak.
> I am little uncomfortable with -mshare and I don't like -mno-shared
> very much either.  I may just simply remove -mno-shared.
>

I reverted the -mno-shared change.

-- 
H.J.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Add -mshared option to x86 ELF assembler
  2015-05-13 11:50 ` H.J. Lu
@ 2015-05-13 12:59   ` H.J. Lu
  2015-05-20 20:02     ` Andy Lutomirski
  0 siblings, 1 reply; 6+ messages in thread
From: H.J. Lu @ 2015-05-13 12:59 UTC (permalink / raw)
  To: Andy Lutomirski, H. Peter Anvin; +Cc: Jan Beulich, Binutils, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5822 bytes --]

On Wed, May 13, 2015 at 4:50 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, May 12, 2015 at 5:14 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Fri, May 8, 2015 at 1:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Fri, May 8, 2015 at 5:09 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>>>>>>> AFAICT gas will produce relocations for jumps to global labels in the
>>>>>>>> same file.  This doesn't seem directly harmful to me, except that, on
>>>>>>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>>>>>>
>>>>>>>> This seems especially unfortunate, since even hidden and protected
>>>>>>>> symbols have this problem.
>>>>>>>>
>>>>>>>> Given that many users don't want interposition support (especially the
>>>>>>>> kernel and anyone using .hidden or .protected), it would be nice to
>>>>>>>> have a command-line option to turn this off and probably also to turn
>>>>>>>> it off by default for hidden and protected symbols.  Can gas do this?
>>>>>>>
>>>>>>> I've been running with the below changes (taken off of a bigger set
>>>>>>> of changes, so the line numbers may look a little odd) for the last
>>>>>>> couple of years. I never tried to submit this change because so far
>>>>>>> I couldn't find the time to check whether this would have any
>>>>>>> unwanted side effects on cases I don't normally use.
>>>>>>>
>>>>>>
>>>>>> This is the patch I checked in.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> --
>>>>>> H.J.
>>>>>> ---
>>>>>> Branches to global non-weak symbols defined in the same segment with
>>>>>> non-default visibility can be optimized the same way as branches to
>>>>>> local symbols.
>>>>>
>>>>> Would it make sense to also add a command line option along the lines
>>>>> of gcc's -fno-semantic-interposition or some way to override the
>>>>> default visibility?  AFAICS this patch helps but only if asm code gets
>>>>> liberally sprinkled with .hidden or .protected directives.
>>>>>
>>>>
>>>> This is what I checked in.  With
>>>>
>>>> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
>>>> index 2fda005..186e6f7 100644
>>>> --- a/arch/x86/Makefile
>>>> +++ b/arch/x86/Makefile
>>>> @@ -107,6 +107,10 @@ else
>>>>          KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
>>>>  endif
>>>>
>>>> +NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared)
>>>> +KBUILD_CFLAGS += $(NO_SHARED_CFLAGS)
>>>> +KBUILD_AFLAGS += $(NO_SHARED_CFLAGS)
>>>> +
>>>>  # Make sure compiler does not have buggy stack-protector support.
>>>>  ifdef CONFIG_CC_STACKPROTECTOR
>>>>    cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
>>>>
>>>> On kernel master branch, I got
>>>>
>>>>    text   data    bss    dec    hex filename
>>>> 10934167 2275232 1609728 14819127 e21f37 vmlinux.old
>>>> 10934119 2275232 1609728 14819079 e21f07 vmlinux
>>>>
>>>> It saves 48 bytes.
>>>
>>> This is before I fixed:
>>>
>>> /* This is global to keep gas from relaxing the jumps */
>>> ENTRY(early_idt_handler)
>>>         cld
>>>
>>> in arch/x86/kernel/head_64.S.  With -mno-shared, we must
>>> make early_idt_handler weak to keep gas from relaxing the jumps.
>>>
>>
>> Here is a patch to change the assembler default to optimize out
>> relocations to defined non-weak global branch targets with default
>> visibility.  It will generate slightly smaller object files.  But Linux
>> kernel will be broken unless early_idt_handler is marked weak.
>> I am little uncomfortable with -mshare and I don't like -mno-shared
>> very much either.  I may just simply remove -mno-shared.
>>
>
> I reverted the -mno-shared change.
>

Here is a patch to add -mshared, which is off by default.  On Linux kernel
with this change:

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index a468c0a..9a10e05 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -339,8 +339,8 @@ early_idt_handlers:
  i = i + 1
  .endr

-/* This is global to keep gas from relaxing the jumps */
-ENTRY(early_idt_handler)
+/* This is weak to keep gas from relaxing the jumps */
+WEAK(early_idt_handler)
  cld

  cmpl $2,(%rsp) # X86_TRAP_NMI
-- 
2.1.0

I got

[hjl@gnu-tools-1 kernel.org]$ readelf -r old/vmlinux.o | head -5

Relocation section '.rela.text' at offset 0xafea2f0 contains 205717 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000001  1253100000002 R_X86_64_PC32     0000000000001e70 __fentry__ - 4
000000000009  1c8c00000002 R_X86_64_PC32     0000000000000000 .data + 51bc
[hjl@gnu-tools-1 kernel.org]$ readelf -r new/vmlinux.o | head -5

Relocation section '.rela.text' at offset 0xafea280 contains 205711 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000001  1253100000002 R_X86_64_PC32     0000000000001e70 __fentry__ - 4
000000000009  1c8c00000002 R_X86_64_PC32     0000000000000000 .data + 51bc
[hjl@gnu-tools-1 kernel.org]$

It removes 6 relocations.  On gcc master branch,

[hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1
   text   data    bss    dec    hex filename
21529621  62256 1348312 22940189 15e0a1d build-x86_64-linux.branch/gcc/cc1
21529749  62256 1348312 22940317 15e0a9d build-x86_64-linux/gcc/cc1
[hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1plus
   text   data    bss    dec    hex filename
23713509  62400 1372760 25148669 17fbcfd build-x86_64-linux.branch/gcc/cc1plus
23713669  62400 1372760 25148829 17fbd9d build-x86_64-linux/gcc/cc1plus
[hjl@gnu-tools-1 gcc-misc]$

It is more effective.  I will run more tests.


-- 
H.J.

[-- Attachment #2: binutils-mshared.patch --]
[-- Type: text/x-patch, Size: 14613 bytes --]

From cf1509c67f6c2ca0919c60220489e8d10bb4963a Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Tue, 12 May 2015 16:52:11 -0700
Subject: [PATCH] Add -mshared option to x86 ELF assembler

This patch adds -mshared option to x86 ELF assembler.  By default,
assembler will optimize out relocations to defined non-weak global
branch targets with default visibility.  The -mshared option tells
the assembler to generate code which may go into a shared library
where all non-weak global branch targets with default visibility can
be preempted.  The resulting code is slightly bigger.  This option
only affects the handling of branch instructions.

gas/

	* config/tc-i386.c (shared): New.
	(OPTION_MSHARED): Likewise.
	(elf_symbol_resolved_in_segment_p): Add relocation argument.
	Check PLT relocations and shared.
	(md_estimate_size_before_relax): Pass fragP->fr_var to
	elf_symbol_resolved_in_segment_p.
	(md_longopts): Add -mshared.
	(md_show_usage): Likewise.
	(md_parse_option): Handle OPTION_MSHARED.
	* doc/c-i386.texi: Document -mshared.

gas/testsuite/

	* gas/i386/i386.exp: Run relax-4 and x86-64-relax-3.
	* gas/i386/pcrel.d: Pass -mshared to assembler.
	* gas/i386/relax-3.d: Likewise.  Updated.
	* gas/i386/x86-64-relax-2.d: Likewise.
	* gas/i386/relax-3.s: Add test for PLT relocation.
	* gas/i386/relax-4.d: New file.
	* gas/i386/x86-64-relax-3.d: Likewise.
---
 gas/config/tc-i386.c                    | 35 ++++++++++++++++++++++++++++++---
 gas/doc/c-i386.texi                     | 11 +++++++++++
 gas/testsuite/gas/i386/i386.exp         |  2 ++
 gas/testsuite/gas/i386/pcrel.d          |  1 +
 gas/testsuite/gas/i386/relax-3.d        | 28 ++++++++++++++------------
 gas/testsuite/gas/i386/relax-3.s        |  1 +
 gas/testsuite/gas/i386/relax-4.d        | 32 ++++++++++++++++++++++++++++++
 gas/testsuite/gas/i386/x86-64-relax-2.d | 24 +++++++++++-----------
 gas/testsuite/gas/i386/x86-64-relax-3.d | 33 +++++++++++++++++++++++++++++++
 9 files changed, 140 insertions(+), 27 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/relax-4.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-relax-3.d

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 75f268f..254548f 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -524,6 +524,11 @@ static enum x86_elf_abi x86_elf_abi = I386_ABI;
 static int use_big_obj = 0;
 #endif
 
+#if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
+/* 1 if generating code for a shared library.  */
+static int shared = 0;
+#endif
+
 /* 1 for intel syntax,
    0 if att syntax.  */
 static int intel_syntax = 0;
@@ -8818,7 +8823,7 @@ i386_frag_max_var (fragS *frag)
 
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
 static int
-elf_symbol_resolved_in_segment_p (symbolS *fr_symbol)
+elf_symbol_resolved_in_segment_p (symbolS *fr_symbol, offsetT fr_var)
 {
   /* STT_GNU_IFUNC symbol must go through PLT.  */
   if ((symbol_get_bfdsym (fr_symbol)->flags
@@ -8829,9 +8834,24 @@ elf_symbol_resolved_in_segment_p (symbolS *fr_symbol)
     /* Symbol may be weak or local.  */
     return !S_IS_WEAK (fr_symbol);
 
+  /* Global symbols with non-default visibility can't be preempted. */
+  if (ELF_ST_VISIBILITY (S_GET_OTHER (fr_symbol)) != STV_DEFAULT)
+    return 1;
+
+  if (fr_var != NO_RELOC)
+    switch ((enum bfd_reloc_code_real) fr_var)
+      {
+      case BFD_RELOC_386_PLT32:
+      case BFD_RELOC_X86_64_PLT32:
+	/* Symbol with PLT relocatin may be preempted. */
+	return 0;
+      default:
+	abort ();
+      }
+
   /* Global symbols with default visibility in a shared library may be
      preempted by another definition.  */
-  return ELF_ST_VISIBILITY (S_GET_OTHER (fr_symbol)) != STV_DEFAULT;
+  return !shared;
 }
 #endif
 
@@ -8858,7 +8878,8 @@ md_estimate_size_before_relax (fragS *fragP, segT segment)
   if (S_GET_SEGMENT (fragP->fr_symbol) != segment
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
       || (IS_ELF
-	  && !elf_symbol_resolved_in_segment_p (fragP->fr_symbol))
+	  && !elf_symbol_resolved_in_segment_p (fragP->fr_symbol,
+						fragP->fr_var))
 #endif
 #if defined (OBJ_COFF) && defined (TE_PE)
       || (OUTPUT_FLAVOR == bfd_target_coff_flavour
@@ -9528,6 +9549,7 @@ const char *md_shortopts = "qn";
 #define OPTION_MBIG_OBJ (OPTION_MD_BASE + 18)
 #define OPTION_OMIT_LOCK_PREFIX (OPTION_MD_BASE + 19)
 #define OPTION_MEVEXRCIG (OPTION_MD_BASE + 20)
+#define OPTION_MSHARED (OPTION_MD_BASE + 21)
 
 struct option md_longopts[] =
 {
@@ -9538,6 +9560,7 @@ struct option md_longopts[] =
 #endif
 #if defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF)
   {"x32", no_argument, NULL, OPTION_X32},
+  {"mshared", no_argument, NULL, OPTION_MSHARED},
 #endif
   {"divide", no_argument, NULL, OPTION_DIVIDE},
   {"march", required_argument, NULL, OPTION_MARCH},
@@ -9598,6 +9621,10 @@ md_parse_option (int c, char *arg)
       /* -s: On i386 Solaris, this tells the native assembler to use
 	 .stab instead of .stab.excl.  We always use .stab anyhow.  */
       break;
+
+    case OPTION_MSHARED:
+      shared = 1;
+      break;
 #endif
 #if (defined (OBJ_ELF) || defined (OBJ_MAYBE_ELF) \
      || defined (TE_PE) || defined (TE_PEP) || defined (OBJ_MACH_O))
@@ -10027,6 +10054,8 @@ md_show_usage (FILE *stream)
   -mold-gcc               support old (<= 2.8.1) versions of gcc\n"));
   fprintf (stream, _("\
   -madd-bnd-prefix        add BND prefix for all valid branches\n"));
+  fprintf (stream, _("\
+  -mshared                disable branch optimization for shared code\n"));
 # if defined (TE_PE) || defined (TE_PEP)
   fprintf (stream, _("\
   -mbig-obj               generate big object files\n"));
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index 1645c8c..a1997f5 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -298,6 +298,17 @@ The @code{.att_syntax} and @code{.intel_syntax} directives will take precedent.
 This option forces the assembler to add BND prefix to all branches, even
 if such prefix was not explicitly specified in the source code.
 
+@cindex @samp{-mshared} option, i386
+@cindex @samp{-mshared} option, x86-64
+@item -mno-shared
+On ELF target, the assembler normally optimizes out relocations to
+defined non-weak global branch targets with default visibility.  The
+@samp{-mshared} option tells the assembler to generate code which
+may go into a shared library where all non-weak global branch targets
+with default visibility can be preempted.  The resulting code is
+slightly bigger.  This option only affects the handling of branch
+instructions.
+
 @cindex @samp{-mbig-obj} option, x86-64
 @item -mbig-obj
 On x86-64 PE/COFF target this option forces the use of big object file
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index b6f2810..c66dbc5 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -396,6 +396,7 @@ if [expr ([istarget "i*86-*-*"] ||  [istarget "x86_64-*-*"]) && [gas_32_check]]
 	run_dump_test "note"
 
 	run_dump_test "relax-3"
+	run_dump_test "relax-4"
 
 	if {![istarget "*-*-nacl*"]} then {
 	    run_dump_test "iamcu-1"
@@ -763,6 +764,7 @@ if [expr ([istarget "i*86-*-*"] || [istarget "x86_64-*-*"]) && [gas_64_check]] t
 	run_list_test "x86-64-size-inval-1" "-al"
 
 	run_dump_test "x86-64-relax-2"
+	run_dump_test "x86-64-relax-3"
 
 	run_dump_test "x86-64-jump"
     }
diff --git a/gas/testsuite/gas/i386/pcrel.d b/gas/testsuite/gas/i386/pcrel.d
index 5b61c23..8a91a1a 100644
--- a/gas/testsuite/gas/i386/pcrel.d
+++ b/gas/testsuite/gas/i386/pcrel.d
@@ -1,4 +1,5 @@
 #objdump: -drw
+#as: -mshared
 #name: i386 pcrel reloc
 
 .*: +file format .*i386.*
diff --git a/gas/testsuite/gas/i386/relax-3.d b/gas/testsuite/gas/i386/relax-3.d
index 8aa94e9..4610553 100644
--- a/gas/testsuite/gas/i386/relax-3.d
+++ b/gas/testsuite/gas/i386/relax-3.d
@@ -1,3 +1,4 @@
+#as: -mshared
 #objdump: -dwr
 
 .*: +file format .*
@@ -5,26 +6,27 @@
 Disassembly of section .text:
 
 0+ <foo>:
-[ 	]*[a-f0-9]+:	eb 1f                	jmp    21 <local>
-[ 	]*[a-f0-9]+:	eb 19                	jmp    1d <hidden_def>
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    5 <foo\+0x5>	5: (R_386_PC)?(DISP)?32	global_def
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    a <foo\+0xa>	a: (R_386_PC)?(DISP)?32	weak_def
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    f <foo\+0xf>	f: (R_386_PC)?(DISP)?32	weak_hidden_undef
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    14 <foo\+0x14>	14: (R_386_PC)?(DISP)?32	weak_hidden_def
-[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    19 <foo\+0x19>	19: (R_386_PC)?(DISP)?32	hidden_undef
-
-0+1d <hidden_def>:
+[ 	]*[a-f0-9]+:	eb 24                	jmp    26 <local>
+[ 	]*[a-f0-9]+:	eb 1e                	jmp    22 <hidden_def>
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    5 <foo\+0x5>	5: R_386_PC32	global_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    a <foo\+0xa>	a: R_386_PLT32	global_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    f <foo\+0xf>	f: R_386_PC32	weak_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    14 <foo\+0x14>	14: R_386_PC32	weak_hidden_undef
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    19 <foo\+0x19>	19: R_386_PC32	weak_hidden_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    1e <foo\+0x1e>	1e: R_386_PC32	hidden_undef
+
+0+22 <hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+1e <weak_hidden_def>:
+0+23 <weak_hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+1f <global_def>:
+0+24 <global_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+20 <weak_def>:
+0+25 <weak_def>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 
-0+21 <local>:
+0+26 <local>:
 [ 	]*[a-f0-9]+:	c3                   	ret    
 #pass
diff --git a/gas/testsuite/gas/i386/relax-3.s b/gas/testsuite/gas/i386/relax-3.s
index ab52185..48ea917 100644
--- a/gas/testsuite/gas/i386/relax-3.s
+++ b/gas/testsuite/gas/i386/relax-3.s
@@ -4,6 +4,7 @@ foo:
 	jmp local
 	jmp hidden_def
 	jmp global_def
+	jmp global_def@PLT
 	jmp weak_def
 	jmp weak_hidden_undef
 	jmp weak_hidden_def
diff --git a/gas/testsuite/gas/i386/relax-4.d b/gas/testsuite/gas/i386/relax-4.d
new file mode 100644
index 0000000..2039251
--- /dev/null
+++ b/gas/testsuite/gas/i386/relax-4.d
@@ -0,0 +1,32 @@
+#source: relax-3.s
+#objdump: -dwr
+
+.*: +file format .*
+
+Disassembly of section .text:
+
+0+ <foo>:
+[ 	]*[a-f0-9]+:	eb 21                	jmp    23 <local>
+[ 	]*[a-f0-9]+:	eb 1b                	jmp    1f <hidden_def>
+[ 	]*[a-f0-9]+:	eb 1b                	jmp    21 <global_def>
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    7 <foo\+0x7>	7: R_386_PLT32	global_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    c <foo\+0xc>	c: R_386_PC32	weak_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    11 <foo\+0x11>	11: R_386_PC32	weak_hidden_undef
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    16 <foo\+0x16>	16: R_386_PC32	weak_hidden_def
+[ 	]*[a-f0-9]+:	e9 fc ff ff ff       	jmp    1b <foo\+0x1b>	1b: R_386_PC32	hidden_undef
+
+0+1f <hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+20 <weak_hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+21 <global_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+22 <weak_def>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+
+0+23 <local>:
+[ 	]*[a-f0-9]+:	c3                   	ret    
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-relax-2.d b/gas/testsuite/gas/i386/x86-64-relax-2.d
index 7b0bd56..c124102 100644
--- a/gas/testsuite/gas/i386/x86-64-relax-2.d
+++ b/gas/testsuite/gas/i386/x86-64-relax-2.d
@@ -1,4 +1,5 @@
 #source: relax-3.s
+#as: -mshared
 #objdump: -dwr
 
 .*: +file format .*
@@ -7,26 +8,27 @@
 Disassembly of section .text:
 
 0+ <foo>:
-[ 	]*[a-f0-9]+:	eb 1f                	jmp    21 <local>
-[ 	]*[a-f0-9]+:	eb 19                	jmp    1d <hidden_def>
+[ 	]*[a-f0-9]+:	eb 24                	jmp    26 <local>
+[ 	]*[a-f0-9]+:	eb 1e                	jmp    22 <hidden_def>
 [ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   9 <foo\+0x9>	5: R_X86_64_PC32	global_def-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   e <foo\+0xe>	a: R_X86_64_PC32	weak_def-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   13 <foo\+0x13>	f: R_X86_64_PC32	weak_hidden_undef-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   18 <foo\+0x18>	14: R_X86_64_PC32	weak_hidden_def-0x4
-[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1d <hidden_def>	19: R_X86_64_PC32	hidden_undef-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   e <foo\+0xe>	a: R_X86_64_PLT32	global_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   13 <foo\+0x13>	f: R_X86_64_PC32	weak_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   18 <foo\+0x18>	14: R_X86_64_PC32	weak_hidden_undef-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1d <foo\+0x1d>	19: R_X86_64_PC32	weak_hidden_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   22 <hidden_def>	1e: R_X86_64_PC32	hidden_undef-0x4
 
-0+1d <hidden_def>:
+0+22 <hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+1e <weak_hidden_def>:
+0+23 <weak_hidden_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+1f <global_def>:
+0+24 <global_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+20 <weak_def>:
+0+25 <weak_def>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 
-0+21 <local>:
+0+26 <local>:
 [ 	]*[a-f0-9]+:	c3                   	retq   
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-relax-3.d b/gas/testsuite/gas/i386/x86-64-relax-3.d
new file mode 100644
index 0000000..98fd28d
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-relax-3.d
@@ -0,0 +1,33 @@
+#source: relax-3.s
+#objdump: -dwr
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <foo>:
+[ 	]*[a-f0-9]+:	eb 21                	jmp    23 <local>
+[ 	]*[a-f0-9]+:	eb 1b                	jmp    1f <hidden_def>
+[ 	]*[a-f0-9]+:	eb 1b                	jmp    21 <global_def>
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   b <foo\+0xb>	7: R_X86_64_PLT32	global_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   10 <foo\+0x10>	c: R_X86_64_PC32	weak_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   15 <foo\+0x15>	11: R_X86_64_PC32	weak_hidden_undef-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1a <foo\+0x1a>	16: R_X86_64_PC32	weak_hidden_def-0x4
+[ 	]*[a-f0-9]+:	e9 00 00 00 00       	jmpq   1f <hidden_def>	1b: R_X86_64_PC32	hidden_undef-0x4
+
+0+1f <hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+20 <weak_hidden_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+21 <global_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+22 <weak_def>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+
+0+23 <local>:
+[ 	]*[a-f0-9]+:	c3                   	retq   
+#pass
-- 
2.1.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Add -mshared option to x86 ELF assembler
  2015-05-13 12:59   ` H.J. Lu
@ 2015-05-20 20:02     ` Andy Lutomirski
  2015-05-20 20:32       ` H. Peter Anvin
  0 siblings, 1 reply; 6+ messages in thread
From: Andy Lutomirski @ 2015-05-20 20:02 UTC (permalink / raw)
  To: H.J. Lu; +Cc: H. Peter Anvin, Jan Beulich, Binutils, linux-kernel

On Wed, May 13, 2015 at 5:59 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, May 13, 2015 at 4:50 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, May 12, 2015 at 5:14 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>> On Fri, May 8, 2015 at 1:16 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>> On Fri, May 8, 2015 at 5:09 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Thu, May 7, 2015 at 8:22 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>>> On Thu, May 7, 2015 at 9:21 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>>>> On Thu, May 7, 2015 at 4:52 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>>>>>>>> On 07.05.15 at 08:02, <luto@amacapital.net> wrote:
>>>>>>>>> AFAICT gas will produce relocations for jumps to global labels in the
>>>>>>>>> same file.  This doesn't seem directly harmful to me, except that, on
>>>>>>>>> x86, it forces five-byte jumps instead of two-byte jumps.
>>>>>>>>>
>>>>>>>>> This seems especially unfortunate, since even hidden and protected
>>>>>>>>> symbols have this problem.
>>>>>>>>>
>>>>>>>>> Given that many users don't want interposition support (especially the
>>>>>>>>> kernel and anyone using .hidden or .protected), it would be nice to
>>>>>>>>> have a command-line option to turn this off and probably also to turn
>>>>>>>>> it off by default for hidden and protected symbols.  Can gas do this?
>>>>>>>>
>>>>>>>> I've been running with the below changes (taken off of a bigger set
>>>>>>>> of changes, so the line numbers may look a little odd) for the last
>>>>>>>> couple of years. I never tried to submit this change because so far
>>>>>>>> I couldn't find the time to check whether this would have any
>>>>>>>> unwanted side effects on cases I don't normally use.
>>>>>>>>
>>>>>>>
>>>>>>> This is the patch I checked in.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> --
>>>>>>> H.J.
>>>>>>> ---
>>>>>>> Branches to global non-weak symbols defined in the same segment with
>>>>>>> non-default visibility can be optimized the same way as branches to
>>>>>>> local symbols.
>>>>>>
>>>>>> Would it make sense to also add a command line option along the lines
>>>>>> of gcc's -fno-semantic-interposition or some way to override the
>>>>>> default visibility?  AFAICS this patch helps but only if asm code gets
>>>>>> liberally sprinkled with .hidden or .protected directives.
>>>>>>
>>>>>
>>>>> This is what I checked in.  With
>>>>>
>>>>> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
>>>>> index 2fda005..186e6f7 100644
>>>>> --- a/arch/x86/Makefile
>>>>> +++ b/arch/x86/Makefile
>>>>> @@ -107,6 +107,10 @@ else
>>>>>          KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
>>>>>  endif
>>>>>
>>>>> +NO_SHARED_CFLAGS = $(call as-option,-Wa$(comma)-mno-shared)
>>>>> +KBUILD_CFLAGS += $(NO_SHARED_CFLAGS)
>>>>> +KBUILD_AFLAGS += $(NO_SHARED_CFLAGS)
>>>>> +
>>>>>  # Make sure compiler does not have buggy stack-protector support.
>>>>>  ifdef CONFIG_CC_STACKPROTECTOR
>>>>>    cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
>>>>>
>>>>> On kernel master branch, I got
>>>>>
>>>>>    text   data    bss    dec    hex filename
>>>>> 10934167 2275232 1609728 14819127 e21f37 vmlinux.old
>>>>> 10934119 2275232 1609728 14819079 e21f07 vmlinux
>>>>>
>>>>> It saves 48 bytes.
>>>>
>>>> This is before I fixed:
>>>>
>>>> /* This is global to keep gas from relaxing the jumps */
>>>> ENTRY(early_idt_handler)
>>>>         cld
>>>>
>>>> in arch/x86/kernel/head_64.S.  With -mno-shared, we must
>>>> make early_idt_handler weak to keep gas from relaxing the jumps.
>>>>
>>>
>>> Here is a patch to change the assembler default to optimize out
>>> relocations to defined non-weak global branch targets with default
>>> visibility.  It will generate slightly smaller object files.  But Linux
>>> kernel will be broken unless early_idt_handler is marked weak.
>>> I am little uncomfortable with -mshare and I don't like -mno-shared
>>> very much either.  I may just simply remove -mno-shared.
>>>
>>
>> I reverted the -mno-shared change.
>>
>
> Here is a patch to add -mshared, which is off by default.  On Linux kernel
> with this change:
>
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index a468c0a..9a10e05 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -339,8 +339,8 @@ early_idt_handlers:
>   i = i + 1
>   .endr
>
> -/* This is global to keep gas from relaxing the jumps */
> -ENTRY(early_idt_handler)
> +/* This is weak to keep gas from relaxing the jumps */
> +WEAK(early_idt_handler)
>   cld
>
>   cmpl $2,(%rsp) # X86_TRAP_NMI
> --
> 2.1.0
>
> I got
>
> [hjl@gnu-tools-1 kernel.org]$ readelf -r old/vmlinux.o | head -5
>
> Relocation section '.rela.text' at offset 0xafea2f0 contains 205717 entries:
>   Offset          Info           Type           Sym. Value    Sym. Name + Addend
> 000000000001  1253100000002 R_X86_64_PC32     0000000000001e70 __fentry__ - 4
> 000000000009  1c8c00000002 R_X86_64_PC32     0000000000000000 .data + 51bc
> [hjl@gnu-tools-1 kernel.org]$ readelf -r new/vmlinux.o | head -5
>
> Relocation section '.rela.text' at offset 0xafea280 contains 205711 entries:
>   Offset          Info           Type           Sym. Value    Sym. Name + Addend
> 000000000001  1253100000002 R_X86_64_PC32     0000000000001e70 __fentry__ - 4
> 000000000009  1c8c00000002 R_X86_64_PC32     0000000000000000 .data + 51bc
> [hjl@gnu-tools-1 kernel.org]$
>
> It removes 6 relocations.  On gcc master branch,
>
> [hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1
>    text   data    bss    dec    hex filename
> 21529621  62256 1348312 22940189 15e0a1d build-x86_64-linux.branch/gcc/cc1
> 21529749  62256 1348312 22940317 15e0a9d build-x86_64-linux/gcc/cc1
> [hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1plus
>    text   data    bss    dec    hex filename
> 23713509  62400 1372760 25148669 17fbcfd build-x86_64-linux.branch/gcc/cc1plus
> 23713669  62400 1372760 25148829 17fbd9d build-x86_64-linux/gcc/cc1plus
> [hjl@gnu-tools-1 gcc-misc]$
>
> It is more effective.  I will run more tests.

This seems like a sensible idea, but I can imagine it breaking some
weird use cases (like that one Linux thing).  Is that okay?

--Andy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Add -mshared option to x86 ELF assembler
  2015-05-20 20:02     ` Andy Lutomirski
@ 2015-05-20 20:32       ` H. Peter Anvin
  2015-05-20 20:54         ` Andy Lutomirski
  0 siblings, 1 reply; 6+ messages in thread
From: H. Peter Anvin @ 2015-05-20 20:32 UTC (permalink / raw)
  To: Andy Lutomirski, H.J. Lu; +Cc: Jan Beulich, Binutils, linux-kernel

On 05/20/2015 01:02 PM, Andy Lutomirski wrote:
>>
>> I got
>>
>> [hjl@gnu-tools-1 kernel.org]$ readelf -r old/vmlinux.o | head -5
>>
>> Relocation section '.rela.text' at offset 0xafea2f0 contains 205717 entries:
>>   Offset          Info           Type           Sym. Value    Sym. Name + Addend
>> 000000000001  1253100000002 R_X86_64_PC32     0000000000001e70 __fentry__ - 4
>> 000000000009  1c8c00000002 R_X86_64_PC32     0000000000000000 .data + 51bc
>> [hjl@gnu-tools-1 kernel.org]$ readelf -r new/vmlinux.o | head -5
>>
>> Relocation section '.rela.text' at offset 0xafea280 contains 205711 entries:
>>   Offset          Info           Type           Sym. Value    Sym. Name + Addend
>> 000000000001  1253100000002 R_X86_64_PC32     0000000000001e70 __fentry__ - 4
>> 000000000009  1c8c00000002 R_X86_64_PC32     0000000000000000 .data + 51bc
>> [hjl@gnu-tools-1 kernel.org]$
>>
>> It removes 6 relocations.  On gcc master branch,
>>
>> [hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1
>>    text   data    bss    dec    hex filename
>> 21529621  62256 1348312 22940189 15e0a1d build-x86_64-linux.branch/gcc/cc1
>> 21529749  62256 1348312 22940317 15e0a9d build-x86_64-linux/gcc/cc1
>> [hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1plus
>>    text   data    bss    dec    hex filename
>> 23713509  62400 1372760 25148669 17fbcfd build-x86_64-linux.branch/gcc/cc1plus
>> 23713669  62400 1372760 25148829 17fbd9d build-x86_64-linux/gcc/cc1plus
>> [hjl@gnu-tools-1 gcc-misc]$
>>
>> It is more effective.  I will run more tests.
> 
> This seems like a sensible idea, but I can imagine it breaking some
> weird use cases (like that one Linux thing).  Is that okay?
> 

What about the patch I posted recently?

	-hpa


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFC: Add -mshared option to x86 ELF assembler
  2015-05-20 20:32       ` H. Peter Anvin
@ 2015-05-20 20:54         ` Andy Lutomirski
  0 siblings, 0 replies; 6+ messages in thread
From: Andy Lutomirski @ 2015-05-20 20:54 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: H.J. Lu, Jan Beulich, Binutils, linux-kernel

On Wed, May 20, 2015 at 1:32 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 05/20/2015 01:02 PM, Andy Lutomirski wrote:
>>>
>>> I got
>>>
>>> [hjl@gnu-tools-1 kernel.org]$ readelf -r old/vmlinux.o | head -5
>>>
>>> Relocation section '.rela.text' at offset 0xafea2f0 contains 205717 entries:
>>>   Offset          Info           Type           Sym. Value    Sym. Name + Addend
>>> 000000000001  1253100000002 R_X86_64_PC32     0000000000001e70 __fentry__ - 4
>>> 000000000009  1c8c00000002 R_X86_64_PC32     0000000000000000 .data + 51bc
>>> [hjl@gnu-tools-1 kernel.org]$ readelf -r new/vmlinux.o | head -5
>>>
>>> Relocation section '.rela.text' at offset 0xafea280 contains 205711 entries:
>>>   Offset          Info           Type           Sym. Value    Sym. Name + Addend
>>> 000000000001  1253100000002 R_X86_64_PC32     0000000000001e70 __fentry__ - 4
>>> 000000000009  1c8c00000002 R_X86_64_PC32     0000000000000000 .data + 51bc
>>> [hjl@gnu-tools-1 kernel.org]$
>>>
>>> It removes 6 relocations.  On gcc master branch,
>>>
>>> [hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1
>>>    text   data    bss    dec    hex filename
>>> 21529621  62256 1348312 22940189 15e0a1d build-x86_64-linux.branch/gcc/cc1
>>> 21529749  62256 1348312 22940317 15e0a9d build-x86_64-linux/gcc/cc1
>>> [hjl@gnu-tools-1 gcc-misc]$ size build-x86_64-linux*/gcc/cc1plus
>>>    text   data    bss    dec    hex filename
>>> 23713509  62400 1372760 25148669 17fbcfd build-x86_64-linux.branch/gcc/cc1plus
>>> 23713669  62400 1372760 25148829 17fbd9d build-x86_64-linux/gcc/cc1plus
>>> [hjl@gnu-tools-1 gcc-misc]$
>>>
>>> It is more effective.  I will run more tests.
>>
>> This seems like a sensible idea, but I can imagine it breaking some
>> weird use cases (like that one Linux thing).  Is that okay?
>>
>
> What about the patch I posted recently?
>

I replied in that thread.

--Andy

>         -hpa
>
>



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-05-20 20:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-13  0:14 RFC: Add -mshared option to x86 ELF assembler H.J. Lu
2015-05-13 11:50 ` H.J. Lu
2015-05-13 12:59   ` H.J. Lu
2015-05-20 20:02     ` Andy Lutomirski
2015-05-20 20:32       ` H. Peter Anvin
2015-05-20 20:54         ` Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).