public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym.
@ 2022-05-25  6:42 Tatsuyuki Ishi
  2022-05-25  6:42 ` [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG Tatsuyuki Ishi
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-25  6:42 UTC (permalink / raw)
  To: binutils

This patch adds support for the relevant LLVM assembler directives,
which is a prerequisite for GCC support of the feature.

[1]: https://llvm.org/docs/Extensions.html#sht-llvm-addrsig-section-address-significance-table



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG.
  2022-05-25  6:42 [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Tatsuyuki Ishi
@ 2022-05-25  6:42 ` Tatsuyuki Ishi
  2022-05-26 10:03   ` Jan Beulich
  2022-05-25  6:42 ` [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF Tatsuyuki Ishi
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-25  6:42 UTC (permalink / raw)
  To: binutils; +Cc: Tatsuyuki Ishi

---
 include/elf/common.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/elf/common.h b/include/elf/common.h
index e4bc53e35b4..fd31eb0c001 100644
--- a/include/elf/common.h
+++ b/include/elf/common.h
@@ -534,6 +534,9 @@
 #define SHT_LOOS	0x60000000	/* First of OS specific semantics */
 #define SHT_HIOS	0x6fffffff	/* Last of OS specific semantics */
 
+/* LLVM extension */
+#define SHT_LLVM_ADDRSIG  0x6fff4c03    /* Address significance table */
+
 #define SHT_GNU_INCREMENTAL_INPUTS 0x6fff4700   /* incremental build data */
 #define SHT_GNU_ATTRIBUTES 0x6ffffff5	/* Object attributes */
 #define SHT_GNU_HASH	0x6ffffff6	/* GNU style symbol hash table */
-- 
2.36.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF.
  2022-05-25  6:42 [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Tatsuyuki Ishi
  2022-05-25  6:42 ` [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG Tatsuyuki Ishi
@ 2022-05-25  6:42 ` Tatsuyuki Ishi
  2022-05-25  7:23   ` Alan Modra
  2022-05-25  7:49   ` Jan Beulich
  2022-05-25  6:42 ` [PATCH 3/4] bfd: Output SH_LINK to .symtab for SHT_LLVM_ADDRSIG Tatsuyuki Ishi
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-25  6:42 UTC (permalink / raw)
  To: binutils; +Cc: Tatsuyuki Ishi

These are LLVM extensions for specifying address significance, i.e.
if the symbols have their address taken, which is in turn used for
the Identical Code Folding link-time optimization.

For now, it's only implemented for the ELF platform; LLVM also supports
COFF, but the format is not well documented and usage is not widespread.

The addrsig directive signifies the assembler to emit the .llvm_addrsig
section, which signals the linker that it's possible to do (safe) ICF on
this object file.

The addrsig_sym directive marks a symbol as address significant. In this
patch, it's recorded with a boolean flag on the symbol. Later when the
symtab is ready, we loop over the symbols and construct the addrsig
section with indices of those symbols.
---
 gas/config/obj-elf.c | 62 ++++++++++++++++++++++++++++++++++++++++++++
 gas/symbols.c        | 23 +++++++++++++++-
 gas/symbols.h        |  2 ++
 3 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/gas/config/obj-elf.c b/gas/config/obj-elf.c
index c02d26ba453..a1c9ee6922c 100644
--- a/gas/config/obj-elf.c
+++ b/gas/config/obj-elf.c
@@ -75,6 +75,8 @@ static void obj_elf_symver (int);
 static void obj_elf_subsection (int);
 static void obj_elf_popsection (int);
 static void obj_elf_gnu_attribute (int);
+static void obj_elf_llvm_addrsig (int);
+static void obj_elf_llvm_addrsig_sym (int);
 static void obj_elf_tls_common (int);
 static void obj_elf_lcomm (int);
 static void obj_elf_struct (int);
@@ -121,6 +123,9 @@ static const pseudo_typeS elf_pseudo_table[] =
   /* A GNU extension for object attributes.  */
   {"gnu_attribute", obj_elf_gnu_attribute, 0},
 
+  {"addrsig", obj_elf_llvm_addrsig, 0},
+  {"addrsig_sym", obj_elf_llvm_addrsig_sym, 0},
+
   /* These are used for dwarf2.  */
   { "file", dwarf2_directive_file, 0 },
   { "loc",  dwarf2_directive_loc,  0 },
@@ -191,6 +196,7 @@ static const pseudo_typeS ecoff_debug_pseudo_table[] =
 /* This is called when the assembler starts.  */
 
 asection *elf_com_section_ptr;
+static bool addrsig_enabled;
 
 void
 elf_begin (void)
@@ -205,6 +211,8 @@ elf_begin (void)
   s = bfd_get_section_by_name (stdoutput, BSS_SECTION_NAME);
   symbol_table_insert (section_symbol (s));
   elf_com_section_ptr = bfd_com_section_ptr;
+
+  addrsig_enabled = false;
 }
 
 void
@@ -2104,6 +2112,58 @@ obj_elf_gnu_attribute (int ignored ATTRIBUTE_UNUSED)
   obj_elf_vendor_attribute (OBJ_ATTR_GNU);
 }
 
+static void obj_elf_llvm_addrsig (int ignored ATTRIBUTE_UNUSED)
+{
+  demand_empty_rest_of_line ();
+  addrsig_enabled = true;
+}
+
+static void obj_elf_llvm_addrsig_sym (int ignored ATTRIBUTE_UNUSED)
+{
+  char *name;
+  symbolS *sym;
+
+  get_symbol_name (&name);
+  demand_empty_rest_of_line ();
+
+  sym = symbol_find_or_make (name);
+  S_SET_ADDRSIG (sym);
+  symbol_mark_used_in_reloc (sym);
+}
+
+/* Emit an unsigned "little-endian base 128" number.  */
+
+static unsigned int
+out_uleb128 (valueT value)
+{
+  return output_leb128 (frag_more (sizeof_leb128 (value, 0)), value, 0);
+}
+
+static void
+obj_elf_out_llvm_addrsig (void)
+{
+  segT addrsig_seg;
+  symbolS *symp;
+  int symtab_index;
+  unsigned int len = 0;
+
+  if (!addrsig_enabled)
+    return;
+
+  addrsig_seg = subseg_new (".llvm_addrsig", 0);
+  elf_section_type (addrsig_seg) = SHT_LLVM_ADDRSIG;
+  bfd_set_section_flags (addrsig_seg, SEC_HAS_CONTENTS | SEC_EXCLUDE);
+  for (symtab_index = 0, symp = symbol_rootP; symp; symp = symbol_next (symp))
+    {
+      if (S_GET_ADDRSIG (symp))
+        len += out_uleb128 (symtab_index);
+      if (symbol_written_p (symp))
+        symtab_index++;
+    }
+  frag_now->fr_fix = frag_now_fix_octets ();
+  bfd_set_section_size(addrsig_seg, len);
+}
+
 void
 elf_obj_read_begin_hook (void)
 {
@@ -2899,6 +2959,8 @@ elf_frob_file (void)
 {
   bfd_map_over_sections (stdoutput, adjust_stab_sections, NULL);
 
+  obj_elf_out_llvm_addrsig();
+
 #ifdef elf_tc_final_processing
   elf_tc_final_processing ();
 #endif
diff --git a/gas/symbols.c b/gas/symbols.c
index fb480be6f21..dda37143a54 100644
--- a/gas/symbols.c
+++ b/gas/symbols.c
@@ -88,6 +88,10 @@ struct symbol_flags
   /* Set when a warning about the symbol containing multibyte characters
      is generated.  */
   unsigned int multibyte_warned : 1;
+
+  /* Set when a symbol is marked as address significant, i.e. its address
+     is taken.  */
+  unsigned int addr_sig : 1;
 };
 
 /* A pointer in the symbol may point to either a complete symbol
@@ -204,7 +208,7 @@ static void *
 symbol_entry_find (htab_t table, const char *name)
 {
   hashval_t hash = htab_hash_string (name);
-  symbol_entry_t needle = { { { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
+  symbol_entry_t needle = { { { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
 			      hash, name, 0, 0, 0 } };
   return htab_find_with_hash (table, &needle, hash);
 }
@@ -2578,6 +2582,23 @@ S_SET_FORWARD_REF (symbolS *s)
   s->flags.forward_ref = 1;
 }
 
+bool
+S_GET_ADDRSIG (symbolS *s)
+{
+  if (s->flags.local_symbol)
+    abort();
+  return s->flags.addr_sig;
+}
+
+
+void
+S_SET_ADDRSIG (symbolS *s)
+{
+  if (s->flags.local_symbol)
+    s = local_symbol_convert (s);
+  s->flags.addr_sig = 1;
+}
+
 /* Return the previous symbol in a chain.  */
 
 symbolS *
diff --git a/gas/symbols.h b/gas/symbols.h
index 19eb658ca68..e7802ba8734 100644
--- a/gas/symbols.h
+++ b/gas/symbols.h
@@ -115,6 +115,8 @@ extern void S_SET_THREAD_LOCAL (symbolS *);
 extern void S_SET_VOLATILE (symbolS *);
 extern void S_CLEAR_VOLATILE (symbolS *);
 extern void S_SET_FORWARD_REF (symbolS *);
+extern bool S_GET_ADDRSIG (symbolS *s);
+extern void S_SET_ADDRSIG (symbolS *);
 
 #ifndef WORKING_DOT_WORD
 struct broken_word
-- 
2.36.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 3/4] bfd: Output SH_LINK to .symtab for SHT_LLVM_ADDRSIG.
  2022-05-25  6:42 [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Tatsuyuki Ishi
  2022-05-25  6:42 ` [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG Tatsuyuki Ishi
  2022-05-25  6:42 ` [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF Tatsuyuki Ishi
@ 2022-05-25  6:42 ` Tatsuyuki Ishi
  2022-05-25  7:46   ` Jan Beulich
  2022-05-25  6:42 ` [PATCH 4/4] gas: Add basic test for addrsig Tatsuyuki Ishi
  2022-05-25  7:19 ` [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Fangrui Song
  4 siblings, 1 reply; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-25  6:42 UTC (permalink / raw)
  To: binutils; +Cc: Tatsuyuki Ishi

To mimick LLVM behavior.
---
 bfd/elf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/bfd/elf.c b/bfd/elf.c
index c493aa5b172..63f7fd9f586 100644
--- a/bfd/elf.c
+++ b/bfd/elf.c
@@ -3988,6 +3988,7 @@ assign_section_numbers (bfd *abfd, struct bfd_link_info *link_info)
 	  break;
 
 	case SHT_GROUP:
+    case SHT_LLVM_ADDRSIG:
 	  d->this_hdr.sh_link = elf_onesymtab (abfd);
 	}
     }
-- 
2.36.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 4/4] gas: Add basic test for addrsig.
  2022-05-25  6:42 [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Tatsuyuki Ishi
                   ` (2 preceding siblings ...)
  2022-05-25  6:42 ` [PATCH 3/4] bfd: Output SH_LINK to .symtab for SHT_LLVM_ADDRSIG Tatsuyuki Ishi
@ 2022-05-25  6:42 ` Tatsuyuki Ishi
  2022-05-25  7:30   ` Alan Modra
  2022-05-25  7:19 ` [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Fangrui Song
  4 siblings, 1 reply; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-25  6:42 UTC (permalink / raw)
  To: binutils; +Cc: Tatsuyuki Ishi

These are very basic tests to ensure that the addrsig directives are
correctly parsed and something is emitted.
---
 gas/testsuite/gas/elf/addrsig.d |  7 +++++++
 gas/testsuite/gas/elf/addrsig.s | 10 ++++++++++
 gas/testsuite/gas/elf/elf.exp   |  1 +
 3 files changed, 18 insertions(+)
 create mode 100644 gas/testsuite/gas/elf/addrsig.d
 create mode 100644 gas/testsuite/gas/elf/addrsig.s

diff --git a/gas/testsuite/gas/elf/addrsig.d b/gas/testsuite/gas/elf/addrsig.d
new file mode 100644
index 00000000000..faa6481b200
--- /dev/null
+++ b/gas/testsuite/gas/elf/addrsig.d
@@ -0,0 +1,7 @@
+# name: .llvm_addrsig
+# objdump: -sj .llvm_addrsig
+
+tmpdir/addrsig.o:     file format elf64-x86-64
+
+Contents of section \.llvm_addrsig:
+ 0000 00020304                             .*
\ No newline at end of file
diff --git a/gas/testsuite/gas/elf/addrsig.s b/gas/testsuite/gas/elf/addrsig.s
new file mode 100644
index 00000000000..db822861d8d
--- /dev/null
+++ b/gas/testsuite/gas/elf/addrsig.s
@@ -0,0 +1,10 @@
+.addrsig
+.addrsig_sym g1
+.globl g2
+.addrsig_sym g3
+.addrsig_sym local
+.addrsig_sym .Llocal
+
+local:
+.Llocal:
+.Llocal_hidden:
diff --git a/gas/testsuite/gas/elf/elf.exp b/gas/testsuite/gas/elf/elf.exp
index 07f08a00a28..369fed64fa8 100644
--- a/gas/testsuite/gas/elf/elf.exp
+++ b/gas/testsuite/gas/elf/elf.exp
@@ -355,4 +355,5 @@ if { [is_elf_format] } then {
     run_dump_test "bignums"
     run_dump_test "section-symbol-redef"
     run_dump_test "pr27228"
+    run_dump_test "addrsig"
 }
-- 
2.36.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym.
  2022-05-25  6:42 [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Tatsuyuki Ishi
                   ` (3 preceding siblings ...)
  2022-05-25  6:42 ` [PATCH 4/4] gas: Add basic test for addrsig Tatsuyuki Ishi
@ 2022-05-25  7:19 ` Fangrui Song
  2022-05-25  7:34   ` Alan Modra
  4 siblings, 1 reply; 22+ messages in thread
From: Fangrui Song @ 2022-05-25  7:19 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: binutils

On 2022-05-25, Tatsuyuki Ishi via Binutils wrote:
>This patch adds support for the relevant LLVM assembler directives,
>which is a prerequisite for GCC support of the feature.
>
>[1]: https://llvm.org/docs/Extensions.html#sht-llvm-addrsig-section-address-significance-table
>
>

I have to be fair, the design of SHT_LLVM_ADDRSIG has a major problem
that it does not work with objcopy and ld -r:
https://sourceware.org/bugzilla/show_bug.cgi?id=23817
That said, using relocations would greatly increase the size of the
section and the current uleb128 does a good job on decreasing the file
size...

If ld supports SHT_LLVM_ADDRSIG, it would probably make it easier
to assemble `clang -S` output with `as`. Currently, on many targets/Linux
distributions, `clang -S` defaults to -faddrsig which makes the output
not-assembable with `as`.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF.
  2022-05-25  6:42 ` [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF Tatsuyuki Ishi
@ 2022-05-25  7:23   ` Alan Modra
  2022-05-25  7:49   ` Jan Beulich
  1 sibling, 0 replies; 22+ messages in thread
From: Alan Modra @ 2022-05-25  7:23 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: binutils

On Wed, May 25, 2022 at 03:42:50PM +0900, Tatsuyuki Ishi via Binutils wrote:
> +  for (symtab_index = 0, symp = symbol_rootP; symp; symp = symbol_next (symp))
> +    {
> +      if (S_GET_ADDRSIG (symp))
> +        len += out_uleb128 (symtab_index);
> +      if (symbol_written_p (symp))
> +        symtab_index++;
> +    }

This looks odd, seemingly allowing multiple uleb output for the same
index.  Wouldn't the following be better?

      if (symbol_written_p (symp))
	{
	  if (S_GET_ADDRSIG (symp))
	    len += out_uleb128 (symtab_index);
	  symtab_index++;
	}

> +bool
> +S_GET_ADDRSIG (symbolS *s)
> +{
> +  if (s->flags.local_symbol)
> +    abort();

"return false" rather than "abort()" please.

> +  return s->flags.addr_sig;
> +}

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] gas: Add basic test for addrsig.
  2022-05-25  6:42 ` [PATCH 4/4] gas: Add basic test for addrsig Tatsuyuki Ishi
@ 2022-05-25  7:30   ` Alan Modra
  0 siblings, 0 replies; 22+ messages in thread
From: Alan Modra @ 2022-05-25  7:30 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: binutils

On Wed, May 25, 2022 at 03:42:52PM +0900, Tatsuyuki Ishi via Binutils wrote:
> These are very basic tests to ensure that the addrsig directives are
> correctly parsed and something is emitted.

New tests should be checked on multiple targets.

> +tmpdir/addrsig.o:     file format elf64-x86-64

This will obviously fail on anything other than x86.

> +++ b/gas/testsuite/gas/elf/addrsig.s
> @@ -0,0 +1,10 @@
> +.addrsig
> +.addrsig_sym g1
> +.globl g2
> +.addrsig_sym g3
> +.addrsig_sym local
> +.addrsig_sym .Llocal
> +
> +local:
> +.Llocal:
> +.Llocal_hidden:

Please indent directives with a space or tab.  Some targets assume
that text starting in the first column is a label.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym.
  2022-05-25  7:19 ` [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Fangrui Song
@ 2022-05-25  7:34   ` Alan Modra
  2022-05-25  7:59     ` Fangrui Song
  2022-05-25  8:01     ` Tatsuyuki Ishi
  0 siblings, 2 replies; 22+ messages in thread
From: Alan Modra @ 2022-05-25  7:34 UTC (permalink / raw)
  To: Fangrui Song; +Cc: Tatsuyuki Ishi, binutils

On Wed, May 25, 2022 at 12:19:24AM -0700, Fangrui Song wrote:
> I have to be fair, the design of SHT_LLVM_ADDRSIG has a major problem
> that it does not work with objcopy and ld -r:

So, discard the section on ld -r and objcopy if symbols change?

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/4] bfd: Output SH_LINK to .symtab for SHT_LLVM_ADDRSIG.
  2022-05-25  6:42 ` [PATCH 3/4] bfd: Output SH_LINK to .symtab for SHT_LLVM_ADDRSIG Tatsuyuki Ishi
@ 2022-05-25  7:46   ` Jan Beulich
  0 siblings, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2022-05-25  7:46 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: binutils

On 25.05.2022 08:42, Tatsuyuki Ishi via Binutils wrote:
> --- a/bfd/elf.c
> +++ b/bfd/elf.c
> @@ -3988,6 +3988,7 @@ assign_section_numbers (bfd *abfd, struct bfd_link_info *link_info)
>  	  break;
>  
>  	case SHT_GROUP:
> +    case SHT_LLVM_ADDRSIG:
>  	  d->this_hdr.sh_link = elf_onesymtab (abfd);

Nit: You want to properly indent the line you're adding.

Jan


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF.
  2022-05-25  6:42 ` [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF Tatsuyuki Ishi
  2022-05-25  7:23   ` Alan Modra
@ 2022-05-25  7:49   ` Jan Beulich
  2022-05-25  7:58     ` Tatsuyuki Ishi
  1 sibling, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2022-05-25  7:49 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: binutils

On 25.05.2022 08:42, Tatsuyuki Ishi via Binutils wrote:
> These are LLVM extensions for specifying address significance, i.e.
> if the symbols have their address taken, which is in turn used for
> the Identical Code Folding link-time optimization.
> 
> For now, it's only implemented for the ELF platform; LLVM also supports
> COFF, but the format is not well documented and usage is not widespread.
> 
> The addrsig directive signifies the assembler to emit the .llvm_addrsig
> section, which signals the linker that it's possible to do (safe) ICF on
> this object file.
> 
> The addrsig_sym directive marks a symbol as address significant. In this
> patch, it's recorded with a boolean flag on the symbol. Later when the
> symtab is ready, we loop over the symbols and construct the addrsig
> section with indices of those symbols.

Why two directives? IOW what's the point of silently ignoring
.addrsig_sym when there's no .addrsig anywhere? And such a toggle-on
directive likely would want to allow expressing also (or even only)
via a command line option.

Jan


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF.
  2022-05-25  7:49   ` Jan Beulich
@ 2022-05-25  7:58     ` Tatsuyuki Ishi
  2022-05-25  8:07       ` Fangrui Song
  2022-05-25  8:34       ` Jan Beulich
  0 siblings, 2 replies; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-25  7:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

> Why two directives? IOW what's the point of silently ignoring
> .addrsig_sym when there's no .addrsig anywhere?

Because addrsig is an opt-in mechanism. A file with only .addrsig means that
none of the symbols are address significant, while a file without is treated
as if all symbols were.

The silent ignore behavior is not ideal, but it could be added later. The
primary target of this patch is compiler generated output.

> And such a toggle-on
> directive likely would want to allow expressing also (or even only)
> via a command line option.

How would that work? That leads to two kind of end results, 1. mark *all*
symbols as address-insignificant (if there's no addrsig_sym directives)
or maybe 2. toggle off llvm_addrsig support (but why?).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym.
  2022-05-25  7:34   ` Alan Modra
@ 2022-05-25  7:59     ` Fangrui Song
  2022-05-25  8:01     ` Tatsuyuki Ishi
  1 sibling, 0 replies; 22+ messages in thread
From: Fangrui Song @ 2022-05-25  7:59 UTC (permalink / raw)
  To: Alan Modra; +Cc: Tatsuyuki Ishi, binutils

On Wed, May 25, 2022 at 12:34 AM Alan Modra <amodra@gmail.com> wrote:
>
> On Wed, May 25, 2022 at 12:19:24AM -0700, Fangrui Song wrote:
> > I have to be fair, the design of SHT_LLVM_ADDRSIG has a major problem
> > that it does not work with objcopy and ld -r:
>
> So, discard the section on ld -r and objcopy if symbols change?

In llvm-project, llvm-objcopy does not recognize/discard SHT_LLVM_ADDRSIG.
Both GNU objcopy and llvm-objcopy will set sh_link to 0 for the
unrecognized section.

ld.lld --icf=safe prints a warning [1] for SHT_LLVM_ADDRSIG when
sh_link==0 (--icf=all does not need SHT_LLVM_ADDRSIG).
The users I need to serve use -Wl,--fatal-warnings to convert warnings
to errors, so we need to do   `objcopy --remove-section=.llvm_addrsig`
in a few places.

objcopy discarding SHT_LLVM_ADDRSIG would be more ergonomic, but it
raises a question whether it is better for the user to be aware of the
problem and discard the section by themselves.

[1]: https://github.com/llvm/llvm-project/blob/main/lld/test/ELF/icf-safe.s#L92

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym.
  2022-05-25  7:34   ` Alan Modra
  2022-05-25  7:59     ` Fangrui Song
@ 2022-05-25  8:01     ` Tatsuyuki Ishi
  1 sibling, 0 replies; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-25  8:01 UTC (permalink / raw)
  To: Alan Modra; +Cc: Fangrui Song, binutils

> So, discard the section on ld -r and objcopy if symbols change?

Actually, this is something I need to take care of. LLD currently uses sh_link=0
as a heuristic for such corrupted entries:

https://github.com/llvm/llvm-project/blob/09c2b7c35af8c4bad39f03e9f60df8bd07323028/lld/ELF/InputFiles.cpp#L540-L554

But if we add the sh_link marking into bfd (patch #3), then it will
silently break again, if I understand correctly.

Does discarding sound good to you?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF.
  2022-05-25  7:58     ` Tatsuyuki Ishi
@ 2022-05-25  8:07       ` Fangrui Song
  2022-05-25  8:34       ` Jan Beulich
  1 sibling, 0 replies; 22+ messages in thread
From: Fangrui Song @ 2022-05-25  8:07 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: Jan Beulich, binutils

On Wed, May 25, 2022 at 12:58 AM Tatsuyuki Ishi via Binutils
<binutils@sourceware.org> wrote:
>
> > Why two directives? IOW what's the point of silently ignoring
> > .addrsig_sym when there's no .addrsig anywhere?
>
> Because addrsig is an opt-in mechanism. A file with only .addrsig means that
> none of the symbols are address significant, while a file without is treated
> as if all symbols were.
>
> The silent ignore behavior is not ideal, but it could be added later. The
> primary target of this patch is compiler generated output.

The idea is that if no symbol is address significant, clang -S just
emits `.addrsig` with no `.addrsig_sym $sym`.
The .llvm_addrsig section is empty but it conveys information to
ld.lld --icf=safe.

In llvm-mc, `.addrsig_sym` accepts exactly one symbol. Technically, if
the directive were designed to support zero or any number of symbols,
`.addrsig` could be removed.

> > And such a toggle-on
> > directive likely would want to allow expressing also (or even only)
> > via a command line option.
>
> How would that work? That leads to two kind of end results, 1. mark *all*
> symbols as address-insignificant (if there's no addrsig_sym directives)
> or maybe 2. toggle off llvm_addrsig support (but why?).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF.
  2022-05-25  7:58     ` Tatsuyuki Ishi
  2022-05-25  8:07       ` Fangrui Song
@ 2022-05-25  8:34       ` Jan Beulich
  2022-05-25  8:41         ` Tatsuyuki Ishi
  1 sibling, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2022-05-25  8:34 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: binutils

On 25.05.2022 09:58, Tatsuyuki Ishi wrote:
>> Why two directives? IOW what's the point of silently ignoring
>> .addrsig_sym when there's no .addrsig anywhere?
> 
> Because addrsig is an opt-in mechanism. A file with only .addrsig means that
> none of the symbols are address significant, while a file without is treated
> as if all symbols were.

Oh, right.

> The silent ignore behavior is not ideal, but it could be added later. The
> primary target of this patch is compiler generated output.

Is there a reason .addrsig_sym can't suffice to enable emission of
the section without any .addrsig?

>> And such a toggle-on
>> directive likely would want to allow expressing also (or even only)
>> via a command line option.
> 
> How would that work? That leads to two kind of end results, 1. mark *all*
> symbols as address-insignificant (if there's no addrsig_sym directives)
> or maybe 2. toggle off llvm_addrsig support (but why?).

I find the question odd. How would a command line option be different
from use of the directive? (But yes, with the directive alone being
meaningful I can see that "only" via command line option isn't very
nice. But for assembler source having the alternative of a command
line option would look useful to me, as then one wouldn't need to
repeat the directive in every source file.)

Jan


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF.
  2022-05-25  8:34       ` Jan Beulich
@ 2022-05-25  8:41         ` Tatsuyuki Ishi
  0 siblings, 0 replies; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-25  8:41 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

> Is there a reason .addrsig_sym can't suffice to enable emission of
> the section without any .addrsig?

There isn't, but well, it basically mirrors the behavior (or a bug if you want
to call that) of llvm-mc.

> But for assembler source having the alternative of a command
> line option would look useful to me, as then one wouldn't need to
> repeat the directive in every source file.

That's fair, although addrsig is only relevant when you have something that
duplicates code (=compiler), and I don't think there's much audience that uses
it in hand-written assembly?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG.
  2022-05-25  6:42 ` [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG Tatsuyuki Ishi
@ 2022-05-26 10:03   ` Jan Beulich
  2022-05-26 10:13     ` Tatsuyuki Ishi
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2022-05-26 10:03 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: binutils

On 25.05.2022 08:42, Tatsuyuki Ishi via Binutils wrote:
> --- a/include/elf/common.h
> +++ b/include/elf/common.h
> @@ -534,6 +534,9 @@
>  #define SHT_LOOS	0x60000000	/* First of OS specific semantics */
>  #define SHT_HIOS	0x6fffffff	/* Last of OS specific semantics */
>  
> +/* LLVM extension */
> +#define SHT_LLVM_ADDRSIG  0x6fff4c03    /* Address significance table */

Mind me asking what OS value(s) this and other SHT_LLVM_* values are
intended to be qualified by?

Jan


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG.
  2022-05-26 10:03   ` Jan Beulich
@ 2022-05-26 10:13     ` Tatsuyuki Ishi
  2022-05-26 10:19       ` Jan Beulich
  0 siblings, 1 reply; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-26 10:13 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

> > +/* LLVM extension */
> > +#define SHT_LLVM_ADDRSIG  0x6fff4c03    /* Address significance table */
>
> Mind me asking what OS value(s) this and other SHT_LLVM_* values are
> intended to be qualified by?

Pardon me but I'm not sure I understand your question. What do you
mean by "qualify"?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG.
  2022-05-26 10:13     ` Tatsuyuki Ishi
@ 2022-05-26 10:19       ` Jan Beulich
  2022-05-26 10:36         ` Tatsuyuki Ishi
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2022-05-26 10:19 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: binutils

On 26.05.2022 12:13, Tatsuyuki Ishi wrote:
>>> +/* LLVM extension */
>>> +#define SHT_LLVM_ADDRSIG  0x6fff4c03    /* Address significance table */
>>
>> Mind me asking what OS value(s) this and other SHT_LLVM_* values are
>> intended to be qualified by?
> 
> Pardon me but I'm not sure I understand your question. What do you
> mean by "qualify"?

Values in [..._LOOS,..._HIOS] pertain to particular OSes (i.e.
ELFOSABI_* values in the header). LLVM isn't really an OS by itself
(and I'm also unaware of ELFOSABI_LLVM existing), so I expect there
are certain ELFOSABI_* values to which SHT_LLVM_* apply. Otherwise
how are collisions with other OSes' SHT_<OS>_... values avoided?

Jan


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG.
  2022-05-26 10:19       ` Jan Beulich
@ 2022-05-26 10:36         ` Tatsuyuki Ishi
  2022-05-27 17:28           ` Peter Collingbourne
  0 siblings, 1 reply; 22+ messages in thread
From: Tatsuyuki Ishi @ 2022-05-26 10:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: binutils

> Values in [..._LOOS,..._HIOS] pertain to particular OSes (i.e.
> ELFOSABI_* values in the header). LLVM isn't really an OS by itself
> (and I'm also unaware of ELFOSABI_LLVM existing), so I expect there
> are certain ELFOSABI_* values to which SHT_LLVM_* apply. Otherwise
> how are collisions with other OSes' SHT_<OS>_... values avoided?

I'm not a LLVM contributor, but unfortunately it seems like the answer
is that they do not follow the practice of constraining these values
based on EI_OSABI, and they are basically treated as same as non-OS
specific values [1].

[1] https://github.com/llvm/llvm-project/blob/09c2b7c35af8c4bad39f03e9f60df8bd07323028/llvm/lib/Object/ELF.cpp#L232-L307

---
Tatsuyuki

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG.
  2022-05-26 10:36         ` Tatsuyuki Ishi
@ 2022-05-27 17:28           ` Peter Collingbourne
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Collingbourne @ 2022-05-27 17:28 UTC (permalink / raw)
  To: Tatsuyuki Ishi; +Cc: Jan Beulich, Binutils

On Thu, May 26, 2022 at 3:37 AM Tatsuyuki Ishi via Binutils
<binutils@sourceware.org> wrote:
>
> > Values in [..._LOOS,..._HIOS] pertain to particular OSes (i.e.
> > ELFOSABI_* values in the header). LLVM isn't really an OS by itself
> > (and I'm also unaware of ELFOSABI_LLVM existing), so I expect there
> > are certain ELFOSABI_* values to which SHT_LLVM_* apply. Otherwise
> > how are collisions with other OSes' SHT_<OS>_... values avoided?
>
> I'm not a LLVM contributor, but unfortunately it seems like the answer
> is that they do not follow the practice of constraining these values
> based on EI_OSABI, and they are basically treated as same as non-OS
> specific values [1].
>
> [1] https://github.com/llvm/llvm-project/blob/09c2b7c35af8c4bad39f03e9f60df8bd07323028/llvm/lib/Object/ELF.cpp#L232-L307

Hi,

The LLVM section types are in a carveout of the GNU OSABI range, as
previously agreed on the GNU gABI list:
https://sourceware.org/pipermail/gnu-gabi/2017q1/000157.html

If other OSs want to adopt the LLVM section types, they will need to
create the same carveout in their OSABI.

From the GNU binutils perspective, I think these should be treated the
same as the SHT_GNU_* section types. I'm not sure of the reason why
the LLVM ELF parser doesn't consider the OSABI for the GNU/LLVM
section types, but it may just be that there hasn't yet been a need to
do so.

Peter

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-05-27 17:28 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-25  6:42 [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Tatsuyuki Ishi
2022-05-25  6:42 ` [PATCH 1/4] elf: Add definition for SHT_LLVM_ADDRSIG Tatsuyuki Ishi
2022-05-26 10:03   ` Jan Beulich
2022-05-26 10:13     ` Tatsuyuki Ishi
2022-05-26 10:19       ` Jan Beulich
2022-05-26 10:36         ` Tatsuyuki Ishi
2022-05-27 17:28           ` Peter Collingbourne
2022-05-25  6:42 ` [PATCH 2/4] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF Tatsuyuki Ishi
2022-05-25  7:23   ` Alan Modra
2022-05-25  7:49   ` Jan Beulich
2022-05-25  7:58     ` Tatsuyuki Ishi
2022-05-25  8:07       ` Fangrui Song
2022-05-25  8:34       ` Jan Beulich
2022-05-25  8:41         ` Tatsuyuki Ishi
2022-05-25  6:42 ` [PATCH 3/4] bfd: Output SH_LINK to .symtab for SHT_LLVM_ADDRSIG Tatsuyuki Ishi
2022-05-25  7:46   ` Jan Beulich
2022-05-25  6:42 ` [PATCH 4/4] gas: Add basic test for addrsig Tatsuyuki Ishi
2022-05-25  7:30   ` Alan Modra
2022-05-25  7:19 ` [PATCH 0/4] gas: Add support for LLVM addrsig and addrsig_sym Fangrui Song
2022-05-25  7:34   ` Alan Modra
2022-05-25  7:59     ` Fangrui Song
2022-05-25  8:01     ` Tatsuyuki Ishi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).