From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by sourceware.org (Postfix) with ESMTPS id 1625E38560B2 for ; Thu, 23 Jun 2022 15:14:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1625E38560B2 Received: by mail-pl1-x636.google.com with SMTP id m2so11168001plx.3 for ; Thu, 23 Jun 2022 08:14:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nFD2vXC9atvdXfdwtBWc/SkjKu2jhAp1D82H8d5Cb0o=; b=mMxDZ8Wl3/aO0aGqXj26a3eybk96rNSNrLvg3YB1opQK54AYGQ30UU8DgtiJ2uWXmd BXtRHpePTVoAYrs4TsDyIF5aV+RCUicKUNxOPEinUGP87UQLM71XZxB5TctpJlSEqxZI HJQHn373lYcb7gwixgfzzMAZquBNoqnEHfUhF9Syxcm5KW64wXGoWtdjkNfKu0rNuc1s mnCXNwE89Bf3YaGaeWiObEpTI5t8H97osmGofjdd+VvswYy8nBSBzUMzie308qxFDByP FKuqBGUBryoXsj4BJfS6rgs3XHkepq2cSNEUQd+64L0iONmhPtlaMToUvFMaf/Nm5Dkh ryXw== X-Gm-Message-State: AJIora8qRc6edTw0uHYL+GuObHVs6IY5vjXo9tT/iMmTEut2hSbrQgM/ ZXwcOQY/99MM1EeTSAtOlacY09pBJz/r2YLe X-Google-Smtp-Source: AGRyM1sQ/bHQxP48hpfTZE1QHDKgo605vVZcKwmC/FvcbVJyoFxTz7D+4VH5YpfwhElQ29A72BGMGg== X-Received: by 2002:a17:903:11c9:b0:154:be2d:eb9 with SMTP id q9-20020a17090311c900b00154be2d0eb9mr39381088plh.91.1655997277013; Thu, 23 Jun 2022 08:14:37 -0700 (PDT) Received: from localhost (221x248x146x74.ap221.ftth.ucom.ne.jp. [221.248.146.74]) by smtp.gmail.com with UTF8SMTPSA id v21-20020aa78095000000b0050dc7628171sm15838816pff.75.2022.06.23.08.14.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 23 Jun 2022 08:14:36 -0700 (PDT) From: Tatsuyuki Ishi To: binutils@sourceware.org Cc: amodra@gmail.com, jbeulich@suse.com, i@maskray.me, Tatsuyuki Ishi Subject: [PATCH v2 5/6] gas: Add support for LLVM addrsig and addrsig_sym directives on ELF. Date: Fri, 24 Jun 2022 00:13:53 +0900 Message-Id: <20220623151353.62139-6-ishitatsuyuki@gmail.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220623151353.62139-1-ishitatsuyuki@gmail.com> References: <20220623151353.62139-1-ishitatsuyuki@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Jun 2022 15:14:40 -0000 These are LLVM extensions for specifying address significance, i.e. if the symbols have their address taken, which is in turn used for the Identical Code Folding link-time optimization. For now, it's only implemented for the ELF platform; LLVM also supports COFF, but the format is not well documented and usage is not widespread. The addrsig directive signifies the assembler to emit the .llvm_addrsig section, which signals the linker that it's possible to do (safe) ICF on this object file. The addrsig_sym directive marks a symbol as address significant. In this patch, it's recorded with a boolean flag on the symbol. Later when the symtab is ready, we loop over the symbols and construct the addrsig section with indices of those symbols. --- gas/config/obj-elf.c | 62 ++++++++++++++++++++++++++++++++++++++++++++ gas/symbols.c | 23 +++++++++++++++- gas/symbols.h | 2 ++ 3 files changed, 86 insertions(+), 1 deletion(-) diff --git a/gas/config/obj-elf.c b/gas/config/obj-elf.c index e5ab8514de7..7f2f28ed9ee 100644 --- a/gas/config/obj-elf.c +++ b/gas/config/obj-elf.c @@ -75,6 +75,8 @@ static void obj_elf_symver (int); static void obj_elf_subsection (int); static void obj_elf_popsection (int); static void obj_elf_gnu_attribute (int); +static void obj_elf_llvm_addrsig (int); +static void obj_elf_llvm_addrsig_sym (int); static void obj_elf_tls_common (int); static void obj_elf_lcomm (int); static void obj_elf_struct (int); @@ -121,6 +123,9 @@ static const pseudo_typeS elf_pseudo_table[] = /* A GNU extension for object attributes. */ {"gnu_attribute", obj_elf_gnu_attribute, 0}, + {"addrsig", obj_elf_llvm_addrsig, 0}, + {"addrsig_sym", obj_elf_llvm_addrsig_sym, 0}, + /* These are used for dwarf2. */ { "file", dwarf2_directive_file, 0 }, { "loc", dwarf2_directive_loc, 0 }, @@ -191,6 +196,7 @@ static const pseudo_typeS ecoff_debug_pseudo_table[] = /* This is called when the assembler starts. */ asection *elf_com_section_ptr; +static bool addrsig_enabled; void elf_begin (void) @@ -205,6 +211,8 @@ elf_begin (void) s = bfd_get_section_by_name (stdoutput, BSS_SECTION_NAME); symbol_table_insert (section_symbol (s)); elf_com_section_ptr = bfd_com_section_ptr; + + addrsig_enabled = false; } void @@ -2108,6 +2116,58 @@ obj_elf_gnu_attribute (int ignored ATTRIBUTE_UNUSED) obj_elf_vendor_attribute (OBJ_ATTR_GNU); } +static void obj_elf_llvm_addrsig (int ignored ATTRIBUTE_UNUSED) +{ + demand_empty_rest_of_line (); + addrsig_enabled = true; +} + +static void obj_elf_llvm_addrsig_sym (int ignored ATTRIBUTE_UNUSED) +{ + char *name; + symbolS *sym; + + get_symbol_name (&name); + demand_empty_rest_of_line (); + + sym = symbol_find_or_make (name); + S_SET_ADDRSIG (sym); + symbol_mark_used_in_reloc (sym); +} + +/* Emit an unsigned "little-endian base 128" number. */ + +static unsigned int +out_uleb128 (valueT value) +{ + return output_leb128 (frag_more (sizeof_leb128 (value, 0)), value, 0); +} + +static void +obj_elf_out_llvm_addrsig (void) +{ + segT addrsig_seg; + symbolS *symp; + int symtab_index; + unsigned int len = 0; + + if (!addrsig_enabled) + return; + + addrsig_seg = subseg_new (".llvm_addrsig", 0); + elf_section_type (addrsig_seg) = SHT_LLVM_ADDRSIG; + bfd_set_section_flags (addrsig_seg, SEC_HAS_CONTENTS | SEC_EXCLUDE); + for (symtab_index = 0, symp = symbol_rootP; symp; symp = symbol_next (symp)) + if (symbol_written_p (symp)) + { + if (S_GET_ADDRSIG (symp)) + len += out_uleb128 (symtab_index); + symtab_index++; + } + frag_now->fr_fix = frag_now_fix_octets (); + bfd_set_section_size(addrsig_seg, len); +} + void elf_obj_read_begin_hook (void) { @@ -2903,6 +2963,8 @@ elf_frob_file (void) { bfd_map_over_sections (stdoutput, adjust_stab_sections, NULL); + obj_elf_out_llvm_addrsig(); + #ifdef elf_tc_final_processing elf_tc_final_processing (); #endif diff --git a/gas/symbols.c b/gas/symbols.c index e3fddee8c79..881d6d08851 100644 --- a/gas/symbols.c +++ b/gas/symbols.c @@ -88,6 +88,10 @@ struct symbol_flags /* Set when a warning about the symbol containing multibyte characters is generated. */ unsigned int multibyte_warned : 1; + + /* Set when a symbol is marked as address significant, i.e. its address + is taken. */ + unsigned int addr_sig : 1; }; /* A pointer in the symbol may point to either a complete symbol @@ -204,7 +208,7 @@ static void * symbol_entry_find (htab_t table, const char *name) { hashval_t hash = htab_hash_string (name); - symbol_entry_t needle = { { { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, + symbol_entry_t needle = { { { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }, hash, name, 0, 0, 0 } }; return htab_find_with_hash (table, &needle, hash); } @@ -2578,6 +2582,23 @@ S_SET_FORWARD_REF (symbolS *s) s->flags.forward_ref = 1; } +bool +S_GET_ADDRSIG (symbolS *s) +{ + if (s->flags.local_symbol) + return false; + return s->flags.addr_sig; +} + + +void +S_SET_ADDRSIG (symbolS *s) +{ + if (s->flags.local_symbol) + s = local_symbol_convert (s); + s->flags.addr_sig = 1; +} + /* Return the previous symbol in a chain. */ symbolS * diff --git a/gas/symbols.h b/gas/symbols.h index 19eb658ca68..e7802ba8734 100644 --- a/gas/symbols.h +++ b/gas/symbols.h @@ -115,6 +115,8 @@ extern void S_SET_THREAD_LOCAL (symbolS *); extern void S_SET_VOLATILE (symbolS *); extern void S_CLEAR_VOLATILE (symbolS *); extern void S_SET_FORWARD_REF (symbolS *); +extern bool S_GET_ADDRSIG (symbolS *s); +extern void S_SET_ADDRSIG (symbolS *); #ifndef WORKING_DOT_WORD struct broken_word -- 2.36.1