public inbox for gdb-cvs@sourceware.org help / color / mirror / Atom feed
From: Nick Alcock <nix@sourceware.org> To: bfd-cvs@sourceware.org, gdb-cvs@sourceware.org Subject: [binutils-gdb] include, libctf, ld: extend variable section to contain functions too Date: Wed, 23 Mar 2022 13:53:21 +0000 (GMT) [thread overview] Message-ID: <20220323135321.5DDC63857C49@sourceware.org> (raw) https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=203bfa2f6bd275df4089131bac0a17c278c37a1a commit 203bfa2f6bd275df4089131bac0a17c278c37a1a Author: Nick Alcock <nick.alcock@oracle.com> Date: Wed Mar 16 15:29:25 2022 +0000 include, libctf, ld: extend variable section to contain functions too The CTF variable section is an optional (usually-not-present) section in the CTF dict which contains name -> type mappings corresponding to data symbols that are present in the linker input but not in the output symbol table: the idea is that programs that use their own symbol- resolution mechanisms can use this section to look up the types of symbols they have found using their own mechanism. Because these removed symbols (mostly static variables, functions, etc) all have names that are unlikely to appear in the ELF symtab and because very few programs have their own symbol-resolution mechanisms, a special linker flag (--ctf-variables) is needed to emit this section. Historically, we emitted only removed data symbols into the variable section. This seemed to make sense at the time, but in hindsight it really doesn't: functions are symbols too, and a C program can look them up just like any other type. So extend the variable section so that it contains all static function symbols too (if it is emitted at all), with types of kind CTF_K_FUNCTION. This is a little fiddly. We relied on compiler assistance for data symbols: the compiler simply emits all data symbols twice, once into the symtypetab as an indexed symbol and once into the variable section. Rather than wait for a suitably adjusted compiler that does the same for function symbols, we can pluck unreported function symbols out of the symtab and add them to the variable section ourselves. While we're at it, we do the same with data symbols: this is redundant right now because the compiler does it, but it costs very little time and lets the compiler drop this kludge and save a little space in .o files. include/ * ctf.h: Mention the new things we can see in the variable section. ld/ * testsuite/ld-ctf/data-func-conflicted-vars.d: New test. libctf/ * ctf-link.c (ctf_link_deduplicating_variables): Duplicate symbols into the variable section too. * ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename to... (symtypetab_delete_nonstatics): ... this. Check the funchash when pruning redundant variables. (ctf_symtypetab_sect_sizes): Adjust accordingly. * NEWS: Describe this change. Diff: --- include/ctf.h | 8 +-- ld/testsuite/ld-ctf/data-func-conflicted-vars.d | 69 +++++++++++++++++++++++++ libctf/NEWS | 9 ++++ libctf/ctf-link.c | 37 ++++++++++++- libctf/ctf-serialize.c | 23 +++++---- 5 files changed, 130 insertions(+), 16 deletions(-) diff --git a/include/ctf.h b/include/ctf.h index 6db2742d5fb..698aab3eab6 100644 --- a/include/ctf.h +++ b/include/ctf.h @@ -89,13 +89,13 @@ extern "C" entries and reorder them accordingly (dropping the indexes in the process). Variable records (as distinct from data objects) provide a modicum of support - for non-ELF systems, mapping a variable name to a CTF type ID. The variable - names are sorted into ASCIIbetical order, permitting binary searching. We do - not define how the consumer maps these variable names to addresses or + for non-ELF systems, mapping a variable or function name to a CTF type ID. + The names are sorted into ASCIIbetical order, permitting binary searching. + We do not define how the consumer maps these variable names to addresses or anything else, or indeed what these names represent: they might be names looked up at runtime via dlsym() or names extracted at runtime by a debugger or anything else the consumer likes. Variable records with identically- - named entries in the data object section are removed. + named entries in the data object or function index section are removed. The data types section is a list of variable size records that represent each type, in order by their ID. The types themselves form a directed graph, diff --git a/ld/testsuite/ld-ctf/data-func-conflicted-vars.d b/ld/testsuite/ld-ctf/data-func-conflicted-vars.d new file mode 100644 index 00000000000..b278dfe5d84 --- /dev/null +++ b/ld/testsuite/ld-ctf/data-func-conflicted-vars.d @@ -0,0 +1,69 @@ +#as: +#source: data-func-1.c +#source: data-func-2.c +#objdump: --ctf +#ld: -shared -s --ctf-variables +#name: Conflicted data syms, partially indexed, stripped, with variables + +.*: +file format .* + +Contents of CTF section \.ctf: + + Header: + Magic number: 0xdff2 + Version: 4 \(CTF_VERSION_3\) +#... + Data object section: .* \(0x[1-9a-f][0-9a-f]* bytes\) + Function info section: .* \(0x[1-9a-f][0-9a-f]* bytes\) + Object index section: .* \(0xc bytes\) + Variable section: .* \(0x10 bytes\) + Type section: .* \(0x118 bytes\) + String section: .* +#... + Data objects: + bar -> 0x[0-9a-f]*: \(kind 6\) struct var_3 \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) + var_1 -> 0x[0-9a-f]*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* + var_666 -> 0x[0-9a-f]*: \(kind 3\) foo_t \* \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* + + Function objects: + func_[0-9]* -> 0x[0-9a-f]*: \(kind 5\) void \*\(\*\) \(const char \*restrict, int \(\*\)\(\*\) \(const char \*\)\) \(aligned at 0x[0-9a-f]*\) +#... + Variables: + funcs -> .* + other_func -> .* +#... + Types: +#... + .*: \(kind 6\) struct var_3 .* +#... +CTF archive member: .*/data-func-1\.c: + + Header: + Magic number: 0xdff2 + Version: 4 \(CTF_VERSION_3\) +#... + Parent name: \.ctf + Compilation unit name: .*/data-func-1\.c + Data object section: .* \(0x[1-9a-f][0-9a-f]* bytes\) + Type section: .* \(0xc bytes\) + String section: .* + + Labels: + + Data objects: + var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* + var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* + var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* + var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* + var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* + var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* + var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* + var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .* +#... + Function objects: + + Variables: + + Types: + 0x80000001: \(kind 10\) foo_t .* -> .* int .* +#... diff --git a/libctf/NEWS b/libctf/NEWS index 956cca8473e..f4e59734639 100644 --- a/libctf/NEWS +++ b/libctf/NEWS @@ -1,5 +1,14 @@ -*- text -*- +Changes in 2.39: + +* New features + +** The CTF variable section (if generated via ld --ctf-variables) now contains + entries for static functions, hidden functions, and other functions with + no associated symbol. The associated type is of kind CTF_K_FUNCTION. + (No change if --ctf-variables is not specified, which is the default.) + Changes in 2.37: * New features diff --git a/libctf/ctf-link.c b/libctf/ctf-link.c index ee836054463..d92a6930dd0 100644 --- a/libctf/ctf-link.c +++ b/libctf/ctf-link.c @@ -807,7 +807,12 @@ ctf_link_deduplicating_close_inputs (ctf_dict_t *fp, ctf_dynhash_t *cu_names, return 0; } -/* Do a deduplicating link of all variables in the inputs. */ +/* Do a deduplicating link of all variables in the inputs. + + Also, if we are not omitting the variable section, integrate all symbols from + the symtypetabs into the variable section too. (Duplication with the + symtypetab section in the output will be eliminated at serialization time.) */ + static int ctf_link_deduplicating_variables (ctf_dict_t *fp, ctf_dict_t **inputs, size_t ninputs, int cu_mapped) @@ -820,6 +825,8 @@ ctf_link_deduplicating_variables (ctf_dict_t *fp, ctf_dict_t **inputs, ctf_id_t type; const char *name; + /* First the variables on the inputs. */ + while ((type = ctf_variable_next (inputs[i], &it, &name)) != CTF_ERR) { if (ctf_link_one_variable (fp, inputs[i], name, type, cu_mapped) < 0) @@ -830,6 +837,34 @@ ctf_link_deduplicating_variables (ctf_dict_t *fp, ctf_dict_t **inputs, } if (ctf_errno (inputs[i]) != ECTF_NEXT_END) return ctf_set_errno (fp, ctf_errno (inputs[i])); + + /* Next the symbols. We integrate data symbols even though the compiler + is currently doing the same, to allow the compiler to stop in + future. */ + + while ((type = ctf_symbol_next (inputs[i], &it, &name, 0)) != CTF_ERR) + { + if (ctf_link_one_variable (fp, inputs[i], name, type, 1) < 0) + { + ctf_next_destroy (it); + return -1; /* errno is set for us. */ + } + } + if (ctf_errno (inputs[i]) != ECTF_NEXT_END) + return ctf_set_errno (fp, ctf_errno (inputs[i])); + + /* Finally the function symbols. */ + + while ((type = ctf_symbol_next (inputs[i], &it, &name, 1)) != CTF_ERR) + { + if (ctf_link_one_variable (fp, inputs[i], name, type, 1) < 0) + { + ctf_next_destroy (it); + return -1; /* errno is set for us. */ + } + } + if (ctf_errno (inputs[i]) != ECTF_NEXT_END) + return ctf_set_errno (fp, ctf_errno (inputs[i])); } return 0; } diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c index 89f1ac01aa1..cc9e59d4836 100644 --- a/libctf/ctf-serialize.c +++ b/libctf/ctf-serialize.c @@ -431,12 +431,12 @@ emit_symtypetab_index (ctf_dict_t *fp, ctf_dict_t *symfp, uint32_t *dp, return 0; } -/* Delete data symbols that have been assigned names from the variable section. - Must be called from within ctf_serialize, because that is the only place - you can safely delete variables without messing up ctf_rollback. */ +/* Delete symbols that have been assigned names from the variable section. Must + be called from within ctf_serialize, because that is the only place you can + safely delete variables without messing up ctf_rollback. */ static int -symtypetab_delete_nonstatic_vars (ctf_dict_t *fp, ctf_dict_t *symfp) +symtypetab_delete_nonstatics (ctf_dict_t *fp, ctf_dict_t *symfp) { ctf_dvdef_t *dvd, *nvd; ctf_id_t type; @@ -445,8 +445,10 @@ symtypetab_delete_nonstatic_vars (ctf_dict_t *fp, ctf_dict_t *symfp) { nvd = ctf_list_next (dvd); - if (((type = (ctf_id_t) (uintptr_t) - ctf_dynhash_lookup (fp->ctf_objthash, dvd->dvd_name)) > 0) + if ((((type = (ctf_id_t) (uintptr_t) + ctf_dynhash_lookup (fp->ctf_objthash, dvd->dvd_name)) > 0) + || (type = (ctf_id_t) (uintptr_t) + ctf_dynhash_lookup (fp->ctf_funchash, dvd->dvd_name)) > 0) && ctf_dynhash_lookup (symfp->ctf_dynsyms, dvd->dvd_name) != NULL && type == dvd->dvd_type) ctf_dvd_delete (fp, dvd); @@ -560,13 +562,12 @@ ctf_symtypetab_sect_sizes (ctf_dict_t *fp, emit_symtypetab_state_t *s, /* If we are filtering symbols out, those symbols that the linker has not reported have now been removed from the ctf_objthash and ctf_funchash. - Delete entries from the variable section that duplicate newly-added data - symbols. There's no need to migrate new ones in, because the compiler - always emits both a variable and a data symbol simultaneously, and - filtering only happens at final link time. */ + Delete entries from the variable section that duplicate newly-added + symbols. There's no need to migrate new ones in: we do that (if necessary) + in ctf_link_deduplicating_variables. */ if (s->filter_syms && s->symfp->ctf_dynsyms && - symtypetab_delete_nonstatic_vars (fp, s->symfp) < 0) + symtypetab_delete_nonstatics (fp, s->symfp) < 0) return -1; return 0;
reply other threads:[~2022-03-23 13:53 UTC|newest] Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220323135321.5DDC63857C49@sourceware.org \ --to=nix@sourceware.org \ --cc=bfd-cvs@sourceware.org \ --cc=gdb-cvs@sourceware.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).