public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 0/4] various libctf-specific fixes
@ 2022-03-21 18:24 Nick Alcock
  2022-03-21 18:24 ` [PATCH 1/4] ld, testsuite: improve CTF-availability test Nick Alcock
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Nick Alcock @ 2022-03-21 18:24 UTC (permalink / raw)
  To: binutils

These are a bunch of things that have accumulated over the last few
months.  One of them could in theory be considered a format change
(since it extends the meaning of the variable section), but the
variable section is rarely used, its use is off by default with
the default linker flags, and in practice no users of this feature
are negatively affected and all actually work slightly better with
this change.

The most important change is a fix to sourceware bug 28933, a buffer
overrun when libctf reads in uncompressed (i.e., probably very small)
CTF that was generated on a machine of the opposite endianness.
To prevent this happening again we generalize the byteswapper so that
it can be used to swap out of the native endianness as well as into
it, and add an environment variable that can be used when debugging
to force writeout of the CTF dict in the non-native endianness
instead, so we can finally properly test the code paths that only
kick in when byteswapping.

Nick Alcock (4):
  ld, testsuite: improve CTF-availability test
  include, libctf, ld: extend variable section to contain functions too
  libctf, ld: diagnose corrupted CTF header cth_strlen
  libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option

 include/ctf.h                                 |   8 +-
 .../ld-ctf/data-func-conflicted-vars.d        |  69 ++++++
 ld/testsuite/ld-ctf/diag-cttname-invalid.s    |   2 +-
 ld/testsuite/ld-ctf/diag-cttname-null.s       |   2 +-
 ld/testsuite/ld-ctf/diag-cuname.s             |   2 +-
 ld/testsuite/ld-ctf/diag-parlabel.s           |   2 +-
 ld/testsuite/ld-ctf/diag-parname.s            |   2 +-
 ld/testsuite/ld-ctf/diag-strlen-invalid.d     |   5 +
 ...ttname-invalid.s => diag-strlen-invalid.s} |   0
 ld/testsuite/lib/ld-lib.exp                   |  24 +-
 libctf/ctf-impl.h                             |   2 +
 libctf/ctf-link.c                             |  37 ++-
 libctf/ctf-open.c                             | 102 +++++---
 libctf/ctf-serialize.c                        | 219 +++++++++---------
 14 files changed, 322 insertions(+), 154 deletions(-)
 create mode 100644 ld/testsuite/ld-ctf/data-func-conflicted-vars.d
 create mode 100644 ld/testsuite/ld-ctf/diag-strlen-invalid.d
 copy ld/testsuite/ld-ctf/{diag-cttname-invalid.s => diag-strlen-invalid.s} (100%)


base-commit: bb368aad297fe3ad40cf397e6fc85aa471429a28
prerequisite-patch-id: 92872e00ca2b73dacdf3560166e14946e5c8c9e8
prerequisite-patch-id: a56ae1e4c6217f978cfab173162d19c5bec6b66e
prerequisite-patch-id: 98681ad6dfb609b75fb99faa6a4607b4dbaa9ff4
prerequisite-patch-id: fb4fb6c34271b9c1109234f05b948b8165a68270
prerequisite-patch-id: ea5af70db40bbb20e8588cf89fec3164dc3ba1d3
prerequisite-patch-id: 4a7283db1d6c4ac508cf624411945fe6276a42f2
prerequisite-patch-id: 9024b50dd0e396370f91b6a1e07c9b39e09a8871
prerequisite-patch-id: e114d6610e5a5e96766ffc47909040715e63ec4b
prerequisite-patch-id: 58f5630d18a1e32e8b7dd0ccc734d0046c9ebdb5
prerequisite-patch-id: 74a9849f23d902f5cb6d99ec7041dac407499534
prerequisite-patch-id: 0edd30d8a73394dd8bf17eaba84390be8c49cdc5
prerequisite-patch-id: ee024384bddb95ccbed56c4e877c477b90aecd09
prerequisite-patch-id: bb2b236d705a047c77111434e808b951c17e07a4
prerequisite-patch-id: bc798478702505d71177ad369f60ed0d2f6690a1
prerequisite-patch-id: 380d4bb4907763fff91d79f0410298436b8d57d0
prerequisite-patch-id: f36a3a087a64d3a65d95952c5c4f45adab672c07
prerequisite-patch-id: 779750a965e8163d6a989edf2c09fe06a49d5f08
prerequisite-patch-id: b6bdba26b904756e0481c55ad719687dd5022fbe
prerequisite-patch-id: 4e03d099a59d6e67632e1dbbb5c971746096d932
prerequisite-patch-id: 0c9657495a39a5560fe8eb87464de59f56ab73d7
prerequisite-patch-id: 0b6a82625015ae62f622c5e89966c8213088a260
prerequisite-patch-id: 25c94c705bef9d15eace9ac3bcb57e126c03b6f0
prerequisite-patch-id: cd4fcf984a99b38332d9917990170835d95a027f
prerequisite-patch-id: 0352b9de77fac90515c24377ad63d8204e8245bb
prerequisite-patch-id: abf93ac65c0621c1557c5376780a8b0b43d26e50
prerequisite-patch-id: f353a2ab037d28b0fcc3d4f3d9bd139317140960
prerequisite-patch-id: 0984334f5518e857340283f6b9cfe728b9baa563
prerequisite-patch-id: e943518b884c6754a669a963d7018030b1f420d9
prerequisite-patch-id: c974dd4ed0d8bff8a1e41d58f4387bb294b527a2
prerequisite-patch-id: b2cfa9d031946885120abc524cf55af589e7db73
prerequisite-patch-id: 0e34ae632b44ad51ee60b78e2477c59b88f04aa8
prerequisite-patch-id: 6cc12f5d3f57b3152c88f07b0bfed5a5c25ba452
prerequisite-patch-id: e5738774c19fc68690ac626d505b16bc83ffe64f
prerequisite-patch-id: fda4d965c64c1cf299a94417d868f3e77e35dddd
prerequisite-patch-id: 45a7b39f33d104a8cbbc205db2ea003ca5d713a1
prerequisite-patch-id: 0b47ea7b3c5cf4c555ee07ce208b99fa8505fd8e
prerequisite-patch-id: 1fa6f3a5a49712edfc038eb8e28e95cf728a27c4
prerequisite-patch-id: c911f20a9746210e8b30154caf1c1fe92a54724c
prerequisite-patch-id: 63c745dfbdcefd66cbd9991a4ca2d87134648eb0
prerequisite-patch-id: 7e1a61b3c0d3330e088552efc63ef19955915988
prerequisite-patch-id: 9c0d06f03b6ef7689d1521fd01b7ac62f457b6b1
prerequisite-patch-id: 9e9149136bcd2bc7ccb4b53bdc7ef26a66535248
prerequisite-patch-id: 0f343b78e27bc603a4b7068ad4c9ba17e009c5cd
prerequisite-patch-id: 652590e84ac8f37cad254fbc9165a2c8d2701c5c
prerequisite-patch-id: 369637be97ec85415c7be56c11e9981019f716cd
prerequisite-patch-id: 6af72bee97fa1d951dcb86be035651e1d3ef5ad9
prerequisite-patch-id: 9b20bb7c36f34a3cff34697e121ece43b65755b4
prerequisite-patch-id: 48d7be5f19a7d0816ff92368a4891a54f381192a
prerequisite-patch-id: e88357ba3286fd2cebc83f861bfb5cf75e1bffda
prerequisite-patch-id: 652f20da5066dba06e1ae770d64f45779a043920
prerequisite-patch-id: 9d34ff1ccde498d83742619536a395a336f94e89
prerequisite-patch-id: 55c02e7e1bf507d0adc6ced3c971b4f00f13348e
prerequisite-patch-id: 14e043d4be0f54d52f788ea1fdd21a23d9b253d0
prerequisite-patch-id: 1650e28c78426cbd97c274c88dfe1431913377b7
prerequisite-patch-id: c43eaf3aad2ac41d012825b8916d57232eab7232
prerequisite-patch-id: 2a4a55cd53678430c2af9b429781e4ec218504ba
prerequisite-patch-id: 8e4398615f6259a537c7f9298f4e01dfda7593a2
prerequisite-patch-id: b3723d267e4daa1740e1293df5252987834ff7e1
prerequisite-patch-id: 65cc62cf39694ffe1ac0563d0519e06cb6e78f1a
prerequisite-patch-id: 56fcceb5d42c944a3a664377b24d05fa731167a7
prerequisite-patch-id: fb5dd8c6b522b5d20d2ad229dc087d6845591780
prerequisite-patch-id: 0e5820182e5a6e9e28f3682127b918d29163ae77
prerequisite-patch-id: 003bfa2c471907a17cf76b49c9091bed4f58f8e7
prerequisite-patch-id: 835b0169ce56786d6cee5f94406cefc71aa069ae
prerequisite-patch-id: af0835d26d01b36a9f790b0ca070fb3d10e75bcc
prerequisite-patch-id: 889402116125c2bc015b1feabb0be9bfcafe14ec
prerequisite-patch-id: 48eed66dad8aca1b97c4db23bbe89407f9bfdedc
prerequisite-patch-id: b9dd63ecc574bf20ac291de4b7bc9cc46d36b423
prerequisite-patch-id: c0eb76d4b7655cd2d5428b6082f70ca83e857035
prerequisite-patch-id: 33f5411515ad446100aaa7aa656aa693212062a3
prerequisite-patch-id: 9dc39bf1d18d0b0432e4e147dc3b8be49cdbd2ec
prerequisite-patch-id: 61375e5c1c300661b4fdc342a6833aafad4a2d34
prerequisite-patch-id: d258ad649f84f0bd733863c6e404eafc78740478
prerequisite-patch-id: c2304490c25672fae029f45b51a2fa3feba5e19a
prerequisite-patch-id: 5dba7e2868b7bcf9c7d8ad58c722ea085cf1f192
prerequisite-patch-id: 5f45f7de52744a4c918a0c30ee0bf1daa3a54ba8
prerequisite-patch-id: 94d23412c629f1d1fabe7e85c889628b8a089102
prerequisite-patch-id: 5cc5aaf9d9c9abd7d0fb0d17d5b9f30739e15b16
prerequisite-patch-id: 450d3fd2ae5d061f4d5123c620f141b6f1a953d8
prerequisite-patch-id: fff88d650c15d19c858121e6609bf9ffa17acfc7
prerequisite-patch-id: 24328f427d4397fccbc609cb0d0774d27c2f8d3d
prerequisite-patch-id: 5157356b7b54bf5a8f81f1b023c4964d2dedc347
prerequisite-patch-id: e637ac80e1bac8b61e76ac6cf884caee261d8a6a
prerequisite-patch-id: 12ce086e97061c8038c4489a4aa6fb99bae23c33
prerequisite-patch-id: 997a5a0cdc0a0cc1932279d48986fc394b2bc5b3
prerequisite-patch-id: f4940a9cdc2056757d730ada67daca46e3826017
prerequisite-patch-id: 46a35bee86e0f927cfe2ad7b8bd609beb17cc845
prerequisite-patch-id: a35a1df053b9fdc1e682d92aea745e88b36ca715
prerequisite-patch-id: 3c1011003b9de4bac7b795b39ad024c2af28a0a2
prerequisite-patch-id: eb675a41c3307438c394d9dcd8cd54c2e08f4059
prerequisite-patch-id: a97a216272895695a196bdbf875f99c5754e0f6a
prerequisite-patch-id: 01916daf641d0d55bed0630565091ecb43dc4088
prerequisite-patch-id: d2eda91b9a27df188c6a7e80408af3db73756b4c
prerequisite-patch-id: dd113253f08e6774e9c05ffc867bf3f70749a0ad
prerequisite-patch-id: 90b7a55e6d24734a909af3c0247274d480bc83a1
prerequisite-patch-id: 484d7a8e6d1a6e1ab8ca31a73fac6111261565cb
prerequisite-patch-id: 56311938d6f15bde8a503cc3772c13a517d92457
prerequisite-patch-id: 734a8a92a263c227ec40314a4baa7475d9edcb78
prerequisite-patch-id: 12eaabdc6dd9b1228fe63e9cda7058d118ad5d35
prerequisite-patch-id: 6aedc79fd7d9fe6b0f62be7a4d7060a884224c73
prerequisite-patch-id: 4200cb21291d376fec7ac572b7595dee9973415e
-- 
2.35.1.261.g8402f930ba.dirty


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] ld, testsuite: improve CTF-availability test
  2022-03-21 18:24 [PATCH 0/4] various libctf-specific fixes Nick Alcock
@ 2022-03-21 18:24 ` Nick Alcock
  2022-03-21 18:24 ` [PATCH 2/4] include, libctf, ld: extend variable section to contain functions too Nick Alcock
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Nick Alcock @ 2022-03-21 18:24 UTC (permalink / raw)
  To: binutils

The test for -gctf support in the compiler is used to determine when to
run the ld-ctf tests and most of those in libctf.  Unfortunately,
because it uses check_compiler_available and compile_one_cc, it will
fail whenever the compiler emits anything on stderr, even if it
actually does support CTF perfectly well.

So, instead, ask the compiler to emit assembler output and grep it for
references to ".ctf": this is highly unlikely to be present if the
compiler does not support CTF.  (This will need adjusting when CTF grows
support for non-ELF platforms that don't dot-prepend their section
names, but right now the linker doesn't link CTF on any such platforms
in any case.)

With this in place we can do things like run all the libctf tests under
leak sanitizers etc even if those spray warnings on simple CTF
compilations, rather than being blocked from doing so just when we would
most like to.

ld/
	* testsuite/lib/ld-lib.exp (check_ctf_available): detect CTF
	even if a CTF-capable compiler emits warnings.
---
 ld/testsuite/lib/ld-lib.exp | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/ld/testsuite/lib/ld-lib.exp b/ld/testsuite/lib/ld-lib.exp
index 5c7771f7221..ec27388a72e 100644
--- a/ld/testsuite/lib/ld-lib.exp
+++ b/ld/testsuite/lib/ld-lib.exp
@@ -1628,24 +1628,36 @@ proc compile_one_cc { src output additional_flags } {
     return [run_host_cmd_yesno "$CC_FOR_TARGET" "$flags $CFLAGS_FOR_TARGET $additional_flags $src -o $output"]
 }
 
-# Returns true if the target compiler supports -gctf
+# Returns true if the target compiler supports -gctf.
 proc check_ctf_available { } {
     global ctf_available_saved
 
     if {![info exists ctf_available_saved]} {
-	if { ![check_compiler_available] } {
-	    set ctf_available_saved 0
-	} else {
+	set ctf_available_saved 0
+
+	# Don't check for compiler availability, because that FNs if the
+	# compiler is available but emits warnings.  An unavailable
+	# compiler will fail this test anyway.
+
+	if ([check_compiler_available]) {
 	    set basename "tmpdir/ctf_available[pid]"
 	    set src ${basename}.c
-	    set output ${basename}.o
+	    set output ${basename}.s
 	    set f [open $src "w"]
 	    puts $f "int main() { return 0; }"
 	    close $f
-	    set ctf_available_saved [compile_one_cc $src $output "-gctf -c"]
+	    compile_one_cc $src $output "-gctf -S -c"
 	    remote_file host delete $src
+	    if {! [remote_file host exists $output] } {
+		file delete $src
+		return 0
+	    }
+	    set status [remote_exec host fgrep ".ctf $output"]
 	    remote_file host delete $output
 	    file delete $src
+	    if { [lindex $status 0] == 0 } {
+		set ctf_available_saved 1
+	    }
 	}
     }
     return $ctf_available_saved
-- 
2.35.1.261.g8402f930ba.dirty


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/4] include, libctf, ld: extend variable section to contain functions too
  2022-03-21 18:24 [PATCH 0/4] various libctf-specific fixes Nick Alcock
  2022-03-21 18:24 ` [PATCH 1/4] ld, testsuite: improve CTF-availability test Nick Alcock
@ 2022-03-21 18:24 ` Nick Alcock
  2022-03-23 13:54   ` Nick Alcock
  2022-03-21 18:24 ` [PATCH 3/4] libctf, ld: diagnose corrupted CTF header cth_strlen Nick Alcock
  2022-03-21 18:24 ` [PATCH 4/4] libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option Nick Alcock
  3 siblings, 1 reply; 7+ messages in thread
From: Nick Alcock @ 2022-03-21 18:24 UTC (permalink / raw)
  To: binutils

The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.

Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.

Historically, we emitted only removed data symbols into the variable
section.  This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type.  So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.

This is a little fiddly.  We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.

Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves.  While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.

include/
	* ctf.h: Mention the new things we can see in the variable
	section.

ld/
	* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.

libctf/
	* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
	symbols into the variable section too.
	* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
	to...
	(symtypetab_delete_nonstatics): ... this.  Check the funchash
	when pruning redundant variables.
	(ctf_symtypetab_sect_sizes): Adjust accordingly.
---
 include/ctf.h                                 |  8 +--
 .../ld-ctf/data-func-conflicted-vars.d        | 69 +++++++++++++++++++
 libctf/ctf-link.c                             | 37 +++++++++-
 libctf/ctf-serialize.c                        | 23 ++++---
 4 files changed, 121 insertions(+), 16 deletions(-)
 create mode 100644 ld/testsuite/ld-ctf/data-func-conflicted-vars.d

diff --git a/include/ctf.h b/include/ctf.h
index 6db2742d5fb..698aab3eab6 100644
--- a/include/ctf.h
+++ b/include/ctf.h
@@ -89,13 +89,13 @@ extern "C"
    entries and reorder them accordingly (dropping the indexes in the process).
 
    Variable records (as distinct from data objects) provide a modicum of support
-   for non-ELF systems, mapping a variable name to a CTF type ID.  The variable
-   names are sorted into ASCIIbetical order, permitting binary searching.  We do
-   not define how the consumer maps these variable names to addresses or
+   for non-ELF systems, mapping a variable or function name to a CTF type ID.
+   The names are sorted into ASCIIbetical order, permitting binary searching.
+   We do not define how the consumer maps these variable names to addresses or
    anything else, or indeed what these names represent: they might be names
    looked up at runtime via dlsym() or names extracted at runtime by a debugger
    or anything else the consumer likes.  Variable records with identically-
-   named entries in the data object section are removed.
+   named entries in the data object or function index section are removed.
 
    The data types section is a list of variable size records that represent each
    type, in order by their ID.  The types themselves form a directed graph,
diff --git a/ld/testsuite/ld-ctf/data-func-conflicted-vars.d b/ld/testsuite/ld-ctf/data-func-conflicted-vars.d
new file mode 100644
index 00000000000..b278dfe5d84
--- /dev/null
+++ b/ld/testsuite/ld-ctf/data-func-conflicted-vars.d
@@ -0,0 +1,69 @@
+#as:
+#source: data-func-1.c
+#source: data-func-2.c
+#objdump: --ctf
+#ld: -shared -s --ctf-variables
+#name: Conflicted data syms, partially indexed, stripped, with variables
+
+.*: +file format .*
+
+Contents of CTF section \.ctf:
+
+  Header:
+    Magic number: 0xdff2
+    Version: 4 \(CTF_VERSION_3\)
+#...
+    Data object section:	.* \(0x[1-9a-f][0-9a-f]* bytes\)
+    Function info section:	.* \(0x[1-9a-f][0-9a-f]* bytes\)
+    Object index section:	.* \(0xc bytes\)
+    Variable section:	.* \(0x10 bytes\)
+    Type section:	.* \(0x118 bytes\)
+    String section:	.*
+#...
+  Data objects:
+    bar -> 0x[0-9a-f]*: \(kind 6\) struct var_3 \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\)
+    var_1 -> 0x[0-9a-f]*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+    var_666 -> 0x[0-9a-f]*: \(kind 3\) foo_t \* \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+
+  Function objects:
+    func_[0-9]* -> 0x[0-9a-f]*: \(kind 5\) void \*\(\*\) \(const char \*restrict, int \(\*\)\(\*\) \(const char \*\)\) \(aligned at 0x[0-9a-f]*\)
+#...
+  Variables:
+    funcs -> .*
+    other_func -> .*
+#...
+  Types:
+#...
+    .*: \(kind 6\) struct var_3 .*
+#...
+CTF archive member: .*/data-func-1\.c:
+
+  Header:
+    Magic number: 0xdff2
+    Version: 4 \(CTF_VERSION_3\)
+#...
+    Parent name: \.ctf
+    Compilation unit name: .*/data-func-1\.c
+    Data object section:	.* \(0x[1-9a-f][0-9a-f]* bytes\)
+    Type section:	.* \(0xc bytes\)
+    String section:	.*
+
+  Labels:
+
+  Data objects:
+    var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+    var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+    var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+    var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+    var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+    var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+    var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+    var_[0-9]* -> 0x80000001*: \(kind 10\) foo_t \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\) -> .*
+#...
+  Function objects:
+
+  Variables:
+
+  Types:
+    0x80000001: \(kind 10\) foo_t .* -> .* int .*
+#...
diff --git a/libctf/ctf-link.c b/libctf/ctf-link.c
index ee836054463..d92a6930dd0 100644
--- a/libctf/ctf-link.c
+++ b/libctf/ctf-link.c
@@ -807,7 +807,12 @@ ctf_link_deduplicating_close_inputs (ctf_dict_t *fp, ctf_dynhash_t *cu_names,
   return 0;
 }
 
-/* Do a deduplicating link of all variables in the inputs.  */
+/* Do a deduplicating link of all variables in the inputs.
+
+   Also, if we are not omitting the variable section, integrate all symbols from
+   the symtypetabs into the variable section too.  (Duplication with the
+   symtypetab section in the output will be eliminated at serialization time.)  */
+
 static int
 ctf_link_deduplicating_variables (ctf_dict_t *fp, ctf_dict_t **inputs,
 				  size_t ninputs, int cu_mapped)
@@ -820,6 +825,8 @@ ctf_link_deduplicating_variables (ctf_dict_t *fp, ctf_dict_t **inputs,
       ctf_id_t type;
       const char *name;
 
+      /* First the variables on the inputs.  */
+
       while ((type = ctf_variable_next (inputs[i], &it, &name)) != CTF_ERR)
 	{
 	  if (ctf_link_one_variable (fp, inputs[i], name, type, cu_mapped) < 0)
@@ -830,6 +837,34 @@ ctf_link_deduplicating_variables (ctf_dict_t *fp, ctf_dict_t **inputs,
 	}
       if (ctf_errno (inputs[i]) != ECTF_NEXT_END)
 	return ctf_set_errno (fp, ctf_errno (inputs[i]));
+
+      /* Next the symbols.  We integrate data symbols even though the compiler
+	 is currently doing the same, to allow the compiler to stop in
+	 future.  */
+
+      while ((type = ctf_symbol_next (inputs[i], &it, &name, 0)) != CTF_ERR)
+	{
+	  if (ctf_link_one_variable (fp, inputs[i], name, type, 1) < 0)
+	    {
+	      ctf_next_destroy (it);
+	      return -1;			/* errno is set for us.  */
+	    }
+	}
+      if (ctf_errno (inputs[i]) != ECTF_NEXT_END)
+	return ctf_set_errno (fp, ctf_errno (inputs[i]));
+
+      /* Finally the function symbols.  */
+
+      while ((type = ctf_symbol_next (inputs[i], &it, &name, 1)) != CTF_ERR)
+	{
+	  if (ctf_link_one_variable (fp, inputs[i], name, type, 1) < 0)
+	    {
+	      ctf_next_destroy (it);
+	      return -1;			/* errno is set for us.  */
+	    }
+	}
+      if (ctf_errno (inputs[i]) != ECTF_NEXT_END)
+	return ctf_set_errno (fp, ctf_errno (inputs[i]));
     }
   return 0;
 }
diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c
index 89f1ac01aa1..cc9e59d4836 100644
--- a/libctf/ctf-serialize.c
+++ b/libctf/ctf-serialize.c
@@ -431,12 +431,12 @@ emit_symtypetab_index (ctf_dict_t *fp, ctf_dict_t *symfp, uint32_t *dp,
   return 0;
 }
 
-/* Delete data symbols that have been assigned names from the variable section.
-   Must be called from within ctf_serialize, because that is the only place
-   you can safely delete variables without messing up ctf_rollback.  */
+/* Delete symbols that have been assigned names from the variable section.  Must
+   be called from within ctf_serialize, because that is the only place you can
+   safely delete variables without messing up ctf_rollback.  */
 
 static int
-symtypetab_delete_nonstatic_vars (ctf_dict_t *fp, ctf_dict_t *symfp)
+symtypetab_delete_nonstatics (ctf_dict_t *fp, ctf_dict_t *symfp)
 {
   ctf_dvdef_t *dvd, *nvd;
   ctf_id_t type;
@@ -445,8 +445,10 @@ symtypetab_delete_nonstatic_vars (ctf_dict_t *fp, ctf_dict_t *symfp)
     {
       nvd = ctf_list_next (dvd);
 
-      if (((type = (ctf_id_t) (uintptr_t)
-	    ctf_dynhash_lookup (fp->ctf_objthash, dvd->dvd_name)) > 0)
+      if ((((type = (ctf_id_t) (uintptr_t)
+	     ctf_dynhash_lookup (fp->ctf_objthash, dvd->dvd_name)) > 0)
+	   || (type = (ctf_id_t) (uintptr_t)
+	       ctf_dynhash_lookup (fp->ctf_funchash, dvd->dvd_name)) > 0)
 	  && ctf_dynhash_lookup (symfp->ctf_dynsyms, dvd->dvd_name) != NULL
 	  && type == dvd->dvd_type)
 	ctf_dvd_delete (fp, dvd);
@@ -560,13 +562,12 @@ ctf_symtypetab_sect_sizes (ctf_dict_t *fp, emit_symtypetab_state_t *s,
 
   /* If we are filtering symbols out, those symbols that the linker has not
      reported have now been removed from the ctf_objthash and ctf_funchash.
-     Delete entries from the variable section that duplicate newly-added data
-     symbols.  There's no need to migrate new ones in, because the compiler
-     always emits both a variable and a data symbol simultaneously, and
-     filtering only happens at final link time.  */
+     Delete entries from the variable section that duplicate newly-added
+     symbols.  There's no need to migrate new ones in: we do that (if necessary)
+     in ctf_link_deduplicating_variables.  */
 
   if (s->filter_syms && s->symfp->ctf_dynsyms &&
-      symtypetab_delete_nonstatic_vars (fp, s->symfp) < 0)
+      symtypetab_delete_nonstatics (fp, s->symfp) < 0)
     return -1;
 
   return 0;
-- 
2.35.1.261.g8402f930ba.dirty


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 3/4] libctf, ld: diagnose corrupted CTF header cth_strlen
  2022-03-21 18:24 [PATCH 0/4] various libctf-specific fixes Nick Alcock
  2022-03-21 18:24 ` [PATCH 1/4] ld, testsuite: improve CTF-availability test Nick Alcock
  2022-03-21 18:24 ` [PATCH 2/4] include, libctf, ld: extend variable section to contain functions too Nick Alcock
@ 2022-03-21 18:24 ` Nick Alcock
  2022-03-23 13:56   ` Nick Alcock
  2022-03-21 18:24 ` [PATCH 4/4] libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option Nick Alcock
  3 siblings, 1 reply; 7+ messages in thread
From: Nick Alcock @ 2022-03-21 18:24 UTC (permalink / raw)
  To: binutils

The last section in a CTF dict is the string table, at an offset
represented by the cth_stroff header field.  Its length is recorded in
the next field, cth_strlen, and the two added together are taken as the
size of the CTF dict.  Upon opening a dict, we check that none of the
header offsets exceed this size, and we check when uncompressing a
compressed dict that the result of the uncompression is the same length:
but CTF dicts need not be compressed, and short ones are not.
Uncompressed dicts just use the ctf_size without checking it.  This
field is thankfully almost unused: it is mostly used when reserializing
a dict, which can't be done to dicts read off disk since they're
read-only.

However, when opening an uncompressed foreign-endian dict we have to
copy it out of the mmaped region it is stored in so we can endian-
swap it, and we use ctf_size when doing that.  When the cth_strlen is
corrupt, this can overrun.

Fix this by checking the ctf_size in all uncompressed cases, just as we
already do in the compressed case.  Add a new test.

This came to light because various corrupted-CTF raw-asm tests had an
incorrect cth_strlen: fix all of them so they produce the expected
error again.

libctf/
	PR libctf/28933
	* ctf-open.c (ctf_bufopen_internal): Always check uncompressed
	CTF dict sizes against the section size in case the cth_strlen is
	corrupt.

ld/
	PR libctf/28933
	* testsuite/ld-ctf/diag-strlen-invalid.*: New test,
	derived from diag-cttname-invalid.s.
	* testsuite/ld-ctf/diag-cttname-invalid.s: Fix incorrect cth_strlen.
	* testsuite/ld-ctf/diag-cttname-null.s: Likewise.
	* testsuite/ld-ctf/diag-cuname.s: Likewise.
	* testsuite/ld-ctf/diag-parlabel.s: Likewise.
	* testsuite/ld-ctf/diag-parname.s: Likewise.
---
 ld/testsuite/ld-ctf/diag-cttname-invalid.s    |  2 +-
 ld/testsuite/ld-ctf/diag-cttname-null.s       |  2 +-
 ld/testsuite/ld-ctf/diag-cuname.s             |  2 +-
 ld/testsuite/ld-ctf/diag-parlabel.s           |  2 +-
 ld/testsuite/ld-ctf/diag-parname.s            |  2 +-
 ld/testsuite/ld-ctf/diag-strlen-invalid.d     |  5 +++
 ...ttname-invalid.s => diag-strlen-invalid.s} |  0
 libctf/ctf-open.c                             | 45 ++++++++++++-------
 8 files changed, 39 insertions(+), 21 deletions(-)
 create mode 100644 ld/testsuite/ld-ctf/diag-strlen-invalid.d
 copy ld/testsuite/ld-ctf/{diag-cttname-invalid.s => diag-strlen-invalid.s} (100%)

diff --git a/ld/testsuite/ld-ctf/diag-cttname-invalid.s b/ld/testsuite/ld-ctf/diag-cttname-invalid.s
index dbfdd21fe27..f025254665d 100644
--- a/ld/testsuite/ld-ctf/diag-cttname-invalid.s
+++ b/ld/testsuite/ld-ctf/diag-cttname-invalid.s
@@ -15,7 +15,7 @@
 	.long	0x8
 	.long	0x10
 	.long	0x40
-	.long	0x42
+	.long	0x37
 	.long	0x1
 	.long	0x7
 	.long	0x7
diff --git a/ld/testsuite/ld-ctf/diag-cttname-null.s b/ld/testsuite/ld-ctf/diag-cttname-null.s
index ad6ce60f964..f3ba2129fef 100644
--- a/ld/testsuite/ld-ctf/diag-cttname-null.s
+++ b/ld/testsuite/ld-ctf/diag-cttname-null.s
@@ -15,7 +15,7 @@
 	.long	0x8
 	.long	0x10
 	.long	0x40
-	.long	0x42
+	.long	0x37
 	.long	0x1
 	.long	0x7
 	.long	0x7
diff --git a/ld/testsuite/ld-ctf/diag-cuname.s b/ld/testsuite/ld-ctf/diag-cuname.s
index dcdbd62aa73..95f3d72feea 100644
--- a/ld/testsuite/ld-ctf/diag-cuname.s
+++ b/ld/testsuite/ld-ctf/diag-cuname.s
@@ -15,7 +15,7 @@
 	.long	0x8
 	.long	0x10
 	.long	0x40
-	.long	0x42
+	.long	0x37
 	.long	0x1
 	.long	0x7
 	.long	0x7
diff --git a/ld/testsuite/ld-ctf/diag-parlabel.s b/ld/testsuite/ld-ctf/diag-parlabel.s
index e0ce57ca535..b31fb8181a3 100644
--- a/ld/testsuite/ld-ctf/diag-parlabel.s
+++ b/ld/testsuite/ld-ctf/diag-parlabel.s
@@ -15,7 +15,7 @@
 	.long	0x8
 	.long	0x10
 	.long	0x40
-	.long	0x42
+	.long	0x37
 	.long	0x1
 	.long	0x7
 	.long	0x7
diff --git a/ld/testsuite/ld-ctf/diag-parname.s b/ld/testsuite/ld-ctf/diag-parname.s
index da30e4a7c85..d30178de39b 100644
--- a/ld/testsuite/ld-ctf/diag-parname.s
+++ b/ld/testsuite/ld-ctf/diag-parname.s
@@ -15,7 +15,7 @@
 	.long	0x8
 	.long	0x10
 	.long	0x40
-	.long	0x42
+	.long	0x37
 	.long	0x1
 	.long	0x7
 	.long	0x7
diff --git a/ld/testsuite/ld-ctf/diag-strlen-invalid.d b/ld/testsuite/ld-ctf/diag-strlen-invalid.d
new file mode 100644
index 00000000000..8a7b69b41df
--- /dev/null
+++ b/ld/testsuite/ld-ctf/diag-strlen-invalid.d
@@ -0,0 +1,5 @@
+#as:
+#source: diag-strlen-invalid.s
+#ld: -shared
+#name: Diagnostics - String offset invalid.
+#warning: .* byte long CTF dictionary overruns .* byte long CTF section
diff --git a/ld/testsuite/ld-ctf/diag-cttname-invalid.s b/ld/testsuite/ld-ctf/diag-strlen-invalid.s
similarity index 100%
copy from ld/testsuite/ld-ctf/diag-cttname-invalid.s
copy to ld/testsuite/ld-ctf/diag-strlen-invalid.s
diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index c7ca37e5249..3f8d336f895 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -1517,26 +1517,39 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 	  goto bad;
 	}
     }
-  else if (foreign_endian)
+  else
     {
-      if ((fp->ctf_base = malloc (fp->ctf_size)) == NULL)
+      if (_libctf_unlikely_ (ctfsect->cts_size < hdrsz + fp->ctf_size))
 	{
-	  err = ECTF_ZALLOC;
+	  ctf_err_warn (NULL, 0, ECTF_CORRUPT,
+			_("%lu byte long CTF dictionary overruns %lu byte long CTF section"),
+			(unsigned long) ctfsect->cts_size,
+			(unsigned long) (hdrsz + fp->ctf_size));
+	  err = ECTF_CORRUPT;
 	  goto bad;
 	}
-      fp->ctf_dynbase = fp->ctf_base;
-      memcpy (fp->ctf_base, ((unsigned char *) ctfsect->cts_data) + hdrsz,
-	      fp->ctf_size);
-      fp->ctf_buf = fp->ctf_base;
-    }
-  else
-    {
-      /* We are just using the section passed in -- but its header may be an old
-	 version.  Point ctf_buf past the old header, and never touch it
-	 again.  */
-      fp->ctf_base = (unsigned char *) ctfsect->cts_data;
-      fp->ctf_dynbase = NULL;
-      fp->ctf_buf = fp->ctf_base + hdrsz;
+
+      if (foreign_endian)
+	{
+	  if ((fp->ctf_base = malloc (fp->ctf_size)) == NULL)
+	    {
+	      err = ECTF_ZALLOC;
+	      goto bad;
+	    }
+	  fp->ctf_dynbase = fp->ctf_base;
+	  memcpy (fp->ctf_base, ((unsigned char *) ctfsect->cts_data) + hdrsz,
+		  fp->ctf_size);
+	  fp->ctf_buf = fp->ctf_base;
+	}
+      else
+	{
+	  /* We are just using the section passed in -- but its header may
+	     be an old version.  Point ctf_buf past the old header, and
+	     never touch it again.  */
+	  fp->ctf_base = (unsigned char *) ctfsect->cts_data;
+	  fp->ctf_dynbase = NULL;
+	  fp->ctf_buf = fp->ctf_base + hdrsz;
+	}
     }
 
   /* Once we have uncompressed and validated the CTF data buffer, we can
-- 
2.35.1.261.g8402f930ba.dirty


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 4/4] libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
  2022-03-21 18:24 [PATCH 0/4] various libctf-specific fixes Nick Alcock
                   ` (2 preceding siblings ...)
  2022-03-21 18:24 ` [PATCH 3/4] libctf, ld: diagnose corrupted CTF header cth_strlen Nick Alcock
@ 2022-03-21 18:24 ` Nick Alcock
  3 siblings, 0 replies; 7+ messages in thread
From: Nick Alcock @ 2022-03-21 18:24 UTC (permalink / raw)
  To: binutils

libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness.  This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)

To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness.  This then tests the foreign-endian read paths properly
at open time.

Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression).  Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.

The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.

libctf/
	* ctf-impl.h (ctf_flip_header): No longer static.
	(ctf_flip): Likewise.
	* ctf-open.c (flip_header): Rename to...
	(ctf_flip_header): ... this, now it is not private to one file.
	(flip_ctf): Rename...
	(ctf_flip): ... this too.  Add FOREIGN_ENDIAN arg.
	(flip_types): Likewise.  Use it.
	(ctf_bufopen_internal): Adjust calls.
	* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
	a newly-allocated bounce buffer.
	(ctf_compress_write): Move below ctf_write_mem and reimplement
	in terms of it.
	(ctf_write): Likewise.
	(ctf_gzwrite): Note that this obscure writeout function does not
	support endian-flipping.
---
 libctf/ctf-impl.h      |   2 +
 libctf/ctf-open.c      |  57 ++++++++----
 libctf/ctf-serialize.c | 196 +++++++++++++++++++++--------------------
 3 files changed, 144 insertions(+), 111 deletions(-)

diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index f749b839ab3..6b6ec16291a 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -738,6 +738,8 @@ extern void ctf_arc_close_internal (struct ctf_archive *);
 extern const ctf_preamble_t *ctf_arc_bufpreamble (const ctf_sect_t *);
 extern void *ctf_set_open_errno (int *, int);
 extern unsigned long ctf_set_errno (ctf_dict_t *, int);
+extern void ctf_flip_header (ctf_header_t *);
+extern int ctf_flip (ctf_dict_t *, ctf_header_t *, unsigned char *, int);
 
 extern ctf_dict_t *ctf_simple_open_internal (const char *, size_t, const char *,
 					     size_t, size_t,
diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index 3f8d336f895..f0e203e0a16 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -965,8 +965,8 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 
 /* Flip the endianness of the CTF header.  */
 
-static void
-flip_header (ctf_header_t *cth)
+void
+ctf_flip_header (ctf_header_t *cth)
 {
   swap_thing (cth->cth_preamble.ctp_magic);
   swap_thing (cth->cth_preamble.ctp_version);
@@ -1031,26 +1031,48 @@ flip_vars (void *start, size_t len)
    ctf_stype followed by variable data.  */
 
 static int
-flip_types (ctf_dict_t *fp, void *start, size_t len)
+flip_types (ctf_dict_t *fp, void *start, size_t len, int to_foreign)
 {
   ctf_type_t *t = start;
 
   while ((uintptr_t) t < ((uintptr_t) start) + len)
     {
+      uint32_t kind;
+      size_t size;
+      uint32_t vlen;
+      size_t vbytes;
+
+      if (to_foreign)
+	{
+	  kind = CTF_V2_INFO_KIND (t->ctt_info);
+	  size = t->ctt_size;
+	  vlen = CTF_V2_INFO_VLEN (t->ctt_info);
+	  vbytes = get_vbytes_v2 (fp, kind, size, vlen);
+	}
+
       swap_thing (t->ctt_name);
       swap_thing (t->ctt_info);
       swap_thing (t->ctt_size);
 
-      uint32_t kind = CTF_V2_INFO_KIND (t->ctt_info);
-      size_t size = t->ctt_size;
-      uint32_t vlen = CTF_V2_INFO_VLEN (t->ctt_info);
-      size_t vbytes = get_vbytes_v2 (fp, kind, size, vlen);
+      if (!to_foreign)
+	{
+	  kind = CTF_V2_INFO_KIND (t->ctt_info);
+	  size = t->ctt_size;
+	  vlen = CTF_V2_INFO_VLEN (t->ctt_info);
+	  vbytes = get_vbytes_v2 (fp, kind, size, vlen);
+	}
 
       if (_libctf_unlikely_ (size == CTF_LSIZE_SENT))
 	{
+	  if (to_foreign)
+	    size = CTF_TYPE_LSIZE (t);
+
 	  swap_thing (t->ctt_lsizehi);
 	  swap_thing (t->ctt_lsizelo);
-	  size = CTF_TYPE_LSIZE (t);
+
+	  if (!to_foreign)
+	    size = CTF_TYPE_LSIZE (t);
+
 	  t = (ctf_type_t *) ((uintptr_t) t + sizeof (ctf_type_t));
 	}
       else
@@ -1182,22 +1204,27 @@ flip_types (ctf_dict_t *fp, void *start, size_t len)
 }
 
 /* Flip the endianness of BUF, given the offsets in the (already endian-
-   converted) CTH.
+   converted) CTH.  If TO_FOREIGN is set, flip to foreign-endianness; if not,
+   flip away.
 
    All of this stuff happens before the header is fully initialized, so the
    LCTF_*() macros cannot be used yet.  Since we do not try to endian-convert v1
    data, this is no real loss.  */
 
-static int
-flip_ctf (ctf_dict_t *fp, ctf_header_t *cth, unsigned char *buf)
+int
+ctf_flip (ctf_dict_t *fp, ctf_header_t *cth, unsigned char *buf,
+	  int to_foreign)
 {
+  ctf_dprintf("flipping endianness\n");
+
   flip_lbls (buf + cth->cth_lbloff, cth->cth_objtoff - cth->cth_lbloff);
   flip_objts (buf + cth->cth_objtoff, cth->cth_funcoff - cth->cth_objtoff);
   flip_objts (buf + cth->cth_funcoff, cth->cth_objtidxoff - cth->cth_funcoff);
   flip_objts (buf + cth->cth_objtidxoff, cth->cth_funcidxoff - cth->cth_objtidxoff);
   flip_objts (buf + cth->cth_funcidxoff, cth->cth_varoff - cth->cth_funcidxoff);
   flip_vars (buf + cth->cth_varoff, cth->cth_typeoff - cth->cth_varoff);
-  return flip_types (fp, buf + cth->cth_typeoff, cth->cth_stroff - cth->cth_typeoff);
+  return flip_types (fp, buf + cth->cth_typeoff,
+		     cth->cth_stroff - cth->cth_typeoff, to_foreign);
 }
 
 /* Set up the ctl hashes in a ctf_dict_t.  Called by both writable and
@@ -1404,7 +1431,7 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
     upgrade_header (hp);
 
   if (foreign_endian)
-    flip_header (hp);
+    ctf_flip_header (hp);
   fp->ctf_openflags = hp->cth_flags;
   fp->ctf_size = hp->cth_stroff + hp->cth_strlen;
 
@@ -1610,9 +1637,9 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
   fp->ctf_syn_ext_strtab = syn_strtab;
 
   if (foreign_endian &&
-      (err = flip_ctf (fp, hp, fp->ctf_buf)) != 0)
+      (err = ctf_flip (fp, hp, fp->ctf_buf, 0)) != 0)
     {
-      /* We can be certain that flip_ctf() will have endian-flipped everything
+      /* We can be certain that ctf_flip() will have endian-flipped everything
 	 other than the types table when we return.  In particular the header
 	 is fine, so set it, to allow freeing to use the usual code path.  */
 
diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c
index cc9e59d4836..c6b8b495568 100644
--- a/libctf/ctf-serialize.c
+++ b/libctf/ctf-serialize.c
@@ -1229,7 +1229,13 @@ err:
 
 /* File writing.  */
 
-/* Write the compressed CTF data stream to the specified gzFile descriptor.  */
+/* Write the compressed CTF data stream to the specified gzFile descriptor.  The
+   whole stream is compressed, and cannot be read by CTF opening functions in
+   this library until it is decompressed.  (The functions below this one leave
+   the header uncompressed, and the CTF opening functions work on them without
+   manual decompression.)
+
+   No support for (testing-only) endian-flipping.  */
 int
 ctf_gzwrite (ctf_dict_t *fp, gzFile fd)
 {
@@ -1260,85 +1266,25 @@ ctf_gzwrite (ctf_dict_t *fp, gzFile fd)
   return 0;
 }
 
-/* Compress the specified CTF data stream and write it to the specified file
-   descriptor.  */
-int
-ctf_compress_write (ctf_dict_t *fp, int fd)
-{
-  unsigned char *buf;
-  unsigned char *bp;
-  ctf_header_t h;
-  ctf_header_t *hp = &h;
-  ssize_t header_len = sizeof (ctf_header_t);
-  ssize_t compress_len;
-  ssize_t len;
-  int rc;
-  int err = 0;
-
-  if (ctf_serialize (fp) < 0)
-    return -1;					/* errno is set for us.  */
-
-  memcpy (hp, fp->ctf_header, header_len);
-  hp->cth_flags |= CTF_F_COMPRESS;
-  compress_len = compressBound (fp->ctf_size);
-
-  if ((buf = malloc (compress_len)) == NULL)
-    {
-      ctf_err_warn (fp, 0, 0, _("ctf_compress_write: cannot allocate %li bytes"),
-		    (unsigned long) compress_len);
-      return (ctf_set_errno (fp, ECTF_ZALLOC));
-    }
-
-  if ((rc = compress (buf, (uLongf *) &compress_len,
-		      fp->ctf_buf, fp->ctf_size)) != Z_OK)
-    {
-      err = ctf_set_errno (fp, ECTF_COMPRESS);
-      ctf_err_warn (fp, 0, 0, _("zlib deflate err: %s"), zError (rc));
-      goto ret;
-    }
-
-  while (header_len > 0)
-    {
-      if ((len = write (fd, hp, header_len)) < 0)
-	{
-	  err = ctf_set_errno (fp, errno);
-	  ctf_err_warn (fp, 0, 0, _("ctf_compress_write: error writing header"));
-	  goto ret;
-	}
-      header_len -= len;
-      hp += len;
-    }
-
-  bp = buf;
-  while (compress_len > 0)
-    {
-      if ((len = write (fd, bp, compress_len)) < 0)
-	{
-	  err = ctf_set_errno (fp, errno);
-	  ctf_err_warn (fp, 0, 0, _("ctf_compress_write: error writing"));
-	  goto ret;
-	}
-      compress_len -= len;
-      bp += len;
-    }
-
-ret:
-  free (buf);
-  return err;
-}
-
 /* Optionally compress the specified CTF data stream and return it as a new
-   dynamically-allocated string.  */
+   dynamically-allocated string.  Possibly write it with reversed
+   endianness.  */
 unsigned char *
 ctf_write_mem (ctf_dict_t *fp, size_t *size, size_t threshold)
 {
   unsigned char *buf;
   unsigned char *bp;
   ctf_header_t *hp;
+  unsigned char *flipped, *src;
   ssize_t header_len = sizeof (ctf_header_t);
   ssize_t compress_len;
+  int flip_endian;
+  int uncompressed;
   int rc;
 
+  flip_endian = getenv ("LIBCTF_WRITE_FOREIGN_ENDIAN") != NULL;
+  uncompressed = (fp->ctf_size < threshold);
+
   if (ctf_serialize (fp) < 0)
     return NULL;				/* errno is set for us.  */
 
@@ -1359,17 +1305,43 @@ ctf_write_mem (ctf_dict_t *fp, size_t *size, size_t threshold)
   bp = buf + sizeof (struct ctf_header);
   *size = sizeof (struct ctf_header);
 
-  if (fp->ctf_size < threshold)
+  if (uncompressed)
+    hp->cth_flags &= ~CTF_F_COMPRESS;
+  else
+    hp->cth_flags |= CTF_F_COMPRESS;
+
+  src = fp->ctf_buf;
+  flipped = NULL;
+
+  if (flip_endian)
     {
-      hp->cth_flags &= ~CTF_F_COMPRESS;
-      memcpy (bp, fp->ctf_buf, fp->ctf_size);
+      if ((flipped = malloc (fp->ctf_size)) == NULL)
+	{
+	  ctf_set_errno (fp, ENOMEM);
+	  ctf_err_warn (fp, 0, 0, _("ctf_write_mem: cannot allocate %li bytes"),
+			(unsigned long) fp->ctf_size + sizeof (struct ctf_header));
+	  return NULL;
+	}
+      ctf_flip_header (hp);
+      memcpy (flipped, fp->ctf_buf, fp->ctf_size);
+      if (ctf_flip (fp, fp->ctf_header, flipped, 1) < 0)
+	{
+	  free (buf);
+	  free (flipped);
+	  return NULL;				/* errno is set for us.  */
+	}
+      src = flipped;
+    }
+
+  if (uncompressed)
+    {
+      memcpy (bp, src, fp->ctf_size);
       *size += fp->ctf_size;
     }
   else
     {
-      hp->cth_flags |= CTF_F_COMPRESS;
       if ((rc = compress (bp, (uLongf *) &compress_len,
-			  fp->ctf_buf, fp->ctf_size)) != Z_OK)
+			  src, fp->ctf_size)) != Z_OK)
 	{
 	  ctf_set_errno (fp, ECTF_COMPRESS);
 	  ctf_err_warn (fp, 0, 0, _("zlib deflate err: %s"), zError (rc));
@@ -1378,45 +1350,77 @@ ctf_write_mem (ctf_dict_t *fp, size_t *size, size_t threshold)
 	}
       *size += compress_len;
     }
+
+  free (flipped);
+
   return buf;
 }
 
-/* Write the uncompressed CTF data stream to the specified file descriptor.  */
+/* Compress the specified CTF data stream and write it to the specified file
+   descriptor.  */
 int
-ctf_write (ctf_dict_t *fp, int fd)
+ctf_compress_write (ctf_dict_t *fp, int fd)
 {
-  const unsigned char *buf;
-  ssize_t resid;
+  unsigned char *buf;
+  unsigned char *bp;
+  size_t tmp;
+  ssize_t buf_len;
   ssize_t len;
+  int err = 0;
 
-  if (ctf_serialize (fp) < 0)
+  if ((buf = ctf_write_mem (fp, &tmp, 0)) == NULL)
     return -1;					/* errno is set for us.  */
 
-  resid = sizeof (ctf_header_t);
-  buf = (unsigned char *) fp->ctf_header;
-  while (resid != 0)
+  buf_len = tmp;
+  bp = buf;
+
+  while (buf_len > 0)
     {
-      if ((len = write (fd, buf, resid)) <= 0)
+      if ((len = write (fd, bp, buf_len)) < 0)
 	{
-	  ctf_err_warn (fp, 0, errno, _("ctf_write: error writing header"));
-	  return (ctf_set_errno (fp, errno));
+	  err = ctf_set_errno (fp, errno);
+	  ctf_err_warn (fp, 0, 0, _("ctf_compress_write: error writing"));
+	  goto ret;
 	}
-      resid -= len;
-      buf += len;
+      buf_len -= len;
+      bp += len;
     }
 
-  resid = fp->ctf_size;
-  buf = fp->ctf_buf;
-  while (resid != 0)
+ret:
+  free (buf);
+  return err;
+}
+
+/* Write the uncompressed CTF data stream to the specified file descriptor.  */
+int
+ctf_write (ctf_dict_t *fp, int fd)
+{
+  unsigned char *buf;
+  unsigned char *bp;
+  size_t tmp;
+  ssize_t buf_len;
+  ssize_t len;
+  int err = 0;
+
+  if ((buf = ctf_write_mem (fp, &tmp, (size_t) -1)) == NULL)
+    return -1;					/* errno is set for us.  */
+
+  buf_len = tmp;
+  bp = buf;
+
+  while (buf_len > 0)
     {
-      if ((len = write (fd, buf, resid)) <= 0)
+      if ((len = write (fd, bp, buf_len)) < 0)
 	{
-	  ctf_err_warn (fp, 0, errno, _("ctf_write: error writing"));
-	  return (ctf_set_errno (fp, errno));
+	  err = ctf_set_errno (fp, errno);
+	  ctf_err_warn (fp, 0, 0, _("ctf_compress_write: error writing"));
+	  goto ret;
 	}
-      resid -= len;
-      buf += len;
+      buf_len -= len;
+      bp += len;
     }
 
-  return 0;
+ret:
+  free (buf);
+  return err;
 }
-- 
2.35.1.261.g8402f930ba.dirty


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/4] include, libctf, ld: extend variable section to contain functions too
  2022-03-21 18:24 ` [PATCH 2/4] include, libctf, ld: extend variable section to contain functions too Nick Alcock
@ 2022-03-23 13:54   ` Nick Alcock
  0 siblings, 0 replies; 7+ messages in thread
From: Nick Alcock @ 2022-03-23 13:54 UTC (permalink / raw)
  To: binutils

On 21 Mar 2022, Nick Alcock via Binutils verbalised:

> The CTF variable section is an optional (usually-not-present) section in
> the CTF dict which contains name -> type mappings corresponding to data
> symbols that are present in the linker input but not in the output
> symbol table: the idea is that programs that use their own symbol-
> resolution mechanisms can use this section to look up the types of
> symbols they have found using their own mechanism.

Tested (a *lot*, because of the changes to the writeout path) and pushed
this series. (Hopefully that's not a problem without review: even though
it touches ld/, it only touches bits of ld/ related purely to ctf
testing)

Pushed, with the following addition to libctf/NEWS:

> Changes in 2.39:
> 
> * New features
> 
> ** The CTF variable section (if generated via ld --ctf-variables) now contains
>    entries for static functions, hidden functions, and other functions with
>    no associated symbol.  The associated type is of kind CTF_K_FUNCTION.
>    (No change if --ctf-variables is not specified, which is the default.)

(LIBCTF_WRITE_FOREIGN_ENDIAN, being a debugging option, doesn't rate a
NEWS entry.)

(This push should also fix Sourceware bug 28933, and add enough testing
infrastructure to be sure it doesn't recur.)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/4] libctf, ld: diagnose corrupted CTF header cth_strlen
  2022-03-21 18:24 ` [PATCH 3/4] libctf, ld: diagnose corrupted CTF header cth_strlen Nick Alcock
@ 2022-03-23 13:56   ` Nick Alcock
  0 siblings, 0 replies; 7+ messages in thread
From: Nick Alcock @ 2022-03-23 13:56 UTC (permalink / raw)
  To: binutils

On 21 Mar 2022, Nick Alcock via Binutils outgrape:

> The last section in a CTF dict is the string table, at an offset
> represented by the cth_stroff header field.  Its length is recorded in
> the next field, cth_strlen, and the two added together are taken as the
> size of the CTF dict.  Upon opening a dict, we check that none of the
> header offsets exceed this size, and we check when uncompressing a
> compressed dict that the result of the uncompression is the same length:
> but CTF dicts need not be compressed, and short ones are not.
> Uncompressed dicts just use the ctf_size without checking it.  This
> field is thankfully almost unused: it is mostly used when reserializing
> a dict, which can't be done to dicts read off disk since they're
> read-only.

I'll backport this commit, but not any of the others, to 2.38 shortly
(got to do at least some testing on it first).

I could backport to 2.37 as well if anyone thinks this
really-rather-unlikely-to-happen overrun is worth it (you have to
transport CTF written on a machine with the opposite endianness *and* it
has to be small enough to be uncompressed, which is distinctly rare...)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-03-23 13:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-21 18:24 [PATCH 0/4] various libctf-specific fixes Nick Alcock
2022-03-21 18:24 ` [PATCH 1/4] ld, testsuite: improve CTF-availability test Nick Alcock
2022-03-21 18:24 ` [PATCH 2/4] include, libctf, ld: extend variable section to contain functions too Nick Alcock
2022-03-23 13:54   ` Nick Alcock
2022-03-21 18:24 ` [PATCH 3/4] libctf, ld: diagnose corrupted CTF header cth_strlen Nick Alcock
2022-03-23 13:56   ` Nick Alcock
2022-03-21 18:24 ` [PATCH 4/4] libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option Nick Alcock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).