[PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes)

public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed

* [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes)
@ 2024-04-17 20:19 Nick Alcock
  2024-04-17 20:19 ` [PATCH 01/22] binutils, objdump: Add --ctf-parent-section Nick Alcock
                   ` (22 more replies)
  0 siblings, 23 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:19 UTC (permalink / raw)
  To: binutils; +Cc: Nicholas Vinson

A longstanding restriction of libctf is that open CTF dicts are divided
into two varieties: one that you can create, add stuff to and then write
out and throw away, and one that you can open but then never add
anything to: the dict is forever read-only.

This distinction is not entirely original sin.  Solaris libctf and its users
had remnants of code that suggested that it was intended to be possible to
read in CTF and at least modify it, but this was never properly implemented
and would at best have caused memory corruption.  Most attempts failed with
an ECTF_RDONLY error.

This was not at all helped by the design decision to split the set of
types libctf saw into 'dynamic' (added by ctf_add_*) and 'static', have
lookups work only on static types, and have ctf_update() work by
throwing the dict's static types away and reserializing them from the
dynamic types.  This stopped type lookup working until you did a
ctf_update() to reserialize the entire dict, at increasingly horrible
performance cost, and meant that libctf had to in effect handle dicts
that were mixtures of read-only and writable dicts while gaining none of
the benefits of doing that.

The performance cost and need to call ctf_update() have long been fixed,
and lookups now work on all types however added, but the restriction
that writable dicts came from ctf_create() and read-only ones came from
ctf_*open(), and that you couldn't save the latter, persisted.  Worse
yet, if you tried to save writable dicts more than once things often
went wrong (strtab corruption was commonplace), even if you did nothing
at all to them between the saves.

This series tries to clean all that up, in part so we can save dicts and
make transformations to what we save without affecting the dict itself, and
certainly without corrupting anything.

Ignoring a few commits that introduce a minor new option to objdump, fix an
unfortunate error in lookups of bitfield types by name, and fix typos and
leaks, this series is divided into two halves:

 - patches up to the reversion in the middle, which make the readonliness of
   dicts apply to *types* instead of the dict as a whole: in particular, you
   cannot add members to structs, enums, or unions that were read in from
   files.  You can add references to them, and add new types of any kind
   freely, which was more or less easy except for the symbol handling code,
   which needed a good bit of rejigging (and bugfixing) in the process.

 - the reversion and patches beyond it discards an old internal strtab
   abstraction which proved to be much more trouble than it was worth
   ("pending refs") and replaces it with a new scheme which fixes corruption
   of the string table if serialized more than once, drops any need to scan
   existing types for references to strings (so we can just blindly copy the
   existing static type table from a ctf_open()ed dict and append to it when
   saving it again), and redoes serialization and the writeout functions so
   that while it does make a few changes to the dict being read in (the
   strtab is regenerated), the types table is not affected, and there is no
   "replace the guts of this type table with a serialized copy" nonsense
   like libctf has always had before now: we just emit everything into a new
   buffer and return it.  Old types already present when the dict was
   ctf_opened need not be traversed at all (we have to traverse the
   symtypetabs and variables sections because they are sorted, so any new
   entries probably appear in the middle).

   The result is noticeably simpler and avoids a lot of boilerplate where
   you had to remember to copy every field in the struct ctf_dict (and
   remember to augment this list when adding new fields, which was routinely
   forgotten, triggering different subtle bugs every time).  It also fixes a
   couple of completely broken API functions, notably ctf_gzwrite(), which
   while inconvenient and annoying to use should not completely fail to
   serialize the dict before writing it out...

The last couple of patches, one due to Nicholas Vinson and the other very
similar to one he wrote, fixes bugs that break building with recent LLD (LLD
is stricter than GNU ld with respect to version scripts these days).

The usual giant pile of tests have been run: all look happy. I'm going to
run the trybot over it shortly.

I'll apply it in a couple of days if nobody says otherwise.

Cc: Nicholas Vinson <nvinson234@gmail.com>

Nicholas Vinson (1):
  libctf: Remove undefined functions from ver. map

Nick Alcock (21):
  binutils, objdump: Add --ctf-parent-section
  libctf: don't leak the symbol name in the name->type cache
  libctf: remove static/dynamic name lookup distinction
  libctf: fix name lookup in dicts containing base-type bitfields
  libctf: support addition of types to dicts read via ctf_open()
  libctf: fix a comment
  libctf: delete LCTF_DIRTY
  libctf: fix a comment typo
  libctf: rename ctf_dict.ctf_{symtab,strtab}
  Revert "libctf: do not corrupt strings across ctf_serialize"
  libctf: replace 'pending refs' abstraction
  libctf: rethink strtab writeout
  libctf: make ctf_serialize() actually serialize
  libctf: fix tiny dumping error
  libctf: improve handling of type dumping errors
  libctf: make ctf_lookup of symbols by name work in more cases
  libctf: fix a debugging typo
  libctf: add rewriting tests
  libctf: fix leak in test
  libctf: don't pass errno into ctf_err_warn so often
  libctf: do not include undefined functions in libctf.ver

 binutils/doc/ctf.options.texi                 |  10 +
 binutils/objdump.c                            |  56 +-
 libctf/configure                              |  21 +-
 libctf/configure.ac                           |  21 +-
 libctf/ctf-archive.c                          |   9 +-
 libctf/ctf-create.c                           | 252 ++++---
 libctf/ctf-dedup.c                            |   8 +-
 libctf/ctf-dump.c                             |  10 +-
 libctf/ctf-hash.c                             | 112 +---
 libctf/ctf-impl.h                             | 116 ++--
 libctf/ctf-link.c                             |  38 +-
 libctf/ctf-lookup.c                           | 372 +++++++----
 libctf/ctf-open.c                             | 341 +++++-----
 libctf/ctf-serialize.c                        | 406 +++++-------
 libctf/ctf-string.c                           | 620 ++++++++++++------
 libctf/ctf-subr.c                             |   6 +-
 libctf/ctf-types.c                            |  46 +-
 libctf/ctf-util.c                             |  13 -
 libctf/libctf.ver                             |   5 +-
 .../libctf-lookup/add-to-opened-ctf.c         |  19 +
 .../testsuite/libctf-lookup/add-to-opened.c   | 148 +++++
 .../testsuite/libctf-lookup/add-to-opened.lk  |   3 +
 .../libctf-lookup/conflicting-type-syms.c     |   4 +
 .../libctf-regression/gzrewrite-ctf.c         |  19 +
 .../testsuite/libctf-regression/gzrewrite.c   | 165 +++++
 .../testsuite/libctf-regression/gzrewrite.lk  |   3 +
 libctf/testsuite/libctf-regression/zrewrite.c | 156 +++++
 .../testsuite/libctf-regression/zrewrite.lk   |   3 +
 .../libctf-bitfield-name-lookup.c             | 136 ++++
 .../libctf-bitfield-name-lookup.lk            |   1 +
 30 files changed, 1983 insertions(+), 1136 deletions(-)
 create mode 100644 libctf/testsuite/libctf-lookup/add-to-opened-ctf.c
 create mode 100644 libctf/testsuite/libctf-lookup/add-to-opened.c
 create mode 100644 libctf/testsuite/libctf-lookup/add-to-opened.lk
 create mode 100644 libctf/testsuite/libctf-regression/gzrewrite-ctf.c
 create mode 100644 libctf/testsuite/libctf-regression/gzrewrite.c
 create mode 100644 libctf/testsuite/libctf-regression/gzrewrite.lk
 create mode 100644 libctf/testsuite/libctf-regression/zrewrite.c
 create mode 100644 libctf/testsuite/libctf-regression/zrewrite.lk
 create mode 100644 libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c
 create mode 100644 libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk

-- 
2.44.0.273.ge0bd14271f

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 01/22] binutils, objdump: Add --ctf-parent-section
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
@ 2024-04-17 20:19 ` Nick Alcock
  2024-04-18  2:05   ` Alan Modra
  2024-04-17 20:19 ` [PATCH 02/22] libctf: don't leak the symbol name in the name->type cache Nick Alcock
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:19 UTC (permalink / raw)
  To: binutils

This lets you examine CTF where the parent and child dicts are in entirely
different sections, rather than in a CTF archive with members with different
names.  The linker doesn't emit ELF objects structured like this, but some
third-party linkers may; it's also useful for objcopy-constructed files
in some cases.

(This is what the objdump --ctf-parent option used to do before commit
80b56fad5c99a8c9 in 2021.  The new semantics of that option are much more
useful, but that doesn't mean the old ones are never useful at all, so let's
bring them back.)

(I was specifically driven to add this by DTrace's obscure "ctypes" and
"dtypes" options, which dump its internal, dynamically-generated dicts out
to files for debugging purposes: there are two, one the parent of the other.
Since they're in two separate files rather than a CTF archive and we have no
tools that paste files together into archives, objdump wouldn't show them --
and even pasting them together into an ELF executable with objcopy didn't
help, since objdump had no options that could be used to look in specific
sections for the parent dict.  With --ctf-parent-section, this sort of
obscure use case becomes possible again.  You'll never need it for the
output of the normal linker.)

binutils/

	* doc/ctf.options.texi: Add --ctf-parent-section=.
	* objdump.c (dump_ctf): Implement it.
	(dump_bfd): Likewise.
	(main): Likewise.
---
 binutils/doc/ctf.options.texi | 10 +++++++
 binutils/objdump.c            | 56 ++++++++++++++++++++++++++++++-----
 2 files changed, 58 insertions(+), 8 deletions(-)

diff --git a/binutils/doc/ctf.options.texi b/binutils/doc/ctf.options.texi
index 2820946f2c0..0b04e9df426 100644
--- a/binutils/doc/ctf.options.texi
+++ b/binutils/doc/ctf.options.texi
@@ -22,3 +22,13 @@ function at link time.  When looking at CTF archives that have been
 created by a linker that uses the name changer to rename the parent
 archive member, @option{--ctf-parent} can be used to specify the name
 used for the parent.
+
+@item --ctf-parent-section=@var{section}
+
+This option lets you pick a completely different section for the CTF
+parent dictionary containing unambiguous types than for the child
+dictionaries that contain the ambiguous remainder.  The linker does
+not emit ELF objects structured like this, but some third-party linkers
+may.  It's also convenient to inspect CTF written out as multiple raw
+files to compose them with objcopy, which can put them in different
+ELF sections but not in different members of a single CTF dict.
diff --git a/binutils/objdump.c b/binutils/objdump.c
index 6396174d50f..9db1b4915c9 100644
--- a/binutils/objdump.c
+++ b/binutils/objdump.c
@@ -108,6 +108,7 @@ static int dump_stab_section_info;	/* --stabs */
 static int dump_ctf_section_info;       /* --ctf */
 static char *dump_ctf_section_name;
 static char *dump_ctf_parent_name;	/* --ctf-parent */
+static char *dump_ctf_parent_section_name;	/* --ctf-parent-section */
 static int dump_sframe_section_info;	/* --sframe */
 static char *dump_sframe_section_name;
 static int do_demangle;			/* -C, --demangle */
@@ -485,6 +486,7 @@ enum option_values
 #ifdef ENABLE_LIBCTF
     OPTION_CTF,
     OPTION_CTF_PARENT,
+    OPTION_CTF_PARENT_SECTION,
 #endif
     OPTION_SFRAME,
     OPTION_VISUALIZE_JUMPS,
@@ -500,6 +502,7 @@ static struct option long_options[]=
 #ifdef ENABLE_LIBCTF
   {"ctf", optional_argument, NULL, OPTION_CTF},
   {"ctf-parent", required_argument, NULL, OPTION_CTF_PARENT},
+  {"ctf-parent-section", required_argument, NULL, OPTION_CTF_PARENT_SECTION},
 #endif
   {"debugging", no_argument, NULL, 'g'},
   {"debugging-tags", no_argument, NULL, 'e'},
@@ -4854,11 +4857,14 @@ dump_ctf_archive_member (ctf_dict_t *ctf, const char *name, ctf_dict_t *parent,
 /* Dump the CTF debugging information.  */
 
 static void
-dump_ctf (bfd *abfd, const char *sect_name, const char *parent_name)
+dump_ctf (bfd *abfd, const char *sect_name, const char *parent_name,
+	  const char *parent_sect_name)
 {
-  asection *sec;
-  ctf_archive_t *ctfa = NULL;
-  bfd_byte *ctfdata;
+  asection *sec, *psec = NULL;
+  ctf_archive_t *ctfa;
+  ctf_archive_t *ctfpa = NULL;
+  bfd_byte *ctfdata = NULL;
+  bfd_byte *ctfpdata = NULL;
   ctf_sect_t ctfsect;
   ctf_dict_t *parent;
   ctf_dict_t *fp;
@@ -4878,7 +4884,8 @@ dump_ctf (bfd *abfd, const char *sect_name, const char *parent_name)
     }
 
   /* Load the CTF file and dump it.  Preload the parent dict, since it will
-     need to be imported into every child in turn. */
+     need to be imported into every child in turn.  The parent dict may come
+     from a different section entirely.  */
 
   ctfsect = make_ctfsect (sect_name, ctfdata, bfd_section_size (sec));
   if ((ctfa = ctf_bfdopen_ctfsect (abfd, &ctfsect, &err)) == NULL)
@@ -4890,13 +4897,36 @@ dump_ctf (bfd *abfd, const char *sect_name, const char *parent_name)
       return;
     }
 
-  if ((parent = ctf_dict_open (ctfa, parent_name, &err)) == NULL)
+  if (parent_sect_name) {
+    psec = read_section (abfd, parent_sect_name, &ctfpdata);
+    if (sec == NULL) {
+      my_bfd_nonfatal (bfd_get_filename (abfd));
+      free (ctfdata);
+      return;
+    }
+
+    ctfsect = make_ctfsect (parent_sect_name, ctfpdata, bfd_section_size (psec));
+    if ((ctfpa = ctf_bfdopen_ctfsect (abfd, &ctfsect, &err)) == NULL)
+      {
+	dump_ctf_errs (NULL);
+	non_fatal (_("CTF open failure: %s"), ctf_errmsg (err));
+	my_bfd_nonfatal (bfd_get_filename (abfd));
+	free (ctfdata);
+	free (ctfpdata);
+	return;
+      }
+  }
+  else
+    ctfpa = ctfa;
+
+  if ((parent = ctf_dict_open (ctfpa, parent_name, &err)) == NULL)
     {
       dump_ctf_errs (NULL);
       non_fatal (_("CTF open failure: %s"), ctf_errmsg (err));
       my_bfd_nonfatal (bfd_get_filename (abfd));
       ctf_close (ctfa);
       free (ctfdata);
+      free (ctfpdata);
       return;
     }
 
@@ -4913,11 +4943,16 @@ dump_ctf (bfd *abfd, const char *sect_name, const char *parent_name)
   ctf_dict_close (parent);
   ctf_close (ctfa);
   free (ctfdata);
+  if (parent_sect_name) {
+    ctf_close (ctfpa);
+    free (ctfpdata);
+  }
 }
 #else
 static void
 dump_ctf (bfd *abfd ATTRIBUTE_UNUSED, const char *sect_name ATTRIBUTE_UNUSED,
-	  const char *parent_name ATTRIBUTE_UNUSED) {}
+	  const char *parent_name ATTRIBUTE_UNUSED,
+	  const char *parent_sect_name ATTRIBUTE_UNUSED) {}
 #endif
 
 static void
@@ -5733,7 +5768,8 @@ dump_bfd (bfd *abfd, bool is_mainfile)
   if (is_mainfile || process_links)
     {
       if (dump_ctf_section_info)
-	dump_ctf (abfd, dump_ctf_section_name, dump_ctf_parent_name);
+	dump_ctf (abfd, dump_ctf_section_name, dump_ctf_parent_name,
+		  dump_ctf_parent_section_name);
       if (dump_sframe_section_info)
 	dump_section_sframe (abfd, dump_sframe_section_name);
       if (dump_stab_section_info)
@@ -6243,6 +6279,9 @@ main (int argc, char **argv)
 	case OPTION_CTF_PARENT:
 	  dump_ctf_parent_name = xstrdup (optarg);
 	  break;
+	case OPTION_CTF_PARENT_SECTION:
+	  dump_ctf_parent_section_name = xstrdup (optarg);
+	  break;
 #endif
 	case OPTION_SFRAME:
 	  dump_sframe_section_info = true;
@@ -6337,6 +6376,7 @@ main (int argc, char **argv)
   free (dump_ctf_section_name);
   free (dump_ctf_parent_name);
   free ((void *) source_comment);
+  free (dump_ctf_parent_section_name);
 
   return exit_status;
 }
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 02/22] libctf: don't leak the symbol name in the name->type cache
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
  2024-04-17 20:19 ` [PATCH 01/22] binutils, objdump: Add --ctf-parent-section Nick Alcock
@ 2024-04-17 20:19 ` Nick Alcock
  2024-04-17 20:19 ` [PATCH 03/22] libctf: remove static/dynamic name lookup distinction Nick Alcock
                   ` (20 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:19 UTC (permalink / raw)
  To: binutils

This cache replaced a cache of symbol index->ctf_id_t. That cache was
just an array, so it could get away with just being free()d, but the
ctfi_symnamedicts cache that replaced it is a full dynhash with a
dynamically-allocated string as the key.  As such, it needs freeing with
ctf_dynhash_destroy(), not just free(), or we leak parts of the
underlying hashtab, and all the keys.

libctf/ChangeLog:

	* ctf-archive.c (ctf_arc_flush_caches): Fix leak.
---
 libctf/ctf-archive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libctf/ctf-archive.c b/libctf/ctf-archive.c
index a0ea838ddc4..a88c6135e1a 100644
--- a/libctf/ctf-archive.c
+++ b/libctf/ctf-archive.c
@@ -699,7 +699,7 @@ void
 ctf_arc_flush_caches (ctf_archive_t *wrapper)
 {
   free (wrapper->ctfi_symdicts);
-  free (wrapper->ctfi_symnamedicts);
+  ctf_dynhash_destroy (wrapper->ctfi_symnamedicts);
   ctf_dynhash_destroy (wrapper->ctfi_dicts);
   wrapper->ctfi_symdicts = NULL;
   wrapper->ctfi_symnamedicts = NULL;
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 03/22] libctf: remove static/dynamic name lookup distinction
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
  2024-04-17 20:19 ` [PATCH 01/22] binutils, objdump: Add --ctf-parent-section Nick Alcock
  2024-04-17 20:19 ` [PATCH 02/22] libctf: don't leak the symbol name in the name->type cache Nick Alcock
@ 2024-04-17 20:19 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 04/22] libctf: fix name lookup in dicts containing base-type bitfields Nick Alcock
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:19 UTC (permalink / raw)
  To: binutils

libctf internally maintains a set of hash tables for type name lookups,
one for each valid C type namespace (struct, union, enum, and everything
else).

Or, rather, it maintains *two* sets of hash tables: one, a ctf_hash *,
is meant for lookups in ctf_(buf)open()ed dicts with fixed content; the
other, a ctf_dynhash *, is meant for lookups in ctf_create()d dicts.

This distinction was somewhat valuable in the far pre-binutils past when
two different hashtable implementations were used (one expanding, the
other fixed-size), but those days are long gone: the hash table
implementations are almost identical, both wrappers around the libiberty
hashtab. The ctf_dynhash has many more capabilities than the ctf_hash
(iteration, deletion, etc etc) and has no downsides other than starting
at a fixed, arbitrary small size.

That limitation is easy to lift (via a new ctf_dynhash_create_sized()),
following which we can throw away nearly all the ctf_hash
implementation, and all the code to choose between readable and writable
hashtabs; the few convenience functions that are still useful (for
insertion of name -> type mappings) can also be generalized a bit so
that the extra string verification they do is potentially available to
other string lookups as well.

(libctf still has two hashtable implementations, ctf_dynhash, above,
and ctf_dynset, which is a key-only hashtab that can avoid a great many
malloc()s, used for high-volume applications in the deduplicator.)

libctf/

	* ctf-create.c (ctf_create): Eliminate ctn_writable.
	(ctf_dtd_insert): Likewise.
	(ctf_dtd_delete): Likewise.
	(ctf_rollback): Likewise.
	(ctf_name_table): Eliminate ctf_names_t.
	* ctf-hash.c (ctf_dynhash_create): Comment update.
        Reimplement in terms of...
	(ctf_dynhash_create_sized): ... this new function.
	(ctf_hash_create): Remove.
	(ctf_hash_size): Remove.
	(ctf_hash_define_type): Remove.
	(ctf_hash_destroy): Remove.
	(ctf_hash_lookup_type): Rename to...
	(ctf_dynhash_lookup_type): ... this.
	(ctf_hash_insert_type): Rename to...
	(ctf_dynhash_insert_type): ... this, moving validation to...
	* ctf-string.c (ctf_strptr_validate): ... this new function.
	* ctf-impl.h (struct ctf_names): Extirpate.
	(struct ctf_lookup.ctl_hash): Now a ctf_dynhash_t.
	(struct ctf_dict): All ctf_names_t fields are now ctf_dynhash_t.
	(ctf_name_table): Now returns a ctf_dynhash_t.
	(ctf_lookup_by_rawhash): Remove.
	(ctf_hash_create): Likewise.
	(ctf_hash_insert_type): Likewise.
	(ctf_hash_define_type): Likewise.
	(ctf_hash_lookup_type): Likewise.
	(ctf_hash_size): Likewise.
	(ctf_hash_destroy): Likewise.
	(ctf_dynhash_create_sized): New.
	(ctf_dynhash_insert_type): New.
	(ctf_dynhash_lookup_type): New.
	(ctf_strptr_validate): New.
	* ctf-lookup.c (ctf_lookup_by_name_internal): Adapt.
	* ctf-open.c (init_types): Adapt.
	(ctf_set_ctl_hashes): Adapt.
	(ctf_dict_close): Adapt.
	* ctf-serialize.c (ctf_serialize): Adapt.
	* ctf-types.c (ctf_lookup_by_rawhash): Remove.
---
 libctf/ctf-create.c    |  26 ++++----
 libctf/ctf-hash.c      | 106 +++++++++++--------------------
 libctf/ctf-impl.h      |  34 +++++-----
 libctf/ctf-lookup.c    |   5 +-
 libctf/ctf-open.c      | 137 +++++++++++++++++++----------------------
 libctf/ctf-serialize.c |   8 +--
 libctf/ctf-string.c    |  23 +++++++
 libctf/ctf-types.c     |  18 +-----
 8 files changed, 156 insertions(+), 201 deletions(-)

diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index 21fbad714a9..240f3dad9ff 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -154,10 +154,10 @@ ctf_create (int *errp)
   if ((fp = ctf_bufopen_internal (&cts, NULL, NULL, NULL, 1, errp)) == NULL)
     goto err_dv;
 
-  fp->ctf_structs.ctn_writable = structs;
-  fp->ctf_unions.ctn_writable = unions;
-  fp->ctf_enums.ctn_writable = enums;
-  fp->ctf_names.ctn_writable = names;
+  fp->ctf_structs = structs;
+  fp->ctf_unions = unions;
+  fp->ctf_enums = enums;
+  fp->ctf_names = names;
   fp->ctf_objthash = objthash;
   fp->ctf_funchash = funchash;
   fp->ctf_dthash = dthash;
@@ -203,19 +203,19 @@ ctf_update (ctf_dict_t *fp)
   return 0;
 }
 
-ctf_names_t *
+ctf_dynhash_t *
 ctf_name_table (ctf_dict_t *fp, int kind)
 {
   switch (kind)
     {
     case CTF_K_STRUCT:
-      return &fp->ctf_structs;
+      return fp->ctf_structs;
     case CTF_K_UNION:
-      return &fp->ctf_unions;
+      return fp->ctf_unions;
     case CTF_K_ENUM:
-      return &fp->ctf_enums;
+      return fp->ctf_enums;
     default:
-      return &fp->ctf_names;
+      return fp->ctf_names;
     }
 }
 
@@ -230,7 +230,7 @@ ctf_dtd_insert (ctf_dict_t *fp, ctf_dtdef_t *dtd, int flag, int kind)
   if (flag == CTF_ADD_ROOT && dtd->dtd_data.ctt_name
       && (name = ctf_strraw (fp, dtd->dtd_data.ctt_name)) != NULL)
     {
-      if (ctf_dynhash_insert (ctf_name_table (fp, kind)->ctn_writable,
+      if (ctf_dynhash_insert (ctf_name_table (fp, kind),
 			      (char *) name, (void *) (uintptr_t)
 			      dtd->dtd_type) < 0)
 	{
@@ -287,8 +287,7 @@ ctf_dtd_delete (ctf_dict_t *fp, ctf_dtdef_t *dtd)
       && (name = ctf_strraw (fp, dtd->dtd_data.ctt_name)) != NULL
       && LCTF_INFO_ISROOT (fp, dtd->dtd_data.ctt_info))
     {
-      ctf_dynhash_remove (ctf_name_table (fp, name_kind)->ctn_writable,
-			  name);
+      ctf_dynhash_remove (ctf_name_table (fp, name_kind), name);
       ctf_str_remove_ref (fp, name, &dtd->dtd_data.ctt_name);
     }
 
@@ -410,8 +409,7 @@ ctf_rollback (ctf_dict_t *fp, ctf_snapshot_id_t id)
 	  && (name = ctf_strraw (fp, dtd->dtd_data.ctt_name)) != NULL
 	  && LCTF_INFO_ISROOT (fp, dtd->dtd_data.ctt_info))
 	{
-	  ctf_dynhash_remove (ctf_name_table (fp, kind)->ctn_writable,
-			      name);
+	  ctf_dynhash_remove (ctf_name_table (fp, kind), name);
 	  ctf_str_remove_ref (fp, name, &dtd->dtd_data.ctt_name);
 	}
 
diff --git a/libctf/ctf-hash.c b/libctf/ctf-hash.c
index 1bc539e3617..f8032ae4d86 100644
--- a/libctf/ctf-hash.c
+++ b/libctf/ctf-hash.c
@@ -22,25 +22,22 @@
 #include "libiberty.h"
 #include "hashtab.h"
 
-/* We have three hashtable implementations:
-
-   - ctf_hash_* is an interface to a fixed-size hash from const char * ->
-     ctf_id_t with number of elements specified at creation time, that should
-     support addition of items but need not support removal.
+/* We have two hashtable implementations:
 
    - ctf_dynhash_* is an interface to a dynamically-expanding hash with
-     unknown size that should support addition of large numbers of items, and
-     removal as well, and is used only at type-insertion time and during
-     linking.
+     unknown size that should support addition of large numbers of items,
+     and removal as well, and is used only at type-insertion time and during
+     linking.  It can be constructed with an expected initial number of
+     elements, but need not be.
 
    - ctf_dynset_* is an interface to a dynamically-expanding hash that contains
      only keys: no values.
 
    These can be implemented by the same underlying hashmap if you wish.  */
 
-/* The helem is used for general key/value mappings in both the ctf_hash and
-   ctf_dynhash: the owner may not have space allocated for it, and will be
-   garbage (not NULL!) in that case.  */
+/* The helem is used for general key/value mappings in the ctf_dynhash: the
+   owner may not have space allocated for it, and will be garbage (not
+   NULL!) in that case.  */
 
 typedef struct ctf_helem
 {
@@ -157,8 +154,9 @@ ctf_dynhash_item_free (void *item)
 }
 
 ctf_dynhash_t *
-ctf_dynhash_create (ctf_hash_fun hash_fun, ctf_hash_eq_fun eq_fun,
-                    ctf_hash_free_fun key_free, ctf_hash_free_fun value_free)
+ctf_dynhash_create_sized (unsigned long nelems, ctf_hash_fun hash_fun,
+			  ctf_hash_eq_fun eq_fun, ctf_hash_free_fun key_free,
+			  ctf_hash_free_fun value_free)
 {
   ctf_dynhash_t *dynhash;
   htab_del del = ctf_dynhash_item_free;
@@ -173,8 +171,7 @@ ctf_dynhash_create (ctf_hash_fun hash_fun, ctf_hash_eq_fun eq_fun,
   if (key_free == NULL && value_free == NULL)
     del = free;
 
-  /* 7 is arbitrary and untested for now.  */
-  if ((dynhash->htab = htab_create_alloc (7, (htab_hash) hash_fun, eq_fun,
+  if ((dynhash->htab = htab_create_alloc (nelems, (htab_hash) hash_fun, eq_fun,
 					  del, xcalloc, free)) == NULL)
     {
       free (dynhash);
@@ -190,6 +187,15 @@ ctf_dynhash_create (ctf_hash_fun hash_fun, ctf_hash_eq_fun eq_fun,
   return dynhash;
 }
 
+ctf_dynhash_t *
+ctf_dynhash_create (ctf_hash_fun hash_fun, ctf_hash_eq_fun eq_fun,
+		    ctf_hash_free_fun key_free, ctf_hash_free_fun value_free)
+{
+  /* 7 is arbitrary and not benchmarked yet.  */
+
+  return ctf_dynhash_create_sized (7, hash_fun, eq_fun, key_free, value_free);
+}
+
 static ctf_helem_t **
 ctf_hashtab_lookup (struct htab *htab, const void *key, enum insert_option insert)
 {
@@ -767,80 +773,38 @@ ctf_dynset_next (ctf_dynset_t *hp, ctf_next_t **it, void **key)
   return ECTF_NEXT_END;
 }
 
-/* ctf_hash, used for fixed-size maps from const char * -> ctf_id_t without
-   removal.  This is a straight cast of a hashtab.  */
-
-ctf_hash_t *
-ctf_hash_create (unsigned long nelems, ctf_hash_fun hash_fun,
-		 ctf_hash_eq_fun eq_fun)
-{
-  return (ctf_hash_t *) htab_create_alloc (nelems, (htab_hash) hash_fun,
-					   eq_fun, free, xcalloc, free);
-}
-
-uint32_t
-ctf_hash_size (const ctf_hash_t *hp)
-{
-  return htab_elements ((struct htab *) hp);
-}
+/* Helper functions for insertion/removal of types.  */
 
 int
-ctf_hash_insert_type (ctf_hash_t *hp, ctf_dict_t *fp, uint32_t type,
-		      uint32_t name)
+ctf_dynhash_insert_type (ctf_dict_t *fp, ctf_dynhash_t *hp, uint32_t type,
+			 uint32_t name)
 {
-  const char *str = ctf_strraw (fp, name);
+  const char *str;
+  int err;
 
   if (type == 0)
     return EINVAL;
 
-  if (str == NULL
-      && CTF_NAME_STID (name) == CTF_STRTAB_1
-      && fp->ctf_syn_ext_strtab == NULL
-      && fp->ctf_str[CTF_NAME_STID (name)].cts_strs == NULL)
-    return ECTF_STRTAB;
-
-  if (str == NULL)
-    return ECTF_BADNAME;
+  if ((str = ctf_strptr_validate (fp, name)) == NULL)
+    return ctf_errno (fp);
 
   if (str[0] == '\0')
     return 0;		   /* Just ignore empty strings on behalf of caller.  */
 
-  if (ctf_hashtab_insert ((struct htab *) hp, (char *) str,
-			  (void *) (ptrdiff_t) type, NULL, NULL) != NULL)
+  if ((err = ctf_dynhash_insert (hp, (char *) str,
+				 (void *) (ptrdiff_t) type)) == 0)
     return 0;
-  return errno;
-}
 
-/* if the key is already in the hash, override the previous definition with
-   this new official definition. If the key is not present, then call
-   ctf_hash_insert_type and hash it in.  */
-int
-ctf_hash_define_type (ctf_hash_t *hp, ctf_dict_t *fp, uint32_t type,
-                      uint32_t name)
-{
-  /* This matches the semantics of ctf_hash_insert_type in this
-     implementation anyway.  */
-
-  return ctf_hash_insert_type (hp, fp, type, name);
+  return err;
 }
 
 ctf_id_t
-ctf_hash_lookup_type (ctf_hash_t *hp, ctf_dict_t *fp __attribute__ ((__unused__)),
-		      const char *key)
+ctf_dynhash_lookup_type (ctf_dynhash_t *hp, const char *key)
 {
-  ctf_helem_t **slot;
+  void *value;
 
-  slot = ctf_hashtab_lookup ((struct htab *) hp, key, NO_INSERT);
-
-  if (slot)
-    return (ctf_id_t) (uintptr_t) ((*slot)->value);
+  if (ctf_dynhash_lookup_kv (hp, key, NULL, &value))
+    return (ctf_id_t) (uintptr_t) value;
 
   return 0;
 }
-
-void
-ctf_hash_destroy (ctf_hash_t *hp)
-{
-  if (hp != NULL)
-    htab_delete ((struct htab *) hp);
-}
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index 0fe9c20127a..8cbb2ae8242 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -123,17 +123,11 @@ typedef struct ctf_dmodel
   size_t ctd_long;		/* Size of long in bytes.  */
 } ctf_dmodel_t;
 
-typedef struct ctf_names
-{
-  ctf_hash_t *ctn_readonly;	/* Hash table when readonly.  */
-  ctf_dynhash_t *ctn_writable;	/* Hash table when writable.  */
-} ctf_names_t;
-
 typedef struct ctf_lookup
 {
   const char *ctl_prefix;	/* String prefix for this lookup.  */
   size_t ctl_len;		/* Length of prefix string in bytes.  */
-  ctf_names_t *ctl_hash;	/* Pointer to hash table for lookup.  */
+  ctf_dynhash_t *ctl_hash;	/* Pointer to hash table for lookup.  */
 } ctf_lookup_t;
 
 typedef struct ctf_dictops
@@ -382,10 +376,10 @@ struct ctf_dict
   ctf_dynhash_t *ctf_syn_ext_strtab; /* Maps ext-strtab offsets to names.  */
   void *ctf_data_mmapped;	    /* CTF data we mmapped, to free later.  */
   size_t ctf_data_mmapped_len;	    /* Length of CTF data we mmapped.  */
-  ctf_names_t ctf_structs;	    /* Hash table of struct types.  */
-  ctf_names_t ctf_unions;	    /* Hash table of union types.  */
-  ctf_names_t ctf_enums;	    /* Hash table of enum types.  */
-  ctf_names_t ctf_names;	    /* Hash table of remaining type names.  */
+  ctf_dynhash_t *ctf_structs;	    /* Hash table of struct types.  */
+  ctf_dynhash_t *ctf_unions;	    /* Hash table of union types.  */
+  ctf_dynhash_t *ctf_enums;	    /* Hash table of enum types.  */
+  ctf_dynhash_t *ctf_names;	    /* Hash table of remaining type names.  */
   ctf_lookup_t ctf_lookups[5];	    /* Pointers to nametabs for name lookup.  */
   ctf_strs_t ctf_str[2];	    /* Array of string table base and bounds.  */
   ctf_dynhash_t *ctf_str_atoms;	    /* Hash table of ctf_str_atoms_t.  */
@@ -597,10 +591,9 @@ struct ctf_next
 #define LCTF_DIRTY	0x0004	/* CTF dict has been modified.  */
 #define LCTF_LINKING	0x0008  /* CTF link is underway: respect ctf_link_flags.  */
 
-extern ctf_names_t *ctf_name_table (ctf_dict_t *, int);
+extern ctf_dynhash_t *ctf_name_table (ctf_dict_t *, int);
 extern const ctf_type_t *ctf_lookup_by_id (ctf_dict_t **, ctf_id_t);
 extern ctf_id_t ctf_lookup_by_rawname (ctf_dict_t *, int, const char *);
-extern ctf_id_t ctf_lookup_by_rawhash (ctf_dict_t *, ctf_names_t *, const char *);
 extern void ctf_set_ctl_hashes (ctf_dict_t *);
 
 extern int ctf_symtab_skippable (ctf_link_sym_t *sym);
@@ -629,19 +622,19 @@ typedef int (*ctf_hash_iter_find_f) (void *key, void *value, void *arg);
 typedef int (*ctf_hash_sort_f) (const ctf_next_hkv_t *, const ctf_next_hkv_t *,
 				void *arg);
 
-extern ctf_hash_t *ctf_hash_create (unsigned long, ctf_hash_fun, ctf_hash_eq_fun);
-extern int ctf_hash_insert_type (ctf_hash_t *, ctf_dict_t *, uint32_t, uint32_t);
-extern int ctf_hash_define_type (ctf_hash_t *, ctf_dict_t *, uint32_t, uint32_t);
-extern ctf_id_t ctf_hash_lookup_type (ctf_hash_t *, ctf_dict_t *, const char *);
-extern uint32_t ctf_hash_size (const ctf_hash_t *);
-extern void ctf_hash_destroy (ctf_hash_t *);
-
 extern ctf_dynhash_t *ctf_dynhash_create (ctf_hash_fun, ctf_hash_eq_fun,
 					  ctf_hash_free_fun, ctf_hash_free_fun);
+extern ctf_dynhash_t *ctf_dynhash_create_sized (unsigned long, ctf_hash_fun,
+						ctf_hash_eq_fun,
+						ctf_hash_free_fun,
+						ctf_hash_free_fun);
+
 extern int ctf_dynhash_insert (ctf_dynhash_t *, void *, void *);
 extern void ctf_dynhash_remove (ctf_dynhash_t *, const void *);
 extern size_t ctf_dynhash_elements (ctf_dynhash_t *);
 extern void ctf_dynhash_empty (ctf_dynhash_t *);
+extern int ctf_dynhash_insert_type (ctf_dict_t *, ctf_dynhash_t *, uint32_t, uint32_t);
+extern ctf_id_t ctf_dynhash_lookup_type (ctf_dynhash_t *, const char *);
 extern void *ctf_dynhash_lookup (ctf_dynhash_t *, const void *);
 extern int ctf_dynhash_lookup_kv (ctf_dynhash_t *, const void *key,
 				  const void **orig_key, void **value);
@@ -720,6 +713,7 @@ extern const char *ctf_strptr (ctf_dict_t *, uint32_t);
 extern const char *ctf_strraw (ctf_dict_t *, uint32_t);
 extern const char *ctf_strraw_explicit (ctf_dict_t *, uint32_t,
 					ctf_strs_t *);
+extern const char *ctf_strptr_validate (ctf_dict_t *, uint32_t);
 extern int ctf_str_create_atoms (ctf_dict_t *);
 extern void ctf_str_free_atoms (ctf_dict_t *);
 extern uint32_t ctf_str_add (ctf_dict_t *, const char *);
diff --git a/libctf/ctf-lookup.c b/libctf/ctf-lookup.c
index 9e736a8659c..b5d2637fe01 100644
--- a/libctf/ctf-lookup.c
+++ b/libctf/ctf-lookup.c
@@ -276,8 +276,9 @@ ctf_lookup_by_name_internal (ctf_dict_t *fp, ctf_dict_t *child,
 		    return ctf_set_typed_errno (fp, ENOMEM);
 		}
 
-	      if ((type = ctf_lookup_by_rawhash (fp, lp->ctl_hash,
-						 fp->ctf_tmp_typeslice)) == 0)
+	      if ((type = (ctf_id_t) (uintptr_t)
+		   ctf_dynhash_lookup (lp->ctl_hash,
+				       fp->ctf_tmp_typeslice)) == 0)
 		goto notype;
 
 	      break;
diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index cf0cb54720d..2945228ff2a 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -740,33 +740,33 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
   /* Now that we've counted up the number of each type, we can allocate
      the hash tables, type translation table, and pointer table.  */
 
-  if ((fp->ctf_structs.ctn_readonly
-       = ctf_hash_create (pop[CTF_K_STRUCT], ctf_hash_string,
-			  ctf_hash_eq_string)) == NULL)
+  if ((fp->ctf_structs
+       = ctf_dynhash_create_sized (pop[CTF_K_STRUCT], ctf_hash_string,
+				   ctf_hash_eq_string, NULL, NULL)) == NULL)
     return ENOMEM;
 
-  if ((fp->ctf_unions.ctn_readonly
-       = ctf_hash_create (pop[CTF_K_UNION], ctf_hash_string,
-			  ctf_hash_eq_string)) == NULL)
+  if ((fp->ctf_unions
+       = ctf_dynhash_create_sized (pop[CTF_K_UNION], ctf_hash_string,
+				   ctf_hash_eq_string, NULL, NULL)) == NULL)
     return ENOMEM;
 
-  if ((fp->ctf_enums.ctn_readonly
-       = ctf_hash_create (pop[CTF_K_ENUM], ctf_hash_string,
-			  ctf_hash_eq_string)) == NULL)
+  if ((fp->ctf_enums
+       = ctf_dynhash_create_sized (pop[CTF_K_ENUM], ctf_hash_string,
+				   ctf_hash_eq_string, NULL, NULL)) == NULL)
     return ENOMEM;
 
-  if ((fp->ctf_names.ctn_readonly
-       = ctf_hash_create (pop[CTF_K_UNKNOWN] +
-			  pop[CTF_K_INTEGER] +
-			  pop[CTF_K_FLOAT] +
-			  pop[CTF_K_FUNCTION] +
-			  pop[CTF_K_TYPEDEF] +
-			  pop[CTF_K_POINTER] +
-			  pop[CTF_K_VOLATILE] +
-			  pop[CTF_K_CONST] +
-			  pop[CTF_K_RESTRICT],
-			  ctf_hash_string,
-			  ctf_hash_eq_string)) == NULL)
+  if ((fp->ctf_names
+       = ctf_dynhash_create_sized (pop[CTF_K_UNKNOWN] +
+				   pop[CTF_K_INTEGER] +
+				   pop[CTF_K_FLOAT] +
+				   pop[CTF_K_FUNCTION] +
+				   pop[CTF_K_TYPEDEF] +
+				   pop[CTF_K_POINTER] +
+				   pop[CTF_K_VOLATILE] +
+				   pop[CTF_K_CONST] +
+				   pop[CTF_K_RESTRICT],
+				   ctf_hash_string,
+				   ctf_hash_eq_string, NULL, NULL)) == NULL)
     return ENOMEM;
 
   fp->ctf_txlate = malloc (sizeof (uint32_t) * (fp->ctf_typemax + 1));
@@ -810,11 +810,10 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 	     root-visible version so that we can be sure to find it when
 	     checking for conflicting definitions in ctf_add_type().  */
 
-	  if (((ctf_hash_lookup_type (fp->ctf_names.ctn_readonly,
-				      fp, name)) == 0)
+	  if (((ctf_dynhash_lookup_type (fp->ctf_names, name)) == 0)
 	      || isroot)
 	    {
-	      err = ctf_hash_define_type (fp->ctf_names.ctn_readonly, fp,
+	      err = ctf_dynhash_insert_type (fp, fp->ctf_names,
 					  LCTF_INDEX_TO_TYPE (fp, id, child),
 					  tp->ctt_name);
 	      if (err != 0)
@@ -832,9 +831,9 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 	  if (!isroot)
 	    break;
 
-	  err = ctf_hash_insert_type (fp->ctf_names.ctn_readonly, fp,
-				      LCTF_INDEX_TO_TYPE (fp, id, child),
-				      tp->ctt_name);
+	  err = ctf_dynhash_insert_type (fp, fp->ctf_names,
+					 LCTF_INDEX_TO_TYPE (fp, id, child),
+					 tp->ctt_name);
 	  if (err != 0)
 	    return err;
 	  break;
@@ -846,9 +845,9 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 	  if (!isroot)
 	    break;
 
-	  err = ctf_hash_define_type (fp->ctf_structs.ctn_readonly, fp,
-				      LCTF_INDEX_TO_TYPE (fp, id, child),
-				      tp->ctt_name);
+	  err = ctf_dynhash_insert_type (fp, fp->ctf_structs,
+					 LCTF_INDEX_TO_TYPE (fp, id, child),
+					 tp->ctt_name);
 
 	  if (err != 0)
 	    return err;
@@ -862,9 +861,9 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 	  if (!isroot)
 	    break;
 
-	  err = ctf_hash_define_type (fp->ctf_unions.ctn_readonly, fp,
-				      LCTF_INDEX_TO_TYPE (fp, id, child),
-				      tp->ctt_name);
+	  err = ctf_dynhash_insert_type (fp, fp->ctf_unions,
+					 LCTF_INDEX_TO_TYPE (fp, id, child),
+					 tp->ctt_name);
 
 	  if (err != 0)
 	    return err;
@@ -874,9 +873,9 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 	  if (!isroot)
 	    break;
 
-	  err = ctf_hash_define_type (fp->ctf_enums.ctn_readonly, fp,
-				      LCTF_INDEX_TO_TYPE (fp, id, child),
-				      tp->ctt_name);
+	  err = ctf_dynhash_insert_type (fp, fp->ctf_enums,
+					 LCTF_INDEX_TO_TYPE (fp, id, child),
+					 tp->ctt_name);
 
 	  if (err != 0)
 	    return err;
@@ -886,27 +885,26 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 	  if (!isroot)
 	    break;
 
-	  err = ctf_hash_insert_type (fp->ctf_names.ctn_readonly, fp,
-				      LCTF_INDEX_TO_TYPE (fp, id, child),
-				      tp->ctt_name);
+	  err = ctf_dynhash_insert_type (fp, fp->ctf_names,
+					 LCTF_INDEX_TO_TYPE (fp, id, child),
+					 tp->ctt_name);
 	  if (err != 0)
 	    return err;
 	  break;
 
 	case CTF_K_FORWARD:
 	  {
-	    ctf_names_t *np = ctf_name_table (fp, tp->ctt_type);
+	    ctf_dynhash_t *h = ctf_name_table (fp, tp->ctt_type);
 
 	    if (!isroot)
 	      break;
 
 	    /* Only insert forward tags into the given hash if the type or tag
 	       name is not already present.  */
-	    if (ctf_hash_lookup_type (np->ctn_readonly, fp, name) == 0)
+	    if (ctf_dynhash_lookup_type (h, name) == 0)
 	      {
-		err = ctf_hash_insert_type (np->ctn_readonly, fp,
-					    LCTF_INDEX_TO_TYPE (fp, id, child),
-					    tp->ctt_name);
+		err = ctf_dynhash_insert_type (fp, h, LCTF_INDEX_TO_TYPE (fp, id, child),
+					       tp->ctt_name);
 		if (err != 0)
 		  return err;
 	      }
@@ -929,15 +927,15 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 	  if (!isroot)
 	    break;
 
-	  err = ctf_hash_insert_type (fp->ctf_names.ctn_readonly, fp,
-				      LCTF_INDEX_TO_TYPE (fp, id, child),
-				      tp->ctt_name);
+	  err = ctf_dynhash_insert_type (fp, fp->ctf_names,
+					 LCTF_INDEX_TO_TYPE (fp, id, child),
+					 tp->ctt_name);
 	  if (err != 0)
 	    return err;
 	  break;
 	default:
 	  ctf_err_warn (fp, 0, ECTF_CORRUPT,
-			_("init_types(): unhandled CTF kind: %x"), kind);
+			_("init_static_types(): unhandled CTF kind: %x"), kind);
 	  return ECTF_CORRUPT;
 	}
 
@@ -946,14 +944,14 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
     }
 
   ctf_dprintf ("%lu total types processed\n", fp->ctf_typemax);
-  ctf_dprintf ("%u enum names hashed\n",
-	       ctf_hash_size (fp->ctf_enums.ctn_readonly));
-  ctf_dprintf ("%u struct names hashed (%d long)\n",
-	       ctf_hash_size (fp->ctf_structs.ctn_readonly), nlstructs);
-  ctf_dprintf ("%u union names hashed (%d long)\n",
-	       ctf_hash_size (fp->ctf_unions.ctn_readonly), nlunions);
-  ctf_dprintf ("%u base type names hashed\n",
-	       ctf_hash_size (fp->ctf_names.ctn_readonly));
+  ctf_dprintf ("%zu enum names hashed\n",
+	       ctf_dynhash_elements (fp->ctf_enums));
+  ctf_dprintf ("%zu struct names hashed (%d long)\n",
+	       ctf_dynhash_elements (fp->ctf_structs), nlstructs);
+  ctf_dprintf ("%zu union names hashed (%d long)\n",
+	       ctf_dynhash_elements (fp->ctf_unions), nlunions);
+  ctf_dprintf ("%zu base type names hashed\n",
+	       ctf_dynhash_elements (fp->ctf_names));
 
   return 0;
 }
@@ -1235,16 +1233,16 @@ void ctf_set_ctl_hashes (ctf_dict_t *fp)
      array of type name prefixes and the corresponding ctf_hash to use.  */
   fp->ctf_lookups[0].ctl_prefix = "struct";
   fp->ctf_lookups[0].ctl_len = strlen (fp->ctf_lookups[0].ctl_prefix);
-  fp->ctf_lookups[0].ctl_hash = &fp->ctf_structs;
+  fp->ctf_lookups[0].ctl_hash = fp->ctf_structs;
   fp->ctf_lookups[1].ctl_prefix = "union";
   fp->ctf_lookups[1].ctl_len = strlen (fp->ctf_lookups[1].ctl_prefix);
-  fp->ctf_lookups[1].ctl_hash = &fp->ctf_unions;
+  fp->ctf_lookups[1].ctl_hash = fp->ctf_unions;
   fp->ctf_lookups[2].ctl_prefix = "enum";
   fp->ctf_lookups[2].ctl_len = strlen (fp->ctf_lookups[2].ctl_prefix);
-  fp->ctf_lookups[2].ctl_hash = &fp->ctf_enums;
+  fp->ctf_lookups[2].ctl_hash = fp->ctf_enums;
   fp->ctf_lookups[3].ctl_prefix = _CTF_NULLSTR;
   fp->ctf_lookups[3].ctl_len = strlen (fp->ctf_lookups[3].ctl_prefix);
-  fp->ctf_lookups[3].ctl_hash = &fp->ctf_names;
+  fp->ctf_lookups[3].ctl_hash = fp->ctf_names;
   fp->ctf_lookups[4].ctl_prefix = NULL;
   fp->ctf_lookups[4].ctl_len = 0;
   fp->ctf_lookups[4].ctl_hash = NULL;
@@ -1764,20 +1762,11 @@ ctf_dict_close (ctf_dict_t *fp)
       ctf_dtd_delete (fp, dtd);
     }
   ctf_dynhash_destroy (fp->ctf_dthash);
-  if (fp->ctf_flags & LCTF_RDWR)
-    {
-      ctf_dynhash_destroy (fp->ctf_structs.ctn_writable);
-      ctf_dynhash_destroy (fp->ctf_unions.ctn_writable);
-      ctf_dynhash_destroy (fp->ctf_enums.ctn_writable);
-      ctf_dynhash_destroy (fp->ctf_names.ctn_writable);
-    }
-  else
-    {
-      ctf_hash_destroy (fp->ctf_structs.ctn_readonly);
-      ctf_hash_destroy (fp->ctf_unions.ctn_readonly);
-      ctf_hash_destroy (fp->ctf_enums.ctn_readonly);
-      ctf_hash_destroy (fp->ctf_names.ctn_readonly);
-    }
+
+  ctf_dynhash_destroy (fp->ctf_structs);
+  ctf_dynhash_destroy (fp->ctf_unions);
+  ctf_dynhash_destroy (fp->ctf_enums);
+  ctf_dynhash_destroy (fp->ctf_names);
 
   for (dvd = ctf_list_next (&fp->ctf_dvdefs); dvd != NULL; dvd = nvd)
     {
diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c
index 11cbe75601e..511c5116140 100644
--- a/libctf/ctf-serialize.c
+++ b/libctf/ctf-serialize.c
@@ -1203,10 +1203,10 @@ ctf_serialize (ctf_dict_t *fp)
   memset (fp->ctf_lookups, 0, sizeof (fp->ctf_lookups));
   memset (&fp->ctf_in_flight_dynsyms, 0, sizeof (fp->ctf_in_flight_dynsyms));
   memset (&fp->ctf_dedup, 0, sizeof (fp->ctf_dedup));
-  fp->ctf_structs.ctn_writable = NULL;
-  fp->ctf_unions.ctn_writable = NULL;
-  fp->ctf_enums.ctn_writable = NULL;
-  fp->ctf_names.ctn_writable = NULL;
+  fp->ctf_structs = NULL;
+  fp->ctf_unions = NULL;
+  fp->ctf_enums = NULL;
+  fp->ctf_names = NULL;
 
   memcpy (&ofp, fp, sizeof (ctf_dict_t));
   memcpy (fp, nfp, sizeof (ctf_dict_t));
diff --git a/libctf/ctf-string.c b/libctf/ctf-string.c
index 58ebcd9d785..63dced02e2f 100644
--- a/libctf/ctf-string.c
+++ b/libctf/ctf-string.c
@@ -73,6 +73,29 @@ ctf_strptr (ctf_dict_t *fp, uint32_t name)
   return (s != NULL ? s : "(?)");
 }
 
+/* As above, but return info on what is wrong in more detail.
+   (Used for type lookups.) */
+
+const char *
+ctf_strptr_validate (ctf_dict_t *fp, uint32_t name)
+{
+  const char *str = ctf_strraw (fp, name);
+
+  if (str == NULL)
+    {
+      if (CTF_NAME_STID (name) == CTF_STRTAB_1
+	  && fp->ctf_syn_ext_strtab == NULL
+	  && fp->ctf_str[CTF_NAME_STID (name)].cts_strs == NULL) {
+	ctf_set_errno (fp, ECTF_STRTAB);
+	return NULL;
+      }
+
+      ctf_set_errno (fp, ECTF_BADNAME);
+      return NULL;
+    }
+  return str;
+}
+
 /* Remove all refs to a given atom.  */
 static void
 ctf_str_purge_atom_refs (ctf_str_atom_t *atom)
diff --git a/libctf/ctf-types.c b/libctf/ctf-types.c
index 0eaafa13619..10bb6d1596a 100644
--- a/libctf/ctf-types.c
+++ b/libctf/ctf-types.c
@@ -635,22 +635,8 @@ ctf_get_dict (ctf_dict_t *fp, ctf_id_t type)
 
 ctf_id_t ctf_lookup_by_rawname (ctf_dict_t *fp, int kind, const char *name)
 {
-  return ctf_lookup_by_rawhash (fp, ctf_name_table (fp, kind), name);
-}
-
-/* Look up a name in the given name table, in the appropriate hash given the
-   readability state of the dictionary.  The name is a raw, undecorated
-   identifier.  */
-
-ctf_id_t ctf_lookup_by_rawhash (ctf_dict_t *fp, ctf_names_t *np, const char *name)
-{
-  ctf_id_t id;
-
-  if (fp->ctf_flags & LCTF_RDWR)
-    id = (ctf_id_t) (uintptr_t) ctf_dynhash_lookup (np->ctn_writable, name);
-  else
-    id = ctf_hash_lookup_type (np->ctn_readonly, fp, name);
-  return id;
+  return (ctf_id_t) (uintptr_t)
+    ctf_dynhash_lookup (ctf_name_table (fp, kind), name);
 }
 
 /* Lookup the given type ID and return its name as a new dynamically-allocated
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 04/22] libctf: fix name lookup in dicts containing base-type bitfields
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (2 preceding siblings ...)
  2024-04-17 20:19 ` [PATCH 03/22] libctf: remove static/dynamic name lookup distinction Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 05/22] libctf: support addition of types to dicts read via ctf_open() Nick Alcock
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

The intent of the name lookup code was for lookups to yield non-bitfield
basic types except if none existed with a given name, and only then
return bitfield types with that name.  Unfortunately, the code as
written only does this if the base type has a type ID higher than all
bitfield types, which is most unlikely (the opposite is almost always
the case).

Adjust it so that what ends up in the name table is the highest-width
zero-offset type with a given name, if any such exist, and failing that
the first type with that name we see, no matter its offset.  (We don't
define *which* bitfield type you get, after all, so we might as well
just stuff in the first we find.)

Reported by Stephen Brennan <stephen.brennan@oracle.com>.

libctf/

	* ctf-open.c (init_types): Modify to allow some lookups during open;
	detect bitfield name reuse and prefer less bitfieldy types.
	* testsuite/libctf-writable/libctf-bitfield-name-lookup.*: New test.
---
 libctf/ctf-open.c                             |  73 ++++++----
 .../libctf-bitfield-name-lookup.c             | 136 ++++++++++++++++++
 .../libctf-bitfield-name-lookup.lk            |   1 +
 3 files changed, 186 insertions(+), 24 deletions(-)
 create mode 100644 libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c
 create mode 100644 libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk

diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index 2945228ff2a..87b0f74367a 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -685,6 +685,7 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
   const ctf_type_t *tp;
   uint32_t id;
   uint32_t *xp;
+  unsigned long typemax = 0;
 
   /* We determine whether the dict is a child or a parent based on the value of
      cth_parname.  */
@@ -708,7 +709,7 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
   /* We make two passes through the entire type section.  In this first
      pass, we count the number of each type and the total number of types.  */
 
-  for (tp = tbuf; tp < tend; fp->ctf_typemax++)
+  for (tp = tbuf; tp < tend; typemax++)
     {
       unsigned short kind = LCTF_INFO_KIND (fp, tp->ctt_info);
       unsigned long vlen = LCTF_INFO_VLEN (fp, tp->ctt_info);
@@ -769,8 +770,8 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 				   ctf_hash_eq_string, NULL, NULL)) == NULL)
     return ENOMEM;
 
-  fp->ctf_txlate = malloc (sizeof (uint32_t) * (fp->ctf_typemax + 1));
-  fp->ctf_ptrtab_len = fp->ctf_typemax + 1;
+  fp->ctf_txlate = malloc (sizeof (uint32_t) * (typemax + 1));
+  fp->ctf_ptrtab_len = typemax + 1;
   fp->ctf_ptrtab = malloc (sizeof (uint32_t) * fp->ctf_ptrtab_len);
 
   if (fp->ctf_txlate == NULL || fp->ctf_ptrtab == NULL)
@@ -779,13 +780,17 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
   xp = fp->ctf_txlate;
   *xp++ = 0;			/* Type id 0 is used as a sentinel value.  */
 
-  memset (fp->ctf_txlate, 0, sizeof (uint32_t) * (fp->ctf_typemax + 1));
-  memset (fp->ctf_ptrtab, 0, sizeof (uint32_t) * (fp->ctf_typemax + 1));
+  memset (fp->ctf_txlate, 0, sizeof (uint32_t) * (typemax + 1));
+  memset (fp->ctf_ptrtab, 0, sizeof (uint32_t) * (typemax + 1));
 
   /* In the second pass through the types, we fill in each entry of the
-     type and pointer tables and add names to the appropriate hashes.  */
+     type and pointer tables and add names to the appropriate hashes.
 
-  for (id = 1, tp = tbuf; tp < tend; xp++, id++)
+     Bump ctf_typemax as we go, but keep it one higher than normal, so that
+     the type being read in is considered a valid type and it is at least
+     barely possible to run simple lookups on it.  */
+
+  for (id = 1, fp->ctf_typemax = 1, tp = tbuf; tp < tend; xp++, id++, fp->ctf_typemax++)
     {
       unsigned short kind = LCTF_INFO_KIND (fp, tp->ctt_info);
       unsigned short isroot = LCTF_INFO_ISROOT (fp, tp->ctt_info);
@@ -799,27 +804,47 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
       /* Cannot fail: shielded by call in loop above.  */
       vbytes = LCTF_VBYTES (fp, kind, size, vlen);
 
+      *xp = (uint32_t) ((uintptr_t) tp - (uintptr_t) fp->ctf_buf);
+
       switch (kind)
 	{
 	case CTF_K_UNKNOWN:
 	case CTF_K_INTEGER:
 	case CTF_K_FLOAT:
-	  /* Names are reused by bit-fields, which are differentiated by their
-	     encodings, and so typically we'd record only the first instance of
-	     a given intrinsic.  However, we replace an existing type with a
-	     root-visible version so that we can be sure to find it when
-	     checking for conflicting definitions in ctf_add_type().  */
+	  {
+	    ctf_id_t existing;
+	    ctf_encoding_t existing_en;
+	    ctf_encoding_t this_en;
 
-	  if (((ctf_dynhash_lookup_type (fp->ctf_names, name)) == 0)
-	      || isroot)
-	    {
-	      err = ctf_dynhash_insert_type (fp, fp->ctf_names,
-					  LCTF_INDEX_TO_TYPE (fp, id, child),
-					  tp->ctt_name);
-	      if (err != 0)
-		return err;
-	    }
-	  break;
+	    if (!isroot)
+	      break;
+
+	    /* Names are reused by bitfields, which are differentiated by
+	       their encodings.  So check for the type already existing, and
+	       iff the new type is a root-visible non-bitfield, replace the
+	       old one.  It's a little hard to figure out whether a type is
+	       a non-bitfield without already knowing that type's native
+	       width, but we can converge on it by replacing an existing
+	       type as long as the new type is zero-offset and has a
+	       bit-width wider than the existing one, since the native type
+	       must necessarily have a bit-width at least as wide as any
+	       bitfield based on it. */
+
+	    if (((existing = ctf_dynhash_lookup_type (fp->ctf_names, name)) == 0)
+		|| ctf_type_encoding (fp, existing, &existing_en) != 0
+		|| (ctf_type_encoding (fp, LCTF_INDEX_TO_TYPE (fp, id, child), &this_en) == 0
+		    && this_en.cte_offset == 0
+		    && (existing_en.cte_offset != 0
+			|| existing_en.cte_bits < this_en.cte_bits)))
+	      {
+		err = ctf_dynhash_insert_type (fp, fp->ctf_names,
+					       LCTF_INDEX_TO_TYPE (fp, id, child),
+					       tp->ctt_name);
+		if (err != 0)
+		  return err;
+	      }
+	    break;
+	  }
 
 	  /* These kinds have no name, so do not need interning into any
 	     hashtables.  */
@@ -938,10 +963,10 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 			_("init_static_types(): unhandled CTF kind: %x"), kind);
 	  return ECTF_CORRUPT;
 	}
-
-      *xp = (uint32_t) ((uintptr_t) tp - (uintptr_t) fp->ctf_buf);
       tp = (ctf_type_t *) ((uintptr_t) tp + increment + vbytes);
     }
+  fp->ctf_typemax--;
+  assert (fp->ctf_typemax == typemax);
 
   ctf_dprintf ("%lu total types processed\n", fp->ctf_typemax);
   ctf_dprintf ("%zu enum names hashed\n",
diff --git a/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c b/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c
new file mode 100644
index 00000000000..1554ca2d626
--- /dev/null
+++ b/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c
@@ -0,0 +1,136 @@
+/* Verify that name lookup of basic types including old-style bitfield types
+   yields the non-bitfield.  */
+
+#include <ctf-api.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+int bitfieldery (int count, int up, int pos)
+{
+  unsigned char *ctf_written;
+  size_t size;
+  ctf_dict_t *dict;
+  const char *err = "opening";
+  int open_err;
+  ctf_encoding_t en;
+  ctf_encoding_t basic;
+  ctf_id_t type;
+  size_t i;
+
+  /* This is rendered annoying by two factors: old-style bitfields are not
+     generated by current compilers, so we need to build a suitable dict by
+     hand; and this is an open-time bug, so we need to serialize it and then
+     load it back in again.  */
+
+  if ((dict = ctf_create (&open_err)) == NULL)
+    goto open_err;
+
+  /* Populate with a pile of bitfields of increasing/decreasing size, with a
+     single basic type dropped in at position POS.  Oscillate the offset
+     between 0 and 1.  */
+
+  basic.cte_bits = count;
+  basic.cte_offset = 0;
+  basic.cte_format = CTF_INT_SIGNED;
+
+  en.cte_bits = up ? 0 : count - 1;
+  en.cte_offset = 0;
+  en.cte_format = CTF_INT_SIGNED;
+
+  for (i = 0; i < count; i++)
+    {
+      if (i == pos)
+	{
+	  err = "populating with basic type";
+	  if (ctf_add_integer (dict, CTF_ADD_ROOT, "int", &basic) < 0)
+	    goto err;
+	}
+
+      err = "populating";
+      if (ctf_add_integer (dict, CTF_ADD_ROOT, "int", &en) < 0)
+	goto err;
+
+      en.cte_bits += up ? 1 : -1;
+      if (en.cte_offset == 0)
+	en.cte_offset = 1;
+      else
+	en.cte_offset = 0;
+    }
+
+  /* Possibly populate with at-end basic type.  */
+  if (i == pos)
+    {
+      err = "populating with basic type";
+      if (ctf_add_integer (dict, CTF_ADD_ROOT, "int", &basic) < 0)
+	goto err;
+    }
+
+  err = "writing";
+  if ((ctf_written = ctf_write_mem (dict, &size, 4096)) == NULL)
+    goto err;
+  ctf_dict_close (dict);
+
+  err = "opening";
+  if ((dict = ctf_simple_open ((char *) ctf_written, size, NULL, 0,
+			       0, NULL, 0, &open_err)) == NULL)
+    goto open_err;
+
+  err = "looking up";
+  if ((type = ctf_lookup_by_name (dict, "int")) == CTF_ERR)
+    goto err;
+
+  err = "encoding check";
+  if (ctf_type_encoding (dict, type, &en) < 0)
+    goto err;
+
+  if (en.cte_bits < count || en.cte_offset != 0) {
+    fprintf (stderr, "Name lookup with count %i, pos %i, counting %s "
+	     "gave bitfield ID %lx with bits %i, offset %i\n", count, pos,
+	     up ? "up" : "down", type, en.cte_bits, en.cte_offset);
+    return 1;
+  }
+  ctf_dict_close (dict);
+  free (ctf_written);
+
+  return 0;
+
+ open_err:
+  fprintf (stdout, "Error %s: %s\n", err, ctf_errmsg (open_err));
+  return 1;
+
+ err:
+  fprintf (stdout, "Error %s: %s\n", err, ctf_errmsg (ctf_errno (dict)));
+  return 1;
+}
+
+/* Do a bunch of tests with a type of a given size: up and down, basic type
+   at and near the start and end, and in the middle.  */
+
+void mass_bitfieldery (long size)
+{
+  size *= 8;
+  bitfieldery (size, 1, 0);
+  bitfieldery (size, 0, 0);
+  bitfieldery (size, 1, 1);
+  bitfieldery (size, 0, 1);
+  bitfieldery (size, 1, size / 2);
+  bitfieldery (size, 0, size / 2);
+  bitfieldery (size, 1, size - 1);
+  bitfieldery (size, 0, size - 1);
+  bitfieldery (size, 1, size);
+  bitfieldery (size, 0, size);
+}
+
+int main (void)
+{
+  mass_bitfieldery (sizeof (char));
+  mass_bitfieldery (sizeof (short));
+  mass_bitfieldery (sizeof (int));
+  mass_bitfieldery (sizeof (long));
+  mass_bitfieldery (sizeof (uint64_t));
+
+  printf ("All done.\n");
+
+  return 0;
+}
diff --git a/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk b/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk
new file mode 100644
index 00000000000..b944f73d013
--- /dev/null
+++ b/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk
@@ -0,0 +1 @@
+All done.
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 05/22] libctf: support addition of types to dicts read via ctf_open()
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (3 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 04/22] libctf: fix name lookup in dicts containing base-type bitfields Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 06/22] libctf: fix a comment Nick Alcock
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.

But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.

So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them.  (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)

This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account.  Some of these irregularities were hard to define as
anything but bugs.

Notably:

 - The symbol handling was assuming that symbols only needed to be
   looked for in dynamic hashtabs or static linker-laid-out indexed/
   nonindexed layouts, but now we want to check both in case people
   added more symbols to a dict they opened.

 - The code that handles type additions wasn't checking to see if types
   with the same name existed *at all* (so you could do
   ctf_add_typedef (fp, "foo", bar) repeatedly without error).  This
   seems reasonable for types you just added, but we probably *do* want
   to ban addition of types with names that override names we already
   used in the ctf_open()ed portion, since that would probably corrupt
   existing type relationships.  (Doing things this way also avoids
   causing new errors for any existing code that was doing this sort of
   thing.)

 - ctf_lookup_variable entirely failed to work for variables just added
   by ctf_add_variable: you had to write the dict out and read it back
   in again before they appeared.

 - The symbol handling remembered what symbols you looked up but didn't
   remember their types, so you could look up an object symbol and then
   find it popping up when you asked for function symbols, which seems
   less than ideal.  Since we had to rejig things enough to be able to
   distinguish function and object symbols internally anyway (in order
   to give suitable errors if you try to add a symbol with a name that
   already existed in the ctf_open()ed dict), this bug suddenly became
   more visible and was easily fixed.

We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time).  This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).

There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.

libctf/

	* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
	(ctf_dict.ctf_symhash_func): ... this and...
	(ctf_dict.ctf_symhash_objt): ... this.
	(ctf_dict.ctf_stypes): New, counts static types.
	(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
	(LCTF_RDWR): Deleted.
	(LCTF_DIRTY): Renumbered.
	(LCTF_LINKING): Likewise.
	(ctf_lookup_variable_here): New.
	(ctf_lookup_by_sym_or_name): Likewise.
	(ctf_symbol_next_static): Likewise.
	(ctf_add_variable_forced): Likewise.
	(ctf_add_funcobjt_sym_forced): Likewise.
	(ctf_simple_open_internal): Adjust.
	(ctf_bufopen_internal): Likewise.
	* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
	(ctf_create): Migrate a bunch of initializations into bufopen.
	Force recreation of name tables.  Do not forcibly override the
	model, let ctf_bufopen do it.
	(ctf_static_type): New.
	(ctf_update): Drop LCTF_RDWR check.
	(ctf_dynamic_type): Likewise.
	(ctf_add_function): Likewise.
	(ctf_add_type_internal): Likewise.
	(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
	(ctf_set_array): Likewise.
	(ctf_add_struct_sized): Likewise.
	(ctf_add_union_sized): Likewise.
	(ctf_add_enum): Likewise.
	(ctf_add_enumerator): Likewise (only on the target dict).
	(ctf_add_member_offset): Likewise.
	(ctf_add_generic): Drop LCTF_RDWR check.  Ban addition of types
	with colliding names.
	(ctf_add_forward): Note safety under the new rules.
	(ctf_add_variable): Split all but the existence check into...
	(ctf_add_variable_forced): ... this new function.
	(ctf_add_funcobjt_sym): Likewise...
	(ctf_add_funcobjt_sym_forced): ... for this new function.
	* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
	with any stypes.
	(ctf_link_add_strtab): Likewise.
	(ctf_link_shuffle_syms): Likewise.
	(ctf_link_intern_extern_string): Note pre-existing prohibition.
	* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
	(ctf_lookup_variable): Split out looking in a dict but not
	its parent into...
	(ctf_lookup_variable_here): ... this new function.
	(ctf_lookup_symbol_idx): Track whether looking up a function or
	object: cache them separately.
	(ctf_symbol_next): Split out looking in non-dynamic symtypetab
	entries to...
	(ctf_symbol_next_static): ... this new function.  Don't get confused
	by the simultaneous presence of static and dynamic symtypetab entries.
	(ctf_try_lookup_indexed):  Don't waste time looking up symbols by
	index before there can be any idea how symbols are numbered.
	(ctf_lookup_by_sym_or_name): Distinguish between function and
	data object lookups.  Drop LCTF_RDWR.
	(ctf_lookup_by_symbol): Adjust.
	(ctf_lookup_by_symbol_name): Likewise.
	* ctf-open.c (init_types): Rename to...
	(init_static_types): ... this.  Drop LCTF_RDWR.  Populate ctf_stypes.
	(ctf_simple_open): Drop writable arg.
	(ctf_simple_open_internal): Likewise.
	(ctf_bufopen): Likewise.
	(ctf_bufopen_internal): Populate fields only used for writable dicts.
	Drop LCTF_RDWR.
	(ctf_dict_close): Cater for symhash cache split.
	* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
	* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
	* testsuite/libctf-lookup/add-to-opened*: New test.
---
 libctf/ctf-create.c                           | 184 ++++-----
 libctf/ctf-impl.h                             |  28 +-
 libctf/ctf-link.c                             |  14 +-
 libctf/ctf-lookup.c                           | 351 +++++++++++-------
 libctf/ctf-open.c                             |  66 ++--
 libctf/ctf-serialize.c                        |  51 ++-
 libctf/ctf-types.c                            |  28 +-
 .../libctf-lookup/add-to-opened-ctf.c         |  19 +
 .../testsuite/libctf-lookup/add-to-opened.c   | 147 ++++++++
 .../testsuite/libctf-lookup/add-to-opened.lk  |   3 +
 10 files changed, 621 insertions(+), 270 deletions(-)
 create mode 100644 libctf/testsuite/libctf-lookup/add-to-opened-ctf.c
 create mode 100644 libctf/testsuite/libctf-lookup/add-to-opened.c
 create mode 100644 libctf/testsuite/libctf-lookup/add-to-opened.lk

diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index 240f3dad9ff..7aa244e5ec7 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -47,9 +47,13 @@ ctf_grow_ptrtab (ctf_dict_t *fp)
   size_t new_ptrtab_len = fp->ctf_ptrtab_len;
 
   /* We allocate one more ptrtab entry than we need, for the initial zero,
-     plus one because the caller will probably allocate a new type.  */
+     plus one because the caller will probably allocate a new type.
 
-  if (fp->ctf_ptrtab == NULL)
+     Equally, if the ptrtab is small -- perhaps due to ctf_open of a small
+     dict -- boost it by quite a lot at first, so we don't need to keep
+     realloc()ing.  */
+
+  if (fp->ctf_ptrtab == NULL || fp->ctf_ptrtab_len < 1024)
     new_ptrtab_len = 1024;
   else if ((fp->ctf_typemax + 2) > fp->ctf_ptrtab_len)
     new_ptrtab_len = fp->ctf_ptrtab_len * 1.25;
@@ -104,29 +108,11 @@ ctf_create (int *errp)
 {
   static const ctf_header_t hdr = { .cth_preamble = { CTF_MAGIC, CTF_VERSION, 0 } };
 
-  ctf_dynhash_t *dthash;
-  ctf_dynhash_t *dvhash;
   ctf_dynhash_t *structs = NULL, *unions = NULL, *enums = NULL, *names = NULL;
-  ctf_dynhash_t *objthash = NULL, *funchash = NULL;
   ctf_sect_t cts;
   ctf_dict_t *fp;
 
   libctf_init_debug();
-  dthash = ctf_dynhash_create (ctf_hash_integer, ctf_hash_eq_integer,
-			       NULL, NULL);
-  if (dthash == NULL)
-    {
-      ctf_set_open_errno (errp, EAGAIN);
-      goto err;
-    }
-
-  dvhash = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
-			       NULL, NULL);
-  if (dvhash == NULL)
-    {
-      ctf_set_open_errno (errp, EAGAIN);
-      goto err_dt;
-    }
 
   structs = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
 				NULL, NULL);
@@ -136,14 +122,10 @@ ctf_create (int *errp)
 			      NULL, NULL);
   names = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
 			      NULL, NULL);
-  objthash = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
-				 free, NULL);
-  funchash = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
-				 free, NULL);
   if (!structs || !unions || !enums || !names)
     {
       ctf_set_open_errno (errp, EAGAIN);
-      goto err_dv;
+      goto err;
     }
 
   cts.cts_name = _CTF_SECTION;
@@ -151,24 +133,26 @@ ctf_create (int *errp)
   cts.cts_size = sizeof (hdr);
   cts.cts_entsize = 1;
 
-  if ((fp = ctf_bufopen_internal (&cts, NULL, NULL, NULL, 1, errp)) == NULL)
-    goto err_dv;
+  if ((fp = ctf_bufopen_internal (&cts, NULL, NULL, NULL, errp)) == NULL)
+    goto err;
 
+  /* These hashes will have been initialized with a starting size of zero,
+     which is surely wrong.  Use ones with slightly larger sizes.  */
+  ctf_dynhash_destroy (fp->ctf_structs);
+  ctf_dynhash_destroy (fp->ctf_unions);
+  ctf_dynhash_destroy (fp->ctf_enums);
+  ctf_dynhash_destroy (fp->ctf_names);
   fp->ctf_structs = structs;
   fp->ctf_unions = unions;
   fp->ctf_enums = enums;
   fp->ctf_names = names;
-  fp->ctf_objthash = objthash;
-  fp->ctf_funchash = funchash;
-  fp->ctf_dthash = dthash;
-  fp->ctf_dvhash = dvhash;
   fp->ctf_dtoldid = 0;
-  fp->ctf_snapshots = 1;
   fp->ctf_snapshot_lu = 0;
   fp->ctf_flags |= LCTF_DIRTY;
 
+  /* Make sure the ptrtab starts out at a reasonable size.  */
+
   ctf_set_ctl_hashes (fp);
-  ctf_setmodel (fp, CTF_MODEL_NATIVE);
   if (ctf_grow_ptrtab (fp) < 0)
     {
       ctf_set_open_errno (errp, ctf_errno (fp));
@@ -178,17 +162,11 @@ ctf_create (int *errp)
 
   return fp;
 
- err_dv:
+ err:
   ctf_dynhash_destroy (structs);
   ctf_dynhash_destroy (unions);
   ctf_dynhash_destroy (enums);
   ctf_dynhash_destroy (names);
-  ctf_dynhash_destroy (objthash);
-  ctf_dynhash_destroy (funchash);
-  ctf_dynhash_destroy (dvhash);
- err_dt:
-  ctf_dynhash_destroy (dthash);
- err:
   return NULL;
 }
 
@@ -196,9 +174,6 @@ ctf_create (int *errp)
 int
 ctf_update (ctf_dict_t *fp)
 {
-  if (!(fp->ctf_flags & LCTF_RDWR))
-    return (ctf_set_errno (fp, ECTF_RDONLY));
-
   fp->ctf_dtoldid = fp->ctf_typemax;
   return 0;
 }
@@ -310,9 +285,6 @@ ctf_dynamic_type (const ctf_dict_t *fp, ctf_id_t id)
 {
   ctf_id_t idx;
 
-  if (!(fp->ctf_flags & LCTF_RDWR))
-    return NULL;
-
   if ((fp->ctf_flags & LCTF_CHILD) && LCTF_TYPE_ISPARENT (fp, id))
     fp = fp->ctf_parent;
 
@@ -323,6 +295,19 @@ ctf_dynamic_type (const ctf_dict_t *fp, ctf_id_t id)
   return NULL;
 }
 
+static int
+ctf_static_type (const ctf_dict_t *fp, ctf_id_t id)
+{
+  ctf_id_t idx;
+
+  if ((fp->ctf_flags & LCTF_CHILD) && LCTF_TYPE_ISPARENT (fp, id))
+    fp = fp->ctf_parent;
+
+  idx = LCTF_TYPE_TO_INDEX(fp, id);
+
+  return ((unsigned long) idx <= fp->ctf_stypes);
+}
+
 int
 ctf_dvd_insert (ctf_dict_t *fp, ctf_dvdef_t *dvd)
 {
@@ -385,7 +370,7 @@ ctf_rollback (ctf_dict_t *fp, ctf_snapshot_id_t id)
   ctf_dtdef_t *dtd, *ntd;
   ctf_dvdef_t *dvd, *nvd;
 
-  if (!(fp->ctf_flags & LCTF_RDWR))
+  if (id.snapshot_id < fp->ctf_stypes)
     return (ctf_set_errno (fp, ECTF_RDONLY));
 
   if (fp->ctf_snapshot_lu >= id.snapshot_id)
@@ -449,15 +434,25 @@ ctf_add_generic (ctf_dict_t *fp, uint32_t flag, const char *name, int kind,
   if (flag != CTF_ADD_NONROOT && flag != CTF_ADD_ROOT)
     return (ctf_set_typed_errno (fp, EINVAL));
 
-  if (!(fp->ctf_flags & LCTF_RDWR))
-    return (ctf_set_typed_errno (fp, ECTF_RDONLY));
-
   if (LCTF_INDEX_TO_TYPE (fp, fp->ctf_typemax, 1) >= CTF_MAX_TYPE)
     return (ctf_set_typed_errno (fp, ECTF_FULL));
 
   if (LCTF_INDEX_TO_TYPE (fp, fp->ctf_typemax, 1) == (CTF_MAX_PTYPE - 1))
     return (ctf_set_typed_errno (fp, ECTF_FULL));
 
+  /* Prohibit addition of a root-visible type that is already present
+     in the non-dynamic portion. */
+
+  if (flag == CTF_ADD_ROOT && name != NULL && name[0] != '\0')
+    {
+      ctf_id_t existing;
+
+      if (((existing = ctf_dynhash_lookup_type (ctf_name_table (fp, kind),
+						name)) > 0)
+	  && ctf_static_type (fp, existing))
+	return (ctf_set_typed_errno (fp, ECTF_RDONLY));
+    }
+
   /* Make sure ptrtab always grows to be big enough for all types.  */
   if (ctf_grow_ptrtab (fp) < 0)
       return CTF_ERR;				/* errno is set for us. */
@@ -724,10 +719,9 @@ ctf_set_array (ctf_dict_t *fp, ctf_id_t type, const ctf_arinfo_t *arp)
   if ((fp->ctf_flags & LCTF_CHILD) && LCTF_TYPE_ISPARENT (fp, type))
     fp = fp->ctf_parent;
 
-  if (!(ofp->ctf_flags & LCTF_RDWR))
-    return (ctf_set_errno (ofp, ECTF_RDONLY));
-
-  if (!(fp->ctf_flags & LCTF_RDWR))
+  /* You can only call ctf_set_array on a type you have added, not a
+     type that was read in via ctf_open().  */
+  if (type < fp->ctf_stypes)
     return (ctf_set_errno (ofp, ECTF_RDONLY));
 
   if (dtd == NULL
@@ -755,9 +749,6 @@ ctf_add_function (ctf_dict_t *fp, uint32_t flag,
   size_t initial_vlen;
   size_t i;
 
-  if (!(fp->ctf_flags & LCTF_RDWR))
-    return (ctf_set_typed_errno (fp, ECTF_RDONLY));
-
   if (ctc == NULL || (ctc->ctc_flags & ~CTF_FUNC_VARARG) != 0
       || (ctc->ctc_argc != 0 && argv == NULL))
     return (ctf_set_typed_errno (fp, EINVAL));
@@ -813,6 +804,10 @@ ctf_add_struct_sized (ctf_dict_t *fp, uint32_t flag, const char *name,
   if (name != NULL)
     type = ctf_lookup_by_rawname (fp, CTF_K_STRUCT, name);
 
+  /* Prohibit promotion if this type was ctf_open()ed.  */
+  if (type > 0 && type < fp->ctf_stypes)
+    return (ctf_set_errno (fp, ECTF_RDONLY));
+
   if (type != 0 && ctf_type_kind (fp, type) == CTF_K_FORWARD)
     dtd = ctf_dtd_lookup (fp, type);
   else if ((type = ctf_add_generic (fp, flag, name, CTF_K_STRUCT,
@@ -853,11 +848,15 @@ ctf_add_union_sized (ctf_dict_t *fp, uint32_t flag, const char *name,
   if (name != NULL)
     type = ctf_lookup_by_rawname (fp, CTF_K_UNION, name);
 
+  /* Prohibit promotion if this type was ctf_open()ed.  */
+  if (type > 0 && type < fp->ctf_stypes)
+    return (ctf_set_errno (fp, ECTF_RDONLY));
+
   if (type != 0 && ctf_type_kind (fp, type) == CTF_K_FORWARD)
     dtd = ctf_dtd_lookup (fp, type);
   else if ((type = ctf_add_generic (fp, flag, name, CTF_K_UNION,
 				    initial_vlen, &dtd)) == CTF_ERR)
-    return CTF_ERR;		/* errno is set for us */
+    return CTF_ERR;		/* errno is set for us.  */
 
   /* Forwards won't have any vlen yet.  */
   if (dtd->dtd_vlen_alloc == 0)
@@ -892,6 +891,10 @@ ctf_add_enum (ctf_dict_t *fp, uint32_t flag, const char *name)
   if (name != NULL)
     type = ctf_lookup_by_rawname (fp, CTF_K_ENUM, name);
 
+  /* Prohibit promotion if this type was ctf_open()ed.  */
+  if (type > 0 && type < fp->ctf_stypes)
+    return (ctf_set_errno (fp, ECTF_RDONLY));
+
   if (type != 0 && ctf_type_kind (fp, type) == CTF_K_FORWARD)
     dtd = ctf_dtd_lookup (fp, type);
   else if ((type = ctf_add_generic (fp, flag, name, CTF_K_ENUM,
@@ -953,8 +956,9 @@ ctf_add_forward (ctf_dict_t *fp, uint32_t flag, const char *name,
   if (name == NULL || name[0] == '\0')
     return (ctf_set_typed_errno (fp, ECTF_NONAME));
 
-  /* If the type is already defined or exists as a forward tag, just
-     return the ctf_id_t of the existing definition.  */
+  /* If the type is already defined or exists as a forward tag, just return
+     the ctf_id_t of the existing definition.  Since this changes nothing,
+     it's safe to do even on the read-only portion of the dict.  */
 
   type = ctf_lookup_by_rawname (fp, kind, name);
 
@@ -1066,10 +1070,7 @@ ctf_add_enumerator (ctf_dict_t *fp, ctf_id_t enid, const char *name,
   if ((fp->ctf_flags & LCTF_CHILD) && LCTF_TYPE_ISPARENT (fp, enid))
     fp = fp->ctf_parent;
 
-  if (!(ofp->ctf_flags & LCTF_RDWR))
-    return (ctf_set_errno (ofp, ECTF_RDONLY));
-
-  if (!(fp->ctf_flags & LCTF_RDWR))
+  if (enid < fp->ctf_stypes)
     return (ctf_set_errno (ofp, ECTF_RDONLY));
 
   if (dtd == NULL)
@@ -1142,10 +1143,7 @@ ctf_add_member_offset (ctf_dict_t *fp, ctf_id_t souid, const char *name,
       fp = fp->ctf_parent;
     }
 
-  if (!(ofp->ctf_flags & LCTF_RDWR))
-    return (ctf_set_errno (ofp, ECTF_RDONLY));
-
-  if (!(fp->ctf_flags & LCTF_RDWR))
+  if (souid < fp->ctf_stypes)
     return (ctf_set_errno (ofp, ECTF_RDONLY));
 
   if (dtd == NULL)
@@ -1332,18 +1330,15 @@ ctf_add_member (ctf_dict_t *fp, ctf_id_t souid, const char *name,
   return ctf_add_member_offset (fp, souid, name, type, (unsigned long) - 1);
 }
 
+/* Add a variable regardless of whether or not it is already present.
+
+   Internal use only.  */
 int
-ctf_add_variable (ctf_dict_t *fp, const char *name, ctf_id_t ref)
+ctf_add_variable_forced (ctf_dict_t *fp, const char *name, ctf_id_t ref)
 {
   ctf_dvdef_t *dvd;
   ctf_dict_t *tmp = fp;
 
-  if (!(fp->ctf_flags & LCTF_RDWR))
-    return (ctf_set_errno (fp, ECTF_RDONLY));
-
-  if (ctf_dvd_lookup (fp, name) != NULL)
-    return (ctf_set_errno (fp, ECTF_DUPLICATE));
-
   if (ctf_lookup_by_id (&tmp, ref) == NULL)
     return -1;			/* errno is set for us.  */
 
@@ -1375,21 +1370,30 @@ ctf_add_variable (ctf_dict_t *fp, const char *name, ctf_id_t ref)
 }
 
 int
-ctf_add_funcobjt_sym (ctf_dict_t *fp, int is_function, const char *name, ctf_id_t id)
+ctf_add_variable (ctf_dict_t *fp, const char *name, ctf_id_t ref)
+{
+  if (ctf_lookup_variable_here (fp, name) != CTF_ERR)
+    return (ctf_set_errno (fp, ECTF_DUPLICATE));
+
+  if (ctf_errno (fp) != ECTF_NOTYPEDAT)
+    return -1;				/* errno is set for us.  */
+
+  return ctf_add_variable_forced (fp, name, ref);
+}
+
+/* Add a function or object symbol regardless of whether or not it is already
+   present (already existing symbols are silently overwritten).
+
+   Internal use only.  */
+int
+ctf_add_funcobjt_sym_forced (ctf_dict_t *fp, int is_function, const char *name, ctf_id_t id)
 {
   ctf_dict_t *tmp = fp;
   char *dupname;
   ctf_dynhash_t *h = is_function ? fp->ctf_funchash : fp->ctf_objthash;
 
-  if (!(fp->ctf_flags & LCTF_RDWR))
-    return (ctf_set_errno (fp, ECTF_RDONLY));
-
-  if (ctf_dynhash_lookup (fp->ctf_objthash, name) != NULL ||
-      ctf_dynhash_lookup (fp->ctf_funchash, name) != NULL)
-    return (ctf_set_errno (fp, ECTF_DUPLICATE));
-
   if (ctf_lookup_by_id (&tmp, id) == NULL)
-    return -1;                                  /* errno is set for us.  */
+    return -1;				/* errno is set for us.  */
 
   if (is_function && ctf_type_kind (fp, id) != CTF_K_FUNCTION)
     return (ctf_set_errno (fp, ECTF_NOTFUNC));
@@ -1405,6 +1409,15 @@ ctf_add_funcobjt_sym (ctf_dict_t *fp, int is_function, const char *name, ctf_id_
   return 0;
 }
 
+int
+ctf_add_funcobjt_sym (ctf_dict_t *fp, int is_function, const char *name, ctf_id_t id)
+{
+  if (ctf_lookup_by_sym_or_name (fp, 0, name, 0, is_function) != CTF_ERR)
+    return (ctf_set_errno (fp, ECTF_DUPLICATE));
+
+  return ctf_add_funcobjt_sym_forced (fp, is_function, name, id);
+}
+
 int
 ctf_add_objt_sym (ctf_dict_t *fp, const char *name, ctf_id_t id)
 {
@@ -1606,9 +1619,6 @@ ctf_add_type_internal (ctf_dict_t *dst_fp, ctf_dict_t *src_fp, ctf_id_t src_type
 
   ctf_id_t orig_src_type = src_type;
 
-  if (!(dst_fp->ctf_flags & LCTF_RDWR))
-    return (ctf_set_typed_errno (dst_fp, ECTF_RDONLY));
-
   if ((src_tp = ctf_lookup_by_id (&src_fp, src_type)) == NULL)
     return (ctf_set_typed_errno (dst_fp, ctf_errno (src_fp)));
 
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index 8cbb2ae8242..f4fa3234681 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -369,7 +369,8 @@ struct ctf_dict
   ctf_sect_t ctf_symtab;	    /* Symbol table from object file.  */
   ctf_sect_t ctf_strtab;	    /* String table from object file.  */
   int ctf_symsect_little_endian;    /* Endianness of the ctf_symtab.  */
-  ctf_dynhash_t *ctf_symhash;       /* (partial) hash, symsect name -> idx. */
+  ctf_dynhash_t *ctf_symhash_func;  /* (partial) hash, symsect name -> idx. */
+  ctf_dynhash_t *ctf_symhash_objt;  /* ditto, for object symbols.  */
   size_t ctf_symhash_latest;	    /* Amount of symsect scanned so far.  */
   ctf_dynhash_t *ctf_prov_strtab;   /* Maps provisional-strtab offsets
 				       to names.  */
@@ -406,8 +407,8 @@ struct ctf_dict
   uint32_t *ctf_funcidx_sxlate;	  /* Offsets into funcinfo for a given funcidx.  */
   uint32_t *ctf_objtidx_sxlate;	  /* Likewise, for ctf_objtidx.  */
   size_t ctf_nobjtidx;		  /* Number of objtidx entries.  */
-  ctf_dynhash_t *ctf_objthash;	  /* name -> type ID.  */
-  ctf_dynhash_t *ctf_funchash;	  /* name -> CTF_K_FUNCTION type ID.  */
+  ctf_dynhash_t *ctf_objthash;	  /* Dynamic: name -> type ID.  */
+  ctf_dynhash_t *ctf_funchash;	  /* Dynamic: name -> CTF_K_FUNCTION type ID.  */
 
   /* The next three are linker-derived state found in ctf_link targets only.  */
 
@@ -418,6 +419,7 @@ struct ctf_dict
   struct ctf_varent *ctf_vars;	  /* Sorted variable->type mapping.  */
   unsigned long ctf_nvars;	  /* Number of variables in ctf_vars.  */
   unsigned long ctf_typemax;	  /* Maximum valid type ID number.  */
+  unsigned long ctf_stypes;	  /* Number of static (non-dynamic) types.  */
   const ctf_dmodel_t *ctf_dmodel; /* Data model pointer (see above).  */
   const char *ctf_cuname;	  /* Compilation unit name (if any).  */
   char *ctf_dyncuname;		  /* Dynamically allocated name of CU.  */
@@ -575,7 +577,7 @@ struct ctf_next
 					   (id))
 
 #define LCTF_INDEX_TO_TYPEPTR(fp, i) \
-    ((fp->ctf_flags & LCTF_RDWR) ?					\
+  ((i > fp->ctf_stypes) ?							\
      &(ctf_dtd_lookup (fp, LCTF_INDEX_TO_TYPE				\
 		       (fp, i, fp->ctf_flags & LCTF_CHILD))->dtd_data) : \
      (ctf_type_t *)((uintptr_t)(fp)->ctf_buf + (fp)->ctf_txlate[(i)]))
@@ -587,14 +589,19 @@ struct ctf_next
   ((fp)->ctf_dictops->ctfo_get_vbytes(fp, kind, size, vlen))
 
 #define LCTF_CHILD	0x0001	/* CTF dict is a child.  */
-#define LCTF_RDWR	0x0002	/* CTF dict is writable.  */
-#define LCTF_DIRTY	0x0004	/* CTF dict has been modified.  */
-#define LCTF_LINKING	0x0008  /* CTF link is underway: respect ctf_link_flags.  */
+#define LCTF_DIRTY	0x0002	/* CTF dict has been modified.  */
+#define LCTF_LINKING	0x0004  /* CTF link is underway: respect ctf_link_flags.  */
 
 extern ctf_dynhash_t *ctf_name_table (ctf_dict_t *, int);
 extern const ctf_type_t *ctf_lookup_by_id (ctf_dict_t **, ctf_id_t);
+extern ctf_id_t ctf_lookup_variable_here (ctf_dict_t *fp, const char *name);
+extern ctf_id_t ctf_lookup_by_sym_or_name (ctf_dict_t *, unsigned long symidx,
+					   const char *symname, int try_parent,
+					   int is_function);
 extern ctf_id_t ctf_lookup_by_rawname (ctf_dict_t *, int, const char *);
 extern void ctf_set_ctl_hashes (ctf_dict_t *);
+extern ctf_id_t ctf_symbol_next_static (ctf_dict_t *, ctf_next_t **,
+					const char **, int);
 
 extern int ctf_symtab_skippable (ctf_link_sym_t *sym);
 extern int ctf_add_funcobjt_sym (ctf_dict_t *, int is_function,
@@ -690,6 +697,9 @@ extern ctf_id_t ctf_add_encoded (ctf_dict_t *, uint32_t, const char *,
 				 const ctf_encoding_t *, uint32_t kind);
 extern ctf_id_t ctf_add_reftype (ctf_dict_t *, uint32_t, ctf_id_t,
 				 uint32_t kind);
+extern int ctf_add_variable_forced (ctf_dict_t *, const char *, ctf_id_t);
+extern int ctf_add_funcobjt_sym_forced (ctf_dict_t *, int is_function,
+					const char *, ctf_id_t);
 
 extern int ctf_dedup_atoms_init (ctf_dict_t *);
 extern int ctf_dedup (ctf_dict_t *, ctf_dict_t **, uint32_t ninputs,
@@ -741,10 +751,10 @@ extern int ctf_flip (ctf_dict_t *, ctf_header_t *, unsigned char *, int);
 extern ctf_dict_t *ctf_simple_open_internal (const char *, size_t, const char *,
 					     size_t, size_t,
 					     const char *, size_t,
-					     ctf_dynhash_t *, int, int *);
+					     ctf_dynhash_t *, int *);
 extern ctf_dict_t *ctf_bufopen_internal (const ctf_sect_t *, const ctf_sect_t *,
 					 const ctf_sect_t *, ctf_dynhash_t *,
-					 int, int *);
+					 int *);
 extern int ctf_import_unref (ctf_dict_t *fp, ctf_dict_t *pfp);
 extern int ctf_serialize (ctf_dict_t *);
 
diff --git a/libctf/ctf-link.c b/libctf/ctf-link.c
index 360bc1a0e63..9d2d29416d3 100644
--- a/libctf/ctf-link.c
+++ b/libctf/ctf-link.c
@@ -1576,6 +1576,8 @@ ctf_link_intern_extern_string (void *key _libctf_unused_, void *value,
 /* Repeatedly call ADD_STRING to acquire strings from the external string table,
    adding them to the atoms table for this CU and all subsidiary CUs.
 
+   Must be called on a dict that has not yet been serialized.
+
    If ctf_link is also called, it must be called first if you want the new CTF
    files ctf_link can create to get their strings dedupped against the ELF
    strtab properly.  */
@@ -1587,6 +1589,9 @@ ctf_link_add_strtab (ctf_dict_t *fp, ctf_link_strtab_string_f *add_string,
   uint32_t offset;
   int err = 0;
 
+  if (fp->ctf_stypes > 0)
+    return ctf_set_errno (fp, ECTF_RDONLY);
+
   while ((str = add_string (&offset, arg)) != NULL)
     {
       ctf_link_out_string_cb_arg_t iter_arg = { str, offset, 0 };
@@ -1610,7 +1615,8 @@ ctf_link_add_strtab (ctf_dict_t *fp, ctf_link_strtab_string_f *add_string,
 /* Inform the ctf-link machinery of a new symbol in the target symbol table
    (which must be some symtab that is not usually stripped, and which
    is in agreement with ctf_bfdopen_ctfsect).  May be called either before or
-   after ctf_link_add_strtab.  */
+   after ctf_link_add_strtab.  As with that function, must be called on a dict which
+   has not yet been serialized.  */
 int
 ctf_link_add_linker_symbol (ctf_dict_t *fp, ctf_link_sym_t *sym)
 {
@@ -1625,6 +1631,9 @@ ctf_link_add_linker_symbol (ctf_dict_t *fp, ctf_link_sym_t *sym)
   if (ctf_errno (fp) == ENOMEM)
     return -ENOMEM;				/* errno is set for us.  */
 
+  if (fp->ctf_stypes > 0)
+    return ctf_set_errno (fp, ECTF_RDONLY);
+
   if (ctf_symtab_skippable (sym))
     return 0;
 
@@ -1660,6 +1669,9 @@ ctf_link_shuffle_syms (ctf_dict_t *fp)
   int err = ENOMEM;
   void *name_, *sym_;
 
+  if (fp->ctf_stypes > 0)
+    return ctf_set_errno (fp, ECTF_RDONLY);
+
   if (!fp->ctf_dynsyms)
     {
       fp->ctf_dynsyms = ctf_dynhash_create (ctf_hash_string,
diff --git a/libctf/ctf-lookup.c b/libctf/ctf-lookup.c
index b5d2637fe01..1fcbebee2d1 100644
--- a/libctf/ctf-lookup.c
+++ b/libctf/ctf-lookup.c
@@ -329,7 +329,7 @@ ctf_lookup_by_name (ctf_dict_t *fp, const char *name)
 const ctf_type_t *
 ctf_lookup_by_id (ctf_dict_t **fpp, ctf_id_t type)
 {
-  ctf_dict_t *fp = *fpp;	/* Caller passes in starting CTF dict.  */
+  ctf_dict_t *fp = *fpp;
   ctf_id_t idx;
 
   if ((fp = ctf_get_dict (fp, type)) == NULL)
@@ -338,27 +338,10 @@ ctf_lookup_by_id (ctf_dict_t **fpp, ctf_id_t type)
       return NULL;
     }
 
-  /* If this dict is writable, check for a dynamic type.  */
-
-  if (fp->ctf_flags & LCTF_RDWR)
-    {
-      ctf_dtdef_t *dtd;
-
-      if ((dtd = ctf_dynamic_type (fp, type)) != NULL)
-	{
-	  *fpp = fp;
-	  return &dtd->dtd_data;
-	}
-      (void) ctf_set_errno (*fpp, ECTF_BADID);
-      return NULL;
-    }
-
-  /* Check for a type in the static portion.  */
-
   idx = LCTF_TYPE_TO_INDEX (fp, type);
   if (idx > 0 && (unsigned long) idx <= fp->ctf_typemax)
     {
-      *fpp = fp;		/* Function returns ending CTF dict.  */
+      *fpp = fp;		/* Possibly the parent CTF dict.  */
       return (LCTF_INDEX_TO_TYPEPTR (fp, idx));
     }
 
@@ -384,36 +367,52 @@ ctf_lookup_var (const void *key_, const void *lookup_)
   return (strcmp (key->clik_name, ctf_strptr (key->clik_fp, lookup->ctv_name)));
 }
 
-/* Given a variable name, return the type of the variable with that name.  */
+/* Given a variable name, return the type of the variable with that name.
+   Look only in this dict, not in the parent. */
 
 ctf_id_t
-ctf_lookup_variable (ctf_dict_t *fp, const char *name)
+ctf_lookup_variable_here (ctf_dict_t *fp, const char *name)
 {
+  ctf_dvdef_t *dvd = ctf_dvd_lookup (fp, name);
   ctf_varent_t *ent;
   ctf_lookup_idx_key_t key = { fp, name, NULL };
 
+  if (dvd != NULL)
+    return dvd->dvd_type;
+
   /* This array is sorted, so we can bsearch for it.  */
 
   ent = bsearch (&key, fp->ctf_vars, fp->ctf_nvars, sizeof (ctf_varent_t),
 		 ctf_lookup_var);
 
   if (ent == NULL)
-    {
-      if (fp->ctf_parent != NULL)
-        {
-          ctf_id_t ptype;
-
-          if ((ptype = ctf_lookup_variable (fp->ctf_parent, name)) != CTF_ERR)
-            return ptype;
-          return (ctf_set_typed_errno (fp, ctf_errno (fp->ctf_parent)));
-        }
-
       return (ctf_set_typed_errno (fp, ECTF_NOTYPEDAT));
-    }
 
   return ent->ctv_type;
 }
 
+/* As above, but look in the parent too.  */
+
+ctf_id_t
+ctf_lookup_variable (ctf_dict_t *fp, const char *name)
+{
+  ctf_id_t type;
+
+  if ((type = ctf_lookup_variable_here (fp, name)) == CTF_ERR)
+    {
+      if (ctf_errno (fp) == ECTF_NOTYPEDAT && fp->ctf_parent != NULL)
+	{
+	  if ((type = ctf_lookup_variable_here (fp->ctf_parent, name)) != CTF_ERR)
+	    return type;
+	  return (ctf_set_typed_errno (fp, ctf_errno (fp->ctf_parent)));
+	}
+
+      return -1;				/* errno is set for us.  */
+    }
+
+  return type;
+}
+
 typedef struct ctf_symidx_sort_arg_cb
 {
   ctf_dict_t *fp;
@@ -535,9 +534,11 @@ ctf_lookup_symbol_name (ctf_dict_t *fp, unsigned long symidx)
 }
 
 /* Given a symbol name, return the index of that symbol, or -1 on error or if
-   not found.  */
+   not found.  If is_function is >= 0, return only function or data object
+   symbols, respectively.  */
 static unsigned long
-ctf_lookup_symbol_idx (ctf_dict_t *fp, const char *symname)
+ctf_lookup_symbol_idx (ctf_dict_t *fp, const char *symname, int try_parent,
+		       int is_function)
 {
   const ctf_sect_t *sp = &fp->ctf_symtab;
   ctf_link_sym_t sym;
@@ -551,7 +552,9 @@ ctf_lookup_symbol_idx (ctf_dict_t *fp, const char *symname)
 
       ctf_link_sym_t *symp;
 
-      if ((symp = ctf_dynhash_lookup (fp->ctf_dynsyms, symname)) == NULL)
+      if (((symp = ctf_dynhash_lookup (fp->ctf_dynsyms, symname)) == NULL)
+	  || (symp->st_type != STT_OBJECT && is_function == 0)
+	  || (symp->st_type != STT_FUNC && is_function == 1))
 	goto try_parent;
 
       return symp->st_symidx;
@@ -562,22 +565,33 @@ ctf_lookup_symbol_idx (ctf_dict_t *fp, const char *symname)
     goto try_parent;
 
   /* First, try a hash lookup to see if we have already spotted this symbol
-     during a past iteration: create the hash first if need be.  The lifespan
-     of the strings is equal to the lifespan of the cts_data, so we don't
-     need to strdup them.  If this dict was opened as part of an archive,
-     and this archive has designed a crossdict_cache to cache results that
+     during a past iteration: create the hash first if need be.  The
+     lifespan of the strings is equal to the lifespan of the cts_data, so we
+     don't need to strdup them.  If this dict was opened as part of an
+     archive, and this archive has a crossdict_cache to cache results that
      are the same across all dicts in an archive, use it.  */
 
   if (fp->ctf_archive && fp->ctf_archive->ctfi_crossdict_cache)
     cache = fp->ctf_archive->ctfi_crossdict_cache;
 
-  if (!cache->ctf_symhash)
-    if ((cache->ctf_symhash = ctf_dynhash_create (ctf_hash_string,
-						  ctf_hash_eq_string,
-						  NULL, NULL)) == NULL)
+  if (!cache->ctf_symhash_func)
+    if ((cache->ctf_symhash_func = ctf_dynhash_create (ctf_hash_string,
+						       ctf_hash_eq_string,
+						       NULL, NULL)) == NULL)
       goto oom;
 
-  if (ctf_dynhash_lookup_kv (cache->ctf_symhash, symname, NULL, &known_idx))
+  if (!cache->ctf_symhash_objt)
+    if ((cache->ctf_symhash_objt = ctf_dynhash_create (ctf_hash_string,
+						       ctf_hash_eq_string,
+						       NULL, NULL)) == NULL)
+      goto oom;
+
+  if (is_function != 0 &&
+      ctf_dynhash_lookup_kv (cache->ctf_symhash_func, symname, NULL, &known_idx))
+    return (unsigned long) (uintptr_t) known_idx;
+
+  if (is_function != 1 &&
+      ctf_dynhash_lookup_kv (cache->ctf_symhash_objt, symname, NULL, &known_idx))
     return (unsigned long) (uintptr_t) known_idx;
 
   /* Hash lookup unsuccessful: linear search, populating the hashtab for later
@@ -586,21 +600,16 @@ ctf_lookup_symbol_idx (ctf_dict_t *fp, const char *symname)
   for (; cache->ctf_symhash_latest < sp->cts_size / sp->cts_entsize;
        cache->ctf_symhash_latest++)
     {
+      ctf_dynhash_t *h;
+
       switch (sp->cts_entsize)
 	{
 	case sizeof (Elf64_Sym):
 	  {
 	    Elf64_Sym *symp = (Elf64_Sym *) sp->cts_data;
+
 	    ctf_elf64_to_link_sym (fp, &sym, &symp[cache->ctf_symhash_latest],
 				   cache->ctf_symhash_latest);
-	    if (!ctf_dynhash_lookup_kv (cache->ctf_symhash, sym.st_name,
-					NULL, NULL))
-	      if (ctf_dynhash_cinsert (cache->ctf_symhash, sym.st_name,
-				       (const void *) (uintptr_t)
-				       cache->ctf_symhash_latest) < 0)
-		goto oom;
-	    if (strcmp (sym.st_name, symname) == 0)
-	      return cache->ctf_symhash_latest++;
 	  }
 	  break;
 	case sizeof (Elf32_Sym):
@@ -608,20 +617,28 @@ ctf_lookup_symbol_idx (ctf_dict_t *fp, const char *symname)
 	    Elf32_Sym *symp = (Elf32_Sym *) sp->cts_data;
 	    ctf_elf32_to_link_sym (fp, &sym, &symp[cache->ctf_symhash_latest],
 				   cache->ctf_symhash_latest);
-	    if (!ctf_dynhash_lookup_kv (cache->ctf_symhash, sym.st_name,
-					NULL, NULL))
-	      if (ctf_dynhash_cinsert (cache->ctf_symhash, sym.st_name,
-				       (const void *) (uintptr_t)
-				       cache->ctf_symhash_latest) < 0)
-		goto oom;
-	    if (strcmp (sym.st_name, symname) == 0)
-	      return cache->ctf_symhash_latest++;
+	    break;
 	  }
-	  break;
 	default:
 	  ctf_set_errno (fp, ECTF_SYMTAB);
 	  return (unsigned long) -1;
 	}
+
+      if (sym.st_type == STT_FUNC)
+	h = cache->ctf_symhash_func;
+      else if (sym.st_type == STT_OBJECT)
+	h = cache->ctf_symhash_objt;
+      else
+	continue;					/* Not of interest.  */
+
+      if (!ctf_dynhash_lookup_kv (h, sym.st_name,
+				  NULL, NULL))
+	if (ctf_dynhash_cinsert (h, sym.st_name,
+				 (const void *) (uintptr_t)
+				 cache->ctf_symhash_latest) < 0)
+	  goto oom;
+      if (strcmp (sym.st_name, symname) == 0)
+	return cache->ctf_symhash_latest++;
     }
 
   /* Searched everything, still not found.  */
@@ -629,11 +646,12 @@ ctf_lookup_symbol_idx (ctf_dict_t *fp, const char *symname)
   return (unsigned long) -1;
 
  try_parent:
-  if (fp->ctf_parent)
+  if (fp->ctf_parent && try_parent)
     {
       unsigned long psym;
 
-      if ((psym = ctf_lookup_symbol_idx (fp->ctf_parent, symname))
+      if ((psym = ctf_lookup_symbol_idx (fp->ctf_parent, symname, try_parent,
+					 is_function))
           != (unsigned long) -1)
         return psym;
 
@@ -653,12 +671,17 @@ oom:
 
 }
 
-/* Iterate over all symbols with types: if FUNC, function symbols, otherwise,
-   data symbols.  The name argument is not optional.  The return order is
-   arbitrary, though is likely to be in symbol index or name order.  You can
-   change the value of 'functions' in the middle of iteration over non-dynamic
-   dicts, but doing so on dynamic dicts will fail.  (This is probably not very
-   useful, but there is no reason to prohibit it.)  */
+ctf_id_t
+ctf_symbol_next_static (ctf_dict_t *fp, ctf_next_t **it, const char **name,
+			int functions);
+
+/* Iterate over all symbols with types: if FUNC, function symbols,
+   otherwise, data symbols.  The name argument is not optional.  The return
+   order is arbitrary, though is likely to be in symbol index or name order.
+   Changing the value of 'functions' in the middle of iteration has
+   unpredictable effects (probably skipping symbols, etc) and is not
+   recommended.  Adding symbols while iteration is underway may also lead
+   to other symbols being skipped.  */
 
 ctf_id_t
 ctf_symbol_next (ctf_dict_t *fp, ctf_next_t **it, const char **name,
@@ -685,24 +708,24 @@ ctf_symbol_next (ctf_dict_t *fp, ctf_next_t **it, const char **name,
   if (fp != i->cu.ctn_fp)
     return (ctf_set_typed_errno (fp, ECTF_NEXT_WRONGFP));
 
-  /* We intentionally use raw access, not ctf_lookup_by_symbol, to avoid
+  /* Check the dynamic set of names first, to allow previously-written names
+     to be replaced with dynamic ones (there is still no way to remove them,
+     though).
+
+     We intentionally use raw access, not ctf_lookup_by_symbol, to avoid
      incurring additional sorting cost for unsorted symtypetabs coming from the
      compiler, to allow ctf_symbol_next to work in the absence of a symtab, and
      finally because it's easier to work out what the name of each symbol is if
      we do that.  */
 
-  if (fp->ctf_flags & LCTF_RDWR)
+  ctf_dynhash_t *dynh = functions ? fp->ctf_funchash : fp->ctf_objthash;
+  void *dyn_name = NULL, *dyn_value = NULL;
+  size_t dyn_els = dynh ? ctf_dynhash_elements (dynh) : 0;
+
+  if (i->ctn_n < dyn_els)
     {
-      ctf_dynhash_t *dynh = functions ? fp->ctf_funchash : fp->ctf_objthash;
-      void *dyn_name = NULL, *dyn_value = NULL;
-
-      if (!dynh)
-	{
-	  ctf_next_destroy (i);
-	  return (ctf_set_typed_errno (fp, ECTF_NEXT_END));
-	}
-
       err = ctf_dynhash_next (dynh, &i->ctn_next, &dyn_name, &dyn_value);
+
       /* This covers errors and also end-of-iteration.  */
       if (err != 0)
 	{
@@ -713,9 +736,50 @@ ctf_symbol_next (ctf_dict_t *fp, ctf_next_t **it, const char **name,
 
       *name = dyn_name;
       sym = (ctf_id_t) (uintptr_t) dyn_value;
+      i->ctn_n++;
+
+      return sym;
     }
-  else if ((!functions && fp->ctf_objtidx_names) ||
-	   (functions && fp->ctf_funcidx_names))
+
+  return ctf_symbol_next_static (fp, it, name, functions);
+}
+
+/* ctf_symbol_next, but only for static symbols.  Mostly an internal
+   implementation detail of ctf_symbol_next, but also used to simplify
+   serialization.  */
+ctf_id_t
+ctf_symbol_next_static (ctf_dict_t *fp, ctf_next_t **it, const char **name,
+			int functions)
+{
+  ctf_id_t sym = CTF_ERR;
+  ctf_next_t *i = *it;
+  ctf_dynhash_t *dynh = functions ? fp->ctf_funchash : fp->ctf_objthash;
+  size_t dyn_els = dynh ? ctf_dynhash_elements (dynh) : 0;
+
+  /* Only relevant for direct internal-to-library calls, not via
+     ctf_symbol_next (but important then).  */
+
+  if (!i)
+    {
+      if ((i = ctf_next_create ()) == NULL)
+	return ctf_set_typed_errno (fp, ENOMEM);
+
+      i->cu.ctn_fp = fp;
+      i->ctn_iter_fun = (void (*) (void)) ctf_symbol_next;
+      i->ctn_n = dyn_els;
+      *it = i;
+    }
+
+  if ((void (*) (void)) ctf_symbol_next != i->ctn_iter_fun)
+    return (ctf_set_typed_errno (fp, ECTF_NEXT_WRONGFUN));
+
+  if (fp != i->cu.ctn_fp)
+    return (ctf_set_typed_errno (fp, ECTF_NEXT_WRONGFP));
+
+  /* TODO-v4: Indexed after non-indexed portions?  */
+
+  if ((!functions && fp->ctf_objtidx_names) ||
+      (functions && fp->ctf_funcidx_names))
     {
       ctf_header_t *hp = fp->ctf_header;
       uint32_t *idx = functions ? fp->ctf_funcidx_names : fp->ctf_objtidx_names;
@@ -735,48 +799,51 @@ ctf_symbol_next (ctf_dict_t *fp, ctf_next_t **it, const char **name,
 
       do
 	{
-	  if (i->ctn_n >= len)
+	  if (i->ctn_n - dyn_els >= len)
 	    goto end;
 
-	  *name = ctf_strptr (fp, idx[i->ctn_n]);
-	  sym = tab[i->ctn_n++];
+	  *name = ctf_strptr (fp, idx[i->ctn_n - dyn_els]);
+	  sym = tab[i->ctn_n - dyn_els];
+	  i->ctn_n++;
 	}
       while (sym == -1u || sym == 0);
     }
   else
     {
-      /* Skip over pads in ctf_xslate, padding for typeless symbols in the
+      /* Skip over pads in ctf_sxlate, padding for typeless symbols in the
 	 symtypetab itself, and symbols in the wrong table.  */
-      for (; i->ctn_n < fp->ctf_nsyms; i->ctn_n++)
+      for (; i->ctn_n - dyn_els < fp->ctf_nsyms; i->ctn_n++)
 	{
 	  ctf_header_t *hp = fp->ctf_header;
+	  size_t n = i->ctn_n - dyn_els;
 
-	  if (fp->ctf_sxlate[i->ctn_n] == -1u)
+	  if (fp->ctf_sxlate[n] == -1u)
 	    continue;
 
-	  sym = *(uint32_t *) ((uintptr_t) fp->ctf_buf + fp->ctf_sxlate[i->ctn_n]);
+	  sym = *(uint32_t *) ((uintptr_t) fp->ctf_buf + fp->ctf_sxlate[n]);
 
 	  if (sym == 0)
 	    continue;
 
 	  if (functions)
 	    {
-	      if (fp->ctf_sxlate[i->ctn_n] >= hp->cth_funcoff
-		  && fp->ctf_sxlate[i->ctn_n] < hp->cth_objtidxoff)
+	      if (fp->ctf_sxlate[n] >= hp->cth_funcoff
+		  && fp->ctf_sxlate[n] < hp->cth_objtidxoff)
 		break;
 	    }
 	  else
 	    {
-	      if (fp->ctf_sxlate[i->ctn_n] >= hp->cth_objtoff
-		  && fp->ctf_sxlate[i->ctn_n] < hp->cth_funcoff)
+	      if (fp->ctf_sxlate[n] >= hp->cth_objtoff
+		  && fp->ctf_sxlate[n] < hp->cth_funcoff)
 		break;
 	    }
 	}
 
-      if (i->ctn_n >= fp->ctf_nsyms)
+      if (i->ctn_n - dyn_els >= fp->ctf_nsyms)
 	goto end;
 
-      *name = ctf_lookup_symbol_name (fp, i->ctn_n++);
+      *name = ctf_lookup_symbol_name (fp, i->ctn_n - dyn_els);
+      i->ctn_n++;
     }
 
   return sym;
@@ -815,6 +882,13 @@ ctf_try_lookup_indexed (ctf_dict_t *fp, unsigned long symidx,
   if (symname == NULL)
     symname = ctf_lookup_symbol_name (fp, symidx);
 
+  /* Dynamic dict with no static portion: just return.  */
+  if (!hp)
+    {
+      ctf_dprintf ("%s not found in idx: dict is dynamic\n", symname);
+      return 0;
+    }
+
   ctf_dprintf ("Looking up type of object with symtab idx %lx or name %s in "
 	       "indexed symtypetab\n", symidx, symname);
 
@@ -887,17 +961,27 @@ ctf_try_lookup_indexed (ctf_dict_t *fp, unsigned long symidx,
    function or data object described by the corresponding entry in the symbol
    table.  We can only return symbols in read-only dicts and in dicts for which
    ctf_link_shuffle_syms has been called to assign symbol indexes to symbol
-   names.  */
+   names.
 
-static ctf_id_t
+   If try_parent is false, do not check the parent dict too.
+
+   If is_function is > -1, only look for data objects or functions in
+   particular.  */
+
+ctf_id_t
 ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
-			   const char *symname)
+			   const char *symname, int try_parent,
+			   int is_function)
 {
   const ctf_sect_t *sp = &fp->ctf_symtab;
   ctf_id_t type = 0;
   int err = 0;
 
-  /* Shuffled dynsymidx present?  Use that.  */
+  /* Shuffled dynsymidx present?  Use that.  For now, the dynsymidx and
+     shuffled-symbol lookup only support dynamically-added symbols, because
+     this interface is meant for use by linkers, and linkers are only going
+     to report symbols against newly-created, freshly-ctf_link'ed dicts: so
+     there will be no static component in any case.  */
   if (fp->ctf_dynsymidx)
     {
       const ctf_link_sym_t *sym;
@@ -909,10 +993,6 @@ ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
 	ctf_dprintf ("Looking up type of object with symtab idx %lx in "
 		     "writable dict symtypetab\n", symidx);
 
-      /* The dict must be dynamic.  */
-      if (!ctf_assert (fp, fp->ctf_flags & LCTF_RDWR))
-	return CTF_ERR;
-
       /* No name? Need to look it up.  */
       if (!symname)
 	{
@@ -922,7 +1002,9 @@ ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
 
 	  sym = fp->ctf_dynsymidx[symidx];
 	  err = ECTF_NOTYPEDAT;
-	  if (!sym || (sym->st_shndx != STT_OBJECT && sym->st_shndx != STT_FUNC))
+	  if (!sym || (sym->st_type != STT_OBJECT && sym->st_type != STT_FUNC)
+	      || (sym->st_type != STT_OBJECT && is_function == 0)
+	      || (sym->st_type != STT_FUNC && is_function == 1))
 	    goto try_parent;
 
 	  if (!ctf_assert (fp, !sym->st_nameidx_set))
@@ -931,49 +1013,55 @@ ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
      }
 
       if (fp->ctf_objthash == NULL
-	  || ((type = (ctf_id_t) (uintptr_t)
-	       ctf_dynhash_lookup (fp->ctf_objthash, symname)) == 0))
+	  || is_function == 1
+	  || (type = (ctf_id_t) (uintptr_t)
+	      ctf_dynhash_lookup (fp->ctf_objthash, symname)) == 0)
 	{
 	  if (fp->ctf_funchash == NULL
-	      || ((type = (ctf_id_t) (uintptr_t)
-		   ctf_dynhash_lookup (fp->ctf_funchash, symname)) == 0))
+	      || is_function == 0
+	      || (type = (ctf_id_t) (uintptr_t)
+		  ctf_dynhash_lookup (fp->ctf_funchash, symname)) == 0)
 	    goto try_parent;
 	}
 
       return type;
     }
 
-  /* Lookup by name in a dynamic dict: just do it directly.  */
-  if (symname && fp->ctf_flags & LCTF_RDWR)
+  /* Dict not shuffled: look for a dynamic sym first, and look it up
+     directly.  */
+  if (symname)
     {
-      if (fp->ctf_objthash == NULL
-	  || ((type = (ctf_id_t) (uintptr_t)
-	       ctf_dynhash_lookup (fp->ctf_objthash, symname)) == 0))
-	{
-	  if (fp->ctf_funchash == NULL
-	      || ((type = (ctf_id_t) (uintptr_t)
-		   ctf_dynhash_lookup (fp->ctf_funchash, symname)) == 0))
-	    goto try_parent;
-	}
-      return type;
+      if (fp->ctf_objthash != NULL
+	  && is_function != 1
+	  && ((type = (ctf_id_t) (uintptr_t)
+	       ctf_dynhash_lookup (fp->ctf_objthash, symname)) != 0))
+	return type;
+
+      if (fp->ctf_funchash != NULL
+	  && is_function != 0
+	  && ((type = (ctf_id_t) (uintptr_t)
+	       ctf_dynhash_lookup (fp->ctf_funchash, symname)) != 0))
+	return type;
     }
 
   err = ECTF_NOSYMTAB;
   if (sp->cts_data == NULL)
     goto try_parent;
 
-  /* This covers both out-of-range lookups and a dynamic dict which hasn't been
-     shuffled yet.  */
+  /* This covers both out-of-range lookups by index and a dynamic dict which
+     hasn't been shuffled yet.  */
   err = EINVAL;
   if (symname == NULL && symidx >= fp->ctf_nsyms)
     goto try_parent;
 
-  if (fp->ctf_objtidx_names)
+  /* Try an indexed lookup.  */
+
+  if (fp->ctf_objtidx_names && is_function != 1)
     {
       if ((type = ctf_try_lookup_indexed (fp, symidx, symname, 0)) == CTF_ERR)
 	return CTF_ERR;				/* errno is set for us.  */
     }
-  if (type == 0 && fp->ctf_funcidx_names)
+  if (type == 0 && fp->ctf_funcidx_names && is_function != 0)
     {
       if ((type = ctf_try_lookup_indexed (fp, symidx, symname, 1)) == CTF_ERR)
 	return CTF_ERR;				/* errno is set for us.  */
@@ -981,6 +1069,7 @@ ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
   if (type != 0)
     return type;
 
+  /* Indexed but no symbol found -> not present, try the parent.  */
   err = ECTF_NOTYPEDAT;
   if (fp->ctf_objtidx_names && fp->ctf_funcidx_names)
     goto try_parent;
@@ -990,7 +1079,8 @@ ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
   ctf_dprintf ("Looking up object type %lx in 1:1 dict symtypetab\n", symidx);
 
   if (symname != NULL)
-    if ((symidx = ctf_lookup_symbol_idx (fp, symname)) == (unsigned long) -1)
+    if ((symidx = ctf_lookup_symbol_idx (fp, symname, try_parent, is_function))
+	== (unsigned long) -1)
       goto try_parent;
 
   if (fp->ctf_sxlate[symidx] == -1u)
@@ -1002,11 +1092,16 @@ ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
     goto try_parent;
 
   return type;
+
  try_parent:
+  if (!try_parent)
+    return ctf_set_errno (fp, err);
+
   if (fp->ctf_parent)
     {
       ctf_id_t ret = ctf_lookup_by_sym_or_name (fp->ctf_parent, symidx,
-						symname);
+						symname, try_parent,
+						is_function);
       if (ret == CTF_ERR)
 	ctf_set_errno (fp, ctf_errno (fp->ctf_parent));
       return ret;
@@ -1020,7 +1115,7 @@ ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
 ctf_id_t
 ctf_lookup_by_symbol (ctf_dict_t *fp, unsigned long symidx)
 {
-  return ctf_lookup_by_sym_or_name (fp, symidx, NULL);
+  return ctf_lookup_by_sym_or_name (fp, symidx, NULL, 1, -1);
 }
 
 /* Given a symbol name, return the type of the function or data object described
@@ -1028,7 +1123,7 @@ ctf_lookup_by_symbol (ctf_dict_t *fp, unsigned long symidx)
 ctf_id_t
 ctf_lookup_by_symbol_name (ctf_dict_t *fp, const char *symname)
 {
-  return ctf_lookup_by_sym_or_name (fp, 0, symname);
+  return ctf_lookup_by_sym_or_name (fp, 0, symname, 1, -1);
 }
 
 /* Given a symbol table index, return the info for the function described
diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index 87b0f74367a..f80bf5476a7 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -670,13 +670,15 @@ upgrade_types (ctf_dict_t *fp, ctf_header_t *cth)
   return 0;
 }
 
-/* Initialize the type ID translation table with the byte offset of each type,
+/* Populate statically-defined types (those loaded from a saved buffer).
+
+   Initialize the type ID translation table with the byte offset of each type,
    and initialize the hash tables of each named type.  Upgrade the type table to
    the latest supported representation in the process, if needed, and if this
    recension of libctf supports upgrading.  */
 
 static int
-init_types (ctf_dict_t *fp, ctf_header_t *cth)
+init_static_types (ctf_dict_t *fp, ctf_header_t *cth)
 {
   const ctf_type_t *tbuf;
   const ctf_type_t *tend;
@@ -694,8 +696,6 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
   int nlstructs = 0, nlunions = 0;
   int err;
 
-  assert (!(fp->ctf_flags & LCTF_RDWR));
-
   if (_libctf_unlikely_ (fp->ctf_version == CTF_VERSION_1))
     {
       int err;
@@ -770,9 +770,16 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 				   ctf_hash_eq_string, NULL, NULL)) == NULL)
     return ENOMEM;
 
+  /* The ptrtab and txlate can be appropriately sized for precisely this set
+     of types: the txlate because it is only used to look up static types,
+     so dynamic types added later will never go through it, and the ptrtab
+     because later-added types will call grow_ptrtab() automatically, as
+     needed.  */
+
   fp->ctf_txlate = malloc (sizeof (uint32_t) * (typemax + 1));
   fp->ctf_ptrtab_len = typemax + 1;
   fp->ctf_ptrtab = malloc (sizeof (uint32_t) * fp->ctf_ptrtab_len);
+  fp->ctf_stypes = typemax;
 
   if (fp->ctf_txlate == NULL || fp->ctf_ptrtab == NULL)
     return ENOMEM;		/* Memory allocation failed.  */
@@ -1283,7 +1290,7 @@ ctf_dict_t *ctf_simple_open (const char *ctfsect, size_t ctfsect_size,
 {
   return ctf_simple_open_internal (ctfsect, ctfsect_size, symsect, symsect_size,
 				   symsect_entsize, strsect, strsect_size, NULL,
-				   0, errp);
+				   errp);
 }
 
 /* Open a CTF file, mocking up a suitable ctf_sect and overriding the external
@@ -1293,8 +1300,7 @@ ctf_dict_t *ctf_simple_open_internal (const char *ctfsect, size_t ctfsect_size,
 				      const char *symsect, size_t symsect_size,
 				      size_t symsect_entsize,
 				      const char *strsect, size_t strsect_size,
-				      ctf_dynhash_t *syn_strtab, int writable,
-				      int *errp)
+				      ctf_dynhash_t *syn_strtab, int *errp)
 {
   ctf_sect_t skeleton;
 
@@ -1332,7 +1338,7 @@ ctf_dict_t *ctf_simple_open_internal (const char *ctfsect, size_t ctfsect_size,
     }
 
   return ctf_bufopen_internal (ctfsectp, symsectp, strsectp, syn_strtab,
-			       writable, errp);
+			       errp);
 }
 
 /* Decode the specified CTF buffer and optional symbol table, and create a new
@@ -1344,7 +1350,7 @@ ctf_dict_t *
 ctf_bufopen (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 	     const ctf_sect_t *strsect, int *errp)
 {
-  return ctf_bufopen_internal (ctfsect, symsect, strsect, NULL, 0, errp);
+  return ctf_bufopen_internal (ctfsect, symsect, strsect, NULL, errp);
 }
 
 /* Like ctf_bufopen, but overriding the external strtab with a synthetic one.  */
@@ -1352,7 +1358,7 @@ ctf_bufopen (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 ctf_dict_t *
 ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 		      const ctf_sect_t *strsect, ctf_dynhash_t *syn_strtab,
-		      int writable, int *errp)
+		      int *errp)
 {
   const ctf_preamble_t *pp;
   size_t hdrsz = sizeof (ctf_header_t);
@@ -1441,9 +1447,6 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 
   memset (fp, 0, sizeof (ctf_dict_t));
 
-  if (writable)
-    fp->ctf_flags |= LCTF_RDWR;
-
   if ((fp->ctf_header = malloc (sizeof (struct ctf_header))) == NULL)
     {
       free (fp);
@@ -1526,7 +1529,7 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
      section's buffer pointer into ctf_buf, below.  */
 
   /* Note: if this is a v1 buffer, it will be reallocated and expanded by
-     init_types().  */
+     init_static_types().  */
 
   if (hp->cth_flags & CTF_F_COMPRESS)
     {
@@ -1607,7 +1610,7 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
      proceed with initializing the ctf_dict_t we allocated above.
 
      Nothing that depends on buf or base should be set directly in this function
-     before the init_types() call, because it may be reallocated during
+     before the init_static_types() call, because it may be reallocated during
      transparent upgrade if this recension of libctf is so configured: see
      ctf_set_base().  */
 
@@ -1660,6 +1663,26 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
     }
   fp->ctf_syn_ext_strtab = syn_strtab;
 
+  /* Dynamic state, for dynamic addition to this dict after loading.  */
+
+  fp->ctf_dthash = ctf_dynhash_create (ctf_hash_integer, ctf_hash_eq_integer,
+				       NULL, NULL);
+  fp->ctf_dvhash = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
+				       NULL, NULL);
+  fp->ctf_snapshots = 1;
+
+  fp->ctf_objthash = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
+					   free, NULL);
+  fp->ctf_funchash = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
+					 free, NULL);
+
+  if (!fp->ctf_dthash || !fp->ctf_dvhash || !fp->ctf_snapshots ||
+      !fp->ctf_objthash || !fp->ctf_funchash)
+    {
+      err = ENOMEM;
+      goto bad;
+    }
+
   if (foreign_endian &&
       (err = ctf_flip (fp, hp, fp->ctf_buf, 0)) != 0)
     {
@@ -1673,15 +1696,7 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 
   ctf_set_base (fp, hp, fp->ctf_base);
 
-  /* No need to do anything else for dynamic dicts: they do not support symbol
-     lookups, and the type table is maintained in the dthashes.  */
-  if (fp->ctf_flags & LCTF_RDWR)
-    {
-      fp->ctf_refcnt = 1;
-      return fp;
-    }
-
-  if ((err = init_types (fp, hp)) != 0)
+  if ((err = init_static_types (fp, hp)) != 0)
     goto bad;
 
   /* Allocate and initialize the symtab translation table, pointed to by
@@ -1800,7 +1815,8 @@ ctf_dict_close (ctf_dict_t *fp)
     }
   ctf_dynhash_destroy (fp->ctf_dvhash);
 
-  ctf_dynhash_destroy (fp->ctf_symhash);
+  ctf_dynhash_destroy (fp->ctf_symhash_func);
+  ctf_dynhash_destroy (fp->ctf_symhash_objt);
   free (fp->ctf_funcidx_sxlate);
   free (fp->ctf_objtidx_sxlate);
   ctf_dynhash_destroy (fp->ctf_objthash);
diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c
index 511c5116140..7092264f446 100644
--- a/libctf/ctf-serialize.c
+++ b/libctf/ctf-serialize.c
@@ -945,7 +945,12 @@ ctf_sort_var (const void *one_, const void *two_, void *arg_)
    code simple: ctf_simple_open_internal() will return a new ctf_dict_t, but we
    want to keep the fp constant for the caller, so after
    ctf_simple_open_internal() returns, we use memcpy to swap the interior of the
-   old and new ctf_dict_t's, and then free the old.  */
+   old and new ctf_dict_t's, and then free the old.
+
+   We do not currently support serializing a dict that has already been
+   serialized in the past: but all the tables support it except for the types
+   table.  */
+
 int
 ctf_serialize (ctf_dict_t *fp)
 {
@@ -956,6 +961,7 @@ ctf_serialize (ctf_dict_t *fp)
   ctf_strs_writable_t strtab;
   int err;
   int num_missed_str_refs;
+  int sym_functions = 0;
 
   unsigned char *t;
   unsigned long i;
@@ -967,7 +973,11 @@ ctf_serialize (ctf_dict_t *fp)
   emit_symtypetab_state_t symstate;
   memset (&symstate, 0, sizeof (emit_symtypetab_state_t));
 
-  if (!(fp->ctf_flags & LCTF_RDWR))
+  /* This isn't a very nice error code, but it's close enough: it's what you
+     get if you try to modify a type loaded out of a serialized dict, so
+     it makes at least a little sense that it's what you get if you try to
+     reserialize the dict again.  */
+  if (fp->ctf_stypes > 0)
     return (ctf_set_errno (fp, ECTF_RDONLY));
 
   /* Update required?  */
@@ -999,10 +1009,44 @@ ctf_serialize (ctf_dict_t *fp)
      of the dynsym and dynstr these days.  */
   hdr.cth_flags = (CTF_F_NEWFUNCINFO | CTF_F_DYNSTR);
 
+  /* Propagate all symbols in the symtypetabs into the dynamic state, so that
+     we can put them back in the right order.  Symbols already in the dynamic
+     state are left as they are.  */
+  do
+    {
+      ctf_next_t *it = NULL;
+      const char *sym_name;
+      ctf_id_t sym;
+
+      while ((sym = ctf_symbol_next_static (fp, &it, &sym_name,
+					    sym_functions)) != CTF_ERR)
+	if ((ctf_add_funcobjt_sym_forced (fp, sym_functions, sym_name, sym)) < 0)
+	  if (ctf_errno (fp) != ECTF_DUPLICATE)
+	    return -1;				/* errno is set for us.  */
+
+      if (ctf_errno (fp) != ECTF_NEXT_END)
+	return -1;				/* errno is set for us.  */
+    } while (sym_functions++ < 1);
+
+  /* Figure out how big the symtypetabs are now.  */
+
   if (ctf_symtypetab_sect_sizes (fp, &symstate, &hdr, &objt_size, &func_size,
 				 &objtidx_size, &funcidx_size) < 0)
     return -1;					/* errno is set for us.  */
 
+  /* Propagate all vars into the dynamic state, so we can put them back later.
+     Variables already in the dynamic state, likely due to repeated
+     serialization, are left unchanged.  */
+
+  for (i = 0; i < fp->ctf_nvars; i++)
+    {
+      const char *name = ctf_strptr (fp, fp->ctf_vars[i].ctv_name);
+
+      if (name != NULL && !ctf_dvd_lookup (fp, name))
+	if (ctf_add_variable_forced (fp, name, fp->ctf_vars[i].ctv_type) < 0)
+	  return -1;				/* errno is set for us.  */
+    }
+
   for (nvars = 0, dvd = ctf_list_next (&fp->ctf_dvdefs);
        dvd != NULL; dvd = ctf_list_next (dvd), nvars++);
 
@@ -1101,7 +1145,7 @@ ctf_serialize (ctf_dict_t *fp)
 
   if ((nfp = ctf_simple_open_internal ((char *) buf, buf_size, NULL, 0,
 				       0, NULL, 0, fp->ctf_syn_ext_strtab,
-				       1, &err)) == NULL)
+				       &err)) == NULL)
     {
       free (buf);
       return (ctf_set_errno (fp, err));
@@ -1131,6 +1175,7 @@ ctf_serialize (ctf_dict_t *fp)
   nfp->ctf_ptrtab = fp->ctf_ptrtab;
   nfp->ctf_pptrtab = fp->ctf_pptrtab;
   nfp->ctf_typemax = fp->ctf_typemax;
+  nfp->ctf_stypes = fp->ctf_stypes;
   nfp->ctf_dynsymidx = fp->ctf_dynsymidx;
   nfp->ctf_dynsymmax = fp->ctf_dynsymmax;
   nfp->ctf_ptrtab_len = fp->ctf_ptrtab_len;
diff --git a/libctf/ctf-types.c b/libctf/ctf-types.c
index 10bb6d1596a..ff12d51941d 100644
--- a/libctf/ctf-types.c
+++ b/libctf/ctf-types.c
@@ -492,6 +492,7 @@ ctf_id_t
 ctf_variable_next (ctf_dict_t *fp, ctf_next_t **it, const char **name)
 {
   ctf_next_t *i = *it;
+  ctf_id_t id;
 
   if ((fp->ctf_flags & LCTF_CHILD) && (fp->ctf_parent == NULL))
     return (ctf_set_typed_errno (fp, ECTF_NOPARENT));
@@ -503,8 +504,7 @@ ctf_variable_next (ctf_dict_t *fp, ctf_next_t **it, const char **name)
 
       i->cu.ctn_fp = fp;
       i->ctn_iter_fun = (void (*) (void)) ctf_variable_next;
-      if (fp->ctf_flags & LCTF_RDWR)
-	i->u.ctn_dvd = ctf_list_next (&fp->ctf_dvdefs);
+      i->u.ctn_dvd = ctf_list_next (&fp->ctf_dvdefs);
       *it = i;
     }
 
@@ -514,27 +514,21 @@ ctf_variable_next (ctf_dict_t *fp, ctf_next_t **it, const char **name)
   if (fp != i->cu.ctn_fp)
     return (ctf_set_typed_errno (fp, ECTF_NEXT_WRONGFP));
 
-  if (!(fp->ctf_flags & LCTF_RDWR))
+  if (i->ctn_n < fp->ctf_nvars)
     {
-      if (i->ctn_n >= fp->ctf_nvars)
-	goto end_iter;
-
       *name = ctf_strptr (fp, fp->ctf_vars[i->ctn_n].ctv_name);
       return fp->ctf_vars[i->ctn_n++].ctv_type;
-    }
-  else
-    {
-      ctf_id_t id;
 
-      if (i->u.ctn_dvd == NULL)
-	goto end_iter;
-
-      *name = i->u.ctn_dvd->dvd_name;
-      id = i->u.ctn_dvd->dvd_type;
-      i->u.ctn_dvd = ctf_list_next (i->u.ctn_dvd);
-      return id;
     }
 
+  if (i->u.ctn_dvd == NULL)
+    goto end_iter;
+
+  *name = i->u.ctn_dvd->dvd_name;
+  id = i->u.ctn_dvd->dvd_type;
+  i->u.ctn_dvd = ctf_list_next (i->u.ctn_dvd);
+  return id;
+
  end_iter:
   ctf_next_destroy (i);
   *it = NULL;
diff --git a/libctf/testsuite/libctf-lookup/add-to-opened-ctf.c b/libctf/testsuite/libctf-lookup/add-to-opened-ctf.c
new file mode 100644
index 00000000000..b5d483ea1cb
--- /dev/null
+++ b/libctf/testsuite/libctf-lookup/add-to-opened-ctf.c
@@ -0,0 +1,19 @@
+int an_int;
+char *a_char_ptr;
+typedef int (*a_typedef) (int main);
+struct struct_forward;
+enum enum_forward;
+union union_forward;
+typedef int an_array[50];
+struct a_struct { int foo; };
+union a_union { int bar; };
+enum an_enum { FOO };
+
+a_typedef a;
+struct struct_forward *x;
+union union_forward *y;
+enum enum_forward *z;
+struct a_struct *xx;
+union a_union *yy;
+enum an_enum *zz;
+an_array ar;
diff --git a/libctf/testsuite/libctf-lookup/add-to-opened.c b/libctf/testsuite/libctf-lookup/add-to-opened.c
new file mode 100644
index 00000000000..dc2e1f55b99
--- /dev/null
+++ b/libctf/testsuite/libctf-lookup/add-to-opened.c
@@ -0,0 +1,147 @@
+/* Make sure you can add to ctf_open()ed CTF dicts, and that you
+   cannot make changes to existing types.  */
+
+#include <ctf-api.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+int
+main (int argc, char *argv[])
+{
+  ctf_dict_t *fp;
+  ctf_archive_t *ctf;
+  ctf_id_t type, ptrtype;
+  ctf_arinfo_t ar = {0, 0, 0};
+  ctf_encoding_t en = { CTF_INT_SIGNED, 0, sizeof (int) };
+  unsigned char *ctf_written;
+  size_t size;
+  int err;
+
+  if (argc != 2)
+    {
+      fprintf (stderr, "Syntax: %s PROGRAM\n", argv[0]);
+      exit(1);
+    }
+
+  if ((ctf = ctf_open (argv[1], NULL, &err)) == NULL)
+    goto open_err;
+  if ((fp = ctf_dict_open (ctf, NULL, &err)) == NULL)
+    goto open_err;
+
+  /* Check that various modifications to already-written types
+     are prohibited.  */
+
+  if (ctf_add_integer (fp, CTF_ADD_ROOT, "int", &en) == 0)
+    fprintf (stderr, "allowed to add integer existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to add integer in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_typedef (fp, CTF_ADD_ROOT, "a_typedef", 0) == 0)
+    fprintf (stderr, "allowed to add typedef existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to add typedef in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_struct (fp, CTF_ADD_ROOT, "a_struct") == 0)
+    fprintf (stderr, "allowed to add struct existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to add struct in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_union (fp, CTF_ADD_ROOT, "a_union") == 0)
+    fprintf (stderr, "allowed to add union existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to add union in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_enum (fp, CTF_ADD_ROOT, "an_enum") == 0)
+    fprintf (stderr, "allowed to add enum existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to add enum in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_struct (fp, CTF_ADD_ROOT, "struct_forward") == 0)
+    fprintf (stderr, "allowed to promote struct forward existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to promote struct forward in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_union (fp, CTF_ADD_ROOT, "union_forward") == 0)
+    fprintf (stderr, "allowed to promote union forward existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to promote union forward in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_enum (fp, CTF_ADD_ROOT, "enum_forward") == 0)
+    fprintf (stderr, "allowed to promote enum forward existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to promote enum forward in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if ((type = ctf_lookup_by_name (fp, "struct a_struct")) == CTF_ERR)
+    fprintf (stderr, "Lookup of struct a_struct failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_member (fp, type, "wombat", 0) == 0)
+    fprintf (stderr, "allowed to add member to struct existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to add member to struct in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if ((type = ctf_lookup_by_name (fp, "union a_union")) == CTF_ERR)
+    fprintf (stderr, "Lookup of union a_union failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_member (fp, type, "wombat", 0) == 0)
+    fprintf (stderr, "allowed to add member to union existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to add member to union in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if ((type = ctf_lookup_by_name (fp, "enum an_enum")) == CTF_ERR)
+    fprintf (stderr, "Lookup of enum an_enum failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_add_enumerator (fp, type, "wombat", 0) == 0)
+    fprintf (stderr, "allowed to add enumerator to enum existing in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to add enumerator to enum in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if ((type = ctf_lookup_by_name (fp, "an_array")) == CTF_ERR)
+    fprintf (stderr, "Lookup of an_array failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if ((type = ctf_type_reference (fp, type)) == CTF_ERR)
+    fprintf (stderr, "Lookup of type reffed by an_array failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_set_array (fp, type, &ar) == 0)
+    fprintf (stderr, "allowed to set array in readonly portion\n");
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s attempting to set array in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
+
+  if ((ctf_written = ctf_write_mem (fp, &size, 4096)) != NULL)
+    fprintf (stderr, "Writeout unexpectedly succeeded: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_errno (fp) != ECTF_RDONLY)
+    fprintf (stderr, "unexpected error %s trying to write out previously serialized dict\n", ctf_errmsg (ctf_errno (fp)));
+
+  /* Finally, make sure we can add new types, and look them up again.  */
+
+  if ((type = ctf_lookup_by_name (fp, "struct a_struct")) == CTF_ERR)
+    fprintf (stderr, "Lookup of struct a_struct failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if ((ptrtype = ctf_add_pointer (fp, CTF_ADD_ROOT, type)) == CTF_ERR)
+    fprintf (stderr, "Cannot add pointer to ctf_opened dict: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_type_reference (fp, ptrtype) == CTF_ERR)
+    fprintf (stderr, "Lookup of pointer preserved across writeout failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if (ctf_type_reference (fp, ptrtype) != type)
+    fprintf (stderr, "Look up of newly-added type in serialized dict yields ID %lx, expected %lx\n", ctf_type_reference (fp, ptrtype), type);
+
+  printf ("All done.\n");
+  return 0;
+ 
+ open_err:
+  fprintf (stderr, "%s: cannot open: %s\n", argv[0], ctf_errmsg (err));
+  return 1;
+}    
diff --git a/libctf/testsuite/libctf-lookup/add-to-opened.lk b/libctf/testsuite/libctf-lookup/add-to-opened.lk
new file mode 100644
index 00000000000..af842597363
--- /dev/null
+++ b/libctf/testsuite/libctf-lookup/add-to-opened.lk
@@ -0,0 +1,3 @@
+# source: add-to-opened-ctf.c
+# lookup: add-to-opened.c
+All done.
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 06/22] libctf: fix a comment
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (4 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 05/22] libctf: support addition of types to dicts read via ctf_open() Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 07/22] libctf: delete LCTF_DIRTY Nick Alcock
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

A mistaken "not" in ctf_err_warn made it seem like we only extracted
error messages if this was not an error.

libctf/

	* ctf-subr.c (ctf_err_warn): Fix comment.
---
 libctf/ctf-subr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libctf/ctf-subr.c b/libctf/ctf-subr.c
index a21048d0032..ecc68848d31 100644
--- a/libctf/ctf-subr.c
+++ b/libctf/ctf-subr.c
@@ -225,7 +225,7 @@ ctf_err_warn (ctf_dict_t *fp, int is_warning, int err,
     }
   va_end (alist);
 
-  /* Include the error code only if there is one; if this is not a warning,
+  /* Include the error code only if there is one; if this is a warning,
      only use the error code if it was explicitly passed and is nonzero.
      (Warnings may not have a meaningful error code, since the warning may not
      lead to unwinding up to the user.)  */
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 07/22] libctf: delete LCTF_DIRTY
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (5 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 06/22] libctf: fix a comment Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 08/22] libctf: fix a comment typo Nick Alcock
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

This flag was meant as an optimization to avoid reserializing dicts
unnecessarily.  It was critically necessary back when serialization was
done by ctf_update() and you had to call that every time you wanted any
new modifications to the type table to be usable by other types, but
that has been unnecessary for years now, and serialization is only done
once when writing out, which one would naturally assume would always
serialize the dict.  Worse, it never really worked: it only tracked
newly-added types, not things like added symbols which might equally
well require reserialization, and it gets in the way of an upcoming
change.  Delete entirely.

libctf/

	* ctf-create.c (ctf_create): Drop LCTF_DIRTY.
	(ctf_discard): Likewise.
	(ctf_rollback): Likewise.
	(ctf_add_generic): Likewise.
	(ctf_set_array): Likewise.
	(ctf_add_enumerator): Likewise.
	(ctf_add_member_offset): Likewise.
	(ctf_add_variable_forced): Likewise.
	* ctf-link.c (ctf_link_intern_extern_string): Likewise.
	(ctf_link_add_strtab): Likewise.
	* ctf-serialize.c (ctf_serialize): Likewise.
	* ctf-impl.h (LCTF_DIRTY): Likewise.
	(LCTF_LINKING): Renumber.
---
 libctf/ctf-create.c    | 15 ---------------
 libctf/ctf-impl.h      |  3 +--
 libctf/ctf-link.c      |  2 --
 libctf/ctf-serialize.c |  5 -----
 4 files changed, 1 insertion(+), 24 deletions(-)

diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index 7aa244e5ec7..23bbf92ff1a 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -148,7 +148,6 @@ ctf_create (int *errp)
   fp->ctf_names = names;
   fp->ctf_dtoldid = 0;
   fp->ctf_snapshot_lu = 0;
-  fp->ctf_flags |= LCTF_DIRTY;
 
   /* Make sure the ptrtab starts out at a reasonable size.  */
 
@@ -347,10 +346,6 @@ ctf_discard (ctf_dict_t *fp)
     { fp->ctf_dtoldid,
       fp->ctf_snapshot_lu + 1 };
 
-  /* Update required?  */
-  if (!(fp->ctf_flags & LCTF_DIRTY))
-    return 0;
-
   return (ctf_rollback (fp, last_update));
 }
 
@@ -415,9 +410,6 @@ ctf_rollback (ctf_dict_t *fp, ctf_snapshot_id_t id)
   fp->ctf_typemax = id.dtd_id;
   fp->ctf_snapshots = id.snapshot_id;
 
-  if (fp->ctf_snapshots == fp->ctf_snapshot_lu)
-    fp->ctf_flags &= ~LCTF_DIRTY;
-
   return 0;
 }
 
@@ -482,8 +474,6 @@ ctf_add_generic (ctf_dict_t *fp, uint32_t flag, const char *name, int kind,
   if (ctf_dtd_insert (fp, dtd, flag, kind) < 0)
     goto err;					/* errno is set for us.  */
 
-  fp->ctf_flags |= LCTF_DIRTY;
-
   *rp = dtd;
   return type;
 
@@ -729,7 +719,6 @@ ctf_set_array (ctf_dict_t *fp, ctf_id_t type, const ctf_arinfo_t *arp)
     return (ctf_set_errno (ofp, ECTF_BADID));
 
   vlen = (ctf_array_t *) dtd->dtd_vlen;
-  fp->ctf_flags |= LCTF_DIRTY;
   vlen->cta_contents = (uint32_t) arp->ctr_contents;
   vlen->cta_index = (uint32_t) arp->ctr_index;
   vlen->cta_nelems = arp->ctr_nelems;
@@ -1113,8 +1102,6 @@ ctf_add_enumerator (ctf_dict_t *fp, ctf_id_t enid, const char *name,
 
   dtd->dtd_data.ctt_info = CTF_TYPE_INFO (kind, root, vlen + 1);
 
-  fp->ctf_flags |= LCTF_DIRTY;
-
   return 0;
 }
 
@@ -1296,7 +1283,6 @@ ctf_add_member_offset (ctf_dict_t *fp, ctf_id_t souid, const char *name,
   dtd->dtd_data.ctt_lsizelo = CTF_SIZE_TO_LSIZE_LO (ssize);
   dtd->dtd_data.ctt_info = CTF_TYPE_INFO (kind, root, vlen + 1);
 
-  fp->ctf_flags |= LCTF_DIRTY;
   return 0;
 }
 
@@ -1365,7 +1351,6 @@ ctf_add_variable_forced (ctf_dict_t *fp, const char *name, ctf_id_t ref)
       return -1;			/* errno is set for us.  */
     }
 
-  fp->ctf_flags |= LCTF_DIRTY;
   return 0;
 }
 
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index f4fa3234681..dc57d6f64c7 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -589,8 +589,7 @@ struct ctf_next
   ((fp)->ctf_dictops->ctfo_get_vbytes(fp, kind, size, vlen))
 
 #define LCTF_CHILD	0x0001	/* CTF dict is a child.  */
-#define LCTF_DIRTY	0x0002	/* CTF dict has been modified.  */
-#define LCTF_LINKING	0x0004  /* CTF link is underway: respect ctf_link_flags.  */
+#define LCTF_LINKING	0x0002  /* CTF link is underway: respect ctf_link_flags.  */
 
 extern ctf_dynhash_t *ctf_name_table (ctf_dict_t *, int);
 extern const ctf_type_t *ctf_lookup_by_id (ctf_dict_t **, ctf_id_t);
diff --git a/libctf/ctf-link.c b/libctf/ctf-link.c
index 9d2d29416d3..44d4e496f6a 100644
--- a/libctf/ctf-link.c
+++ b/libctf/ctf-link.c
@@ -1568,7 +1568,6 @@ ctf_link_intern_extern_string (void *key _libctf_unused_, void *value,
   ctf_dict_t *fp = (ctf_dict_t *) value;
   ctf_link_out_string_cb_arg_t *arg = (ctf_link_out_string_cb_arg_t *) arg_;
 
-  fp->ctf_flags |= LCTF_DIRTY;
   if (!ctf_str_add_external (fp, arg->str, arg->offset))
     arg->err = ENOMEM;
 }
@@ -1596,7 +1595,6 @@ ctf_link_add_strtab (ctf_dict_t *fp, ctf_link_strtab_string_f *add_string,
     {
       ctf_link_out_string_cb_arg_t iter_arg = { str, offset, 0 };
 
-      fp->ctf_flags |= LCTF_DIRTY;
       if (!ctf_str_add_external (fp, str, offset))
 	err = ENOMEM;
 
diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c
index 7092264f446..9dd7fbda285 100644
--- a/libctf/ctf-serialize.c
+++ b/libctf/ctf-serialize.c
@@ -980,10 +980,6 @@ ctf_serialize (ctf_dict_t *fp)
   if (fp->ctf_stypes > 0)
     return (ctf_set_errno (fp, ECTF_RDONLY));
 
-  /* Update required?  */
-  if (!(fp->ctf_flags & LCTF_DIRTY))
-    return 0;
-
   /* The strtab refs table must be empty at this stage.  Any refs already added
      will be corrupted by any modifications, including reserialization, after
      strtab finalization is complete.  Only this function, and functions it
@@ -1156,7 +1152,6 @@ ctf_serialize (ctf_dict_t *fp)
   nfp->ctf_parent = fp->ctf_parent;
   nfp->ctf_parent_unreffed = fp->ctf_parent_unreffed;
   nfp->ctf_refcnt = fp->ctf_refcnt;
-  nfp->ctf_flags |= fp->ctf_flags & ~LCTF_DIRTY;
   if (nfp->ctf_dynbase == NULL)
     nfp->ctf_dynbase = buf;		/* Make sure buf is freed on close.  */
   nfp->ctf_dthash = fp->ctf_dthash;
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 08/22] libctf: fix a comment typo
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (6 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 07/22] libctf: delete LCTF_DIRTY Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 09/22] libctf: rename ctf_dict.ctf_{symtab,strtab} Nick Alcock
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

ctf_update has been called ctf_serialize for years now.

libctf/

	* ctf-impl.h: Fix comment typo.
---
 libctf/ctf-impl.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index dc57d6f64c7..b7123317c98 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -199,13 +199,13 @@ typedef struct ctf_err_warning
 } ctf_err_warning_t;
 
 /* Atoms associate strings with a list of the CTF items that reference that
-   string, so that ctf_update() can instantiate all the strings using the
+   string, so that ctf_serialize() can instantiate all the strings using the
    ctf_str_atoms and then reassociate them with the real string later.
 
    Strings can be interned into ctf_str_atom without having refs associated
    with them, for values that are returned to callers, etc.  Items are only
-   removed from this table on ctf_close(), but on every ctf_update(), all the
-   csa_refs in all entries are purged.  */
+   removed from this table on ctf_close(), but on every ctf_serialize(), all
+   the csa_refs in all entries are purged.  */
 
 typedef struct ctf_str_atom
 {
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 09/22] libctf: rename ctf_dict.ctf_{symtab,strtab}
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (7 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 08/22] libctf: fix a comment typo Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 10/22] Revert "libctf: do not corrupt strings across ctf_serialize" Nick Alcock
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

These two fields are constantly confusing because CTF dicts contain both
a symtypetab and strtab, but these fields are not that: they are the
symtab and strtab from the ELF file.  We have enough string tables now
(internal, external, synthetic external, dynamic) that we need to at
least name them better than this to avoid getting totally confused.
Rename them to ctf_ext_symtab and ctf_ext_strtab.

libctf/

	* ctf-dump.c (ctf_dump_objts): Rename ctf_symtab -> ctf_ext_symtab.
	* ctf-impl.h (struct ctf_dict.ctf_symtab): Rename to...
	(struct ctf_dict.ctf_ext_strtab): ... this.
	(struct ctf_dict.ctf_strtab): Rename to...
	(struct ctf_dict.ctf_ext_strtab): ... this.
	* ctf-lookup.c (ctf_lookup_symbol_name): Adapt.
	(ctf_lookup_symbol_idx): Adapt.
	(ctf_lookup_by_sym_or_name): Adapt.
	* ctf-open.c (ctf_bufopen_internal): Adapt.
	(ctf_dict_close): Adapt.
	(ctf_getsymsect): Adapt.
	(ctf_getstrsect): Adapt.
	(ctf_symsect_endianness): Adapt.
---
 libctf/ctf-dump.c   |  2 +-
 libctf/ctf-impl.h   |  6 +++---
 libctf/ctf-lookup.c |  6 +++---
 libctf/ctf-open.c   | 36 ++++++++++++++++++------------------
 4 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/libctf/ctf-dump.c b/libctf/ctf-dump.c
index 11179a61386..474c4e00cea 100644
--- a/libctf/ctf-dump.c
+++ b/libctf/ctf-dump.c
@@ -441,7 +441,7 @@ ctf_dump_objts (ctf_dict_t *fp, ctf_dump_state_t *state, int functions)
   if ((functions && fp->ctf_funcidx_names)
       || (!functions && fp->ctf_objtidx_names))
     str = str_append (str, _("Section is indexed.\n"));
-  else if (fp->ctf_symtab.cts_data == NULL)
+  else if (fp->ctf_ext_symtab.cts_data == NULL)
     str = str_append (str, _("No symbol table.\n"));
 
   while ((id = ctf_symbol_next (fp, &i, &name, functions)) != CTF_ERR)
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index b7123317c98..8ce489a55ba 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -366,9 +366,9 @@ struct ctf_dict
   struct ctf_header *ctf_header;    /* The header from this CTF dict.  */
   unsigned char ctf_openflags;	    /* Flags the dict had when opened.  */
   ctf_sect_t ctf_data;		    /* CTF data from object file.  */
-  ctf_sect_t ctf_symtab;	    /* Symbol table from object file.  */
-  ctf_sect_t ctf_strtab;	    /* String table from object file.  */
-  int ctf_symsect_little_endian;    /* Endianness of the ctf_symtab.  */
+  ctf_sect_t ctf_ext_symtab;	    /* Symbol table from object file.  */
+  ctf_sect_t ctf_ext_strtab;	    /* String table from object file.  */
+  int ctf_symsect_little_endian;    /* Endianness of the ctf_ext_symtab.  */
   ctf_dynhash_t *ctf_symhash_func;  /* (partial) hash, symsect name -> idx. */
   ctf_dynhash_t *ctf_symhash_objt;  /* ditto, for object symbols.  */
   size_t ctf_symhash_latest;	    /* Amount of symsect scanned so far.  */
diff --git a/libctf/ctf-lookup.c b/libctf/ctf-lookup.c
index 1fcbebee2d1..aa251bafb89 100644
--- a/libctf/ctf-lookup.c
+++ b/libctf/ctf-lookup.c
@@ -469,7 +469,7 @@ ctf_symidx_sort (ctf_dict_t *fp, uint32_t *idx, size_t *nidx,
 static const char *
 ctf_lookup_symbol_name (ctf_dict_t *fp, unsigned long symidx)
 {
-  const ctf_sect_t *sp = &fp->ctf_symtab;
+  const ctf_sect_t *sp = &fp->ctf_ext_symtab;
   ctf_link_sym_t sym;
   int err;
 
@@ -540,7 +540,7 @@ static unsigned long
 ctf_lookup_symbol_idx (ctf_dict_t *fp, const char *symname, int try_parent,
 		       int is_function)
 {
-  const ctf_sect_t *sp = &fp->ctf_symtab;
+  const ctf_sect_t *sp = &fp->ctf_ext_symtab;
   ctf_link_sym_t sym;
   void *known_idx;
   int err;
@@ -973,7 +973,7 @@ ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
 			   const char *symname, int try_parent,
 			   int is_function)
 {
-  const ctf_sect_t *sp = &fp->ctf_symtab;
+  const ctf_sect_t *sp = &fp->ctf_ext_symtab;
   ctf_id_t type = 0;
   int err = 0;
 
diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index f80bf5476a7..22475465fa8 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -1626,8 +1626,8 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 
   if (symsect != NULL)
     {
-      memcpy (&fp->ctf_symtab, symsect, sizeof (ctf_sect_t));
-      memcpy (&fp->ctf_strtab, strsect, sizeof (ctf_sect_t));
+      memcpy (&fp->ctf_ext_symtab, symsect, sizeof (ctf_sect_t));
+      memcpy (&fp->ctf_ext_strtab, strsect, sizeof (ctf_sect_t));
     }
 
   if (fp->ctf_data.cts_name != NULL)
@@ -1636,14 +1636,14 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 	err = ENOMEM;
 	goto bad;
       }
-  if (fp->ctf_symtab.cts_name != NULL)
-    if ((fp->ctf_symtab.cts_name = strdup (fp->ctf_symtab.cts_name)) == NULL)
+  if (fp->ctf_ext_symtab.cts_name != NULL)
+    if ((fp->ctf_ext_symtab.cts_name = strdup (fp->ctf_ext_symtab.cts_name)) == NULL)
       {
 	err = ENOMEM;
 	goto bad;
       }
-  if (fp->ctf_strtab.cts_name != NULL)
-    if ((fp->ctf_strtab.cts_name = strdup (fp->ctf_strtab.cts_name)) == NULL)
+  if (fp->ctf_ext_strtab.cts_name != NULL)
+    if ((fp->ctf_ext_strtab.cts_name = strdup (fp->ctf_ext_strtab.cts_name)) == NULL)
       {
 	err = ENOMEM;
 	goto bad;
@@ -1651,10 +1651,10 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 
   if (fp->ctf_data.cts_name == NULL)
     fp->ctf_data.cts_name = _CTF_NULLSTR;
-  if (fp->ctf_symtab.cts_name == NULL)
-    fp->ctf_symtab.cts_name = _CTF_NULLSTR;
-  if (fp->ctf_strtab.cts_name == NULL)
-    fp->ctf_strtab.cts_name = _CTF_NULLSTR;
+  if (fp->ctf_ext_symtab.cts_name == NULL)
+    fp->ctf_ext_symtab.cts_name = _CTF_NULLSTR;
+  if (fp->ctf_ext_strtab.cts_name == NULL)
+    fp->ctf_ext_strtab.cts_name = _CTF_NULLSTR;
 
   if (strsect != NULL)
     {
@@ -1836,11 +1836,11 @@ ctf_dict_close (ctf_dict_t *fp)
   if (fp->ctf_data.cts_name != _CTF_NULLSTR)
     free ((char *) fp->ctf_data.cts_name);
 
-  if (fp->ctf_symtab.cts_name != _CTF_NULLSTR)
-    free ((char *) fp->ctf_symtab.cts_name);
+  if (fp->ctf_ext_symtab.cts_name != _CTF_NULLSTR)
+    free ((char *) fp->ctf_ext_symtab.cts_name);
 
-  if (fp->ctf_strtab.cts_name != _CTF_NULLSTR)
-    free ((char *) fp->ctf_strtab.cts_name);
+  if (fp->ctf_ext_strtab.cts_name != _CTF_NULLSTR)
+    free ((char *) fp->ctf_ext_strtab.cts_name);
   else if (fp->ctf_data_mmapped)
     ctf_munmap (fp->ctf_data_mmapped, fp->ctf_data_mmapped_len);
 
@@ -1909,13 +1909,13 @@ ctf_getdatasect (const ctf_dict_t *fp)
 ctf_sect_t
 ctf_getsymsect (const ctf_dict_t *fp)
 {
-  return fp->ctf_symtab;
+  return fp->ctf_ext_symtab;
 }
 
 ctf_sect_t
 ctf_getstrsect (const ctf_dict_t *fp)
 {
-  return fp->ctf_strtab;
+  return fp->ctf_ext_strtab;
 }
 
 /* Set the endianness of the symbol table attached to FP.  */
@@ -1930,8 +1930,8 @@ ctf_symsect_endianness (ctf_dict_t *fp, int little_endian)
      our idea of the endianness has changed.  */
 
   if (old_endianness != fp->ctf_symsect_little_endian
-      && fp->ctf_sxlate != NULL && fp->ctf_symtab.cts_data != NULL)
-    assert (init_symtab (fp, fp->ctf_header, &fp->ctf_symtab) == 0);
+      && fp->ctf_sxlate != NULL && fp->ctf_ext_symtab.cts_data != NULL)
+    assert (init_symtab (fp, fp->ctf_header, &fp->ctf_ext_symtab) == 0);
 }
 
 /* Return the CTF handle for the parent CTF dict, if one exists.  Otherwise
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 10/22] Revert "libctf: do not corrupt strings across ctf_serialize"
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (8 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 09/22] libctf: rename ctf_dict.ctf_{symtab,strtab} Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 11/22] libctf: replace 'pending refs' abstraction Nick Alcock
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

This reverts commit 986e9e3aa03f854bedacef7fac38fe8f009a416c.

(We do not revert the testcase -- it remains valid -- but we are
taking a different, less complex and more robust approach.)

This also deletes the pending refs abstraction without (yet)
replacing it, so some tests will fail for a commit or two.
---
 libctf/ctf-create.c    | 27 ++-------------
 libctf/ctf-hash.c      |  6 ----
 libctf/ctf-impl.h      |  6 +---
 libctf/ctf-serialize.c | 24 +------------
 libctf/ctf-string.c    | 76 +++++-------------------------------------
 5 files changed, 14 insertions(+), 125 deletions(-)

diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index 23bbf92ff1a..9d86b961132 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -464,8 +464,7 @@ ctf_add_generic (ctf_dict_t *fp, uint32_t flag, const char *name, int kind,
   type = ++fp->ctf_typemax;
   type = LCTF_INDEX_TO_TYPE (fp, type, (fp->ctf_flags & LCTF_CHILD));
 
-  dtd->dtd_data.ctt_name = ctf_str_add_pending (fp, name,
-						&dtd->dtd_data.ctt_name);
+  dtd->dtd_data.ctt_name = ctf_str_add_ref (fp, name, &dtd->dtd_data.ctt_name);
   dtd->dtd_type = type;
 
   if (dtd->dtd_data.ctt_name == 0 && name != NULL && name[0] != '\0')
@@ -1080,21 +1079,11 @@ ctf_add_enumerator (ctf_dict_t *fp, ctf_id_t enid, const char *name,
     return -1;					/* errno is set for us.  */
   en = (ctf_enum_t *) dtd->dtd_vlen;
 
-  if (dtd->dtd_vlen != old_vlen)
-    {
-      ptrdiff_t move = (signed char *) dtd->dtd_vlen - (signed char *) old_vlen;
-
-      /* Remove pending refs in the old vlen region and reapply them.  */
-
-      for (i = 0; i < vlen; i++)
-	ctf_str_move_pending (fp, &en[i].cte_name, move);
-    }
-
   for (i = 0; i < vlen; i++)
     if (strcmp (ctf_strptr (fp, en[i].cte_name), name) == 0)
       return (ctf_set_errno (ofp, ECTF_DUPLICATE));
 
-  en[i].cte_name = ctf_str_add_pending (fp, name, &en[i].cte_name);
+  en[i].cte_name = ctf_str_add_ref (fp, name, &en[i].cte_name);
   en[i].cte_value = value;
 
   if (en[i].cte_name == 0 && name != NULL && name[0] != '\0')
@@ -1154,16 +1143,6 @@ ctf_add_member_offset (ctf_dict_t *fp, ctf_id_t souid, const char *name,
     return (ctf_set_errno (ofp, ctf_errno (fp)));
   memb = (ctf_lmember_t *) dtd->dtd_vlen;
 
-  if (dtd->dtd_vlen != old_vlen)
-    {
-      ptrdiff_t move = (signed char *) dtd->dtd_vlen - (signed char *) old_vlen;
-
-      /* Remove pending refs in the old vlen region and reapply them.  */
-
-      for (i = 0; i < vlen; i++)
-	ctf_str_move_pending (fp, &memb[i].ctlm_name, move);
-    }
-
   if (name != NULL)
     {
       for (i = 0; i < vlen; i++)
@@ -1193,7 +1172,7 @@ ctf_add_member_offset (ctf_dict_t *fp, ctf_id_t souid, const char *name,
 	return -1;		/* errno is set for us.  */
     }
 
-  memb[vlen].ctlm_name = ctf_str_add_pending (fp, name, &memb[vlen].ctlm_name);
+  memb[vlen].ctlm_name = ctf_str_add_ref (fp, name, &memb[vlen].ctlm_name);
   memb[vlen].ctlm_type = type;
   if (memb[vlen].ctlm_name == 0 && name != NULL && name[0] != '\0')
     return -1;			/* errno is set for us.  */
diff --git a/libctf/ctf-hash.c b/libctf/ctf-hash.c
index f8032ae4d86..77b8478479e 100644
--- a/libctf/ctf-hash.c
+++ b/libctf/ctf-hash.c
@@ -669,12 +669,6 @@ ctf_dynset_lookup (ctf_dynset_t *hp, const void *key)
   return NULL;
 }
 
-size_t
-ctf_dynset_elements (ctf_dynset_t *hp)
-{
-  return htab_elements ((struct htab *) hp);
-}
-
 /* TRUE/FALSE return.  */
 int
 ctf_dynset_exists (ctf_dynset_t *hp, const void *key, const void **orig_key)
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index 8ce489a55ba..c16ef185fdc 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -383,8 +383,7 @@ struct ctf_dict
   ctf_dynhash_t *ctf_names;	    /* Hash table of remaining type names.  */
   ctf_lookup_t ctf_lookups[5];	    /* Pointers to nametabs for name lookup.  */
   ctf_strs_t ctf_str[2];	    /* Array of string table base and bounds.  */
-  ctf_dynhash_t *ctf_str_atoms;	    /* Hash table of ctf_str_atoms_t.  */
-  ctf_dynset_t *ctf_str_pending_ref; /* Locations awaiting ref addition.  */
+  ctf_dynhash_t *ctf_str_atoms;	  /* Hash table of ctf_str_atoms_t.  */
   uint64_t ctf_str_num_refs;	  /* Number of refs to cts_str_atoms.  */
   uint32_t ctf_str_prov_offset;	  /* Latest provisional offset assigned so far.  */
   unsigned char *ctf_base;	  /* CTF file pointer.  */
@@ -664,7 +663,6 @@ extern int ctf_dynset_insert (ctf_dynset_t *, void *);
 extern void ctf_dynset_remove (ctf_dynset_t *, const void *);
 extern void ctf_dynset_destroy (ctf_dynset_t *);
 extern void *ctf_dynset_lookup (ctf_dynset_t *, const void *);
-extern size_t ctf_dynset_elements (ctf_dynset_t *);
 extern int ctf_dynset_exists (ctf_dynset_t *, const void *key,
 			      const void **orig_key);
 extern int ctf_dynset_next (ctf_dynset_t *, ctf_next_t **, void **key);
@@ -727,8 +725,6 @@ extern int ctf_str_create_atoms (ctf_dict_t *);
 extern void ctf_str_free_atoms (ctf_dict_t *);
 extern uint32_t ctf_str_add (ctf_dict_t *, const char *);
 extern uint32_t ctf_str_add_ref (ctf_dict_t *, const char *, uint32_t *ref);
-extern uint32_t ctf_str_add_pending (ctf_dict_t *, const char *, uint32_t *);
-extern int ctf_str_move_pending (ctf_dict_t *, uint32_t *, ptrdiff_t);
 extern int ctf_str_add_external (ctf_dict_t *, const char *, uint32_t offset);
 extern void ctf_str_remove_ref (ctf_dict_t *, const char *, uint32_t *ref);
 extern void ctf_str_rollback (ctf_dict_t *, ctf_snapshot_id_t);
diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c
index 9dd7fbda285..2afc7be7c48 100644
--- a/libctf/ctf-serialize.c
+++ b/libctf/ctf-serialize.c
@@ -822,10 +822,7 @@ ctf_emit_type_sect (ctf_dict_t *fp, unsigned char **tptr)
       copied = (ctf_stype_t *) t;  /* name is at the start: constant offset.  */
       if (copied->ctt_name
 	  && (name = ctf_strraw (fp, copied->ctt_name)) != NULL)
-	{
-	  ctf_str_add_ref (fp, name, &copied->ctt_name);
-	  ctf_str_add_ref (fp, name, &dtd->dtd_data.ctt_name);
-	}
+        ctf_str_add_ref (fp, name, &copied->ctt_name);
       copied->ctt_size = type_ctt_size;
       t += len;
 
@@ -960,7 +957,6 @@ ctf_serialize (ctf_dict_t *fp)
   ctf_varent_t *dvarents;
   ctf_strs_writable_t strtab;
   int err;
-  int num_missed_str_refs;
   int sym_functions = 0;
 
   unsigned char *t;
@@ -980,16 +976,6 @@ ctf_serialize (ctf_dict_t *fp)
   if (fp->ctf_stypes > 0)
     return (ctf_set_errno (fp, ECTF_RDONLY));
 
-  /* The strtab refs table must be empty at this stage.  Any refs already added
-     will be corrupted by any modifications, including reserialization, after
-     strtab finalization is complete.  Only this function, and functions it
-     calls, may add refs, and all memory locations (including in the dtds)
-     containing strtab offsets must be traversed as part of serialization, and
-     refs added.  */
-
-  if (!ctf_assert (fp, fp->ctf_str_num_refs == 0))
-    return -1;					/* errno is set for us.  */
-
   /* Fill in an initial CTF header.  We will leave the label, object,
      and function sections empty and only output a header, type section,
      and string table.  The type section begins at a 4-byte aligned
@@ -1103,12 +1089,6 @@ ctf_serialize (ctf_dict_t *fp)
 
   assert (t == (unsigned char *) buf + sizeof (ctf_header_t) + hdr.cth_stroff);
 
-  /* Every string added outside serialization by ctf_str_add_pending should
-     now have been added by ctf_add_ref.  */
-  num_missed_str_refs = ctf_dynset_elements (fp->ctf_str_pending_ref);
-  if (!ctf_assert (fp, num_missed_str_refs == 0))
-    goto err;					/* errno is set for us.  */
-
   /* Construct the final string table and fill out all the string refs with the
      final offsets.  Then purge the refs list, because we're about to move this
      strtab onto the end of the buf, invalidating all the offsets.  */
@@ -1211,10 +1191,8 @@ ctf_serialize (ctf_dict_t *fp)
   ctf_str_free_atoms (nfp);
   nfp->ctf_str_atoms = fp->ctf_str_atoms;
   nfp->ctf_prov_strtab = fp->ctf_prov_strtab;
-  nfp->ctf_str_pending_ref = fp->ctf_str_pending_ref;
   fp->ctf_str_atoms = NULL;
   fp->ctf_prov_strtab = NULL;
-  fp->ctf_str_pending_ref = NULL;
   memset (&fp->ctf_dtdefs, 0, sizeof (ctf_list_t));
   memset (&fp->ctf_errs_warnings, 0, sizeof (ctf_list_t));
   fp->ctf_add_processing = NULL;
diff --git a/libctf/ctf-string.c b/libctf/ctf-string.c
index 63dced02e2f..af16e862f4b 100644
--- a/libctf/ctf-string.c
+++ b/libctf/ctf-string.c
@@ -127,7 +127,7 @@ ctf_str_create_atoms (ctf_dict_t *fp)
 {
   fp->ctf_str_atoms = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
 					  free, ctf_str_free_atom);
-  if (!fp->ctf_str_atoms)
+  if (fp->ctf_str_atoms == NULL)
     return -ENOMEM;
 
   if (!fp->ctf_prov_strtab)
@@ -137,13 +137,6 @@ ctf_str_create_atoms (ctf_dict_t *fp)
   if (!fp->ctf_prov_strtab)
     goto oom_prov_strtab;
 
-  if (!fp->ctf_str_pending_ref)
-    fp->ctf_str_pending_ref = ctf_dynset_create (htab_hash_pointer,
-						 htab_eq_pointer,
-						 NULL);
-  if (!fp->ctf_str_pending_ref)
-    goto oom_str_pending_ref;
-
   errno = 0;
   ctf_str_add (fp, "");
   if (errno == ENOMEM)
@@ -154,9 +147,6 @@ ctf_str_create_atoms (ctf_dict_t *fp)
  oom_str_add:
   ctf_dynhash_destroy (fp->ctf_prov_strtab);
   fp->ctf_prov_strtab = NULL;
- oom_str_pending_ref:
-  ctf_dynset_destroy (fp->ctf_str_pending_ref);
-  fp->ctf_str_pending_ref = NULL;
  oom_prov_strtab:
   ctf_dynhash_destroy (fp->ctf_str_atoms);
   fp->ctf_str_atoms = NULL;
@@ -169,13 +159,8 @@ ctf_str_free_atoms (ctf_dict_t *fp)
 {
   ctf_dynhash_destroy (fp->ctf_prov_strtab);
   ctf_dynhash_destroy (fp->ctf_str_atoms);
-  ctf_dynset_destroy (fp->ctf_str_pending_ref);
 }
 
-#define CTF_STR_ADD_REF 0x1
-#define CTF_STR_MAKE_PROVISIONAL 0x2
-#define CTF_STR_PENDING_REF 0x4
-
 /* Add a string to the atoms table, copying the passed-in string.  Return the
    atom added. Return NULL only when out of memory (and do not touch the
    passed-in string in that case).  Possibly augment the ref list with the
@@ -183,7 +168,7 @@ ctf_str_free_atoms (ctf_dict_t *fp)
    provisional strtab.   */
 static ctf_str_atom_t *
 ctf_str_add_ref_internal (ctf_dict_t *fp, const char *str,
-			  int flags, uint32_t *ref)
+			  int add_ref, int make_provisional, uint32_t *ref)
 {
   char *newstr = NULL;
   ctf_str_atom_t *atom = NULL;
@@ -191,7 +176,7 @@ ctf_str_add_ref_internal (ctf_dict_t *fp, const char *str,
 
   atom = ctf_dynhash_lookup (fp->ctf_str_atoms, str);
 
-  if (flags & CTF_STR_ADD_REF)
+  if (add_ref)
     {
       if ((aref = malloc (sizeof (struct ctf_str_atom_ref))) == NULL) {
 	ctf_set_errno (fp, ENOMEM);
@@ -202,9 +187,8 @@ ctf_str_add_ref_internal (ctf_dict_t *fp, const char *str,
 
   if (atom)
     {
-      if (flags & CTF_STR_ADD_REF)
+      if (add_ref)
 	{
-	  ctf_dynset_remove (fp->ctf_str_pending_ref, (void *) ref);
 	  ctf_list_append (&atom->csa_refs, aref);
 	  fp->ctf_str_num_refs++;
 	}
@@ -224,7 +208,7 @@ ctf_str_add_ref_internal (ctf_dict_t *fp, const char *str,
   atom->csa_str = newstr;
   atom->csa_snapshot_id = fp->ctf_snapshots;
 
-  if (flags & CTF_STR_MAKE_PROVISIONAL)
+  if (make_provisional)
     {
       atom->csa_offset = fp->ctf_str_prov_offset;
 
@@ -235,14 +219,8 @@ ctf_str_add_ref_internal (ctf_dict_t *fp, const char *str,
       fp->ctf_str_prov_offset += strlen (atom->csa_str) + 1;
     }
 
-  if (flags & CTF_STR_PENDING_REF)
+  if (add_ref)
     {
-      if (ctf_dynset_insert (fp->ctf_str_pending_ref, (void *) ref) < 0)
-	goto oom;
-    }
-  else if (flags & CTF_STR_ADD_REF)
-    {
-      ctf_dynset_remove (fp->ctf_str_pending_ref, (void *) ref);
       ctf_list_append (&atom->csa_refs, aref);
       fp->ctf_str_num_refs++;
     }
@@ -271,7 +249,7 @@ ctf_str_add (ctf_dict_t *fp, const char *str)
   if (!str)
     str = "";
 
-  atom = ctf_str_add_ref_internal (fp, str, CTF_STR_MAKE_PROVISIONAL, 0);
+  atom = ctf_str_add_ref_internal (fp, str, FALSE, TRUE, 0);
   if (!atom)
     return 0;
 
@@ -289,47 +267,13 @@ ctf_str_add_ref (ctf_dict_t *fp, const char *str, uint32_t *ref)
   if (!str)
     str = "";
 
-  atom = ctf_str_add_ref_internal (fp, str, CTF_STR_ADD_REF
-				   | CTF_STR_MAKE_PROVISIONAL, ref);
+  atom = ctf_str_add_ref_internal (fp, str, TRUE, TRUE, ref);
   if (!atom)
     return 0;
 
   return atom->csa_offset;
 }
 
-/* Like ctf_str_add_ref(), but notes that this memory location must be added as
-   a ref by a later serialization phase, rather than adding it itself.  */
-uint32_t
-ctf_str_add_pending (ctf_dict_t *fp, const char *str, uint32_t *ref)
-{
-  ctf_str_atom_t *atom;
-
-  if (!str)
-    str = "";
-
-  atom = ctf_str_add_ref_internal (fp, str, CTF_STR_PENDING_REF
-				   | CTF_STR_MAKE_PROVISIONAL, ref);
-  if (!atom)
-    return 0;
-
-  return atom->csa_offset;
-}
-
-/* Note that a pending ref now located at NEW_REF has moved by BYTES bytes.  */
-int
-ctf_str_move_pending (ctf_dict_t *fp, uint32_t *new_ref, ptrdiff_t bytes)
-{
-  if (bytes == 0)
-    return 0;
-
-  if (ctf_dynset_insert (fp->ctf_str_pending_ref, (void *) new_ref) < 0)
-    return (ctf_set_errno (fp, ENOMEM));
-
-  ctf_dynset_remove (fp->ctf_str_pending_ref,
-		     (void *) ((signed char *) new_ref - bytes));
-  return 0;
-}
-
 /* Add an external strtab reference at OFFSET.  Returns zero if the addition
    failed, nonzero otherwise.  */
 int
@@ -340,7 +284,7 @@ ctf_str_add_external (ctf_dict_t *fp, const char *str, uint32_t offset)
   if (!str)
     str = "";
 
-  atom = ctf_str_add_ref_internal (fp, str, 0, 0);
+  atom = ctf_str_add_ref_internal (fp, str, FALSE, FALSE, 0);
   if (!atom)
     return 0;
 
@@ -390,8 +334,6 @@ ctf_str_remove_ref (ctf_dict_t *fp, const char *str, uint32_t *ref)
 	  free (aref);
 	}
     }
-
-  ctf_dynset_remove (fp->ctf_str_pending_ref, (void *) ref);
 }
 
 /* A ctf_dynhash_iter_remove() callback that removes atoms later than a given
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 11/22] libctf: replace 'pending refs' abstraction
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (9 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 10/22] Revert "libctf: do not corrupt strings across ctf_serialize" Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 12/22] libctf: rethink strtab writeout Nick Alcock
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

A few years ago we introduced a 'pending refs' abstraction to fix one
problem: serializing a dict, then changing it would tend to corrupt the dict
because the strtab sort we do on strtab writeout (to improve compression
efficiency) would modify the offset of any strings that sorted
lexicographically earlier in the strtab: so we added a new restriction that
all strings are added only at serialization time, and maintained a set of
'pending' refs that were added earlier, whose offsets we could update (like
other refs) at writeout time.

This was in hindsight seriously problematic for maintenance (because
serialization has to traverse all strings in all datatypes in the entire
dict), and has become impossible to sustain now that we can read in existing
dicts, modify them, and reserialize them again.  We really don't want to
have to dig through the entire dict we jut read in just in order to dig out
all its strtab offsets, then *change* it, just for the sake of a sort that
adds a frankly trivial amount of compression efficiency.

Sorting *is* still worthwhile -- but it sacrifices very little to only sort
newly-added portions of the strtab, reusing older portions as necessary.
As a first stage in this, discard the whole "pending refs" abstraction and
replace it with "movable" refs, which are exactly like all other refs
(addresses containing the strtab offset of some string, which are updated
wiht the final strtab offset on serialization) except that we track them in
a reverse dict so that we can move the refs around (which we do whenever we
realloc() a buffer containing a bunch of structure members or something when
we add members to the structure).

libctf/

	* ctf-create.c (ctf_add_enumerator): Call ctf_str_move_refs; add
        a movable ref.
	(ctf_add_member_offset): Likewise.
	* ctf-util.c (ctf_realloc): Delete.
	* ctf-serialize.c (ctf_serialize): No longer use it.  Adjust to
	new fields.
	* ctf-string.c (ctf_str_purge_atom_refs): Purge movable refs.
	(ctf_str_free_atom): Free freeable atoms' strings.
	(ctf_str_create_atoms): Create the movable refs dynhash if needed.
	(ctf_str_free_atoms): Destroy it.
	(CTF_STR_MOVABLE): Switch (back) from ints to flags (see previous
	reversion).  Add new flag.
	(aref_create):  New, populate movable refs if need be.
	(ctf_str_add_ref_internal): Switch back to flags, update refs
	directly for nonprovisional strings (with already-known fixed offsets);
	create refs via aref_create.  Allocate strings only if not within an
	mmapped strtab.
	(ctf_str_add_movable_ref): New.
	(ctf_str_add): Adjust to CTF_STR_* reintroduction.
	(ctf_str_add_external): LIkewise.
	(ctf_str_move_refs): New, move refs via ctf_str_movable_refs
	backpointer.
	(ctf_str_purge_refs): Drop ctf_str_num_refs.
	(ctf_str_update_refs): Fix indentation.
	* ctf-impl.h (struct ctf_str_atom_movable): New.
	(struct ctf_dict.ctf_str_num_refs): Drop.
	(struct ctf_dict.ctf_str_movable_refs): New.
	(ctf_str_add_movable_ref): Declare.
	(ctf_str_move_refs): Likewise.
	(ctf_realloc): Drop.
---
 libctf/ctf-create.c    |  12 ++-
 libctf/ctf-impl.h      |  21 +++-
 libctf/ctf-serialize.c |  10 +-
 libctf/ctf-string.c    | 233 +++++++++++++++++++++++++++++++++--------
 libctf/ctf-util.c      |  13 ---
 5 files changed, 225 insertions(+), 64 deletions(-)

diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index 9d86b961132..e0558d28233 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -1079,11 +1079,15 @@ ctf_add_enumerator (ctf_dict_t *fp, ctf_id_t enid, const char *name,
     return -1;					/* errno is set for us.  */
   en = (ctf_enum_t *) dtd->dtd_vlen;
 
+  /* Remove refs in the old vlen region and reapply them.  */
+
+  ctf_str_move_refs (fp, old_vlen, sizeof (ctf_enum_t) * vlen, dtd->dtd_vlen);
+
   for (i = 0; i < vlen; i++)
     if (strcmp (ctf_strptr (fp, en[i].cte_name), name) == 0)
       return (ctf_set_errno (ofp, ECTF_DUPLICATE));
 
-  en[i].cte_name = ctf_str_add_ref (fp, name, &en[i].cte_name);
+  en[i].cte_name = ctf_str_add_movable_ref (fp, name, &en[i].cte_name);
   en[i].cte_value = value;
 
   if (en[i].cte_name == 0 && name != NULL && name[0] != '\0')
@@ -1143,6 +1147,10 @@ ctf_add_member_offset (ctf_dict_t *fp, ctf_id_t souid, const char *name,
     return (ctf_set_errno (ofp, ctf_errno (fp)));
   memb = (ctf_lmember_t *) dtd->dtd_vlen;
 
+  /* Remove pending refs in the old vlen region and reapply them.  */
+
+  ctf_str_move_refs (fp, old_vlen, sizeof (ctf_lmember_t) * vlen, dtd->dtd_vlen);
+
   if (name != NULL)
     {
       for (i = 0; i < vlen; i++)
@@ -1172,7 +1180,7 @@ ctf_add_member_offset (ctf_dict_t *fp, ctf_id_t souid, const char *name,
 	return -1;		/* errno is set for us.  */
     }
 
-  memb[vlen].ctlm_name = ctf_str_add_ref (fp, name, &memb[vlen].ctlm_name);
+  memb[vlen].ctlm_name = ctf_str_add_movable_ref (fp, name, &memb[vlen].ctlm_name);
   memb[vlen].ctlm_type = type;
   if (memb[vlen].ctlm_name == 0 && name != NULL && name[0] != '\0')
     return -1;			/* errno is set for us.  */
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index c16ef185fdc..f4611316f50 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -207,13 +207,17 @@ typedef struct ctf_err_warning
    removed from this table on ctf_close(), but on every ctf_serialize(), all
    the csa_refs in all entries are purged.  */
 
+#define CTF_STR_ATOM_FREEABLE	0x1
+#define CTF_STR_ATOM_MOVABLE	0x2
+
 typedef struct ctf_str_atom
 {
-  const char *csa_str;		/* Backpointer to string (hash key).  */
+  char *csa_str;		/* Pointer to string (also used as hash key).  */
   ctf_list_t csa_refs;		/* This string's refs.  */
   uint32_t csa_offset;		/* Strtab offset, if any.  */
   uint32_t csa_external_offset;	/* External strtab offset, if any.  */
   unsigned long csa_snapshot_id; /* Snapshot ID at time of creation.  */
+  int csa_flags;		 /* CTF_STR_ATOM_* flags. */
 } ctf_str_atom_t;
 
 /* The refs of a single string in the atoms table.  */
@@ -224,6 +228,15 @@ typedef struct ctf_str_atom_ref
   uint32_t *caf_ref;		/* A single ref to this string.  */
 } ctf_str_atom_ref_t;
 
+  /* Like a ctf_str_atom_ref_t, but specific to movable refs.  */
+
+typedef struct ctf_str_atom_ref_movable
+{
+  ctf_list_t caf_list;		/* List forward/back pointers.  */
+  uint32_t *caf_ref;		/* A single ref to this string.  */
+  ctf_dynhash_t *caf_movable_refs; /* Backpointer to ctf_str_movable_refs for this dict. */  
+} ctf_str_atom_ref_movable_t;
+
 /* A single linker-provided symbol, during symbol addition, possibly before we
    have been given external strtab refs.  */
 typedef struct ctf_in_flight_dynsym
@@ -384,7 +397,7 @@ struct ctf_dict
   ctf_lookup_t ctf_lookups[5];	    /* Pointers to nametabs for name lookup.  */
   ctf_strs_t ctf_str[2];	    /* Array of string table base and bounds.  */
   ctf_dynhash_t *ctf_str_atoms;	  /* Hash table of ctf_str_atoms_t.  */
-  uint64_t ctf_str_num_refs;	  /* Number of refs to cts_str_atoms.  */
+  ctf_dynhash_t *ctf_str_movable_refs; /* Hash table of void * -> ctf_str_atom_ref_t.  */
   uint32_t ctf_str_prov_offset;	  /* Latest provisional offset assigned so far.  */
   unsigned char *ctf_base;	  /* CTF file pointer.  */
   unsigned char *ctf_dynbase;	  /* Freeable CTF file pointer. */
@@ -725,6 +738,9 @@ extern int ctf_str_create_atoms (ctf_dict_t *);
 extern void ctf_str_free_atoms (ctf_dict_t *);
 extern uint32_t ctf_str_add (ctf_dict_t *, const char *);
 extern uint32_t ctf_str_add_ref (ctf_dict_t *, const char *, uint32_t *ref);
+extern uint32_t ctf_str_add_movable_ref (ctf_dict_t *, const char *,
+					 uint32_t *ref);
+extern int ctf_str_move_refs (ctf_dict_t *fp, void *src, size_t len, void *dest);
 extern int ctf_str_add_external (ctf_dict_t *, const char *, uint32_t offset);
 extern void ctf_str_remove_ref (ctf_dict_t *, const char *, uint32_t *ref);
 extern void ctf_str_rollback (ctf_dict_t *, ctf_snapshot_id_t);
@@ -758,7 +774,6 @@ extern void *ctf_mmap (size_t length, size_t offset, int fd);
 extern void ctf_munmap (void *, size_t);
 extern ssize_t ctf_pread (int fd, void *buf, ssize_t count, off_t offset);
 
-extern void *ctf_realloc (ctf_dict_t *, void *, size_t);
 extern char *ctf_str_append (char *, const char *);
 extern char *ctf_str_append_noerr (char *, const char *);
 
diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c
index 2afc7be7c48..6355d4225eb 100644
--- a/libctf/ctf-serialize.c
+++ b/libctf/ctf-serialize.c
@@ -1104,11 +1104,9 @@ ctf_serialize (ctf_dict_t *fp)
   ctf_qsort_r (dvarents, nvars, sizeof (ctf_varent_t), ctf_sort_var,
 	       &sort_var_arg);
 
-  if ((newbuf = ctf_realloc (fp, buf, buf_size + strtab.cts_len)) == NULL)
-    {
-      free (strtab.cts_strs);
-      goto oom;
-    }
+  if ((newbuf = realloc (buf, buf_size + strtab.cts_len)) == NULL)
+    goto oom;
+
   buf = newbuf;
   memcpy (buf + buf_size, strtab.cts_strs, strtab.cts_len);
   hdrp = (ctf_header_t *) buf;
@@ -1191,8 +1189,10 @@ ctf_serialize (ctf_dict_t *fp)
   ctf_str_free_atoms (nfp);
   nfp->ctf_str_atoms = fp->ctf_str_atoms;
   nfp->ctf_prov_strtab = fp->ctf_prov_strtab;
+  nfp->ctf_str_movable_refs = fp->ctf_str_movable_refs;
   fp->ctf_str_atoms = NULL;
   fp->ctf_prov_strtab = NULL;
+  fp->ctf_str_movable_refs = NULL;
   memset (&fp->ctf_dtdefs, 0, sizeof (ctf_list_t));
   memset (&fp->ctf_errs_warnings, 0, sizeof (ctf_list_t));
   fp->ctf_add_processing = NULL;
diff --git a/libctf/ctf-string.c b/libctf/ctf-string.c
index af16e862f4b..f25cd3abdeb 100644
--- a/libctf/ctf-string.c
+++ b/libctf/ctf-string.c
@@ -17,6 +17,7 @@
    along with this program; see the file COPYING.  If not see
    <http://www.gnu.org/licenses/>.  */
 
+#include <assert.h>
 #include <ctf-impl.h>
 #include <string.h>
 #include <assert.h>
@@ -106,17 +107,28 @@ ctf_str_purge_atom_refs (ctf_str_atom_t *atom)
     {
       next = ctf_list_next (ref);
       ctf_list_delete (&atom->csa_refs, ref);
+      if (atom->csa_flags & CTF_STR_ATOM_MOVABLE)
+	{
+	  ctf_str_atom_ref_movable_t *movref;
+	  movref = (ctf_str_atom_ref_movable_t *) ref;
+	  ctf_dynhash_remove (movref->caf_movable_refs, ref);
+	}
+
       free (ref);
     }
 }
 
-/* Free an atom (only called on ctf_close().)  */
+/* Free an atom.  */
 static void
 ctf_str_free_atom (void *a)
 {
   ctf_str_atom_t *atom = a;
 
   ctf_str_purge_atom_refs (atom);
+
+  if (atom->csa_flags & CTF_STR_ATOM_FREEABLE)
+    free (atom->csa_str);
+
   free (atom);
 }
 
@@ -137,6 +149,12 @@ ctf_str_create_atoms (ctf_dict_t *fp)
   if (!fp->ctf_prov_strtab)
     goto oom_prov_strtab;
 
+  fp->ctf_str_movable_refs = ctf_dynhash_create (ctf_hash_integer,
+						 ctf_hash_eq_integer,
+						 NULL, NULL);
+  if (!fp->ctf_str_movable_refs)
+    goto oom_movable_refs;
+
   errno = 0;
   ctf_str_add (fp, "");
   if (errno == ENOMEM)
@@ -145,6 +163,9 @@ ctf_str_create_atoms (ctf_dict_t *fp)
   return 0;
 
  oom_str_add:
+  ctf_dynhash_destroy (fp->ctf_str_movable_refs);
+  fp->ctf_str_movable_refs = NULL;
+ oom_movable_refs:
   ctf_dynhash_destroy (fp->ctf_prov_strtab);
   fp->ctf_prov_strtab = NULL;
  oom_prov_strtab:
@@ -153,62 +174,140 @@ ctf_str_create_atoms (ctf_dict_t *fp)
   return -ENOMEM;
 }
 
-/* Destroy the atoms table.  */
+/* Destroy the atoms table and associated refs.  */
 void
 ctf_str_free_atoms (ctf_dict_t *fp)
 {
   ctf_dynhash_destroy (fp->ctf_prov_strtab);
   ctf_dynhash_destroy (fp->ctf_str_atoms);
+  ctf_dynhash_destroy (fp->ctf_str_movable_refs);
 }
 
-/* Add a string to the atoms table, copying the passed-in string.  Return the
-   atom added. Return NULL only when out of memory (and do not touch the
-   passed-in string in that case).  Possibly augment the ref list with the
-   passed-in ref.  Possibly add a provisional entry for this string to the
-   provisional strtab.   */
+#define CTF_STR_ADD_REF 0x1
+#define CTF_STR_PROVISIONAL 0x2
+#define CTF_STR_MOVABLE 0x4
+
+/* Allocate a ref and bind it into a ref list.  */
+
+static ctf_str_atom_ref_t *
+aref_create (ctf_dict_t *fp, ctf_str_atom_t *atom, uint32_t *ref, int flags)
+{
+  ctf_str_atom_ref_t *aref;
+  size_t s = sizeof (struct ctf_str_atom_ref);
+
+  if (flags & CTF_STR_MOVABLE)
+    s = sizeof (struct ctf_str_atom_ref_movable);
+
+  aref = malloc (s);
+
+  if (!aref)
+    return NULL;
+
+  aref->caf_ref = ref;
+
+  /* Movable refs get a backpointer to them in ctf_str_movable_refs, and a
+     pointer to ctf_str_movable_refs itself in the ref, for use when freeing
+     refs: they can be moved later in batches via a call to
+     ctf_str_move_refs.  */
+
+  if (flags & CTF_STR_MOVABLE)
+    {
+      ctf_str_atom_ref_movable_t *movref = (ctf_str_atom_ref_movable_t *) aref;
+
+      movref->caf_movable_refs = fp->ctf_str_movable_refs;
+
+      if (ctf_dynhash_insert (fp->ctf_str_movable_refs, ref, aref) < 0)
+	{
+	  free (aref);
+	  return NULL;
+	}
+    }
+
+  ctf_list_append (&atom->csa_refs, aref);
+
+  return aref;
+}
+
+/* Add a string to the atoms table, copying the passed-in string if
+   necessary.  Return the atom added. Return NULL only when out of memory
+   (and do not touch the passed-in string in that case).
+
+   Possibly add a provisional entry for this string to the provisional
+   strtab.  If the string is in the provisional strtab, update its ref list
+   with the passed-in ref, causing the ref to be updated when the strtab is
+   written out.  */
+
 static ctf_str_atom_t *
 ctf_str_add_ref_internal (ctf_dict_t *fp, const char *str,
-			  int add_ref, int make_provisional, uint32_t *ref)
+			  int flags, uint32_t *ref)
 {
   char *newstr = NULL;
   ctf_str_atom_t *atom = NULL;
-  ctf_str_atom_ref_t *aref = NULL;
+  int added = 0;
 
   atom = ctf_dynhash_lookup (fp->ctf_str_atoms, str);
 
-  if (add_ref)
-    {
-      if ((aref = malloc (sizeof (struct ctf_str_atom_ref))) == NULL) {
-	ctf_set_errno (fp, ENOMEM);
-	return NULL;
-      }
-      aref->caf_ref = ref;
-    }
+  /* Existing atoms get refs added only if they are provisional:
+     non-provisional strings already have a fixed strtab offset, and just
+     get their ref updated immediately, since its value cannot change.  */
 
   if (atom)
     {
-      if (add_ref)
+      if (!ctf_dynhash_lookup (fp->ctf_prov_strtab, (void *) (uintptr_t)
+			       atom->csa_offset))
 	{
-	  ctf_list_append (&atom->csa_refs, aref);
-	  fp->ctf_str_num_refs++;
+	  if (flags & CTF_STR_ADD_REF)
+	    {
+	      if (atom->csa_external_offset)
+		*ref = atom->csa_external_offset;
+	      else
+		*ref = atom->csa_offset;
+	    }
+	  return atom;
 	}
+
+      if (flags & CTF_STR_ADD_REF)
+	{
+	  if (!aref_create (fp, atom, ref, flags))
+	    {
+	      ctf_set_errno (fp, ENOMEM);
+	      return NULL;
+	    }
+	}
+
       return atom;
     }
 
+  /* New atom.  */
+
   if ((atom = malloc (sizeof (struct ctf_str_atom))) == NULL)
     goto oom;
   memset (atom, 0, sizeof (struct ctf_str_atom));
 
-  if ((newstr = strdup (str)) == NULL)
-    goto oom;
+  /* Don't allocate new strings if this string is within an mmapped
+     strtab.  */
 
-  if (ctf_dynhash_insert (fp->ctf_str_atoms, newstr, atom) < 0)
-    goto oom;
+  if ((unsigned char *) str < (unsigned char *) fp->ctf_data_mmapped
+      || (unsigned char *) str > (unsigned char *) fp->ctf_data_mmapped + fp->ctf_data_mmapped_len)
+    {
+      if ((newstr = strdup (str)) == NULL)
+	goto oom;
+      atom->csa_flags |= CTF_STR_ATOM_FREEABLE;
+      atom->csa_str = newstr;
+    }
+  else
+    atom->csa_str = (char *) str;
+
+  if (ctf_dynhash_insert (fp->ctf_str_atoms, atom->csa_str, atom) < 0)
+    goto oom;
+  added = 1;
 
-  atom->csa_str = newstr;
   atom->csa_snapshot_id = fp->ctf_snapshots;
 
-  if (make_provisional)
+  /* New atoms marked provisional go into the provisional strtab, and get a
+     ref added.  */
+
+  if (flags & CTF_STR_PROVISIONAL)
     {
       atom->csa_offset = fp->ctf_str_prov_offset;
 
@@ -217,20 +316,20 @@ ctf_str_add_ref_internal (ctf_dict_t *fp, const char *str,
 	goto oom;
 
       fp->ctf_str_prov_offset += strlen (atom->csa_str) + 1;
+
+      if (flags & CTF_STR_ADD_REF)
+      {
+	if (!aref_create (fp, atom, ref, flags))
+	  goto oom;
+      }
     }
 
-  if (add_ref)
-    {
-      ctf_list_append (&atom->csa_refs, aref);
-      fp->ctf_str_num_refs++;
-    }
   return atom;
 
  oom:
-  if (newstr)
-    ctf_dynhash_remove (fp->ctf_str_atoms, newstr);
+  if (added)
+    ctf_dynhash_remove (fp->ctf_str_atoms, atom->csa_str);
   free (atom);
-  free (aref);
   free (newstr);
   ctf_set_errno (fp, ENOMEM);
   return NULL;
@@ -249,7 +348,7 @@ ctf_str_add (ctf_dict_t *fp, const char *str)
   if (!str)
     str = "";
 
-  atom = ctf_str_add_ref_internal (fp, str, FALSE, TRUE, 0);
+  atom = ctf_str_add_ref_internal (fp, str, CTF_STR_PROVISIONAL, 0);
   if (!atom)
     return 0;
 
@@ -267,7 +366,26 @@ ctf_str_add_ref (ctf_dict_t *fp, const char *str, uint32_t *ref)
   if (!str)
     str = "";
 
-  atom = ctf_str_add_ref_internal (fp, str, TRUE, TRUE, ref);
+  atom = ctf_str_add_ref_internal (fp, str, CTF_STR_ADD_REF
+				   | CTF_STR_PROVISIONAL, ref);
+  if (!atom)
+    return 0;
+
+  return atom->csa_offset;
+}
+
+/* Like ctf_str_add_ref(), but note that the ref may be moved later on.  */
+uint32_t
+ctf_str_add_movable_ref (ctf_dict_t *fp, const char *str, uint32_t *ref)
+{
+  ctf_str_atom_t *atom;
+
+  if (!str)
+    str = "";
+
+  atom = ctf_str_add_ref_internal (fp, str, CTF_STR_ADD_REF
+				   | CTF_STR_PROVISIONAL
+				   | CTF_STR_MOVABLE, ref);
   if (!atom)
     return 0;
 
@@ -284,7 +402,7 @@ ctf_str_add_external (ctf_dict_t *fp, const char *str, uint32_t offset)
   if (!str)
     str = "";
 
-  atom = ctf_str_add_ref_internal (fp, str, FALSE, FALSE, 0);
+  atom = ctf_str_add_ref_internal (fp, str, 0, 0);
   if (!atom)
     return 0;
 
@@ -314,6 +432,41 @@ ctf_str_add_external (ctf_dict_t *fp, const char *str, uint32_t offset)
   return 1;
 }
 
+/* Note that refs have moved from (SRC, LEN) to DEST.  We use the movable
+   refs backpointer for this, because it is done an amortized-constant
+   number of times during structure member and enumerand addition, and if we
+   did a linear search this would turn such addition into an O(n^2)
+   operation.  Even this is not linear, but it's better than that.  */
+int
+ctf_str_move_refs (ctf_dict_t *fp, void *src, size_t len, void *dest)
+{
+  uintptr_t p;
+
+  if (src == dest)
+    return 0;
+
+  for (p = (uintptr_t) src; p - (uintptr_t) src < len; p++)
+    {
+      ctf_str_atom_ref_t *ref;
+
+      if ((ref = ctf_dynhash_lookup (fp->ctf_str_movable_refs,
+				     (ctf_str_atom_ref_t *) p)) != NULL)
+	{
+	  int out_of_memory;
+
+	  ref->caf_ref = (uint32_t *) (((uintptr_t) ref->caf_ref +
+					(uintptr_t) dest - (uintptr_t) src));
+	  ctf_dynhash_remove (fp->ctf_str_movable_refs,
+			      (ctf_str_atom_ref_t *) p);
+	  out_of_memory = ctf_dynhash_insert (fp->ctf_str_movable_refs,
+					      ref->caf_ref, ref);
+	  assert (out_of_memory == 0);
+	}
+    }
+
+  return 0;
+}
+
 /* Remove a single ref.  */
 void
 ctf_str_remove_ref (ctf_dict_t *fp, const char *str, uint32_t *ref)
@@ -370,9 +523,7 @@ ctf_str_purge_one_atom_refs (void *key _libctf_unused_, void *value,
 void
 ctf_str_purge_refs (ctf_dict_t *fp)
 {
-  if (fp->ctf_str_num_refs > 0)
-    ctf_dynhash_iter (fp->ctf_str_atoms, ctf_str_purge_one_atom_refs, NULL);
-  fp->ctf_str_num_refs = 0;
+  ctf_dynhash_iter (fp->ctf_str_atoms, ctf_str_purge_one_atom_refs, NULL);
 }
 
 /* Update a list of refs to the specified value. */
@@ -383,7 +534,7 @@ ctf_str_update_refs (ctf_str_atom_t *refs, uint32_t value)
 
   for (ref = ctf_list_next (&refs->csa_refs); ref != NULL;
        ref = ctf_list_next (ref))
-      *(ref->caf_ref) = value;
+    *(ref->caf_ref) = value;
 }
 
 /* State shared across the strtab write process.  */
diff --git a/libctf/ctf-util.c b/libctf/ctf-util.c
index d47c10c99f0..3ea6de9e86f 100644
--- a/libctf/ctf-util.c
+++ b/libctf/ctf-util.c
@@ -231,19 +231,6 @@ ctf_str_append_noerr (char *s, const char *append)
   return new_s;
 }
 
-/* A realloc() that fails noisily if called with any ctf_str_num_users.  */
-void *
-ctf_realloc (ctf_dict_t *fp, void *ptr, size_t size)
-{
-  if (fp->ctf_str_num_refs > 0)
-    {
-      ctf_dprintf ("%p: attempt to realloc() string table with %lu active refs\n",
-		   (void *) fp, (unsigned long) fp->ctf_str_num_refs);
-      return NULL;
-    }
-  return realloc (ptr, size);
-}
-
 /* Store the specified error code into errp if it is non-NULL, and then
    return NULL for the benefit of the caller.  */
 
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 12/22] libctf: rethink strtab writeout
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (10 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 11/22] libctf: replace 'pending refs' abstraction Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 13/22] libctf: make ctf_serialize() actually serialize Nick Alcock
                   ` (10 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

This commit finally adjusts strtab writeout so that repeated writeouts, or
writeouts of a dict that was read in earlier, only sorts the portion of the
strtab that was newly added.

There are three intertwined changes here:

 - pull the contents of strtabs from newly ctf_bufopened dicts into the
   atoms table, so that future additions will reuse the existing offset etc
   rather than adding new identical strings
 - allow the internal ctf_bufopen done by serialization to contribute its
   existing atoms table, so that existing atoms can be used for the
   remainder of the open process (like name table construction): this atoms
   table currente gets thrown away in the mass reassignment done later in
   ctf_serialize in any case, but it needs to be there during the open.
 - rewrite ctf_str_write_strtab so that a) it uses iterators rather than
   ctf_*_iter, reducing pointless structures which serve no other purpose
   than to implement ordinary variable scope, but more clunkily, and b)
   retains the existing strtab on the front of the new one, with its sort
   retained, rather than resorting, so all existing already-written strtab
   offsets remain valid across the call.

This latter change finally permits repeated serializations, and
reserializations of ctf_open()ed dicts, to work, but for now we keep the
code that prevents that because serialization is about to change again in a
way that will make it more obvious that doing such things is safe, and we
can take it out then.

(There are also some smaller changes like moving the purge of the refs table
into ctf_str_write_strtab(), since that's where the changes happen that
invalidate it, rather than doing it in ctf_serialize().  We also prohibit
something that has never worked, opening a dict and then reporting symbols
to it via ctf_link_add_strtab() et al: you must do that to newly-created
dicts which have had stuff ctf_link()ed into them.  This is very unlikely
ever to be a problem in practice: linkers just don't do that sort of thing.)

libctf/

	* ctf-create.c (ctf_create): Add (temporary) atoms arg.
	* ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New.
	(ctf_str_create_atoms): Adjust.
	(ctf_str_write_strtab): Likewise.
	(ctf_simple_open_internal): Likewise.
	* ctf-open.c (ctf_simple_open_internal): Add atoms arg.
	(ctf_bufopen): Likewise.
	(ctf_bufopen_internal): Initialize just enough of an
	atoms table: pre-init from the atoms arg if supplied.
	(ctf_simple_open): Adjust.
	* ctf-serialize.c (ctf_serialize): Constify the strtab.
	Move ref list purging into ctf_str_write_strtab.
	Initialize the new dict with the old dict's atoms table.
	Accept the new strtab from ctf_str_write_strtab.
	Adjust for addition of ctf_dynstrtab.
	* ctf-string.c (ctf_strraw_explicit): Improve comments.
	(ctf_str_create_atoms): Prepopulate from an existing atoms table,
	or alternatively pull in all strings from the strtab and turn
	them into atoms.
	(ctf_str_free_atoms): Free the dynstrtab and its strtab.
	(struct ctf_strtab_write_state): Remove.
	(ctf_str_count_strtab): Fold this...
	(ctf_str_populate_sorttab): ... and this...
	(ctf_str_write_strtab): ... into this.  Prepend existing strings
	to the strtab rather than resorting them (and wrecking their
	offsets).  Keep the dynstrtab updated.  Update refs for all
	atoms with refs, whether or not they are strings newly added
	to the strtab.
---
 libctf/ctf-create.c    |   2 +-
 libctf/ctf-impl.h      |   9 +-
 libctf/ctf-open.c      |  20 ++-
 libctf/ctf-serialize.c |  24 +--
 libctf/ctf-string.c    | 391 +++++++++++++++++++++++++++--------------
 5 files changed, 294 insertions(+), 152 deletions(-)

diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index e0558d28233..78fb0305c20 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -133,7 +133,7 @@ ctf_create (int *errp)
   cts.cts_size = sizeof (hdr);
   cts.cts_entsize = 1;
 
-  if ((fp = ctf_bufopen_internal (&cts, NULL, NULL, NULL, errp)) == NULL)
+  if ((fp = ctf_bufopen_internal (&cts, NULL, NULL, NULL, NULL, errp)) == NULL)
     goto err;
 
   /* These hashes will have been initialized with a starting size of zero,
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index f4611316f50..3eef232bea0 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -396,6 +396,7 @@ struct ctf_dict
   ctf_dynhash_t *ctf_names;	    /* Hash table of remaining type names.  */
   ctf_lookup_t ctf_lookups[5];	    /* Pointers to nametabs for name lookup.  */
   ctf_strs_t ctf_str[2];	    /* Array of string table base and bounds.  */
+  ctf_strs_writable_t *ctf_dynstrtab; /* Dynamically allocated string table, if any. */
   ctf_dynhash_t *ctf_str_atoms;	  /* Hash table of ctf_str_atoms_t.  */
   ctf_dynhash_t *ctf_str_movable_refs; /* Hash table of void * -> ctf_str_atom_ref_t.  */
   uint32_t ctf_str_prov_offset;	  /* Latest provisional offset assigned so far.  */
@@ -734,7 +735,7 @@ extern const char *ctf_strraw (ctf_dict_t *, uint32_t);
 extern const char *ctf_strraw_explicit (ctf_dict_t *, uint32_t,
 					ctf_strs_t *);
 extern const char *ctf_strptr_validate (ctf_dict_t *, uint32_t);
-extern int ctf_str_create_atoms (ctf_dict_t *);
+extern int ctf_str_create_atoms (ctf_dict_t *, ctf_dynhash_t *atoms);
 extern void ctf_str_free_atoms (ctf_dict_t *);
 extern uint32_t ctf_str_add (ctf_dict_t *, const char *);
 extern uint32_t ctf_str_add_ref (ctf_dict_t *, const char *, uint32_t *ref);
@@ -745,7 +746,7 @@ extern int ctf_str_add_external (ctf_dict_t *, const char *, uint32_t offset);
 extern void ctf_str_remove_ref (ctf_dict_t *, const char *, uint32_t *ref);
 extern void ctf_str_rollback (ctf_dict_t *, ctf_snapshot_id_t);
 extern void ctf_str_purge_refs (ctf_dict_t *);
-extern ctf_strs_writable_t ctf_str_write_strtab (ctf_dict_t *);
+extern const ctf_strs_writable_t *ctf_str_write_strtab (ctf_dict_t *);
 
 extern struct ctf_archive_internal *
 ctf_new_archive_internal (int is_archive, int unmap_on_close,
@@ -762,10 +763,10 @@ extern int ctf_flip (ctf_dict_t *, ctf_header_t *, unsigned char *, int);
 extern ctf_dict_t *ctf_simple_open_internal (const char *, size_t, const char *,
 					     size_t, size_t,
 					     const char *, size_t,
-					     ctf_dynhash_t *, int *);
+					     ctf_dynhash_t *, ctf_dynhash_t *, int *);
 extern ctf_dict_t *ctf_bufopen_internal (const ctf_sect_t *, const ctf_sect_t *,
 					 const ctf_sect_t *, ctf_dynhash_t *,
-					 int *);
+					 ctf_dynhash_t *, int *);
 extern int ctf_import_unref (ctf_dict_t *fp, ctf_dict_t *pfp);
 extern int ctf_serialize (ctf_dict_t *);
 
diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index 22475465fa8..6d7a276f2cd 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -1290,7 +1290,7 @@ ctf_dict_t *ctf_simple_open (const char *ctfsect, size_t ctfsect_size,
 {
   return ctf_simple_open_internal (ctfsect, ctfsect_size, symsect, symsect_size,
 				   symsect_entsize, strsect, strsect_size, NULL,
-				   errp);
+				   NULL, errp);
 }
 
 /* Open a CTF file, mocking up a suitable ctf_sect and overriding the external
@@ -1300,7 +1300,8 @@ ctf_dict_t *ctf_simple_open_internal (const char *ctfsect, size_t ctfsect_size,
 				      const char *symsect, size_t symsect_size,
 				      size_t symsect_entsize,
 				      const char *strsect, size_t strsect_size,
-				      ctf_dynhash_t *syn_strtab, int *errp)
+				      ctf_dynhash_t *syn_strtab,
+				      ctf_dynhash_t *atoms, int *errp)
 {
   ctf_sect_t skeleton;
 
@@ -1338,7 +1339,7 @@ ctf_dict_t *ctf_simple_open_internal (const char *ctfsect, size_t ctfsect_size,
     }
 
   return ctf_bufopen_internal (ctfsectp, symsectp, strsectp, syn_strtab,
-			       errp);
+			       atoms, errp);
 }
 
 /* Decode the specified CTF buffer and optional symbol table, and create a new
@@ -1350,7 +1351,7 @@ ctf_dict_t *
 ctf_bufopen (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 	     const ctf_sect_t *strsect, int *errp)
 {
-  return ctf_bufopen_internal (ctfsect, symsect, strsect, NULL, errp);
+  return ctf_bufopen_internal (ctfsect, symsect, strsect, NULL, NULL, errp);
 }
 
 /* Like ctf_bufopen, but overriding the external strtab with a synthetic one.  */
@@ -1358,7 +1359,7 @@ ctf_bufopen (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 ctf_dict_t *
 ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 		      const ctf_sect_t *strsect, ctf_dynhash_t *syn_strtab,
-		      int *errp)
+		      ctf_dynhash_t *atoms, int *errp)
 {
   const ctf_preamble_t *pp;
   size_t hdrsz = sizeof (ctf_header_t);
@@ -1615,7 +1616,14 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
      ctf_set_base().  */
 
   ctf_set_version (fp, hp, hp->cth_version);
-  if (ctf_str_create_atoms (fp) < 0)
+
+  /* Temporary assignment, just enough to be able to initialize
+     the atoms table.  */
+
+  fp->ctf_str[CTF_STRTAB_0].cts_strs = (const char *) fp->ctf_buf
+    + hp->cth_stroff;
+  fp->ctf_str[CTF_STRTAB_0].cts_len = hp->cth_strlen;
+  if (ctf_str_create_atoms (fp, atoms) < 0)
     {
       err = ENOMEM;
       goto bad;
diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c
index 6355d4225eb..82e5b7d705b 100644
--- a/libctf/ctf-serialize.c
+++ b/libctf/ctf-serialize.c
@@ -955,7 +955,7 @@ ctf_serialize (ctf_dict_t *fp)
   ctf_header_t hdr, *hdrp;
   ctf_dvdef_t *dvd;
   ctf_varent_t *dvarents;
-  ctf_strs_writable_t strtab;
+  const ctf_strs_writable_t *strtab;
   int err;
   int sym_functions = 0;
 
@@ -1090,36 +1090,34 @@ ctf_serialize (ctf_dict_t *fp)
   assert (t == (unsigned char *) buf + sizeof (ctf_header_t) + hdr.cth_stroff);
 
   /* Construct the final string table and fill out all the string refs with the
-     final offsets.  Then purge the refs list, because we're about to move this
-     strtab onto the end of the buf, invalidating all the offsets.  */
-  strtab = ctf_str_write_strtab (fp);
-  ctf_str_purge_refs (fp);
+     final offsets.  */
 
-  if (strtab.cts_strs == NULL)
+  strtab = ctf_str_write_strtab (fp);
+
+  if (strtab == NULL)
     goto oom;
 
   /* Now the string table is constructed, we can sort the buffer of
      ctf_varent_t's.  */
-  ctf_sort_var_arg_cb_t sort_var_arg = { fp, (ctf_strs_t *) &strtab };
+  ctf_sort_var_arg_cb_t sort_var_arg = { fp, (ctf_strs_t *) strtab };
   ctf_qsort_r (dvarents, nvars, sizeof (ctf_varent_t), ctf_sort_var,
 	       &sort_var_arg);
 
-  if ((newbuf = realloc (buf, buf_size + strtab.cts_len)) == NULL)
+  if ((newbuf = realloc (buf, buf_size + strtab->cts_len)) == NULL)
     goto oom;
 
   buf = newbuf;
-  memcpy (buf + buf_size, strtab.cts_strs, strtab.cts_len);
+  memcpy (buf + buf_size, strtab->cts_strs, strtab->cts_len);
   hdrp = (ctf_header_t *) buf;
-  hdrp->cth_strlen = strtab.cts_len;
+  hdrp->cth_strlen = strtab->cts_len;
   buf_size += hdrp->cth_strlen;
-  free (strtab.cts_strs);
 
   /* Finally, we are ready to ctf_simple_open() the new dict.  If this is
      successful, we then switch nfp and fp and free the old dict.  */
 
   if ((nfp = ctf_simple_open_internal ((char *) buf, buf_size, NULL, 0,
 				       0, NULL, 0, fp->ctf_syn_ext_strtab,
-				       &err)) == NULL)
+				       fp->ctf_str_atoms, &err)) == NULL)
     {
       free (buf);
       return (ctf_set_errno (fp, err));
@@ -1189,9 +1187,11 @@ ctf_serialize (ctf_dict_t *fp)
   ctf_str_free_atoms (nfp);
   nfp->ctf_str_atoms = fp->ctf_str_atoms;
   nfp->ctf_prov_strtab = fp->ctf_prov_strtab;
+  nfp->ctf_dynstrtab = fp->ctf_dynstrtab;
   nfp->ctf_str_movable_refs = fp->ctf_str_movable_refs;
   fp->ctf_str_atoms = NULL;
   fp->ctf_prov_strtab = NULL;
+  fp->ctf_dynstrtab = NULL;
   fp->ctf_str_movable_refs = NULL;
   memset (&fp->ctf_dtdefs, 0, sizeof (ctf_list_t));
   memset (&fp->ctf_errs_warnings, 0, sizeof (ctf_list_t));
diff --git a/libctf/ctf-string.c b/libctf/ctf-string.c
index f25cd3abdeb..ccf36498eb9 100644
--- a/libctf/ctf-string.c
+++ b/libctf/ctf-string.c
@@ -20,10 +20,14 @@
 #include <assert.h>
 #include <ctf-impl.h>
 #include <string.h>
-#include <assert.h>
 
-/* Convert an encoded CTF string name into a pointer to a C string, using an
-  explicit internal strtab rather than the fp-based one.  */
+static ctf_str_atom_t *
+ctf_str_add_ref_internal (ctf_dict_t *fp, const char *str,
+			  int flags, uint32_t *ref);
+
+/* Convert an encoded CTF string name into a pointer to a C string, possibly
+  using an explicit internal provisional strtab rather than the fp-based
+  one.  */
 const char *
 ctf_strraw_explicit (ctf_dict_t *fp, uint32_t name, ctf_strs_t *strtab)
 {
@@ -32,18 +36,20 @@ ctf_strraw_explicit (ctf_dict_t *fp, uint32_t name, ctf_strs_t *strtab)
   if ((CTF_NAME_STID (name) == CTF_STRTAB_0) && (strtab != NULL))
     ctsp = strtab;
 
-  /* If this name is in the external strtab, and there is a synthetic strtab,
-     use it in preference.  */
+  /* If this name is in the external strtab, and there is a synthetic
+     strtab, use it in preference.  (This is used to add the set of strings
+     -- symbol names, etc -- the linker knows about before the strtab is
+     written out.)  */
 
   if (CTF_NAME_STID (name) == CTF_STRTAB_1
       && fp->ctf_syn_ext_strtab != NULL)
     return ctf_dynhash_lookup (fp->ctf_syn_ext_strtab,
 			       (void *) (uintptr_t) name);
 
-  /* If the name is in the internal strtab, and the offset is beyond the end of
-     the ctsp->cts_len but below the ctf_str_prov_offset, this is a provisional
-     string added by ctf_str_add*() but not yet built into a real strtab: get
-     the value out of the ctf_prov_strtab.  */
+  /* If the name is in the internal strtab, and the name offset is beyond
+     the end of the ctsp->cts_len but below the ctf_str_prov_offset, this is
+     a provisional string added by ctf_str_add*() but not yet built into a
+     real strtab: get the value out of the ctf_prov_strtab.  */
 
   if (CTF_NAME_STID (name) == CTF_STRTAB_0
       && name >= ctsp->cts_len && name < fp->ctf_str_prov_offset)
@@ -133,13 +139,25 @@ ctf_str_free_atom (void *a)
 }
 
 /* Create the atoms table.  There is always at least one atom in it, the null
-   string.  */
+   string: but also pull in atoms from the internal strtab.  (We rely on
+   calls to ctf_str_add_external to populate external strtab entries, since
+   these are often not quite the same as what appears in any external
+   strtab, and the external strtab is often huge and best not aggressively
+   pulled in.)
+
+   Alternatively, if passed, populate atoms from the passed-in table, but do
+   not propagate their flags or refs: they are all non-freeable and
+   non-movable.  (This is used when serializing a dict: this entire atoms
+   table will be thrown away shortly, so it is important that we not create
+   any new strings.)  */
 int
-ctf_str_create_atoms (ctf_dict_t *fp)
+ctf_str_create_atoms (ctf_dict_t *fp, ctf_dynhash_t *atoms)
 {
+  size_t i;
+
   fp->ctf_str_atoms = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
-					  free, ctf_str_free_atom);
-  if (fp->ctf_str_atoms == NULL)
+					  NULL, ctf_str_free_atom);
+  if (!fp->ctf_str_atoms)
     return -ENOMEM;
 
   if (!fp->ctf_prov_strtab)
@@ -160,6 +178,63 @@ ctf_str_create_atoms (ctf_dict_t *fp)
   if (errno == ENOMEM)
     goto oom_str_add;
 
+  /* Serializing.  We have existing strings in an existing atoms table with
+     possibly-live pointers to them which must be used unchanged.  Import
+     them into this atoms table.  */
+
+  if (atoms)
+    {
+      ctf_next_t *it = NULL;
+      void *k, *v;
+      int err;
+
+      while ((err = ctf_dynhash_next (atoms, &it, &k, &v)) == 0)
+	{
+	  ctf_str_atom_t *existing = v;
+	  ctf_str_atom_t *atom;
+
+	  if (existing->csa_str[0] == 0)
+	    continue;
+
+	  if ((atom = malloc (sizeof (struct ctf_str_atom))) == NULL)
+	    goto oom_str_add;
+	  memcpy (atom, existing, sizeof (struct ctf_str_atom));
+	  memset (&atom->csa_refs, 0, sizeof(ctf_list_t));
+	  atom->csa_flags = 0;
+
+	  if (ctf_dynhash_insert (fp->ctf_str_atoms, atom->csa_str, atom) < 0)
+	    {
+	      free (atom);
+	      goto oom_str_add;
+	    }
+	}
+    }
+  else
+    {
+      /* Not serializing.  Pull in all the strings in the strtab as new
+	 atoms.  The provisional strtab must be empty at this point, so
+	 there is no need to populate atoms from it as well.  Types in this
+	 subset are frozen and readonly, so the refs list and movable refs
+	 list need not be populated.  */
+
+      for (i = 0; i < fp->ctf_str[CTF_STRTAB_0].cts_len;
+	   i += strlen (&fp->ctf_str[CTF_STRTAB_0].cts_strs[i]) + 1)
+	{
+	  ctf_str_atom_t *atom;
+
+	  if (fp->ctf_str[CTF_STRTAB_0].cts_strs[i] == 0)
+	    continue;
+
+	  atom = ctf_str_add_ref_internal (fp, &fp->ctf_str[CTF_STRTAB_0].cts_strs[i],
+					   0, 0);
+
+	  if (!atom)
+	    goto oom_str_add;
+
+	  atom->csa_offset = i;
+	}
+    }
+
   return 0;
 
  oom_str_add:
@@ -181,6 +256,11 @@ ctf_str_free_atoms (ctf_dict_t *fp)
   ctf_dynhash_destroy (fp->ctf_prov_strtab);
   ctf_dynhash_destroy (fp->ctf_str_atoms);
   ctf_dynhash_destroy (fp->ctf_str_movable_refs);
+  if (fp->ctf_dynstrtab)
+    {
+      free (fp->ctf_dynstrtab->cts_strs);
+      free (fp->ctf_dynstrtab);
+    }
 }
 
 #define CTF_STR_ADD_REF 0x1
@@ -537,69 +617,6 @@ ctf_str_update_refs (ctf_str_atom_t *refs, uint32_t value)
     *(ref->caf_ref) = value;
 }
 
-/* State shared across the strtab write process.  */
-typedef struct ctf_strtab_write_state
-{
-  /* Strtab we are writing, and the number of strings in it.  */
-  ctf_strs_writable_t *strtab;
-  size_t strtab_count;
-
-  /* Pointers to (existing) atoms in the atoms table, for qsorting.  */
-  ctf_str_atom_t **sorttab;
-
-  /* Loop counter for sorttab population.  */
-  size_t i;
-
-  /* The null-string atom (skipped during population).  */
-  ctf_str_atom_t *nullstr;
-} ctf_strtab_write_state_t;
-
-/* Count the number of entries in the strtab, and its length.  */
-static void
-ctf_str_count_strtab (void *key _libctf_unused_, void *value,
-	      void *arg)
-{
-  ctf_str_atom_t *atom = (ctf_str_atom_t *) value;
-  ctf_strtab_write_state_t *s = (ctf_strtab_write_state_t *) arg;
-
-  /* We only factor in the length of items that have no offset and have refs:
-     other items are in the external strtab, or will simply not be written out
-     at all.  They still contribute to the total count, though, because we still
-     have to sort them.  We add in the null string's length explicitly, outside
-     this function, since it is explicitly written out even if it has no refs at
-     all.  */
-
-  if (s->nullstr == atom)
-    {
-      s->strtab_count++;
-      return;
-    }
-
-  if (!ctf_list_empty_p (&atom->csa_refs))
-    {
-      if (!atom->csa_external_offset)
-	s->strtab->cts_len += strlen (atom->csa_str) + 1;
-      s->strtab_count++;
-    }
-}
-
-/* Populate the sorttab with pointers to the strtab atoms.  */
-static void
-ctf_str_populate_sorttab (void *key _libctf_unused_, void *value,
-		  void *arg)
-{
-  ctf_str_atom_t *atom = (ctf_str_atom_t *) value;
-  ctf_strtab_write_state_t *s = (ctf_strtab_write_state_t *) arg;
-
-  /* Skip the null string.  */
-  if (s->nullstr == atom)
-    return;
-
-  /* Skip atoms with no refs.  */
-  if (!ctf_list_empty_p (&atom->csa_refs))
-    s->sorttab[s->i++] = atom;
-}
-
 /* Sort the strtab.  */
 static int
 ctf_str_sort_strtab (const void *a, const void *b)
@@ -611,79 +628,182 @@ ctf_str_sort_strtab (const void *a, const void *b)
 }
 
 /* Write out and return a strtab containing all strings with recorded refs,
-   adjusting the refs to refer to the corresponding string.  The returned strtab
-   may be NULL on error.  Also populate the synthetic strtab with mappings from
-   external strtab offsets to names, so we can look them up with ctf_strptr().
-   Only external strtab offsets with references are added.  */
-ctf_strs_writable_t
+   adjusting the refs to refer to the corresponding string.  The returned
+   strtab is already assigned to strtab 0 in this dict, is owned by this
+   dict, and may be NULL on error.  Also populate the synthetic strtab with
+   mappings from external strtab offsets to names, so we can look them up
+   with ctf_strptr().  Only external strtab offsets with references are
+   added.
+
+   As a side effect, replaces the strtab of the current dict with the newly-
+   generated strtab.  This is an exception to the general rule that
+   serialization does not change the dict passed in, because the alternative
+   is to copy the entire atoms table on every reserialization just to avoid
+   modifying the original, which is excessively costly for minimal gain.
+
+   We use the lazy man's approach and double memory costs by always storing
+   atoms as individually allocated entities whenever they come from anywhere
+   but a freshly-opened, mmapped dict, even though after serialization there
+   is another copy in the strtab; this ensures that ctf_strptr()-returned
+   pointers to them remain valid for the lifetime of the dict.
+
+   This is all rendered more complex because if a dict is ctf_open()ed it
+   will have a bunch of strings in its strtab already, and their strtab
+   offsets can never change (without piles of complexity to rescan the
+   entire dict just to get all the offsets to all of them into the atoms
+   table).  Entries below the existing strtab limit are just copied into the
+   new dict: entries above it are new, and are are sorted first, then
+   appended to it.  The sorting is purely a compression-efficiency
+   improvement, and we get nearly as good an improvement from sorting big
+   chunks like this as we would from sorting the whole thing.  */
+
+const ctf_strs_writable_t *
 ctf_str_write_strtab (ctf_dict_t *fp)
 {
-  ctf_strs_writable_t strtab;
-  ctf_str_atom_t *nullstr;
+  ctf_strs_writable_t *strtab;
+  size_t strtab_count = 0;
   uint32_t cur_stroff = 0;
-  ctf_strtab_write_state_t s;
   ctf_str_atom_t **sorttab;
+  ctf_next_t *it = NULL;
   size_t i;
+  void *v;
+  int err;
+  int new_strtab = 0;
   int any_external = 0;
 
-  memset (&strtab, 0, sizeof (struct ctf_strs_writable));
-  memset (&s, 0, sizeof (struct ctf_strtab_write_state));
-  s.strtab = &strtab;
+  strtab = calloc (1, sizeof (ctf_strs_writable_t));
+  if (!strtab)
+    return NULL;
 
-  nullstr = ctf_dynhash_lookup (fp->ctf_str_atoms, "");
-  if (!nullstr)
+  /* The strtab contains the existing string table at its start: figure out
+     how many new strings we need to add.  We only need to add new strings
+     that have no external offset, that have refs, and that are found in the
+     provisional strtab.  If the existing strtab is empty we also need to
+     add the null string at its start.  */
+
+  strtab->cts_len = fp->ctf_str[CTF_STRTAB_0].cts_len;
+
+  if (strtab->cts_len == 0)
     {
-      ctf_err_warn (fp, 0, ECTF_INTERNAL, _("null string not found in strtab"));
-      strtab.cts_strs = NULL;
-      return strtab;
+      new_strtab = 1;
+      strtab->cts_len++; 			/* For the \0.  */
     }
 
-  s.nullstr = nullstr;
-  ctf_dynhash_iter (fp->ctf_str_atoms, ctf_str_count_strtab, &s);
-  strtab.cts_len++;				/* For the null string.  */
+  /* Count new entries in the strtab: i.e. entries in the provisional
+     strtab.  Ignore any entry for \0, entries which ended up in the
+     external strtab, and unreferenced entries.  */
 
-  ctf_dprintf ("%lu bytes of strings in strtab.\n",
-	       (unsigned long) strtab.cts_len);
+  while ((err = ctf_dynhash_next (fp->ctf_prov_strtab, &it, NULL, &v)) == 0)
+    {
+      const char *str = (const char *) v;
+      ctf_str_atom_t *atom;
 
-  /* Sort the strtab.  Force the null string to be first.  */
-  sorttab = calloc (s.strtab_count, sizeof (ctf_str_atom_t *));
+      atom = ctf_dynhash_lookup (fp->ctf_str_atoms, str);
+      if (!ctf_assert (fp, atom))
+	goto err_strtab;
+
+      if (atom->csa_str[0] == 0 || ctf_list_empty_p (&atom->csa_refs) ||
+	  atom->csa_external_offset)
+	continue;
+
+      strtab->cts_len += strlen (atom->csa_str) + 1;
+      strtab_count++;
+    }
+  if (err != ECTF_NEXT_END)
+    {
+      ctf_dprintf ("ctf_str_write_strtab: error counting strtab entries: %s\n",
+		   ctf_errmsg (err));
+      goto err_strtab;
+    }
+
+  ctf_dprintf ("%lu bytes of strings in strtab: %lu pre-existing.\n",
+	       (unsigned long) strtab->cts_len,
+	       (unsigned long) fp->ctf_str[CTF_STRTAB_0].cts_len);
+
+  /* Sort the new part of the strtab.  */
+
+  sorttab = calloc (strtab_count, sizeof (ctf_str_atom_t *));
   if (!sorttab)
-    goto oom;
+    {
+      ctf_set_errno (fp, ENOMEM);
+      goto err_strtab;
+    }
 
-  sorttab[0] = nullstr;
-  s.i = 1;
-  s.sorttab = sorttab;
-  ctf_dynhash_iter (fp->ctf_str_atoms, ctf_str_populate_sorttab, &s);
+  i = 0;
+  while ((err = ctf_dynhash_next (fp->ctf_prov_strtab, &it, NULL, &v)) == 0)
+    {
+      ctf_str_atom_t *atom;
 
-  qsort (&sorttab[1], s.strtab_count - 1, sizeof (ctf_str_atom_t *),
+      atom = ctf_dynhash_lookup (fp->ctf_str_atoms, v);
+      if (!ctf_assert (fp, atom))
+	goto err_sorttab;
+
+      if (atom->csa_str[0] == 0 || ctf_list_empty_p (&atom->csa_refs) ||
+	  atom->csa_external_offset)
+	continue;
+
+      sorttab[i++] = atom;
+    }
+
+  qsort (sorttab, strtab_count, sizeof (ctf_str_atom_t *),
 	 ctf_str_sort_strtab);
 
-  if ((strtab.cts_strs = malloc (strtab.cts_len)) == NULL)
-    goto oom_sorttab;
+  if ((strtab->cts_strs = malloc (strtab->cts_len)) == NULL)
+    goto err_sorttab;
 
-  /* Update all refs: also update the strtab appropriately.  */
-  for (i = 0; i < s.strtab_count; i++)
+  cur_stroff = fp->ctf_str[CTF_STRTAB_0].cts_len;
+
+  if (new_strtab)
     {
-      if (sorttab[i]->csa_external_offset)
-	{
-	  /* External strtab entry.  */
+      strtab->cts_strs[0] = 0;
+      cur_stroff++;
+    }
+  else
+    memcpy (strtab->cts_strs, fp->ctf_str[CTF_STRTAB_0].cts_strs,
+	    fp->ctf_str[CTF_STRTAB_0].cts_len);
 
-	  any_external = 1;
-	  ctf_str_update_refs (sorttab[i], sorttab[i]->csa_external_offset);
-	  sorttab[i]->csa_offset = sorttab[i]->csa_external_offset;
-	}
-      else
-	{
-	  /* Internal strtab entry with refs: actually add to the string
-	     table.  */
+  /* Work over the sorttab, add its strings to the strtab, and remember
+     where they are in the csa_offset for the appropriate atom.  No ref
+     updating is done at this point, because refs might well relate to
+     already-existing strings, or external strings, which do not need adding
+     to the strtab and may not be in the sorttab.  */
 
-	  ctf_str_update_refs (sorttab[i], cur_stroff);
-	  sorttab[i]->csa_offset = cur_stroff;
-	  strcpy (&strtab.cts_strs[cur_stroff], sorttab[i]->csa_str);
-	  cur_stroff += strlen (sorttab[i]->csa_str) + 1;
-	}
+  for (i = 0; i < strtab_count; i++)
+    {
+      sorttab[i]->csa_offset = cur_stroff;
+      strcpy (&strtab->cts_strs[cur_stroff], sorttab[i]->csa_str);
+      cur_stroff += strlen (sorttab[i]->csa_str) + 1;
     }
   free (sorttab);
+  sorttab = NULL;
+
+  /* Update all refs, then purge them as no longer necessary: also update
+     the strtab appropriately.  */
+
+  while ((err = ctf_dynhash_next (fp->ctf_str_atoms, &it, NULL, &v)) == 0)
+    {
+      ctf_str_atom_t *atom = (ctf_str_atom_t *) v;
+      uint32_t offset;
+
+      if (ctf_list_empty_p (&atom->csa_refs))
+	continue;
+
+      if (atom->csa_external_offset)
+	{
+	  any_external = 1;
+	  offset = atom->csa_external_offset;
+	}
+      else
+	offset = atom->csa_offset;
+      ctf_str_update_refs (atom, offset);
+    }
+  if (err != ECTF_NEXT_END)
+    {
+      ctf_dprintf ("ctf_str_write_strtab: error iterating over atoms while updating refs: %s\n",
+		   ctf_errmsg (err));
+      goto err_strtab;
+    }
+  ctf_str_purge_refs (fp);
 
   if (!any_external)
     {
@@ -691,16 +811,29 @@ ctf_str_write_strtab (ctf_dict_t *fp)
       fp->ctf_syn_ext_strtab = NULL;
     }
 
+  /* Replace the old strtab with the new one in this dict.  */
+
+  if (fp->ctf_dynstrtab)
+    {
+      free (fp->ctf_dynstrtab->cts_strs);
+      free (fp->ctf_dynstrtab);
+    }
+
+  fp->ctf_dynstrtab = strtab;
+  fp->ctf_str[CTF_STRTAB_0].cts_strs = strtab->cts_strs;
+  fp->ctf_str[CTF_STRTAB_0].cts_len = strtab->cts_len;
+
   /* All the provisional strtab entries are now real strtab entries, and
      ctf_strptr() will find them there.  The provisional offset now starts right
      beyond the new end of the strtab.  */
 
   ctf_dynhash_empty (fp->ctf_prov_strtab);
-  fp->ctf_str_prov_offset = strtab.cts_len + 1;
+  fp->ctf_str_prov_offset = strtab->cts_len + 1;
   return strtab;
 
- oom_sorttab:
+ err_sorttab:
   free (sorttab);
- oom:
-  return strtab;
+ err_strtab:
+  free (strtab);
+  return NULL;
 }
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 13/22] libctf: make ctf_serialize() actually serialize
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (11 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 12/22] libctf: rethink strtab writeout Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 14/22] libctf: fix tiny dumping error Nick Alcock
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.

It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.

Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.

... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t.  This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly.  This simplifies most of its callers
significantly.

(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)

This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.

libctf/

	* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
	(ctf_type_sect_size): Add static type sizes too.
	(ctf_serialize): Return the new dict rather than updating the
	existing dict.  No longer fail for dicts with static types;
	copy them onto the start of the new types table.
	(ctf_gzwrite): Actually serialize before gzwriting.
	(ctf_write_mem): Improve forced (test-mode) endian-flipping:
	flip dicts even if they are too small to be compressed.
	Improve confusing variable naming.
	* ctf-archive.c (arc_write_one_ctf): Don't bother to call
	ctf_serialize: both the functions we call do so.
	* ctf-string.c (ctf_str_create_atoms): Drop serializing case
	(atoms arg).
	* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
	(ctf_simple_open_internal): Delete.
	(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
	longer bother with syn_ext_strtab or forced atoms table,
	serialization no longer needs them.
	* ctf-create.c (ctf_create): Call ctf_bufopen directly.
	* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
	(ctf_simple_open_internal): Delete.
	(ctf_bufopen_internal): Likewise.
	(ctf_serialize): Adjust.
	* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
	this is supposed to work.
---
 libctf/ctf-archive.c                          |   7 +-
 libctf/ctf-create.c                           |   2 +-
 libctf/ctf-impl.h                             |  19 +-
 libctf/ctf-open.c                             |  39 +-
 libctf/ctf-serialize.c                        | 334 ++++++------------
 libctf/ctf-string.c                           |  73 +---
 .../testsuite/libctf-lookup/add-to-opened.c   |  11 +-
 7 files changed, 137 insertions(+), 348 deletions(-)

diff --git a/libctf/ctf-archive.c b/libctf/ctf-archive.c
index a88c6135e1a..3f12c0da85d 100644
--- a/libctf/ctf-archive.c
+++ b/libctf/ctf-archive.c
@@ -253,12 +253,12 @@ ctf_arc_write (const char *file, ctf_dict_t **ctf_dicts, size_t ctf_dict_cnt,
   return err;
 }
 
-/* Write one CTF file out.  Return the file position of the written file (or
+/* Write one CTF dict out.  Return the file position of the written file (or
    rather, of the file-size uint64_t that precedes it): negative return is a
    negative errno or ctf_errno value.  On error, the file position may no longer
    be at the end of the file.  */
 static off_t
-arc_write_one_ctf (ctf_dict_t * f, int fd, size_t threshold)
+arc_write_one_ctf (ctf_dict_t *f, int fd, size_t threshold)
 {
   off_t off, end_off;
   uint64_t ctfsz = 0;
@@ -266,9 +266,6 @@ arc_write_one_ctf (ctf_dict_t * f, int fd, size_t threshold)
   size_t ctfsz_len;
   int (*writefn) (ctf_dict_t * fp, int fd);
 
-  if (ctf_serialize (f) < 0)
-    return f->ctf_errno * -1;
-
   if ((off = lseek (fd, 0, SEEK_CUR)) < 0)
     return errno * -1;
 
diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index 78fb0305c20..ee79e49794d 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -133,7 +133,7 @@ ctf_create (int *errp)
   cts.cts_size = sizeof (hdr);
   cts.cts_entsize = 1;
 
-  if ((fp = ctf_bufopen_internal (&cts, NULL, NULL, NULL, NULL, errp)) == NULL)
+  if ((fp = ctf_bufopen (&cts, NULL, NULL, errp)) == NULL)
     goto err;
 
   /* These hashes will have been initialized with a starting size of zero,
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index 3eef232bea0..03e1a66416a 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -364,14 +364,7 @@ typedef struct ctf_dedup
    clients, who see it only as an opaque pointer.  Modifications can therefore
    be made freely to this structure without regard to client versioning.  The
    ctf_dict_t typedef appears in <ctf-api.h> and declares a forward tag.
-   (A ctf_file_t typedef also appears there, for historical reasons.)
-
-   NOTE: ctf_serialize requires that everything inside of ctf_dict either be an
-   immediate value, a pointer to dynamically allocated data *outside* of the
-   ctf_dict itself, a pointer to statically allocated data, or specially handled
-   in ctf_serialize.  If you add a pointer to ctf_dict that points to something
-   within the ctf_dict itself, you must make corresponding changes to
-   ctf_serialize.  */
+   (A ctf_file_t typedef also appears there, for historical reasons.)  */
 
 struct ctf_dict
 {
@@ -735,7 +728,7 @@ extern const char *ctf_strraw (ctf_dict_t *, uint32_t);
 extern const char *ctf_strraw_explicit (ctf_dict_t *, uint32_t,
 					ctf_strs_t *);
 extern const char *ctf_strptr_validate (ctf_dict_t *, uint32_t);
-extern int ctf_str_create_atoms (ctf_dict_t *, ctf_dynhash_t *atoms);
+extern int ctf_str_create_atoms (ctf_dict_t *);
 extern void ctf_str_free_atoms (ctf_dict_t *);
 extern uint32_t ctf_str_add (ctf_dict_t *, const char *);
 extern uint32_t ctf_str_add_ref (ctf_dict_t *, const char *, uint32_t *ref);
@@ -760,15 +753,7 @@ extern void *ctf_set_open_errno (int *, int);
 extern void ctf_flip_header (ctf_header_t *);
 extern int ctf_flip (ctf_dict_t *, ctf_header_t *, unsigned char *, int);
 
-extern ctf_dict_t *ctf_simple_open_internal (const char *, size_t, const char *,
-					     size_t, size_t,
-					     const char *, size_t,
-					     ctf_dynhash_t *, ctf_dynhash_t *, int *);
-extern ctf_dict_t *ctf_bufopen_internal (const ctf_sect_t *, const ctf_sect_t *,
-					 const ctf_sect_t *, ctf_dynhash_t *,
-					 ctf_dynhash_t *, int *);
 extern int ctf_import_unref (ctf_dict_t *fp, ctf_dict_t *pfp);
-extern int ctf_serialize (ctf_dict_t *);
 
 _libctf_malloc_
 extern void *ctf_mmap (size_t length, size_t offset, int fd);
diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index 6d7a276f2cd..9cbf07626cc 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -1233,9 +1233,8 @@ flip_types (ctf_dict_t *fp, void *start, size_t len, int to_foreign)
   return 0;
 }
 
-/* Flip the endianness of BUF, given the offsets in the (already endian-
-   converted) CTH.  If TO_FOREIGN is set, flip to foreign-endianness; if not,
-   flip away.
+/* Flip the endianness of BUF, given the offsets in the (native-endianness) CTH.
+   If TO_FOREIGN is set, flip to foreign-endianness; if not, flip away.
 
    All of this stuff happens before the header is fully initialized, so the
    LCTF_*() macros cannot be used yet.  Since we do not try to endian-convert v1
@@ -1287,21 +1286,6 @@ ctf_dict_t *ctf_simple_open (const char *ctfsect, size_t ctfsect_size,
 			     size_t symsect_entsize,
 			     const char *strsect, size_t strsect_size,
 			     int *errp)
-{
-  return ctf_simple_open_internal (ctfsect, ctfsect_size, symsect, symsect_size,
-				   symsect_entsize, strsect, strsect_size, NULL,
-				   NULL, errp);
-}
-
-/* Open a CTF file, mocking up a suitable ctf_sect and overriding the external
-   strtab with a synthetic one.  */
-
-ctf_dict_t *ctf_simple_open_internal (const char *ctfsect, size_t ctfsect_size,
-				      const char *symsect, size_t symsect_size,
-				      size_t symsect_entsize,
-				      const char *strsect, size_t strsect_size,
-				      ctf_dynhash_t *syn_strtab,
-				      ctf_dynhash_t *atoms, int *errp)
 {
   ctf_sect_t skeleton;
 
@@ -1338,8 +1322,7 @@ ctf_dict_t *ctf_simple_open_internal (const char *ctfsect, size_t ctfsect_size,
       strsectp = &str_sect;
     }
 
-  return ctf_bufopen_internal (ctfsectp, symsectp, strsectp, syn_strtab,
-			       atoms, errp);
+  return ctf_bufopen (ctfsectp, symsectp, strsectp, errp);
 }
 
 /* Decode the specified CTF buffer and optional symbol table, and create a new
@@ -1350,16 +1333,6 @@ ctf_dict_t *ctf_simple_open_internal (const char *ctfsect, size_t ctfsect_size,
 ctf_dict_t *
 ctf_bufopen (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 	     const ctf_sect_t *strsect, int *errp)
-{
-  return ctf_bufopen_internal (ctfsect, symsect, strsect, NULL, NULL, errp);
-}
-
-/* Like ctf_bufopen, but overriding the external strtab with a synthetic one.  */
-
-ctf_dict_t *
-ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
-		      const ctf_sect_t *strsect, ctf_dynhash_t *syn_strtab,
-		      ctf_dynhash_t *atoms, int *errp)
 {
   const ctf_preamble_t *pp;
   size_t hdrsz = sizeof (ctf_header_t);
@@ -1370,8 +1343,7 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
 
   libctf_init_debug();
 
-  if ((ctfsect == NULL) || ((symsect != NULL) &&
-			    ((strsect == NULL) && syn_strtab == NULL)))
+  if ((ctfsect == NULL) || ((symsect != NULL) && (strsect == NULL)))
     return (ctf_set_open_errno (errp, EINVAL));
 
   if (symsect != NULL && symsect->cts_entsize != sizeof (Elf32_Sym) &&
@@ -1623,7 +1595,7 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
   fp->ctf_str[CTF_STRTAB_0].cts_strs = (const char *) fp->ctf_buf
     + hp->cth_stroff;
   fp->ctf_str[CTF_STRTAB_0].cts_len = hp->cth_strlen;
-  if (ctf_str_create_atoms (fp, atoms) < 0)
+  if (ctf_str_create_atoms (fp) < 0)
     {
       err = ENOMEM;
       goto bad;
@@ -1669,7 +1641,6 @@ ctf_bufopen_internal (const ctf_sect_t *ctfsect, const ctf_sect_t *symsect,
       fp->ctf_str[CTF_STRTAB_1].cts_strs = strsect->cts_data;
       fp->ctf_str[CTF_STRTAB_1].cts_len = strsect->cts_size;
     }
-  fp->ctf_syn_ext_strtab = syn_strtab;
 
   /* Dynamic state, for dynamic addition to this dict after loading.  */
 
diff --git a/libctf/ctf-serialize.c b/libctf/ctf-serialize.c
index 82e5b7d705b..2d1f014db28 100644
--- a/libctf/ctf-serialize.c
+++ b/libctf/ctf-serialize.c
@@ -470,9 +470,9 @@ ctf_symtypetab_sect_sizes (ctf_dict_t *fp, emit_symtypetab_state_t *s,
      filter out reported symbols from the variable section, and filter out all
      other symbols from the symtypetab sections.  (If we are not linking, the
      symbols are sorted; if we are linking, don't bother sorting if we are not
-     filtering out reported symbols: this is almost certaily an ld -r and only
+     filtering out reported symbols: this is almost certainly an ld -r and only
      the linker is likely to consume these symtypetabs again.  The linker
-     doesn't care what order the symtypetab entries is in, since it only
+     doesn't care what order the symtypetab entries are in, since it only
      iterates over symbols and does not use the ctf_lookup_by_symbol* API.)  */
 
   s->sort_syms = 1;
@@ -718,8 +718,8 @@ symerr:
 
 /* Type section.  */
 
-/* Iterate through the dynamic type definition list and compute the
-   size of the CTF type section.  */
+/* Iterate through the static types and the dynamic type definition list and
+   compute the size of the CTF type section.  */
 
 static size_t
 ctf_type_sect_size (ctf_dict_t *fp)
@@ -778,7 +778,7 @@ ctf_type_sect_size (ctf_dict_t *fp)
 	}
     }
 
-  return type_size;
+  return type_size + fp->ctf_header->cth_stroff - fp->ctf_header->cth_typeoff;
 }
 
 /* Take a final lap through the dynamic type definition list and copy the
@@ -933,30 +933,22 @@ ctf_sort_var (const void *one_, const void *two_, void *arg_)
 
 /* Overall serialization.  */
 
-/* If the specified CTF dict is writable and has been modified, reload this dict
-   with the updated type definitions, ready for serialization.  In order to make
-   this code and the rest of libctf as simple as possible, we perform updates by
-   taking the dynamic type definitions and creating an in-memory CTF dict
-   containing the definitions, and then call ctf_simple_open_internal() on it.
-   We perform one extra trick here for the benefit of callers and to keep our
-   code simple: ctf_simple_open_internal() will return a new ctf_dict_t, but we
-   want to keep the fp constant for the caller, so after
-   ctf_simple_open_internal() returns, we use memcpy to swap the interior of the
-   old and new ctf_dict_t's, and then free the old.
+/* Emit a new CTF dict which is a serialized copy of this one: also reify
+   the string table and update all offsets in the current dict suitably.
+   (This simplifies ctf-string.c a little, at the cost of storing a second
+   copy of the strtab if this dict was originally read in via ctf_open.)
 
-   We do not currently support serializing a dict that has already been
-   serialized in the past: but all the tables support it except for the types
-   table.  */
+   Other aspects of the existing dict are unchanged, although some
+   static entries may be duplicated in the dynamic state (which should
+   have no effect on visible operation).  */
 
-int
-ctf_serialize (ctf_dict_t *fp)
+static unsigned char *
+ctf_serialize (ctf_dict_t *fp, size_t *bufsiz)
 {
-  ctf_dict_t ofp, *nfp;
   ctf_header_t hdr, *hdrp;
   ctf_dvdef_t *dvd;
   ctf_varent_t *dvarents;
   const ctf_strs_writable_t *strtab;
-  int err;
   int sym_functions = 0;
 
   unsigned char *t;
@@ -969,13 +961,6 @@ ctf_serialize (ctf_dict_t *fp)
   emit_symtypetab_state_t symstate;
   memset (&symstate, 0, sizeof (emit_symtypetab_state_t));
 
-  /* This isn't a very nice error code, but it's close enough: it's what you
-     get if you try to modify a type loaded out of a serialized dict, so
-     it makes at least a little sense that it's what you get if you try to
-     reserialize the dict again.  */
-  if (fp->ctf_stypes > 0)
-    return (ctf_set_errno (fp, ECTF_RDONLY));
-
   /* Fill in an initial CTF header.  We will leave the label, object,
      and function sections empty and only output a header, type section,
      and string table.  The type section begins at a 4-byte aligned
@@ -993,7 +978,7 @@ ctf_serialize (ctf_dict_t *fp)
 
   /* Propagate all symbols in the symtypetabs into the dynamic state, so that
      we can put them back in the right order.  Symbols already in the dynamic
-     state are left as they are.  */
+     state, likely due to repeated serialization, are left unchanged.  */
   do
     {
       ctf_next_t *it = NULL;
@@ -1004,17 +989,17 @@ ctf_serialize (ctf_dict_t *fp)
 					    sym_functions)) != CTF_ERR)
 	if ((ctf_add_funcobjt_sym_forced (fp, sym_functions, sym_name, sym)) < 0)
 	  if (ctf_errno (fp) != ECTF_DUPLICATE)
-	    return -1;				/* errno is set for us.  */
+	    return NULL;			/* errno is set for us.  */
 
       if (ctf_errno (fp) != ECTF_NEXT_END)
-	return -1;				/* errno is set for us.  */
+	return NULL;				/* errno is set for us.  */
     } while (sym_functions++ < 1);
 
   /* Figure out how big the symtypetabs are now.  */
 
   if (ctf_symtypetab_sect_sizes (fp, &symstate, &hdr, &objt_size, &func_size,
 				 &objtidx_size, &funcidx_size) < 0)
-    return -1;					/* errno is set for us.  */
+    return NULL;				/* errno is set for us.  */
 
   /* Propagate all vars into the dynamic state, so we can put them back later.
      Variables already in the dynamic state, likely due to repeated
@@ -1026,7 +1011,7 @@ ctf_serialize (ctf_dict_t *fp)
 
       if (name != NULL && !ctf_dvd_lookup (fp, name))
 	if (ctf_add_variable_forced (fp, name, fp->ctf_vars[i].ctv_type) < 0)
-	  return -1;				/* errno is set for us.  */
+	  return NULL;				/* errno is set for us.  */
     }
 
   for (nvars = 0, dvd = ctf_list_next (&fp->ctf_dvdefs);
@@ -1050,7 +1035,10 @@ ctf_serialize (ctf_dict_t *fp)
   buf_size = sizeof (ctf_header_t) + hdr.cth_stroff + hdr.cth_strlen;
 
   if ((buf = malloc (buf_size)) == NULL)
-    return (ctf_set_errno (fp, EAGAIN));
+    {
+      ctf_set_errno (fp, EAGAIN);
+      return NULL;
+    }
 
   memcpy (buf, &hdr, sizeof (ctf_header_t));
   t = (unsigned char *) buf + sizeof (ctf_header_t) + hdr.cth_objtoff;
@@ -1085,6 +1073,11 @@ ctf_serialize (ctf_dict_t *fp)
 
   assert (t == (unsigned char *) buf + sizeof (ctf_header_t) + hdr.cth_typeoff);
 
+  /* Copy in existing static types, then emit new dynamic types.  */
+
+  memcpy (t, fp->ctf_buf + fp->ctf_header->cth_typeoff,
+	  fp->ctf_header->cth_stroff - fp->ctf_header->cth_typeoff);
+  t += fp->ctf_header->cth_stroff - fp->ctf_header->cth_typeoff;
   ctf_emit_type_sect (fp, &t);
 
   assert (t == (unsigned char *) buf + sizeof (ctf_header_t) + hdr.cth_stroff);
@@ -1111,136 +1104,15 @@ ctf_serialize (ctf_dict_t *fp)
   hdrp = (ctf_header_t *) buf;
   hdrp->cth_strlen = strtab->cts_len;
   buf_size += hdrp->cth_strlen;
+  *bufsiz = buf_size;
 
-  /* Finally, we are ready to ctf_simple_open() the new dict.  If this is
-     successful, we then switch nfp and fp and free the old dict.  */
-
-  if ((nfp = ctf_simple_open_internal ((char *) buf, buf_size, NULL, 0,
-				       0, NULL, 0, fp->ctf_syn_ext_strtab,
-				       fp->ctf_str_atoms, &err)) == NULL)
-    {
-      free (buf);
-      return (ctf_set_errno (fp, err));
-    }
-
-  (void) ctf_setmodel (nfp, ctf_getmodel (fp));
-
-  nfp->ctf_parent = fp->ctf_parent;
-  nfp->ctf_parent_unreffed = fp->ctf_parent_unreffed;
-  nfp->ctf_refcnt = fp->ctf_refcnt;
-  if (nfp->ctf_dynbase == NULL)
-    nfp->ctf_dynbase = buf;		/* Make sure buf is freed on close.  */
-  nfp->ctf_dthash = fp->ctf_dthash;
-  nfp->ctf_dtdefs = fp->ctf_dtdefs;
-  nfp->ctf_dvhash = fp->ctf_dvhash;
-  nfp->ctf_dvdefs = fp->ctf_dvdefs;
-  nfp->ctf_dtoldid = fp->ctf_dtoldid;
-  nfp->ctf_add_processing = fp->ctf_add_processing;
-  nfp->ctf_snapshots = fp->ctf_snapshots + 1;
-  nfp->ctf_specific = fp->ctf_specific;
-  nfp->ctf_nfuncidx = fp->ctf_nfuncidx;
-  nfp->ctf_nobjtidx = fp->ctf_nobjtidx;
-  nfp->ctf_objthash = fp->ctf_objthash;
-  nfp->ctf_funchash = fp->ctf_funchash;
-  nfp->ctf_dynsyms = fp->ctf_dynsyms;
-  nfp->ctf_ptrtab = fp->ctf_ptrtab;
-  nfp->ctf_pptrtab = fp->ctf_pptrtab;
-  nfp->ctf_typemax = fp->ctf_typemax;
-  nfp->ctf_stypes = fp->ctf_stypes;
-  nfp->ctf_dynsymidx = fp->ctf_dynsymidx;
-  nfp->ctf_dynsymmax = fp->ctf_dynsymmax;
-  nfp->ctf_ptrtab_len = fp->ctf_ptrtab_len;
-  nfp->ctf_pptrtab_len = fp->ctf_pptrtab_len;
-  nfp->ctf_link_inputs = fp->ctf_link_inputs;
-  nfp->ctf_link_outputs = fp->ctf_link_outputs;
-  nfp->ctf_errs_warnings = fp->ctf_errs_warnings;
-  nfp->ctf_funcidx_names = fp->ctf_funcidx_names;
-  nfp->ctf_objtidx_names = fp->ctf_objtidx_names;
-  nfp->ctf_funcidx_sxlate = fp->ctf_funcidx_sxlate;
-  nfp->ctf_objtidx_sxlate = fp->ctf_objtidx_sxlate;
-  nfp->ctf_str_prov_offset = fp->ctf_str_prov_offset;
-  nfp->ctf_syn_ext_strtab = fp->ctf_syn_ext_strtab;
-  nfp->ctf_pptrtab_typemax = fp->ctf_pptrtab_typemax;
-  nfp->ctf_in_flight_dynsyms = fp->ctf_in_flight_dynsyms;
-  nfp->ctf_link_in_cu_mapping = fp->ctf_link_in_cu_mapping;
-  nfp->ctf_link_out_cu_mapping = fp->ctf_link_out_cu_mapping;
-  nfp->ctf_link_type_mapping = fp->ctf_link_type_mapping;
-  nfp->ctf_link_memb_name_changer = fp->ctf_link_memb_name_changer;
-  nfp->ctf_link_memb_name_changer_arg = fp->ctf_link_memb_name_changer_arg;
-  nfp->ctf_link_variable_filter = fp->ctf_link_variable_filter;
-  nfp->ctf_link_variable_filter_arg = fp->ctf_link_variable_filter_arg;
-  nfp->ctf_symsect_little_endian = fp->ctf_symsect_little_endian;
-  nfp->ctf_link_flags = fp->ctf_link_flags;
-  nfp->ctf_dedup_atoms = fp->ctf_dedup_atoms;
-  nfp->ctf_dedup_atoms_alloc = fp->ctf_dedup_atoms_alloc;
-  memcpy (&nfp->ctf_dedup, &fp->ctf_dedup, sizeof (fp->ctf_dedup));
-
-  nfp->ctf_snapshot_lu = fp->ctf_snapshots;
-
-  memcpy (&nfp->ctf_lookups, fp->ctf_lookups, sizeof (fp->ctf_lookups));
-  nfp->ctf_structs = fp->ctf_structs;
-  nfp->ctf_unions = fp->ctf_unions;
-  nfp->ctf_enums = fp->ctf_enums;
-  nfp->ctf_names = fp->ctf_names;
-
-  fp->ctf_dthash = NULL;
-  ctf_str_free_atoms (nfp);
-  nfp->ctf_str_atoms = fp->ctf_str_atoms;
-  nfp->ctf_prov_strtab = fp->ctf_prov_strtab;
-  nfp->ctf_dynstrtab = fp->ctf_dynstrtab;
-  nfp->ctf_str_movable_refs = fp->ctf_str_movable_refs;
-  fp->ctf_str_atoms = NULL;
-  fp->ctf_prov_strtab = NULL;
-  fp->ctf_dynstrtab = NULL;
-  fp->ctf_str_movable_refs = NULL;
-  memset (&fp->ctf_dtdefs, 0, sizeof (ctf_list_t));
-  memset (&fp->ctf_errs_warnings, 0, sizeof (ctf_list_t));
-  fp->ctf_add_processing = NULL;
-  fp->ctf_ptrtab = NULL;
-  fp->ctf_pptrtab = NULL;
-  fp->ctf_funcidx_names = NULL;
-  fp->ctf_objtidx_names = NULL;
-  fp->ctf_funcidx_sxlate = NULL;
-  fp->ctf_objtidx_sxlate = NULL;
-  fp->ctf_objthash = NULL;
-  fp->ctf_funchash = NULL;
-  fp->ctf_dynsyms = NULL;
-  fp->ctf_dynsymidx = NULL;
-  fp->ctf_link_inputs = NULL;
-  fp->ctf_link_outputs = NULL;
-  fp->ctf_syn_ext_strtab = NULL;
-  fp->ctf_link_in_cu_mapping = NULL;
-  fp->ctf_link_out_cu_mapping = NULL;
-  fp->ctf_link_type_mapping = NULL;
-  fp->ctf_dedup_atoms = NULL;
-  fp->ctf_dedup_atoms_alloc = NULL;
-  fp->ctf_parent_unreffed = 1;
-
-  fp->ctf_dvhash = NULL;
-  memset (&fp->ctf_dvdefs, 0, sizeof (ctf_list_t));
-  memset (fp->ctf_lookups, 0, sizeof (fp->ctf_lookups));
-  memset (&fp->ctf_in_flight_dynsyms, 0, sizeof (fp->ctf_in_flight_dynsyms));
-  memset (&fp->ctf_dedup, 0, sizeof (fp->ctf_dedup));
-  fp->ctf_structs = NULL;
-  fp->ctf_unions = NULL;
-  fp->ctf_enums = NULL;
-  fp->ctf_names = NULL;
-
-  memcpy (&ofp, fp, sizeof (ctf_dict_t));
-  memcpy (fp, nfp, sizeof (ctf_dict_t));
-  memcpy (nfp, &ofp, sizeof (ctf_dict_t));
-
-  nfp->ctf_refcnt = 1;				/* Force nfp to be freed.  */
-  ctf_dict_close (nfp);
-
-  return 0;
+  return buf;
 
 oom:
-  free (buf);
-  return (ctf_set_errno (fp, EAGAIN));
+  ctf_set_errno (fp, EAGAIN);
 err:
   free (buf);
-  return -1;					/* errno is set for us.  */
+  return NULL;					/* errno is set for us.  */
 }
 
 /* File writing.  */
@@ -1255,30 +1127,27 @@ err:
 int
 ctf_gzwrite (ctf_dict_t *fp, gzFile fd)
 {
-  const unsigned char *buf;
-  ssize_t resid;
-  ssize_t len;
+  unsigned char *buf;
+  unsigned char *p;
+  size_t bufsiz;
+  size_t len, written = 0;
 
-  resid = sizeof (ctf_header_t);
-  buf = (unsigned char *) fp->ctf_header;
-  while (resid != 0)
+  if ((buf = ctf_serialize (fp, &bufsiz)) == NULL)
+    return -1;					/* errno is set for us.  */
+
+  p = buf;
+  while (written < bufsiz)
     {
-      if ((len = gzwrite (fd, buf, resid)) <= 0)
-	return (ctf_set_errno (fp, errno));
-      resid -= len;
-      buf += len;
-    }
-
-  resid = fp->ctf_size;
-  buf = fp->ctf_buf;
-  while (resid != 0)
-    {
-      if ((len = gzwrite (fd, buf, resid)) <= 0)
-	return (ctf_set_errno (fp, errno));
-      resid -= len;
-      buf += len;
+      if ((len = gzwrite (fd, p, bufsiz - written)) <= 0)
+	{
+	  free (buf);
+	  return (ctf_set_errno (fp, errno));
+	}
+      written += len;
+      p += len;
     }
 
+  free (buf);
   return 0;
 }
 
@@ -1288,88 +1157,95 @@ ctf_gzwrite (ctf_dict_t *fp, gzFile fd)
 unsigned char *
 ctf_write_mem (ctf_dict_t *fp, size_t *size, size_t threshold)
 {
-  unsigned char *buf;
+  unsigned char *rawbuf;
+  unsigned char *buf = NULL;
   unsigned char *bp;
-  ctf_header_t *hp;
-  unsigned char *flipped, *src;
-  ssize_t header_len = sizeof (ctf_header_t);
-  ssize_t compress_len;
+  ctf_header_t *rawhp, *hp;
+  unsigned char *src;
+  size_t rawbufsiz;
+  size_t alloc_len = 0;
+  int uncompressed = 0;
   int flip_endian;
-  int uncompressed;
   int rc;
 
   flip_endian = getenv ("LIBCTF_WRITE_FOREIGN_ENDIAN") != NULL;
-  uncompressed = (fp->ctf_size < threshold);
 
-  if (ctf_serialize (fp) < 0)
+  if ((rawbuf = ctf_serialize (fp, &rawbufsiz)) == NULL)
     return NULL;				/* errno is set for us.  */
 
-  compress_len = compressBound (fp->ctf_size);
-  if (fp->ctf_size < threshold)
-    compress_len = fp->ctf_size;
-  if ((buf = malloc (compress_len
-		     + sizeof (struct ctf_header))) == NULL)
+  if (!ctf_assert (fp, rawbufsiz >= sizeof (ctf_header_t)))
+    goto err;
+
+  if (rawbufsiz >= threshold)
+    alloc_len = compressBound (rawbufsiz - sizeof (ctf_header_t))
+      + sizeof (ctf_header_t);
+
+  /* Trivial operation if the buffer is incompressible or too small to bother
+     compressing, and we're not doing a forced write-time flip.  */
+
+  if (rawbufsiz < threshold || rawbufsiz < alloc_len)
+    {
+      alloc_len = rawbufsiz;
+      uncompressed = 1;
+    }
+
+  if (!flip_endian && uncompressed)
+    {
+      *size = rawbufsiz;
+      return rawbuf;
+    }
+
+  if ((buf = malloc (alloc_len)) == NULL)
     {
       ctf_set_errno (fp, ENOMEM);
       ctf_err_warn (fp, 0, 0, _("ctf_write_mem: cannot allocate %li bytes"),
-		    (unsigned long) (compress_len + sizeof (struct ctf_header)));
-      return NULL;
+		    (unsigned long) (alloc_len));
+      goto err;
     }
 
+  rawhp = (ctf_header_t *) rawbuf;
   hp = (ctf_header_t *) buf;
-  memcpy (hp, fp->ctf_header, header_len);
-  bp = buf + sizeof (struct ctf_header);
-  *size = sizeof (struct ctf_header);
+  memcpy (hp, rawbuf, sizeof (ctf_header_t));
+  bp = buf + sizeof (ctf_header_t);
+  *size = sizeof (ctf_header_t);
 
-  if (uncompressed)
-    hp->cth_flags &= ~CTF_F_COMPRESS;
-  else
+  if (!uncompressed)
     hp->cth_flags |= CTF_F_COMPRESS;
 
-  src = fp->ctf_buf;
-  flipped = NULL;
+  src = rawbuf + sizeof (ctf_header_t);
 
   if (flip_endian)
     {
-      if ((flipped = malloc (fp->ctf_size)) == NULL)
-	{
-	  ctf_set_errno (fp, ENOMEM);
-	  ctf_err_warn (fp, 0, 0, _("ctf_write_mem: cannot allocate %li bytes"),
-			(unsigned long) (fp->ctf_size + sizeof (struct ctf_header)));
-	  return NULL;
-	}
       ctf_flip_header (hp);
-      memcpy (flipped, fp->ctf_buf, fp->ctf_size);
-      if (ctf_flip (fp, fp->ctf_header, flipped, 1) < 0)
-	{
-	  free (buf);
-	  free (flipped);
-	  return NULL;				/* errno is set for us.  */
-	}
-      src = flipped;
+      if (ctf_flip (fp, rawhp, src, 1) < 0)
+	goto err;				/* errno is set for us.  */
     }
 
-  if (uncompressed)
-    {
-      memcpy (bp, src, fp->ctf_size);
-      *size += fp->ctf_size;
-    }
-  else
+  if (!uncompressed)
     {
+      size_t compress_len = alloc_len - sizeof (ctf_header_t);
+
       if ((rc = compress (bp, (uLongf *) &compress_len,
-			  src, fp->ctf_size)) != Z_OK)
+			  src, rawbufsiz - sizeof (ctf_header_t))) != Z_OK)
 	{
 	  ctf_set_errno (fp, ECTF_COMPRESS);
 	  ctf_err_warn (fp, 0, 0, _("zlib deflate err: %s"), zError (rc));
-	  free (buf);
-	  return NULL;
+	  goto err;
 	}
       *size += compress_len;
     }
+  else
+    {
+      memcpy (bp, src, rawbufsiz - sizeof (ctf_header_t));
+      *size += rawbufsiz - sizeof (ctf_header_t);
+    }
 
-  free (flipped);
-
+  free (rawbuf);
   return buf;
+err:
+  free (buf);
+  free (rawbuf);
+  return NULL;
 }
 
 /* Compress the specified CTF data stream and write it to the specified file
diff --git a/libctf/ctf-string.c b/libctf/ctf-string.c
index ccf36498eb9..465b2c88ac8 100644
--- a/libctf/ctf-string.c
+++ b/libctf/ctf-string.c
@@ -143,15 +143,9 @@ ctf_str_free_atom (void *a)
    calls to ctf_str_add_external to populate external strtab entries, since
    these are often not quite the same as what appears in any external
    strtab, and the external strtab is often huge and best not aggressively
-   pulled in.)
-
-   Alternatively, if passed, populate atoms from the passed-in table, but do
-   not propagate their flags or refs: they are all non-freeable and
-   non-movable.  (This is used when serializing a dict: this entire atoms
-   table will be thrown away shortly, so it is important that we not create
-   any new strings.)  */
+   pulled in.)  */
 int
-ctf_str_create_atoms (ctf_dict_t *fp, ctf_dynhash_t *atoms)
+ctf_str_create_atoms (ctf_dict_t *fp)
 {
   size_t i;
 
@@ -178,61 +172,26 @@ ctf_str_create_atoms (ctf_dict_t *fp, ctf_dynhash_t *atoms)
   if (errno == ENOMEM)
     goto oom_str_add;
 
-  /* Serializing.  We have existing strings in an existing atoms table with
-     possibly-live pointers to them which must be used unchanged.  Import
-     them into this atoms table.  */
+  /* Pull in all the strings in the strtab as new atoms.  The provisional
+     strtab must be empty at this point, so there is no need to populate
+     atoms from it as well.  Types in this subset are frozen and readonly,
+     so the refs list and movable refs list need not be populated.  */
 
-  if (atoms)
+  for (i = 0; i < fp->ctf_str[CTF_STRTAB_0].cts_len;
+       i += strlen (&fp->ctf_str[CTF_STRTAB_0].cts_strs[i]) + 1)
     {
-      ctf_next_t *it = NULL;
-      void *k, *v;
-      int err;
+      ctf_str_atom_t *atom;
 
-      while ((err = ctf_dynhash_next (atoms, &it, &k, &v)) == 0)
-	{
-	  ctf_str_atom_t *existing = v;
-	  ctf_str_atom_t *atom;
+      if (fp->ctf_str[CTF_STRTAB_0].cts_strs[i] == 0)
+	continue;
 
-	  if (existing->csa_str[0] == 0)
-	    continue;
+      atom = ctf_str_add_ref_internal (fp, &fp->ctf_str[CTF_STRTAB_0].cts_strs[i],
+				       0, 0);
 
-	  if ((atom = malloc (sizeof (struct ctf_str_atom))) == NULL)
-	    goto oom_str_add;
-	  memcpy (atom, existing, sizeof (struct ctf_str_atom));
-	  memset (&atom->csa_refs, 0, sizeof(ctf_list_t));
-	  atom->csa_flags = 0;
+      if (!atom)
+	goto oom_str_add;
 
-	  if (ctf_dynhash_insert (fp->ctf_str_atoms, atom->csa_str, atom) < 0)
-	    {
-	      free (atom);
-	      goto oom_str_add;
-	    }
-	}
-    }
-  else
-    {
-      /* Not serializing.  Pull in all the strings in the strtab as new
-	 atoms.  The provisional strtab must be empty at this point, so
-	 there is no need to populate atoms from it as well.  Types in this
-	 subset are frozen and readonly, so the refs list and movable refs
-	 list need not be populated.  */
-
-      for (i = 0; i < fp->ctf_str[CTF_STRTAB_0].cts_len;
-	   i += strlen (&fp->ctf_str[CTF_STRTAB_0].cts_strs[i]) + 1)
-	{
-	  ctf_str_atom_t *atom;
-
-	  if (fp->ctf_str[CTF_STRTAB_0].cts_strs[i] == 0)
-	    continue;
-
-	  atom = ctf_str_add_ref_internal (fp, &fp->ctf_str[CTF_STRTAB_0].cts_strs[i],
-					   0, 0);
-
-	  if (!atom)
-	    goto oom_str_add;
-
-	  atom->csa_offset = i;
-	}
+      atom->csa_offset = i;
     }
 
   return 0;
diff --git a/libctf/testsuite/libctf-lookup/add-to-opened.c b/libctf/testsuite/libctf-lookup/add-to-opened.c
index dc2e1f55b99..96629afe1aa 100644
--- a/libctf/testsuite/libctf-lookup/add-to-opened.c
+++ b/libctf/testsuite/libctf-lookup/add-to-opened.c
@@ -118,11 +118,9 @@ main (int argc, char *argv[])
   if (ctf_errno (fp) != ECTF_RDONLY)
     fprintf (stderr, "unexpected error %s attempting to set array in readonly portion\n", ctf_errmsg (ctf_errno (fp)));
 
-  if ((ctf_written = ctf_write_mem (fp, &size, 4096)) != NULL)
-    fprintf (stderr, "Writeout unexpectedly succeeded: %s\n", ctf_errmsg (ctf_errno (fp)));
-
-  if (ctf_errno (fp) != ECTF_RDONLY)
-    fprintf (stderr, "unexpected error %s trying to write out previously serialized dict\n", ctf_errmsg (ctf_errno (fp)));
+  if ((ctf_written = ctf_write_mem (fp, &size, 4096)) == NULL)
+    fprintf (stderr, "Re-writeout unexpectedly failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+  free (ctf_written);
 
   /* Finally, make sure we can add new types, and look them up again.  */
 
@@ -138,6 +136,9 @@ main (int argc, char *argv[])
   if (ctf_type_reference (fp, ptrtype) != type)
     fprintf (stderr, "Look up of newly-added type in serialized dict yields ID %lx, expected %lx\n", ctf_type_reference (fp, ptrtype), type);
 
+  ctf_dict_close (fp);
+  ctf_close (ctf);
+
   printf ("All done.\n");
   return 0;
  
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 14/22] libctf: fix tiny dumping error
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (12 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 13/22] libctf: make ctf_serialize() actually serialize Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 15/22] libctf: improve handling of type dumping errors Nick Alcock
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

Without this, you might get things like this in the output:

    Flags: 0xa (CTF_F_NEWFUNCINFO, , CTF_F_DYNSTR)

Note the spurious comma.

libctf/
	* ctf-dump.c (ctf_dump_header): Fix comma emission.
---
 libctf/ctf-dump.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/libctf/ctf-dump.c b/libctf/ctf-dump.c
index 474c4e00cea..2213d09dc29 100644
--- a/libctf/ctf-dump.c
+++ b/libctf/ctf-dump.c
@@ -333,13 +333,12 @@ ctf_dump_header (ctf_dict_t *fp, ctf_dump_state_t *state)
 		    ? ", " : "",
 		    fp->ctf_openflags & CTF_F_NEWFUNCINFO
 		    ? "CTF_F_NEWFUNCINFO" : "",
-		    (fp->ctf_openflags & (CTF_F_COMPRESS | CTF_F_NEWFUNCINFO))
+		    (fp->ctf_openflags & (CTF_F_NEWFUNCINFO))
 		    && (fp->ctf_openflags & ~(CTF_F_COMPRESS | CTF_F_NEWFUNCINFO))
 		    ? ", " : "",
 		    fp->ctf_openflags & CTF_F_IDXSORTED
 		    ? "CTF_F_IDXSORTED" : "",
-		    fp->ctf_openflags & (CTF_F_COMPRESS | CTF_F_NEWFUNCINFO
-					 | CTF_F_IDXSORTED)
+		    fp->ctf_openflags & (CTF_F_IDXSORTED)
 		    && (fp->ctf_openflags & ~(CTF_F_COMPRESS | CTF_F_NEWFUNCINFO
 					      | CTF_F_IDXSORTED))
 		    ? ", " : "",
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 15/22] libctf: improve handling of type dumping errors
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (13 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 14/22] libctf: fix tiny dumping error Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 16/22] libctf: make ctf_lookup of symbols by name work in more cases Nick Alcock
                   ` (7 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

When dumping a type fails with an error, we want to emit a warning noting
this: a warning because it's not fatal and we can continue.  But warnings
don't automatically print out the ctf_errno (because not all cases causing
warnings set the errno at all), so we must do it at warning-emission time or
lose track of what's gone wrong.

libctf/

	* ctf-dump.c (ctf_dump_format_type): Dump the underlying error on
	type dump failure.
---
 libctf/ctf-dump.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libctf/ctf-dump.c b/libctf/ctf-dump.c
index 2213d09dc29..80a3b265297 100644
--- a/libctf/ctf-dump.c
+++ b/libctf/ctf-dump.c
@@ -239,7 +239,8 @@ ctf_dump_format_type (ctf_dict_t *fp, ctf_id_t id, int flag)
  oom:
   ctf_set_errno (fp, errno);
  err:
-  ctf_err_warn (fp, 1, 0, _("cannot format name dumping type 0x%lx"), id);
+  ctf_err_warn (fp, 1, ctf_errno (fp), _("cannot format name dumping type 0x%lx"),
+		id);
   free (buf);
   free (str);
   free (bit);
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 16/22] libctf: make ctf_lookup of symbols by name work in more cases
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (14 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 15/22] libctf: improve handling of type dumping errors Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 17/22] libctf: fix a debugging typo Nick Alcock
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

In particular, we don't need a symbol table if we're looking up a
symbol by name and that type of symbol has an indexed symtypetab,
since in that case we get the name from the symtypetab index, not
from the symbol table.

This lets you do symbol lookups in unlinked object files and unlinked
dicts written out via libctf's writeout functions.

libctf/

	* ctf-lookup.c (ctf_lookup_by_sym_or_name): Allow lookups
	by index even when there is no symtab.
---
 libctf/ctf-lookup.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libctf/ctf-lookup.c b/libctf/ctf-lookup.c
index aa251bafb89..f37dd7e45ce 100644
--- a/libctf/ctf-lookup.c
+++ b/libctf/ctf-lookup.c
@@ -1045,7 +1045,9 @@ ctf_lookup_by_sym_or_name (ctf_dict_t *fp, unsigned long symidx,
     }
 
   err = ECTF_NOSYMTAB;
-  if (sp->cts_data == NULL)
+  if (sp->cts_data == NULL && symname == NULL &&
+      ((is_function && !fp->ctf_funcidx_names) ||
+       (!is_function && !fp->ctf_objtidx_names)))
     goto try_parent;
 
   /* This covers both out-of-range lookups by index and a dynamic dict which
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 17/22] libctf: fix a debugging typo
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (15 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 16/22] libctf: make ctf_lookup of symbols by name work in more cases Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 18/22] libctf: add rewriting tests Nick Alcock
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

libctf/

	* ctf-lookup.c (ctf_symidx_sort): Fix a debugging typo.
---
 libctf/ctf-lookup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libctf/ctf-lookup.c b/libctf/ctf-lookup.c
index f37dd7e45ce..1b1ebedc4b7 100644
--- a/libctf/ctf-lookup.c
+++ b/libctf/ctf-lookup.c
@@ -455,7 +455,7 @@ ctf_symidx_sort (ctf_dict_t *fp, uint32_t *idx, size_t *nidx,
   if (!(fp->ctf_header->cth_flags & CTF_F_IDXSORTED))
     {
       ctf_symidx_sort_arg_cb_t arg = { fp, idx };
-      ctf_dprintf ("Index section unsorted: sorting.");
+      ctf_dprintf ("Index section unsorted: sorting.\n");
       ctf_qsort_r (sorted, *nidx, sizeof (uint32_t), sort_symidx_by_name, &arg);
       fp->ctf_header->cth_flags |= CTF_F_IDXSORTED;
     }
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 18/22] libctf: add rewriting tests
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (16 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 17/22] libctf: fix a debugging typo Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 19/22] libctf: fix leak in test Nick Alcock
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

Now there's a chance of it actually working, we can add more tests for
the long-broken dict read-and-rewrite cases.  This is the first ever
test for the (rarely-used, unpleasant, and until recently completely
broken) ctf_gzwrite function.

libctf/

	* testsuite/libctf-regression/gzrewrite*: New test.
	* testsuite/libctf-regression/zrewrite*: Likewise.
---
 .../libctf-regression/gzrewrite-ctf.c         |  19 ++
 .../testsuite/libctf-regression/gzrewrite.c   | 165 ++++++++++++++++++
 .../testsuite/libctf-regression/gzrewrite.lk  |   3 +
 libctf/testsuite/libctf-regression/zrewrite.c | 156 +++++++++++++++++
 .../testsuite/libctf-regression/zrewrite.lk   |   3 +
 5 files changed, 346 insertions(+)
 create mode 100644 libctf/testsuite/libctf-regression/gzrewrite-ctf.c
 create mode 100644 libctf/testsuite/libctf-regression/gzrewrite.c
 create mode 100644 libctf/testsuite/libctf-regression/gzrewrite.lk
 create mode 100644 libctf/testsuite/libctf-regression/zrewrite.c
 create mode 100644 libctf/testsuite/libctf-regression/zrewrite.lk

diff --git a/libctf/testsuite/libctf-regression/gzrewrite-ctf.c b/libctf/testsuite/libctf-regression/gzrewrite-ctf.c
new file mode 100644
index 00000000000..b5d483ea1cb
--- /dev/null
+++ b/libctf/testsuite/libctf-regression/gzrewrite-ctf.c
@@ -0,0 +1,19 @@
+int an_int;
+char *a_char_ptr;
+typedef int (*a_typedef) (int main);
+struct struct_forward;
+enum enum_forward;
+union union_forward;
+typedef int an_array[50];
+struct a_struct { int foo; };
+union a_union { int bar; };
+enum an_enum { FOO };
+
+a_typedef a;
+struct struct_forward *x;
+union union_forward *y;
+enum enum_forward *z;
+struct a_struct *xx;
+union a_union *yy;
+enum an_enum *zz;
+an_array ar;
diff --git a/libctf/testsuite/libctf-regression/gzrewrite.c b/libctf/testsuite/libctf-regression/gzrewrite.c
new file mode 100644
index 00000000000..8e279ca3fac
--- /dev/null
+++ b/libctf/testsuite/libctf-regression/gzrewrite.c
@@ -0,0 +1,165 @@
+/* Make sure that you can modify then ctf_gzwrite() a dict
+   and it changes after modification.  */
+
+#include <ctf-api.h>
+#include <errno.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <zlib.h>
+
+char *read_gz(const char *path, size_t *len)
+{
+  char *in = NULL;
+  char buf[4096];
+  gzFile foo;
+  size_t ret;
+
+  if ((foo = gzopen (path, "rb")) == NULL)
+    return NULL;
+
+  *len = 0;
+  while ((ret = gzread (foo, buf, 4096)) > 0)
+    {
+      if ((in = realloc (in, *len + ret)) == NULL)
+	{
+	  fprintf (stderr, "Out of memory\n");
+	  exit (1);
+	}
+
+      memcpy (&in[*len], buf, ret);
+      *len += ret;
+    }
+  if (ret < 0)
+    {
+      int errnum;
+      const char *err;
+      err = gzerror (foo, &errnum);
+      if (errnum != Z_ERRNO)
+	fprintf (stderr, "error reading %s: %s\n", path, err);
+      else
+	fprintf (stderr, "error reading %s: %s\n", path, strerror(errno));
+      exit (1);
+    }
+  gzclose (foo);
+  return in;
+}
+
+int
+main (int argc, char *argv[])
+{
+  ctf_dict_t *fp, *fp_b;
+  ctf_archive_t *ctf;
+  gzFile foo;
+  char *a, *b;
+  size_t a_len, b_len;
+  ctf_id_t type, ptrtype;
+  int err;
+
+  if (argc != 2)
+    {
+      fprintf (stderr, "Syntax: %s PROGRAM\n", argv[0]);
+      exit(1);
+    }
+
+  if ((ctf = ctf_open (argv[1], NULL, &err)) == NULL)
+    goto open_err;
+  if ((fp = ctf_dict_open (ctf, NULL, &err)) == NULL)
+    goto open_err;
+
+  if ((foo = gzopen ("tmpdir/one.gz", "wb")) == NULL)
+    goto write_gzerr;
+  if (ctf_gzwrite (fp, foo) < 0)
+    goto write_err;
+  gzclose (foo);
+
+  if ((foo = gzopen ("tmpdir/two.gz", "wb")) == NULL)
+    goto write_gzerr;
+  if (ctf_gzwrite (fp, foo) < 0)
+    goto write_err;
+  gzclose (foo);
+
+  if ((a = read_gz ("tmpdir/one.gz", &a_len)) == NULL)
+    goto read_err;
+
+  if ((b = read_gz ("tmpdir/two.gz", &b_len)) == NULL)
+    goto read_err;
+
+  if (a_len != b_len || memcmp (a, b, a_len) != 0)
+    {
+      fprintf (stderr, "consecutive gzwrites are different: lengths %i and %i\n", a_len, b_len);
+      return 1;
+    }
+
+  free (b);
+
+  /* Add some new types to the dict and write it out, then read it back in and
+     make sure they're still there, and that at least some of the
+     originally-present data objects are still there too.  */
+
+  if ((type = ctf_lookup_by_name (fp, "struct a_struct")) == CTF_ERR)
+    fprintf (stderr, "Lookup of struct a_struct failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if ((ptrtype = ctf_add_pointer (fp, CTF_ADD_ROOT, type)) == CTF_ERR)
+    fprintf (stderr, "Cannot add pointer to ctf_opened dict: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  unlink ("tmpdir/two.gz");
+  if ((foo = gzopen ("tmpdir/two.gz", "wb")) == NULL)
+    goto write_gzerr;
+  if (ctf_gzwrite (fp, foo) < 0)
+    goto write_err;
+  gzclose (foo);
+
+  if ((b = read_gz ("tmpdir/two.gz", &b_len)) == NULL)
+    goto read_err;
+
+  if (memcmp (a, b, b_len) == 0)
+    {
+      fprintf (stderr, "gzwrites after adding types does not change the dict\n");
+      return 1;
+    }
+
+  free (a);
+  if ((fp_b = ctf_simple_open (b, b_len, NULL, 0, 0, NULL, 0, &err)) == NULL)
+    goto open_err;
+
+  if (ctf_type_reference (fp_b, ptrtype) == CTF_ERR)
+    fprintf (stderr, "Lookup of pointer preserved across writeout failed: %s\n", ctf_errmsg (ctf_errno (fp_b)));
+
+  if (ctf_type_reference (fp_b, ptrtype) != type)
+    fprintf (stderr, "Look up of newly-added type in serialized dict yields ID %lx, expected %lx\n", ctf_type_reference (fp_b, ptrtype), type);
+
+  if (ctf_lookup_by_symbol_name (fp_b, "an_int") == CTF_ERR)
+    fprintf (stderr, "Lookup of symbol an_int failed: %s\n", ctf_errmsg (ctf_errno (fp_b)));
+
+  free (b);
+  ctf_dict_close (fp);
+  ctf_dict_close (fp_b);
+  ctf_close (ctf);
+
+  printf ("All done.\n");
+  return 0;
+ 
+ open_err:
+  fprintf (stderr, "%s: cannot open: %s\n", argv[0], ctf_errmsg (err));
+  return 1;
+ write_err: 
+  fprintf (stderr, "%s: cannot write: %s\n", argv[0], ctf_errmsg (ctf_errno (fp)));
+  return 1;
+ write_gzerr:
+  {
+    int errnum;
+    const char *err;
+
+    err = gzerror (foo, &errnum);
+    if (errnum != Z_ERRNO)
+      fprintf (stderr, "error gzwriting: %s\n", err);
+    else
+      fprintf (stderr, "error gzwriting: %s\n", strerror(errno));
+    return 1;
+  }
+ read_err: 
+  fprintf (stderr, "%s: cannot read\n", argv[0]);
+  return 1;
+}
diff --git a/libctf/testsuite/libctf-regression/gzrewrite.lk b/libctf/testsuite/libctf-regression/gzrewrite.lk
new file mode 100644
index 00000000000..2d0de3dc464
--- /dev/null
+++ b/libctf/testsuite/libctf-regression/gzrewrite.lk
@@ -0,0 +1,3 @@
+# source: gzrewrite-ctf.c
+# lookup: gzrewrite.c
+All done.
diff --git a/libctf/testsuite/libctf-regression/zrewrite.c b/libctf/testsuite/libctf-regression/zrewrite.c
new file mode 100644
index 00000000000..4d5d15e7985
--- /dev/null
+++ b/libctf/testsuite/libctf-regression/zrewrite.c
@@ -0,0 +1,156 @@
+/* Make sure that you can modify then ctf_compress_write() a dict
+   and it changes after modification.  */
+
+#include <ctf-api.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+char *read_file(const char *path, size_t *len)
+{
+  char *in = NULL;
+  char buf[4096];
+  int foo;
+  size_t ret;
+
+  if ((foo = open (path, O_RDONLY)) < 0)
+    {
+      fprintf (stderr, "error opening %s: %s\n", path, strerror(errno));
+      exit (1);
+    }
+
+  *len = 0;
+  while ((ret = read (foo, buf, 4096)) > 0)
+    {
+      if ((in = realloc (in, *len + ret)) == NULL)
+	{
+	  fprintf (stderr, "Out of memory\n");
+	  exit (1);
+	}
+
+      memcpy (&in[*len], buf, ret);
+      *len += ret;
+    }
+
+  if (ret < 0)
+    {
+      fprintf (stderr, "error reading %s: %s\n", path, strerror(errno));
+      exit (1);
+    }
+  close (foo);
+  return in;
+}
+
+int
+main (int argc, char *argv[])
+{
+  ctf_dict_t *fp, *fp_b;
+  ctf_archive_t *ctf, *ctf_b;
+  int foo;
+  char *a, *b;
+  size_t a_len, b_len;
+  ctf_id_t type, ptrtype;
+  int err;
+
+  if (argc != 2)
+    {
+      fprintf (stderr, "Syntax: %s PROGRAM\n", argv[0]);
+      exit(1);
+    }
+
+  if ((ctf = ctf_open (argv[1], NULL, &err)) == NULL)
+    goto open_err;
+  if ((fp = ctf_dict_open (ctf, NULL, &err)) == NULL)
+    goto open_err;
+
+  if ((foo = open ("tmpdir/one", O_CREAT | O_TRUNC | O_WRONLY, 0666)) < 0)
+    goto write_stderr;
+  if (ctf_compress_write (fp, foo) < 0)
+    goto write_err;
+  close (foo);
+
+  if ((foo = open ("tmpdir/two", O_CREAT | O_TRUNC | O_WRONLY, 0666)) < 0)
+    goto write_stderr;
+  if (ctf_compress_write (fp, foo) < 0)
+    goto write_err;
+  close (foo);
+
+  a = read_file ("tmpdir/one", &a_len);
+  b = read_file ("tmpdir/two", &b_len);
+
+  if (a_len != b_len || memcmp (a, b, a_len) != 0)
+    {
+      fprintf (stderr, "consecutive compress_writes are different: lengths %i and %i\n", a_len, b_len);
+      return 1;
+    }
+
+  free (b);
+
+  /* Add some new types to the dict and write it out, then read it back in and
+     make sure they're still there, and that at least some of the
+     originally-present data objects are still there too.  */
+
+  if ((type = ctf_lookup_by_name (fp, "struct a_struct")) == CTF_ERR)
+    fprintf (stderr, "Lookup of struct a_struct failed: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  if ((ptrtype = ctf_add_pointer (fp, CTF_ADD_ROOT, type)) == CTF_ERR)
+    fprintf (stderr, "Cannot add pointer to ctf_opened dict: %s\n", ctf_errmsg (ctf_errno (fp)));
+
+  unlink ("tmpdir/two");
+
+  if ((foo = open ("tmpdir/two", O_CREAT | O_TRUNC | O_WRONLY, 0666)) < 0)
+    goto write_stderr;
+  if (ctf_compress_write (fp, foo) < 0)
+    goto write_err;
+  close (foo);
+
+  b = read_file ("tmpdir/two", &b_len);
+
+  if (memcmp (a, b, b_len) == 0)
+    {
+      fprintf (stderr, "compress_writes after adding types does not change the dict\n");
+      return 1;
+    }
+
+  free (a);
+  free (b);
+
+  if ((ctf_b = ctf_open ("tmpdir/two", NULL, &err)) == NULL)
+    goto open_err;
+  if ((fp_b = ctf_dict_open (ctf_b, NULL, &err)) == NULL)
+    goto open_err;
+
+  if (ctf_type_reference (fp_b, ptrtype) == CTF_ERR)
+    fprintf (stderr, "Lookup of pointer preserved across writeout failed: %s\n", ctf_errmsg (ctf_errno (fp_b)));
+
+  if (ctf_type_reference (fp_b, ptrtype) != type)
+    fprintf (stderr, "Look up of newly-added type in serialized dict yields ID %lx, expected %lx\n", ctf_type_reference (fp_b, ptrtype), type);
+
+  if (ctf_lookup_by_symbol_name (fp_b, "an_int") == CTF_ERR)
+    fprintf (stderr, "Lookup of symbol an_int failed: %s\n", ctf_errmsg (ctf_errno (fp_b)));
+
+  ctf_dict_close (fp);
+  ctf_close (ctf);
+
+  ctf_dict_close (fp_b);
+  ctf_close (ctf_b);
+
+  printf ("All done.\n");
+  return 0;
+ 
+ open_err:
+  fprintf (stderr, "%s: cannot open: %s\n", argv[0], ctf_errmsg (err));
+  return 1;
+ write_err: 
+  fprintf (stderr, "%s: cannot write: %s\n", argv[0], ctf_errmsg (ctf_errno (fp)));
+  return 1;
+ write_stderr:
+  fprintf (stderr, "%s: cannot open for writing: %s\n", argv[0], strerror (errno));
+  return 1;
+ read_err: 
+  fprintf (stderr, "%s: cannot read\n", argv[0]);
+  return 1;
+}
diff --git a/libctf/testsuite/libctf-regression/zrewrite.lk b/libctf/testsuite/libctf-regression/zrewrite.lk
new file mode 100644
index 00000000000..a0a53d91a04
--- /dev/null
+++ b/libctf/testsuite/libctf-regression/zrewrite.lk
@@ -0,0 +1,3 @@
+# source: gzrewrite-ctf.c
+# lookup: zrewrite.c
+All done.
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 19/22] libctf: fix leak in test
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (17 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 18/22] libctf: add rewriting tests Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 20/22] libctf: don't pass errno into ctf_err_warn so often Nick Alcock
                   ` (3 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

This purely serves to make it easier to interpret valgrind output.
No functional effect.

libctf/
	* testsuite/libctf-lookup/conflicting-type-syms.c: Free everything.
---
 libctf/testsuite/libctf-lookup/conflicting-type-syms.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/libctf/testsuite/libctf-lookup/conflicting-type-syms.c b/libctf/testsuite/libctf-lookup/conflicting-type-syms.c
index 239b8775964..e328fdf7728 100644
--- a/libctf/testsuite/libctf-lookup/conflicting-type-syms.c
+++ b/libctf/testsuite/libctf-lookup/conflicting-type-syms.c
@@ -27,18 +27,22 @@ main (int argc, char *argv[])
   if ((a_fp = ctf_arc_lookup_symbol_name (ctf, "a", &a, &err)) == NULL)
     goto sym_err;
   printf ("Type of a is %s\n", foo = ctf_type_aname (a_fp, a));
+  free (foo);
 
   if ((b_fp = ctf_arc_lookup_symbol_name (ctf, "b", &b, &err)) == NULL)
     goto sym_err;
   printf ("Type of b is %s\n", foo = ctf_type_aname (b_fp, b));
+  free (foo);
 
   if ((ignore1_fp = ctf_arc_lookup_symbol_name (ctf, "ignore1", &ignore1, &err)) == NULL)
     goto sym_err;
   printf ("Type of ignore1 is %s\n", foo = ctf_type_aname (ignore1_fp, ignore1));
+  free (foo);
 
   if ((ignore2_fp = ctf_arc_lookup_symbol_name (ctf, "ignore2", &ignore2, &err)) == NULL)
     goto sym_err;
   printf ("Type of ignore2 is %s\n", foo = ctf_type_aname (ignore2_fp, ignore1));
+  free (foo);
 
   /* Try a call in just-get-the-dict mode and make sure it doesn't fail.  */
   if ((tmp_fp = ctf_arc_lookup_symbol_name (ctf, "ignore2", NULL, &err)) == NULL)
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 20/22] libctf: don't pass errno into ctf_err_warn so often
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (18 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 19/22] libctf: fix leak in test Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 21/22] libctf: Remove undefined functions from ver. map Nick Alcock
                   ` (2 subsequent siblings)
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils

The libctf-internal warning function ctf_err_warn() can be passed a libctf
errno as a parameter, and will add its textual errmsg form to the passed-in
error message. But if there is an error on the fp already, and this is
specifically an error and not a warning, ctf_err_warn() will print the error
out regardless: there's no need to pass in anything but 0.

There are still a lot of places where we do

ctf_err_warn (fp, 0, EFOO, ...);
return ctf_set_errno (fp, 0, EFOO);

I've left all of those alone, because fixing it makes the code a bit longer:
but fixing the cases where no return is involved and the error has just been
set on the fp itself costs nothing and reduces redundancy a bit.

libctf/

	* ctf-dedup.c (ctf_dedup_walk_output_mapping): Drop the errno arg.
	(ctf_dedup_emit): Likewise.
	(ctf_dedup_type_mapping): Likewise.
	* ctf-link.c (ctf_create_per_cu): Likewise.
	(ctf_link_deduplicating_close_inputs): Likewise.
	(ctf_link_deduplicating_one_symtypetab): Likewise.
	(ctf_link_deduplicating_per_cu): Likewise.
	* ctf-lookup.c (ctf_lookup_symbol_idx): Likewise.
	* ctf-subr.c (ctf_assert_fail_internal): Likewise.
---
 libctf/ctf-dedup.c  |  8 ++++----
 libctf/ctf-link.c   | 22 +++++++++++-----------
 libctf/ctf-lookup.c |  4 ++--
 libctf/ctf-subr.c   |  4 ++--
 4 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/libctf/ctf-dedup.c b/libctf/ctf-dedup.c
index dc7a1cf79e2..c7db6ab4965 100644
--- a/libctf/ctf-dedup.c
+++ b/libctf/ctf-dedup.c
@@ -2398,8 +2398,8 @@ ctf_dedup_walk_output_mapping (ctf_dict_t *output, ctf_dict_t **inputs,
     }
   if (err != ECTF_NEXT_END)
     {
-      ctf_err_warn (output, 0, err, _("cannot recurse over output mapping"));
       ctf_set_errno (output, err);
+      ctf_err_warn (output, 0, 0, _("cannot recurse over output mapping"));
       goto err;
     }
   ctf_dynset_destroy (already_visited);
@@ -3092,9 +3092,9 @@ ctf_dedup_emit (ctf_dict_t *output, ctf_dict_t **inputs, uint32_t ninputs,
 
   if ((outputs = calloc (num_outputs, sizeof (ctf_dict_t *))) == NULL)
     {
-      ctf_err_warn (output, 0, ENOMEM,
-		    _("out of memory allocating link outputs array"));
       ctf_set_errno (output, ENOMEM);
+      ctf_err_warn (output, 0, 0,
+		    _("out of memory allocating link outputs array"));
       return NULL;
     }
   *noutputs = num_outputs;
@@ -3146,7 +3146,7 @@ ctf_dedup_type_mapping (ctf_dict_t *fp, ctf_dict_t *src_fp, ctf_id_t src_type)
   else
     {
       ctf_set_errno (fp, ECTF_INTERNAL);
-      ctf_err_warn (fp, 0, ECTF_INTERNAL,
+      ctf_err_warn (fp, 0, 0,
 		    _("dict %p passed to ctf_dedup_type_mapping is not a "
 		      "deduplicated output"), (void *) fp);
       return CTF_ERR;
diff --git a/libctf/ctf-link.c b/libctf/ctf-link.c
index 44d4e496f6a..801b6ee599d 100644
--- a/libctf/ctf-link.c
+++ b/libctf/ctf-link.c
@@ -330,9 +330,9 @@ ctf_create_per_cu (ctf_dict_t *fp, ctf_dict_t *input, const char *cu_name)
 
       if ((cu_fp = ctf_create (&err)) == NULL)
 	{
-	  ctf_err_warn (fp, 0, err, _("cannot create per-CU CTF archive for "
-				      "input CU %s"), cu_name);
 	  ctf_set_errno (fp, err);
+	  ctf_err_warn (fp, 0, 0, _("cannot create per-CU CTF archive for "
+				    "input CU %s"), cu_name);
 	  return NULL;
 	}
 
@@ -886,9 +886,9 @@ ctf_link_deduplicating_close_inputs (ctf_dict_t *fp, ctf_dynhash_t *cu_names,
 	}
       if (err != ECTF_NEXT_END)
 	{
-	  ctf_err_warn (fp, 0, err, _("iteration error in deduplicating link "
-				      "input freeing"));
 	  ctf_set_errno (fp, err);
+	  ctf_err_warn (fp, 0, 0, _("iteration error in deduplicating link "
+				    "input freeing"));
 	}
     }
   else
@@ -1087,8 +1087,8 @@ ctf_link_deduplicating_one_symtypetab (ctf_dict_t *fp, ctf_dict_t *input,
   if (ctf_errno (input) != ECTF_NEXT_END)
     {
       ctf_set_errno (fp, ctf_errno (input));
-      ctf_err_warn (fp, 0, ctf_errno (input),
-		    functions ? _("iterating over function symbols") :
+      ctf_err_warn (fp, 0, 0, functions ?
+		    _("iterating over function symbols") :
 		    _("iterating over data symbols"));
       return -1;
     }
@@ -1156,9 +1156,9 @@ ctf_link_deduplicating_per_cu (ctf_dict_t *fp)
 
       if (labs ((long int) ninputs) > 0xfffffffe)
 	{
-	  ctf_err_warn (fp, 0, EFBIG, _("too many inputs in deduplicating "
-					"link: %li"), (long int) ninputs);
 	  ctf_set_errno (fp, EFBIG);
+	  ctf_err_warn (fp, 0, 0, _("too many inputs in deduplicating "
+				    "link: %li"), (long int) ninputs);
 	  goto err_open_inputs;
 	}
 
@@ -1180,10 +1180,10 @@ ctf_link_deduplicating_per_cu (ctf_dict_t *fp)
 						  &ai, NULL, 0, &err);
 	  if (!only_input->clin_fp)
 	    {
-	      ctf_err_warn (fp, 0, err, _("cannot open archive %s in "
-					  "CU-mapped CTF link"),
-			    only_input->clin_filename);
 	      ctf_set_errno (fp, err);
+	      ctf_err_warn (fp, 0, 0, _("cannot open archive %s in "
+					"CU-mapped CTF link"),
+			    only_input->clin_filename);
 	      goto err_open_inputs;
 	    }
 	  ctf_next_destroy (ai);
diff --git a/libctf/ctf-lookup.c b/libctf/ctf-lookup.c
index 1b1ebedc4b7..e4d18bec112 100644
--- a/libctf/ctf-lookup.c
+++ b/libctf/ctf-lookup.c
@@ -665,8 +665,8 @@ ctf_lookup_symbol_idx (ctf_dict_t *fp, const char *symname, int try_parent,
     }
 oom:
   ctf_set_errno (fp, ENOMEM);
-  ctf_err_warn (fp, 0, ENOMEM, _("cannot allocate memory for symbol "
-				 "lookup hashtab"));
+  ctf_err_warn (fp, 0, 0, _("cannot allocate memory for symbol "
+			    "lookup hashtab"));
   return (unsigned long) -1;
 
 }
diff --git a/libctf/ctf-subr.c b/libctf/ctf-subr.c
index ecc68848d31..deb9e0ba5c4 100644
--- a/libctf/ctf-subr.c
+++ b/libctf/ctf-subr.c
@@ -340,7 +340,7 @@ void
 ctf_assert_fail_internal (ctf_dict_t *fp, const char *file, size_t line,
 			  const char *exprstr)
 {
-  ctf_err_warn (fp, 0, ECTF_INTERNAL, _("%s: %lu: libctf assertion failed: %s"),
-		file, (long unsigned int) line, exprstr);
   ctf_set_errno (fp, ECTF_INTERNAL);
+  ctf_err_warn (fp, 0, 0, _("%s: %lu: libctf assertion failed: %s"),
+		file, (long unsigned int) line, exprstr);
 }
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 21/22] libctf: Remove undefined functions from ver. map
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (19 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 20/22] libctf: don't pass errno into ctf_err_warn so often Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-17 20:20 ` [PATCH 22/22] libctf: do not include undefined functions in libctf.ver Nick Alcock
  2024-04-19 15:51 ` [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils; +Cc: Nicholas Vinson

From: Nicholas Vinson <nvinson234@gmail.com>

Starting with ld.lld-17, ld.lld is invoked with the option
--no-undefined-version enabled by default. Furthermore, The functions
ctf_label_set() and ctf_label_get() are not defined. Their inclusion in
libctf/libctf.ver causes ld.lld-17 to fail emitting the following error
messages:

ld.lld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_label_set' failed: symbol not defined
ld.lld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_label_get' failed: symbol not defined

This patch fixes the issue by removing the symbol names from
libctf/libctf.ver.

[nca: fused in later commit that marked ctf_arc_open as libctf
      only as well.  Added ChangeLog entry.]

Signed-off-by: Nicholas Vinson <nvinson234@gmail.com>

libctf/
	* libctf.ver: drop nonexistent label functions: mark
	ctf_arc_open as libctf-only.
---
 libctf/libctf.ver | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/libctf/libctf.ver b/libctf/libctf.ver
index c59847d012b..6e7345be66b 100644
--- a/libctf/libctf.ver
+++ b/libctf/libctf.ver
@@ -80,9 +80,6 @@ LIBCTF_1.0 {
 	ctf_enum_name;
 	ctf_enum_value;
 
-	ctf_label_set;
-	ctf_label_get;
-
 	ctf_label_topmost;
 	ctf_label_info;
 
@@ -139,7 +136,6 @@ LIBCTF_1.0 {
 
 	ctf_arc_write;
 	ctf_arc_write_fd;
-	ctf_arc_open;
 	ctf_arc_bufopen;
 	ctf_arc_close;
 	ctf_arc_open_by_name;
@@ -167,6 +163,7 @@ LIBCTF_1.0 {
 
 	ctf_fdopen;                             /* libctf only.  */
 	ctf_open;                               /* libctf only.  */
+	ctf_arc_open;                           /* libctf only.  */
 	ctf_bfdopen;                            /* libctf only.  */
 	ctf_bfdopen_ctfsect;                    /* libctf only.  */
     local:
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 22/22] libctf: do not include undefined functions in libctf.ver
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (20 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 21/22] libctf: Remove undefined functions from ver. map Nick Alcock
@ 2024-04-17 20:20 ` Nick Alcock
  2024-04-19 15:51 ` [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-17 20:20 UTC (permalink / raw)
  To: binutils; +Cc: Nicholas Vinson

libctf's version script is applied to two libraries: libctf.so,
and libctf-nobfd.so.  The latter library is a subset of the former
which does not link to libbfd and does not include a few public
entry points that use it (found in libctf-open-bfd.c).  This means
that some of the symbols in this version script only exist in one
of the libraries it's applied to.

A number of linkers dislike this: before now, only Solaris's linker
caused serious problems, introducing NOTYPE-typed symbols when such
things were found, but now LLD has started to complain as well:

ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_arc_open' failed: symbol not defined
ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_fdopen' failed: symbol not defined
ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_open' failed: symbol not defined
ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_bfdopen' failed: symbol not defined
ld: error: version script assignment of 'LIBCTF_1.0' to symbol 'ctf_bfdopen_ctfsect' failed: symbol not defined

Rather than adding more and more whack-a-mole fixes for every
linker we encounter that does this, simply exclude such symbols
unconditionally, using the same trick we used to use for Solaris.
(Well, unconditionally if we can use version scripts with this
linker at all, which is not always the case.)

Thanks to Nicholas Vinson for the original report and a fix very
similar to this one (but not quite identical).

Cc: Nicholas Vinson <nvinson234@gmail.com>

libctf/

	* configure.ac: Always exclude libctf symbols from
	libctf-nobfd's version script.
	* configure: Regenerated.
---
 libctf/configure    | 21 ++++++++++++++++-----
 libctf/configure.ac | 21 ++++++++++++++++-----
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/libctf/configure b/libctf/configure
index 778c141571e..1faadefa068 100755
--- a/libctf/configure
+++ b/libctf/configure
@@ -16952,7 +16952,10 @@ fi
 
 
 # Use a version script, if possible, or an -export-symbols-regex otherwise.
+# First figure out the version script flag: then massage the script, if
+# needed.
 decommented_version_script=
+no_version_script=
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for linker versioning flags" >&5
 $as_echo_n "checking for linker versioning flags... " >&6; }
 if ${ac_cv_libctf_version_script+:} false; then :
@@ -16969,7 +16972,7 @@ int ctf_foo (void) { return 0; }
 				    int main (void) { return ctf_foo(); }
 _ACEOF
 if ac_fn_c_try_link "$LINENO"; then :
-  ac_cv_libctf_version_script="-Wl,--version-script='$srcdir/libctf.ver'"
+  ac_cv_libctf_version_script="-Wl,--version-script"
 fi
 rm -f core conftest.err conftest.$ac_objext \
     conftest$ac_exeext conftest.$ac_ext
@@ -16994,22 +16997,30 @@ rm -f core conftest.err conftest.$ac_objext \
 
    if test -z "$ac_cv_libctf_version_script"; then
      ac_cv_libctf_version_script='-export-symbols-regex ctf_.*'
+     no_version_script=t
    fi
    rm -f conftest.ver
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_libctf_version_script" >&5
 $as_echo "$ac_cv_libctf_version_script" >&6; }
+
+# Ensure that no symbols exist in the version script for libctf-nobfd.so
+# that do not exist in the shared library itself, since some linkers (Solaris)
+# add such symbols with type NOTYPE, and others (LLVM) complain loudly
+# and fail to link.
+grep -v 'libctf only' $srcdir/libctf.ver > libctf-nobfd.ver
+
 if test -n "$decommented_version_script"; then
    # Solaris's version scripts use shell-style comments rather than the C-style
    # used by GNU ld.  Use cpp to strip the comments out.  (cpp exists under this
    # name on all platforms that support ld -z gnu-version-script.)
-   # Also ensure that no symbols exist in the version script for libctf-nobfd.so
-   # that do not exist in the shared library itself, since some linkers add such
-   # symbols with type NOTYPE.
    /lib/cpp < $srcdir/libctf.ver > libctf-decommented.ver
-   grep -v 'libctf only' $srcdir/libctf.ver | /lib/cpp > libctf-nobfd-decommented.ver
+   /lib/cpp < $srcdir/libctf-nobfd.ver > libctf-nobfd-decommented.ver
    VERSION_FLAGS="$ac_cv_libctf_version_script='libctf-decommented.ver'"
    VERSION_FLAGS_NOBFD="$ac_cv_libctf_version_script='libctf-nobfd-decommented.ver'"
+elif test -z "$no_version_script"; then
+   VERSION_FLAGS="$ac_cv_libctf_version_script='$srcdir/libctf.ver'"
+   VERSION_FLAGS_NOBFD="$ac_cv_libctf_version_script='libctf-nobfd.ver'"
 else
    VERSION_FLAGS="$ac_cv_libctf_version_script"
    VERSION_FLAGS_NOBFD="$ac_cv_libctf_version_script"
diff --git a/libctf/configure.ac b/libctf/configure.ac
index f327d48f249..ced1aeb7ccf 100644
--- a/libctf/configure.ac
+++ b/libctf/configure.ac
@@ -249,7 +249,10 @@ fi
 AC_SUBST(HAVE_TCL_TRY)
 
 # Use a version script, if possible, or an -export-symbols-regex otherwise.
+# First figure out the version script flag: then massage the script, if
+# needed.
 decommented_version_script=
+no_version_script=
 AC_CACHE_CHECK([for linker versioning flags], [ac_cv_libctf_version_script],
   [echo 'FOO { global: mai*; local: ctf_fo*; };' > conftest.ver
    old_LDFLAGS="$LDFLAGS"
@@ -258,7 +261,7 @@ AC_CACHE_CHECK([for linker versioning flags], [ac_cv_libctf_version_script],
    CFLAGS="$CFLAGS -fPIC"
    AC_LINK_IFELSE([AC_LANG_SOURCE([[int ctf_foo (void) { return 0; }
 				    int main (void) { return ctf_foo(); }]])],
-		  [ac_cv_libctf_version_script="-Wl,--version-script='$srcdir/libctf.ver'"],
+		  [ac_cv_libctf_version_script="-Wl,--version-script"],
 		  [])
    LDFLAGS="$old_LDFLAGS"
 
@@ -275,19 +278,27 @@ AC_CACHE_CHECK([for linker versioning flags], [ac_cv_libctf_version_script],
 
    if test -z "$ac_cv_libctf_version_script"; then
      ac_cv_libctf_version_script='-export-symbols-regex ctf_.*'
+     no_version_script=t
    fi
    rm -f conftest.ver])
+
+# Ensure that no symbols exist in the version script for libctf-nobfd.so
+# that do not exist in the shared library itself, since some linkers (Solaris)
+# add such symbols with type NOTYPE, and others (LLVM) complain loudly
+# and fail to link.
+grep -v 'libctf only' $srcdir/libctf.ver > libctf-nobfd.ver
+
 if test -n "$decommented_version_script"; then
    # Solaris's version scripts use shell-style comments rather than the C-style
    # used by GNU ld.  Use cpp to strip the comments out.  (cpp exists under this
    # name on all platforms that support ld -z gnu-version-script.)
-   # Also ensure that no symbols exist in the version script for libctf-nobfd.so
-   # that do not exist in the shared library itself, since some linkers add such
-   # symbols with type NOTYPE.
    /lib/cpp < $srcdir/libctf.ver > libctf-decommented.ver
-   grep -v 'libctf only' $srcdir/libctf.ver | /lib/cpp > libctf-nobfd-decommented.ver
+   /lib/cpp < $srcdir/libctf-nobfd.ver > libctf-nobfd-decommented.ver
    VERSION_FLAGS="$ac_cv_libctf_version_script='libctf-decommented.ver'"
    VERSION_FLAGS_NOBFD="$ac_cv_libctf_version_script='libctf-nobfd-decommented.ver'"
+elif test -z "$no_version_script"; then
+   VERSION_FLAGS="$ac_cv_libctf_version_script='$srcdir/libctf.ver'"
+   VERSION_FLAGS_NOBFD="$ac_cv_libctf_version_script='libctf-nobfd.ver'"
 else
    VERSION_FLAGS="$ac_cv_libctf_version_script"
    VERSION_FLAGS_NOBFD="$ac_cv_libctf_version_script"
-- 
2.44.0.273.ge0bd14271f


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/22] binutils, objdump: Add --ctf-parent-section
  2024-04-17 20:19 ` [PATCH 01/22] binutils, objdump: Add --ctf-parent-section Nick Alcock
@ 2024-04-18  2:05   ` Alan Modra
  2024-04-18 13:06     ` Nick Alcock
  0 siblings, 1 reply; 26+ messages in thread
From: Alan Modra @ 2024-04-18  2:05 UTC (permalink / raw)
  To: Nick Alcock; +Cc: binutils

On Wed, Apr 17, 2024 at 09:19:57PM +0100, Nick Alcock wrote:
> @@ -4890,13 +4897,36 @@ dump_ctf (bfd *abfd, const char *sect_name, const char *parent_name)
>        return;
>      }
>  
> -  if ((parent = ctf_dict_open (ctfa, parent_name, &err)) == NULL)
> +  if (parent_sect_name) {
> +    psec = read_section (abfd, parent_sect_name, &ctfpdata);
> +    if (sec == NULL) {
> +      my_bfd_nonfatal (bfd_get_filename (abfd));
> +      free (ctfdata);
> +      return;
> +    }

Formatting, here and elsewhere in this patch.  Open braces go on a
line by themselves.

  if (parent_sect_name)
    {
      psec = read_section (abfd, parent_sect_name, &ctfpdata);
      if (sec == NULL)
	{
	  my_bfd_nonfatal (bfd_get_filename (abfd));
	  free (ctfdata);
	  return;
	}
    }

Patch is OK with these all fixed.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 01/22] binutils, objdump: Add --ctf-parent-section
  2024-04-18  2:05   ` Alan Modra
@ 2024-04-18 13:06     ` Nick Alcock
  0 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-18 13:06 UTC (permalink / raw)
  To: Alan Modra; +Cc: binutils

On 18 Apr 2024, Alan Modra said:

> On Wed, Apr 17, 2024 at 09:19:57PM +0100, Nick Alcock wrote:
>> @@ -4890,13 +4897,36 @@ dump_ctf (bfd *abfd, const char *sect_name, const char *parent_name)
>>        return;
>>      }
>>  
>> -  if ((parent = ctf_dict_open (ctfa, parent_name, &err)) == NULL)
>> +  if (parent_sect_name) {
>> +    psec = read_section (abfd, parent_sect_name, &ctfpdata);
>> +    if (sec == NULL) {
>> +      my_bfd_nonfatal (bfd_get_filename (abfd));
>> +      free (ctfdata);
>> +      return;
>> +    }
>
> Formatting, here and elsewhere in this patch.  Open braces go on a
> line by themselves.

AUGH. Apologies, will audit the whole series for this: I see several
others just from a quick grep (and a few leading space/tab problems too,
despite my having hooks trying to spot them). Doing simultaneous
development in OTBS and GNU-style codebases means I make these sorts of
mistakes *all the time* :/ and cc-mode, even with electric newlines,
doesn't always fix them for me.

> Patch is OK with these all fixed.

Thanks!

-- 
NULL && (void)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes)
  2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
                   ` (21 preceding siblings ...)
  2024-04-17 20:20 ` [PATCH 22/22] libctf: do not include undefined functions in libctf.ver Nick Alcock
@ 2024-04-19 15:51 ` Nick Alcock
  22 siblings, 0 replies; 26+ messages in thread
From: Nick Alcock @ 2024-04-19 15:51 UTC (permalink / raw)
  To: binutils; +Cc: Nicholas Vinson

On 17 Apr 2024, Nick Alcock stated:

> I'll apply it in a couple of days if nobody says otherwise.
>
> Cc: Nicholas Vinson <nvinson234@gmail.com>
>
> Nicholas Vinson (1):
>   libctf: Remove undefined functions from ver. map
>
> Nick Alcock (21):
>   binutils, objdump: Add --ctf-parent-section
>   libctf: don't leak the symbol name in the name->type cache
>   libctf: remove static/dynamic name lookup distinction
>   libctf: fix name lookup in dicts containing base-type bitfields
>   libctf: support addition of types to dicts read via ctf_open()
>   libctf: fix a comment
>   libctf: delete LCTF_DIRTY
>   libctf: fix a comment typo
>   libctf: rename ctf_dict.ctf_{symtab,strtab}
>   Revert "libctf: do not corrupt strings across ctf_serialize"
>   libctf: replace 'pending refs' abstraction
>   libctf: rethink strtab writeout
>   libctf: make ctf_serialize() actually serialize
>   libctf: fix tiny dumping error
>   libctf: improve handling of type dumping errors
>   libctf: make ctf_lookup of symbols by name work in more cases
>   libctf: fix a debugging typo
>   libctf: add rewriting tests
>   libctf: fix leak in test
>   libctf: don't pass errno into ctf_err_warn so often
>   libctf: do not include undefined functions in libctf.ver

This is pushed now, exactly as here except for a couple of tiny GNU
style fixes (the one pointed out by Alan, and a few similar ones in a
couple of other commits).

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2024-04-19 15:51 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
2024-04-17 20:19 ` [PATCH 01/22] binutils, objdump: Add --ctf-parent-section Nick Alcock
2024-04-18  2:05   ` Alan Modra
2024-04-18 13:06     ` Nick Alcock
2024-04-17 20:19 ` [PATCH 02/22] libctf: don't leak the symbol name in the name->type cache Nick Alcock
2024-04-17 20:19 ` [PATCH 03/22] libctf: remove static/dynamic name lookup distinction Nick Alcock
2024-04-17 20:20 ` [PATCH 04/22] libctf: fix name lookup in dicts containing base-type bitfields Nick Alcock
2024-04-17 20:20 ` [PATCH 05/22] libctf: support addition of types to dicts read via ctf_open() Nick Alcock
2024-04-17 20:20 ` [PATCH 06/22] libctf: fix a comment Nick Alcock
2024-04-17 20:20 ` [PATCH 07/22] libctf: delete LCTF_DIRTY Nick Alcock
2024-04-17 20:20 ` [PATCH 08/22] libctf: fix a comment typo Nick Alcock
2024-04-17 20:20 ` [PATCH 09/22] libctf: rename ctf_dict.ctf_{symtab,strtab} Nick Alcock
2024-04-17 20:20 ` [PATCH 10/22] Revert "libctf: do not corrupt strings across ctf_serialize" Nick Alcock
2024-04-17 20:20 ` [PATCH 11/22] libctf: replace 'pending refs' abstraction Nick Alcock
2024-04-17 20:20 ` [PATCH 12/22] libctf: rethink strtab writeout Nick Alcock
2024-04-17 20:20 ` [PATCH 13/22] libctf: make ctf_serialize() actually serialize Nick Alcock
2024-04-17 20:20 ` [PATCH 14/22] libctf: fix tiny dumping error Nick Alcock
2024-04-17 20:20 ` [PATCH 15/22] libctf: improve handling of type dumping errors Nick Alcock
2024-04-17 20:20 ` [PATCH 16/22] libctf: make ctf_lookup of symbols by name work in more cases Nick Alcock
2024-04-17 20:20 ` [PATCH 17/22] libctf: fix a debugging typo Nick Alcock
2024-04-17 20:20 ` [PATCH 18/22] libctf: add rewriting tests Nick Alcock
2024-04-17 20:20 ` [PATCH 19/22] libctf: fix leak in test Nick Alcock
2024-04-17 20:20 ` [PATCH 20/22] libctf: don't pass errno into ctf_err_warn so often Nick Alcock
2024-04-17 20:20 ` [PATCH 21/22] libctf: Remove undefined functions from ver. map Nick Alcock
2024-04-17 20:20 ` [PATCH 22/22] libctf: do not include undefined functions in libctf.ver Nick Alcock
2024-04-19 15:51 ` [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).