public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH libctf 00/11] enumerator query API, plus some bugfixes
@ 2024-06-13 18:54 Nick Alcock
  2024-06-13 18:54 ` [PATCH 01/11] libctf: strtab corruption when strings are added to ctf_open()ed dicts Nick Alcock
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

This series is largely the addition of new query APIs for enumerators
(proposed on this list a few weeks ago, then radically rethought by
others into a much better design), plus a bunch of fixes for bugs
found in the process.  We also improve the deduplicator to consider
enums to be conflicting types if they are distinct but have enumeration
constants in common, and adapt ctf_add_enumerator() to prohibit the
addition of such enumerators (which are invalid C anyway, so anything
trying to add them is probably buggy).

The most series of the bugs fixed at the same time is a strtab
corruption bug observed if you add enums or structure members to a dict
you opened with ctf_open(): such dicts were read-only until only a few
weeks ago, so this bug has probably been caught before it could do any
damage!

(There are also new tests for this stuff: in fact the new test for
the enumerator API trips most of these bugs itself.)

I'll push tomorrow if nobody has any comments.

Nick Alcock (11):
  libctf: strtab corruption when strings are added to ctf_open()ed dicts
  include: fix libctf ECTF_NOENUMNAM error message
  libctf: doc: fix ctf_stype_t typedef string in spec
  libctf: dedup: enums with overlapping enumerators are conflicting
  libctf: don't leak enums if ctf_add_type fails
  libctf: fix dict leak on archive-wide symbol lookup error path
  libctf: suppress spurious failure of malloc-counting tests under
    valgrind
  libctf: prohibit addition of enums with overlapping enumerator
    constants
  libctf: make the ctf_next ctn_fp non-const
  include: libctf: comment improvements
  libctf, include: new functions for looking up enumerators

 include/ctf-api.h                             |  59 +++++-
 ld/testsuite/ld-ctf/enum-3.c                  |   3 +
 ld/testsuite/ld-ctf/enum-4.c                  |   3 +
 ld/testsuite/ld-ctf/overlapping-enums-2.d     |  36 ++++
 ld/testsuite/ld-ctf/overlapping-enums.d       |  35 ++++
 libctf/config.h.in                            |   3 +
 libctf/configure                              |   2 +-
 libctf/configure.ac                           |   2 +-
 libctf/ctf-archive.c                          | 108 +++++++++++
 libctf/ctf-create.c                           |  48 ++++-
 libctf/ctf-dedup.c                            |  42 ++++-
 libctf/ctf-hash.c                             |   6 +
 libctf/ctf-impl.h                             |  28 +--
 libctf/ctf-lookup.c                           | 145 +++++++++++++++
 libctf/ctf-open.c                             | 134 ++++++++++++--
 libctf/ctf-string.c                           |   2 +
 libctf/ctf-util.c                             |  29 ++-
 libctf/doc/ctf-spec.texi                      |   2 +-
 libctf/libctf.ver                             |   7 +
 libctf/testsuite/lib/ctf-lib.exp              |   5 +
 libctf/testsuite/libctf-lookup/enum-ctf-2.c   |   6 +
 .../libctf-lookup/enumerator-iteration.c      | 168 ++++++++++++++++++
 .../libctf-lookup/enumerator-iteration.lk     |  17 ++
 .../libctf-regression/open-error-free.c       |  13 ++
 24 files changed, 849 insertions(+), 54 deletions(-)
 create mode 100644 ld/testsuite/ld-ctf/enum-3.c
 create mode 100644 ld/testsuite/ld-ctf/enum-4.c
 create mode 100644 ld/testsuite/ld-ctf/overlapping-enums-2.d
 create mode 100644 ld/testsuite/ld-ctf/overlapping-enums.d
 create mode 100644 libctf/testsuite/libctf-lookup/enum-ctf-2.c
 create mode 100644 libctf/testsuite/libctf-lookup/enumerator-iteration.c
 create mode 100644 libctf/testsuite/libctf-lookup/enumerator-iteration.lk

-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 01/11] libctf: strtab corruption when strings are added to ctf_open()ed dicts
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 02/11] include: fix libctf ECTF_NOENUMNAM error message Nick Alcock
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

ctf_str_add_ref and ctf_str_add_movable_ref take a string and are supposed
to return a strtab offset.  These offsets are "provisional": the ref
mechanism records the address of the location in which the ref is stored and
modifies it when the strtab is finally written out.  Provisional refs in new
dicts start at 0 and go up via strlen() as new refs are added: this is fine,
because the strtab is empty and none of these values will overlap any
existing string offsets (since there are none).  Unfortunately, when a dict
is ctf_open()ed, we fail to set the initial provisional strtab offset to a
higher value than any existing string offset: it starts at zero again!
It's a shame that we already *have* strings at those offsets...

This is all fixed up once the string is reserialized, but if you look up
newly-added strings before serialization, you get corrupted partial string
results from the existing ctf_open()ed dict.

Observed (and thus regtested) by an upcoming test (in this patch series).

Exposed by the recently-introduced series that permits modification of
ctf_open()ed dicts, which has not been released anywhere.  Before that,
any attempt to do such things would fail with ECTF_RDONLY.

libctf/
	* ctf-string.c (ctf_str_create_atoms): Initialize
        ctf_str_prov_offset.
---
 libctf/ctf-string.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libctf/ctf-string.c b/libctf/ctf-string.c
index 0c5bd58cdaf..2d96e45617c 100644
--- a/libctf/ctf-string.c
+++ b/libctf/ctf-string.c
@@ -195,6 +195,8 @@ ctf_str_create_atoms (ctf_dict_t *fp)
       atom->csa_offset = i;
     }
 
+  fp->ctf_str_prov_offset = fp->ctf_str[CTF_STRTAB_0].cts_len + 1;
+
   return 0;
 
  oom_str_add:
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 02/11] include: fix libctf ECTF_NOENUMNAM error message
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
  2024-06-13 18:54 ` [PATCH 01/11] libctf: strtab corruption when strings are added to ctf_open()ed dicts Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 03/11] libctf: doc: fix ctf_stype_t typedef string in spec Nick Alcock
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

ECTF_NOENUMNAM is emitted when enumerator constant names don't exist.
Call them that, not 'enum elements'.

include/
	* ctf-api.h (ECTF_NOENUMNAM): fix error message.
---
 include/ctf-api.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/ctf-api.h b/include/ctf-api.h
index 392964a4ac9..85734afcac2 100644
--- a/include/ctf-api.h
+++ b/include/ctf-api.h
@@ -219,7 +219,7 @@ typedef struct ctf_snapshot_id
   _CTF_ITEM (ECTF_NOLABEL, "No label found corresponding to name.") \
   _CTF_ITEM (ECTF_NOLABELDATA, "File does not contain any labels.") \
   _CTF_ITEM (ECTF_NOTSUP, "Feature not supported.") \
-  _CTF_ITEM (ECTF_NOENUMNAM, "Enum element name not found.") \
+  _CTF_ITEM (ECTF_NOENUMNAM, "Enumerator name not found.") \
   _CTF_ITEM (ECTF_NOMEMBNAM, "Member name not found.") \
   _CTF_ITEM (ECTF_RDONLY, "CTF container is read-only.") \
   _CTF_ITEM (ECTF_DTFULL, "CTF type is full (no more members allowed).") \
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 03/11] libctf: doc: fix ctf_stype_t typedef string in spec
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
  2024-06-13 18:54 ` [PATCH 01/11] libctf: strtab corruption when strings are added to ctf_open()ed dicts Nick Alcock
  2024-06-13 18:54 ` [PATCH 02/11] include: fix libctf ECTF_NOENUMNAM error message Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 04/11] libctf: dedup: enums with overlapping enumerators are conflicting Nick Alcock
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

---
 libctf/doc/ctf-spec.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libctf/doc/ctf-spec.texi b/libctf/doc/ctf-spec.texi
index 37f84e74a54..78d8310d67e 100644
--- a/libctf/doc/ctf-spec.texi
+++ b/libctf/doc/ctf-spec.texi
@@ -582,7 +582,7 @@ typedef struct ctf_stype
     uint32_t ctt_size;
     uint32_t ctt_type;
   };
-} ctf_type_t;
+} ctf_stype_t;
 @end verbatim
 
 If @code{ctt_size} is the #define @code{CTF_LSIZE_SENT}, 0xffffffff, this type
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 04/11] libctf: dedup: enums with overlapping enumerators are conflicting
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
                   ` (2 preceding siblings ...)
  2024-06-13 18:54 ` [PATCH 03/11] libctf: doc: fix ctf_stype_t typedef string in spec Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 05/11] libctf: don't leak enums if ctf_add_type fails Nick Alcock
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

The CTF deduplicator was not considering enumerators inside enum types to be
things that caused type conflicts, so if the following two TUs were linked
together, you would end up with the following in the resulting dict:

1.c:
enum foo { A, B };

2.c:
enum bar { A, B };

linked:

enum foo { A, B };
enum bar { A, B };

This does work -- but it's not something that's valid C, and the general
point of the shared dict is that it is something that you could potentially
get from any valid C TU.

So consider such types to be conflicting, but obviously don't consider
actually identical enums to be conflicting, even though they too have (all)
their identifiers in common.  This involves surprisingly little code. The
deduplicator detects conflicting types by counting types in a hash table of
hash tables:

decorated identifier -> (type hash -> count)

where the COUNT is the number of times a given hash has been observed: any
name with more than one hash associated with it is considered conflicting
(the count is used to identify the most common such name for promotion to
the shared dict).

Before now, those identifiers were all the identifiers of types (possibly
decorated with their namespace on the front for enumerator identifiers), but
we can equally well put *enumeration constant names* in there, undecorated
like the identifiers of types in the global namespace, with the type hash
being the hash of each enum containing that enumerator.  The existing
conflicting-type-detection code will then accurately identify distinct enums
with enumeration constants in common.  The enum that contains the most
commonly-appearing enumerators will be promoted to the shared dict.

libctf/
	* ctf-impl.h (ctf_dedup_t) <cd_name_counts>: Extend comment.
	* ctf-dedup.c (ctf_dedup_count_name): New, split out of...
	(ctf_dedup_populate_mappings): ... here.  Call it for all
	* enumeration constants in an enum as well as types.

ld/
	* testsuite/ld-ctf/enum-3.c: New test CTF.
	* testsuite/ld-ctf/enum-4.c: Likewise.
	* testsuite/ld-ctf/overlapping-enums.d: New test.
	* testsuite/ld-ctf/overlapping-enums-2.d: Likewise.
---
 ld/testsuite/ld-ctf/enum-3.c              |  3 ++
 ld/testsuite/ld-ctf/enum-4.c              |  3 ++
 ld/testsuite/ld-ctf/overlapping-enums-2.d | 36 +++++++++++++++++++
 ld/testsuite/ld-ctf/overlapping-enums.d   | 35 +++++++++++++++++++
 libctf/ctf-dedup.c                        | 42 +++++++++++++++++++----
 libctf/ctf-impl.h                         |  3 +-
 6 files changed, 115 insertions(+), 7 deletions(-)
 create mode 100644 ld/testsuite/ld-ctf/enum-3.c
 create mode 100644 ld/testsuite/ld-ctf/enum-4.c
 create mode 100644 ld/testsuite/ld-ctf/overlapping-enums-2.d
 create mode 100644 ld/testsuite/ld-ctf/overlapping-enums.d

diff --git a/ld/testsuite/ld-ctf/enum-3.c b/ld/testsuite/ld-ctf/enum-3.c
new file mode 100644
index 00000000000..c365aaff564
--- /dev/null
+++ b/ld/testsuite/ld-ctf/enum-3.c
@@ -0,0 +1,3 @@
+enum first_day_of_the_week {Sunday = 0};
+
+static enum first_day_of_the_week day __attribute__((used));
diff --git a/ld/testsuite/ld-ctf/enum-4.c b/ld/testsuite/ld-ctf/enum-4.c
new file mode 100644
index 00000000000..00634244107
--- /dev/null
+++ b/ld/testsuite/ld-ctf/enum-4.c
@@ -0,0 +1,3 @@
+enum intersecting_days_of_the_week {Montag = 1, Tuesday = 2};
+
+static enum intersecting_days_of_the_week day __attribute__((used));
diff --git a/ld/testsuite/ld-ctf/overlapping-enums-2.d b/ld/testsuite/ld-ctf/overlapping-enums-2.d
new file mode 100644
index 00000000000..1adfd86b89b
--- /dev/null
+++ b/ld/testsuite/ld-ctf/overlapping-enums-2.d
@@ -0,0 +1,36 @@
+#as:
+#source: enum.c
+#source: enum-4.c
+#objdump: --ctf
+#ld: -shared
+#name: Semioverlapping enumerators
+
+.*: +file format .*
+
+Contents of CTF section .ctf:
+
+  Header:
+    Magic number: 0xdff2
+    Version: 4 \(CTF_VERSION_3\)
+#...
+  Types:
+    0x1: \(kind 8\) enum day_of_the_week \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\)
+         Monday: 0
+         Tuesday: 1
+         Wednesday: 2
+         Thursday: 3
+         Friday: 4
+         Saturday: 5
+         Sunday: 6
+#...
+  Strings:
+#...
+CTF archive member: .*enum.*\.c:
+#...
+  Types:
+    0x80000001: \(kind 8\) enum intersecting_days_of_the_week \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\)
+                Montag: 1
+                Tuesday: 2
+
+  Strings:
+#...
diff --git a/ld/testsuite/ld-ctf/overlapping-enums.d b/ld/testsuite/ld-ctf/overlapping-enums.d
new file mode 100644
index 00000000000..7cf57d62452
--- /dev/null
+++ b/ld/testsuite/ld-ctf/overlapping-enums.d
@@ -0,0 +1,35 @@
+#as:
+#source: enum.c
+#source: enum-3.c
+#objdump: --ctf
+#ld: -shared
+#name: Overlapping enumerators
+
+.*: +file format .*
+
+Contents of CTF section .ctf:
+
+  Header:
+    Magic number: 0xdff2
+    Version: 4 \(CTF_VERSION_3\)
+#...
+  Types:
+    0x1: \(kind 8\) enum day_of_the_week \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\)
+         Monday: 0
+         Tuesday: 1
+         Wednesday: 2
+         Thursday: 3
+         Friday: 4
+         Saturday: 5
+         Sunday: 6
+#...
+  Strings:
+#...
+CTF archive member: .*enum.*\.c:
+#...
+  Types:
+    0x80000001: \(kind 8\) enum first_day_of_the_week \(size 0x[0-9a-f]*\) \(aligned at 0x[0-9a-f]*\)
+                Sunday: 0
+
+  Strings:
+#...
diff --git a/libctf/ctf-dedup.c b/libctf/ctf-dedup.c
index c7db6ab4965..dd234945462 100644
--- a/libctf/ctf-dedup.c
+++ b/libctf/ctf-dedup.c
@@ -1149,6 +1149,9 @@ ctf_dedup_hash_type (ctf_dict_t *fp, ctf_dict_t *input,
   return NULL;
 }
 
+static int
+ctf_dedup_count_name (ctf_dict_t *fp, const char *name, void *id);
+
 /* Populate a number of useful mappings not directly used by the hashing
    machinery: the output mapping, the cd_name_counts mapping from name -> hash
    -> count of hashval deduplication state for a given hashed type, and the
@@ -1164,8 +1167,6 @@ ctf_dedup_populate_mappings (ctf_dict_t *fp, ctf_dict_t *input _libctf_unused_,
 {
   ctf_dedup_t *d = &fp->ctf_dedup;
   ctf_dynset_t *type_ids;
-  ctf_dynhash_t *name_counts;
-  long int count;
 
 #ifdef ENABLE_LIBCTF_HASH_DEBUGGING
   ctf_dprintf ("Hash %s, %s, into output mapping for %i/%lx @ %s\n",
@@ -1258,24 +1259,53 @@ ctf_dedup_populate_mappings (ctf_dict_t *fp, ctf_dict_t *input _libctf_unused_,
       && ctf_dynset_insert (type_ids, id) < 0)
     return ctf_set_errno (fp, errno);
 
+  if (ctf_type_kind_unsliced (input, type) == CTF_K_ENUM)
+    {
+      ctf_next_t *i = NULL;
+      const char *enumerator;
+
+      while ((enumerator = ctf_enum_next (input, type, &i, NULL)) != NULL)
+	{
+	  if (ctf_dedup_count_name (fp, enumerator, id) < 0)
+	    {
+	      ctf_next_destroy (i);
+	      return -1;
+	    }
+	}
+      if (ctf_errno (input) != ECTF_NEXT_END)
+	return ctf_set_errno (fp, ctf_errno (input));
+    }
+
   /* The rest only needs to happen for types with names.  */
   if (!decorated_name)
     return 0;
 
+  if (ctf_dedup_count_name (fp, decorated_name, id) < 0)
+    return -1;					/* errno is set for us. */
+
+  return 0;
+}
+
+static int
+ctf_dedup_count_name (ctf_dict_t *fp, const char *name, void *id)
+{
+  ctf_dedup_t *d = &fp->ctf_dedup;
+  ctf_dynhash_t *name_counts;
+  long int count;
+  const char *hval;
+
   /* Count the number of occurrences of the hash value for this GID.  */
 
   hval = ctf_dynhash_lookup (d->cd_type_hashes, id);
 
   /* Mapping from name -> hash(hashval, count) not already present?  */
-  if ((name_counts = ctf_dynhash_lookup (d->cd_name_counts,
-					 decorated_name)) == NULL)
+  if ((name_counts = ctf_dynhash_lookup (d->cd_name_counts, name)) == NULL)
     {
       if ((name_counts = ctf_dynhash_create (ctf_hash_string,
 					     ctf_hash_eq_string,
 					     NULL, NULL)) == NULL)
 	  return ctf_set_errno (fp, errno);
-      if (ctf_dynhash_cinsert (d->cd_name_counts, decorated_name,
-			       name_counts) < 0)
+      if (ctf_dynhash_cinsert (d->cd_name_counts, name, name_counts) < 0)
 	{
 	  ctf_dynhash_destroy (name_counts);
 	  return ctf_set_errno (fp, errno);
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index 03e1a66416a..eb89f8b4645 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -294,7 +294,8 @@ typedef struct ctf_dedup
   ctf_dynhash_t *cd_decorated_names[4];
 
   /* Map type names to a hash from type hash value -> number of times each value
-     has appeared.  */
+     has appeared.  Enumeration constants are tracked via the enum they appear
+     in.  */
   ctf_dynhash_t *cd_name_counts;
 
   /* Map global type IDs to type hash values.  Used to determine if types are
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 05/11] libctf: don't leak enums if ctf_add_type fails
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
                   ` (3 preceding siblings ...)
  2024-06-13 18:54 ` [PATCH 04/11] libctf: dedup: enums with overlapping enumerators are conflicting Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 06/11] libctf: fix dict leak on archive-wide symbol lookup error path Nick Alcock
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

If ctf_add_type failed in the middle of enumerator addition, the
destination would end up containing the source enum type and some
but not all of its enumerator constants.

Use snapshots to roll back the enum addition as a whole if this happens.
Before now, it's been pretty unlikely, but in an upcoming commit we will ban
addition of enumerators that already exist in a given dict, making failure
of ctf_add_enumerator and thus of this part of ctf_add_type much more
likely.

libctf/
	* ctf-create.c (ctf_add_type_internal): Roll back if enum or
          enumerator addition fails.
---
 libctf/ctf-create.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index ee79e49794d..073006b24ea 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -1944,10 +1944,15 @@ ctf_add_type_internal (ctf_dict_t *dst_fp, ctf_dict_t *src_fp, ctf_id_t src_type
 	}
       else
 	{
+	  ctf_snapshot_id_t snap = ctf_snapshot (dst_fp);
+
 	  dst_type = ctf_add_enum (dst_fp, flag, name);
 	  if ((dst.ctb_type = dst_type) == CTF_ERR
 	      || ctf_enum_iter (src_fp, src_type, enumadd, &dst))
-	    return CTF_ERR;			/* errno is set for us */
+	    {
+	      ctf_rollback (dst_fp, snap);
+	      return CTF_ERR;			/* errno is set for us */
+	    }
 	}
       break;
 
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 06/11] libctf: fix dict leak on archive-wide symbol lookup error path
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
                   ` (4 preceding siblings ...)
  2024-06-13 18:54 ` [PATCH 05/11] libctf: don't leak enums if ctf_add_type fails Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 07/11] libctf: suppress spurious failure of malloc-counting tests under valgrind Nick Alcock
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

If a lookup fails for a reason unrelated to a lack of type data for this
symbol, we return with an error; but we fail to close the dict we opened
most recently, which is leaked.

libctf/
	* ctf-archive.c (ctf_arc_lookup_sym_or_name): Close dict.
---
 libctf/ctf-archive.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libctf/ctf-archive.c b/libctf/ctf-archive.c
index f459c02e702..4744cb7a828 100644
--- a/libctf/ctf-archive.c
+++ b/libctf/ctf-archive.c
@@ -957,6 +957,7 @@ ctf_arc_lookup_sym_or_name (ctf_archive_t *wrapper, unsigned long symidx,
 	{
 	  if (errp)
 	    *errp = ctf_errno (fp);
+	  ctf_dict_close (fp);
 	  ctf_next_destroy (i);
 	  return NULL;				/* errno is set for us.  */
 	}
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 07/11] libctf: suppress spurious failure of malloc-counting tests under valgrind
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
                   ` (5 preceding siblings ...)
  2024-06-13 18:54 ` [PATCH 06/11] libctf: fix dict leak on archive-wide symbol lookup error path Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 08/11] libctf: prohibit addition of enums with overlapping enumerator constants Nick Alcock
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

The libctf-regression/open-error-free.c test works by interposing malloc
and counting mallocs and frees across libctf operations.  This only
works under suitably-interposable mallocs on systems supporting
dlsym (RTLD_NEXT, ...), so its operation is restricted to glibc
systems for now, but also it interacts badly with valgrind, which
interposes malloc itself.  Detect a running valgrind and skip the test.

Add new facilities allowing libctf lookup tests to declare themselves
unsupported, by printing "UNSUPPORTED: " and then some meaningful
message instead of their normal output.

libctf/
	* configure.ac: Check for <valgrind/valgrind.h>.
	* config.h.in: Regenerate.
	* configure: Likewise.
	* testsuite/lib/ctf-lib.exp (run_lookup_test): Add support for
	UNSUPPORTED tests.
	* testsuite/libctf-regression/open-error-free.c: When running
	under valgrind, this test is unsupported.
---
 libctf/config.h.in                                  |  3 +++
 libctf/configure                                    |  2 +-
 libctf/configure.ac                                 |  2 +-
 libctf/testsuite/lib/ctf-lib.exp                    |  5 +++++
 .../testsuite/libctf-regression/open-error-free.c   | 13 +++++++++++++
 5 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/libctf/config.h.in b/libctf/config.h.in
index 794779eab5a..359f1f5ae1d 100644
--- a/libctf/config.h.in
+++ b/libctf/config.h.in
@@ -115,6 +115,9 @@
 /* Define to 1 if you have the <unistd.h> header file. */
 #undef HAVE_UNISTD_H
 
+/* Define to 1 if you have the <valgrind/valgrind.h> header file. */
+#undef HAVE_VALGRIND_VALGRIND_H
+
 /* Define to the sub-directory in which libtool stores uninstalled libraries.
    */
 #undef LT_OBJDIR
diff --git a/libctf/configure b/libctf/configure
index 1faadefa068..7466d56a18b 100755
--- a/libctf/configure
+++ b/libctf/configure
@@ -16552,7 +16552,7 @@ $as_echo "#define AC_APPLE_UNIVERSAL_BUILD 1" >>confdefs.h
  presetting ac_cv_c_bigendian=no (or yes) will help" "$LINENO" 5 ;;
  esac
 
-for ac_header in byteswap.h endian.h
+for ac_header in byteswap.h endian.h valgrind/valgrind.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_mongrel "$LINENO" "$ac_header" "$as_ac_Header" "$ac_includes_default"
diff --git a/libctf/configure.ac b/libctf/configure.ac
index cf0e988f7fb..64544b83a9f 100644
--- a/libctf/configure.ac
+++ b/libctf/configure.ac
@@ -109,7 +109,7 @@ if test $ac_cv_libctf_bfd_elf = yes; then
 fi
 
 AC_C_BIGENDIAN
-AC_CHECK_HEADERS(byteswap.h endian.h)
+AC_CHECK_HEADERS(byteswap.h endian.h valgrind/valgrind.h)
 AC_CHECK_FUNCS(pread)
 
 dnl Check for bswap_{16,32,64}
diff --git a/libctf/testsuite/lib/ctf-lib.exp b/libctf/testsuite/lib/ctf-lib.exp
index eb2b738498d..cfd36bee339 100644
--- a/libctf/testsuite/lib/ctf-lib.exp
+++ b/libctf/testsuite/lib/ctf-lib.exp
@@ -270,6 +270,11 @@ proc run_lookup_test { name } {
 	set results [run_host_cmd "$opts(wrapper) tmpdir/lookup" $lookup_output]
     }
 
+    if { [regexp {^UNSUPPORTED: (.*)$} $results -> reason] } {
+	unsupported "$testname: $reason"
+	return 0
+    }
+
     set f [open "tmpdir/lookup.out" "w"]
     puts $f $results
     close $f
diff --git a/libctf/testsuite/libctf-regression/open-error-free.c b/libctf/testsuite/libctf-regression/open-error-free.c
index 5e4874444e4..edc5f348d5f 100644
--- a/libctf/testsuite/libctf-regression/open-error-free.c
+++ b/libctf/testsuite/libctf-regression/open-error-free.c
@@ -1,6 +1,7 @@
 /* Make sure that, on error, an opened dict is properly freed.  */
 
 #define _GNU_SOURCE 1
+#include "config.h"
 #include <dlfcn.h>
 #include <stdio.h>
 #include <stdlib.h>
@@ -8,6 +9,10 @@
 #include <ctf-api.h>
 #include <ctf.h>
 
+#ifdef HAVE_VALGRIND_VALGRIND_H
+#include <valgrind/valgrind.h>
+#endif
+
 static unsigned long long malloc_count;
 static unsigned long long free_count;
 
@@ -111,6 +116,14 @@ int main (void)
   ctf_next_t *it = NULL;
   unsigned long long frozen_malloc_count, frozen_free_count;
 
+#ifdef HAVE_VALGRIND_VALGRIND_H
+  if (RUNNING_ON_VALGRIND)
+    {
+      printf ("UNSUPPORTED: valgrind interferes with malloc counting\n");
+      return 0;
+    }
+#endif
+
   if ((fp = ctf_create (&err)) == NULL)
     goto open_err;
 
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 08/11] libctf: prohibit addition of enums with overlapping enumerator constants
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
                   ` (6 preceding siblings ...)
  2024-06-13 18:54 ` [PATCH 07/11] libctf: suppress spurious failure of malloc-counting tests under valgrind Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 09/11] libctf: make the ctf_next ctn_fp non-const Nick Alcock
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

libctf has long prohibited addition of enums with overlapping constants in a
single enum, but now that we are properly considering enums with overlapping
constants to be conflciting types, we can go further and prohibit addition
of enumeration constants to a dict if they already exist in any enum in that
dict: the same rules as C itself.

We do this in a fashion vaguely similar to what we just did in the
deduplicator, by considering enumeration constants as identifiers and adding
them to the core type/identifier namespace, ctf_dict_t.ctf_names.  This is a
little fiddly, because we do not want to prohibit opening of existing dicts
into which the deduplicator has stuffed enums with overlapping constants!
We just want to prohibit the addition of *new* enumerators that violate that
rule.  Even then, it's fine to add overlapping enumerator constants as long
as at least one of them is in a non-root type.  (This is essential for
proper deduplicator operation in cu-mapped mode, where multiple compilation
units can be smashed into one dict, with conflicting types marked as
hidden: these types may well contain overlapping enumerators.)

So, at open time, keep track of all enums observed, then do a third pass
through the enums alone, adding each enumerator either to the ctf_names
table as a mapping from the enumerator name to the enum it is part of (if
not already present), or to a new ctf_conflicting_enums hashtable that
tracks observed duplicates. (The latter is not used yet, but will be soon.)

(We need to do a third pass because it's quite possible to have an enum
containing an enumerator FOO followed by a type FOO: since they're processed
in order, the enumerator would be processed before the type, and at that
stage it seems nonconflicting.  The easiest fix is to run through the
enumerators after all type names are interned.)

At ctf_add_enumerator time, if the enumerator to which we are adding a type
is root-visible, check for an already-present name and error out if found,
then intern the new name in the ctf_names table as is done at open time.

(We retain the existing code which scans the enum itself for duplicates
because it is still an error to add an enumerator twice to a
non-root-visible enum type; but we only need to do this if the enum is
non-root-visible, so the cost of enum addition is reduced.)

Tested in an upcoming commit.

libctf/
	* ctf-impl.h (ctf_dict_t) <ctf_names>: Augment comment.
        <ctf_conflicting_enums>: New.
	(ctf_dynset_elements): New.
	* ctf-hash.c (ctf_dynset_elements): Implement it.
	* ctf-open.c (init_static_types): Split body into...
        (init_static_types_internal): ... here.  Count enumerators;
        keep track of observed enums in pass 2; populate ctf_names and
        ctf_conflicting_enums with enumerators in a third pass.
	(ctf_dict_close): Free ctf_conflicting_enums.
	* ctf-create.c (ctf_add_enumerator): Prohibit addition of duplicate
        enumerators in root-visible enum types.

include/
	* ctf-api.h (CTF_ADD_NONROOT): Describe what non-rootness
        means for enumeration constants.
	(ctf_add_enumerator):  The name is not a misnomer.
        We now require that enumerators have unique names.
        Document the non-rootness of enumerators.
---
 include/ctf-api.h   |  13 +++--
 libctf/ctf-create.c |  41 +++++++++++---
 libctf/ctf-hash.c   |   6 ++
 libctf/ctf-impl.h   |   5 +-
 libctf/ctf-open.c   | 134 +++++++++++++++++++++++++++++++++++++++-----
 5 files changed, 171 insertions(+), 28 deletions(-)

diff --git a/include/ctf-api.h b/include/ctf-api.h
index 85734afcac2..d7bdbdd0dac 100644
--- a/include/ctf-api.h
+++ b/include/ctf-api.h
@@ -268,9 +268,11 @@ _CTF_ERRORS
 #endif
 
 /* Dynamic CTF containers can be created using ctf_create.  The ctf_add_*
-   routines can be used to add new definitions to the dynamic container.
-   New types are labeled as root or non-root to determine whether they are
-   visible at the top-level program scope when subsequently doing a lookup.  */
+   routines can be used to add new definitions to the dynamic container.  New
+   types are labeled as root or non-root to determine whether they are visible
+   at the top-level program scope when subsequently doing a lookup.
+   (Identifiers contained within non-root types, like enumeration constants, are
+   also not visible.)  */
 
 #define	CTF_ADD_NONROOT	0	/* Type only visible in nested scope.  */
 #define	CTF_ADD_ROOT	1	/* Type visible at top-level scope.  */
@@ -785,9 +787,8 @@ extern ctf_id_t ctf_add_union_sized (ctf_dict_t *, uint32_t, const char *,
 extern ctf_id_t ctf_add_unknown (ctf_dict_t *, uint32_t, const char *);
 extern ctf_id_t ctf_add_volatile (ctf_dict_t *, uint32_t, ctf_id_t);
 
-/* Add an enumerator to an enum (the name is a misnomer).  We do not currently
-   validate that enumerators have unique names, even though C requires it: in
-   future this may change.  */
+/* Add an enumerator to an enum.  If the enum is non-root, so are all the
+   constants added to it by ctf_add_enumerator.  */
 
 extern int ctf_add_enumerator (ctf_dict_t *, ctf_id_t, const char *, int);
 
diff --git a/libctf/ctf-create.c b/libctf/ctf-create.c
index 073006b24ea..d6746030918 100644
--- a/libctf/ctf-create.c
+++ b/libctf/ctf-create.c
@@ -1048,7 +1048,6 @@ ctf_add_enumerator (ctf_dict_t *fp, ctf_id_t enid, const char *name,
   ctf_dtdef_t *dtd = ctf_dtd_lookup (fp, enid);
   unsigned char *old_vlen;
   ctf_enum_t *en;
-  size_t i;
 
   uint32_t kind, vlen, root;
 
@@ -1068,6 +1067,12 @@ ctf_add_enumerator (ctf_dict_t *fp, ctf_id_t enid, const char *name,
   root = LCTF_INFO_ISROOT (fp, dtd->dtd_data.ctt_info);
   vlen = LCTF_INFO_VLEN (fp, dtd->dtd_data.ctt_info);
 
+  /* Enumeration constant names are only added, and only checked for duplicates,
+     if the enum they are part of is a root-visible type.  */
+
+  if (root == CTF_ADD_ROOT && ctf_dynhash_lookup (fp->ctf_names, name))
+    return (ctf_set_errno (ofp, ECTF_DUPLICATE));
+
   if (kind != CTF_K_ENUM)
     return (ctf_set_errno (ofp, ECTF_NOTENUM));
 
@@ -1075,24 +1080,46 @@ ctf_add_enumerator (ctf_dict_t *fp, ctf_id_t enid, const char *name,
     return (ctf_set_errno (ofp, ECTF_DTFULL));
 
   old_vlen = dtd->dtd_vlen;
+
   if (ctf_grow_vlen (fp, dtd, sizeof (ctf_enum_t) * (vlen + 1)) < 0)
     return -1;					/* errno is set for us.  */
+
   en = (ctf_enum_t *) dtd->dtd_vlen;
 
   /* Remove refs in the old vlen region and reapply them.  */
 
   ctf_str_move_refs (fp, old_vlen, sizeof (ctf_enum_t) * vlen, dtd->dtd_vlen);
 
-  for (i = 0; i < vlen; i++)
-    if (strcmp (ctf_strptr (fp, en[i].cte_name), name) == 0)
-      return (ctf_set_errno (ofp, ECTF_DUPLICATE));
+  /* Check for constant duplication within any given enum: only needed for
+     non-root-visible types, since the duplicate detection above does the job
+     for root-visible types just fine.  */
 
-  en[i].cte_name = ctf_str_add_movable_ref (fp, name, &en[i].cte_name);
-  en[i].cte_value = value;
+  if (root == CTF_ADD_NONROOT)
+    {
+      size_t i;
 
-  if (en[i].cte_name == 0 && name != NULL && name[0] != '\0')
+      for (i = 0; i < vlen; i++)
+	if (strcmp (ctf_strptr (fp, en[i].cte_name), name) == 0)
+	  return (ctf_set_errno (ofp, ECTF_DUPLICATE));
+    }
+
+  en[vlen].cte_name = ctf_str_add_movable_ref (fp, name, &en[vlen].cte_name);
+  en[vlen].cte_value = value;
+
+  if (en[vlen].cte_name == 0 && name != NULL && name[0] != '\0')
     return (ctf_set_errno (ofp, ctf_errno (fp)));
 
+  /* Put the newly-added enumerator name into the name table if this type is
+     root-visible.  */
+
+  if (root == CTF_ADD_ROOT)
+    {
+      if (ctf_dynhash_insert (fp->ctf_names,
+			      (char *) ctf_strptr (fp, en[vlen].cte_name),
+			      (void *) (uintptr_t) enid) < 0)
+	return ctf_set_errno (fp, ENOMEM);
+    }
+
   dtd->dtd_data.ctt_info = CTF_TYPE_INFO (kind, root, vlen + 1);
 
   return 0;
diff --git a/libctf/ctf-hash.c b/libctf/ctf-hash.c
index 77b8478479e..a52f96db105 100644
--- a/libctf/ctf-hash.c
+++ b/libctf/ctf-hash.c
@@ -651,6 +651,12 @@ ctf_dynset_remove (ctf_dynset_t *hp, const void *key)
   htab_remove_elt ((struct htab *) hp, key_to_internal (key));
 }
 
+size_t
+ctf_dynset_elements (ctf_dynset_t *hp)
+{
+  return htab_elements ((struct htab *) hp);
+}
+
 void
 ctf_dynset_destroy (ctf_dynset_t *hp)
 {
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index eb89f8b4645..ec0b4feb328 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -387,7 +387,8 @@ struct ctf_dict
   ctf_dynhash_t *ctf_structs;	    /* Hash table of struct types.  */
   ctf_dynhash_t *ctf_unions;	    /* Hash table of union types.  */
   ctf_dynhash_t *ctf_enums;	    /* Hash table of enum types.  */
-  ctf_dynhash_t *ctf_names;	    /* Hash table of remaining type names.  */
+  ctf_dynhash_t *ctf_names;	    /* Hash table of remaining types, plus
+				       enumeration constants.  */
   ctf_lookup_t ctf_lookups[5];	    /* Pointers to nametabs for name lookup.  */
   ctf_strs_t ctf_str[2];	    /* Array of string table base and bounds.  */
   ctf_strs_writable_t *ctf_dynstrtab; /* Dynamically allocated string table, if any. */
@@ -407,6 +408,7 @@ struct ctf_dict
   uint32_t *ctf_pptrtab;	  /* Parent types pointed to by child dicts.  */
   size_t ctf_pptrtab_len;	  /* Num types storable in pptrtab currently.  */
   uint32_t ctf_pptrtab_typemax;	  /* Max child type when pptrtab last updated.  */
+  ctf_dynset_t *ctf_conflicting_enums;	/* Tracks enum constants that conflict.  */
   uint32_t *ctf_funcidx_names;	  /* Name of each function symbol in symtypetab
 				     (if indexed).  */
   uint32_t *ctf_objtidx_names;	  /* Likewise, for object symbols.  */
@@ -669,6 +671,7 @@ extern int ctf_dynhash_next_sorted (ctf_dynhash_t *, ctf_next_t **,
 extern ctf_dynset_t *ctf_dynset_create (htab_hash, htab_eq, ctf_hash_free_fun);
 extern int ctf_dynset_insert (ctf_dynset_t *, void *);
 extern void ctf_dynset_remove (ctf_dynset_t *, const void *);
+extern size_t ctf_dynset_elements (ctf_dynset_t *);
 extern void ctf_dynset_destroy (ctf_dynset_t *);
 extern void *ctf_dynset_lookup (ctf_dynset_t *, const void *);
 extern int ctf_dynset_exists (ctf_dynset_t *, const void *key,
diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index 59c6ed0622a..2ae0a696c3a 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -670,24 +670,50 @@ upgrade_types (ctf_dict_t *fp, ctf_header_t *cth)
   return 0;
 }
 
+static int
+init_static_types_internal (ctf_dict_t *fp, ctf_header_t *cth,
+			    ctf_dynset_t *all_enums);
+
 /* Populate statically-defined types (those loaded from a saved buffer).
 
    Initialize the type ID translation table with the byte offset of each type,
    and initialize the hash tables of each named type.  Upgrade the type table to
    the latest supported representation in the process, if needed, and if this
-   recension of libctf supports upgrading.  */
+   recension of libctf supports upgrading.
+
+   This is a wrapper to simplify memory allocation on error in the _internal
+   function that does all the actual work.  */
 
 static int
 init_static_types (ctf_dict_t *fp, ctf_header_t *cth)
+{
+  ctf_dynset_t *all_enums;
+  int err;
+
+  if ((all_enums = ctf_dynset_create (htab_hash_pointer, htab_eq_pointer,
+				      NULL)) == NULL)
+    return ENOMEM;
+
+  err = init_static_types_internal (fp, cth, all_enums);
+  ctf_dynset_destroy (all_enums);
+  return err;
+}
+
+static int
+init_static_types_internal (ctf_dict_t *fp, ctf_header_t *cth,
+			    ctf_dynset_t *all_enums)
 {
   const ctf_type_t *tbuf;
   const ctf_type_t *tend;
 
   unsigned long pop[CTF_K_MAX + 1] = { 0 };
+  int pop_enumerators = 0;
   const ctf_type_t *tp;
   uint32_t id;
   uint32_t *xp;
   unsigned long typemax = 0;
+  ctf_next_t *i = NULL;
+  void *k;
 
   /* We determine whether the dict is a child or a parent based on the value of
      cth_parname.  */
@@ -706,8 +732,10 @@ init_static_types (ctf_dict_t *fp, ctf_header_t *cth)
   tbuf = (ctf_type_t *) (fp->ctf_buf + cth->cth_typeoff);
   tend = (ctf_type_t *) (fp->ctf_buf + cth->cth_stroff);
 
-  /* We make two passes through the entire type section.  In this first
-     pass, we count the number of each type and the total number of types.  */
+  /* We make two passes through the entire type section, and one third pass
+     through part of it.  In this first pass, we count the number of each type
+     and type-like identifier (like enumerators) and the total number of
+     types.  */
 
   for (tp = tbuf; tp < tend; typemax++)
     {
@@ -728,6 +756,9 @@ init_static_types (ctf_dict_t *fp, ctf_header_t *cth)
 
       tp = (ctf_type_t *) ((uintptr_t) tp + increment + vbytes);
       pop[kind]++;
+
+      if (kind == CTF_K_ENUM)
+	pop_enumerators += vlen;
     }
 
   if (child)
@@ -765,11 +796,16 @@ init_static_types (ctf_dict_t *fp, ctf_header_t *cth)
 				   pop[CTF_K_POINTER] +
 				   pop[CTF_K_VOLATILE] +
 				   pop[CTF_K_CONST] +
-				   pop[CTF_K_RESTRICT],
+				   pop[CTF_K_RESTRICT] +
+				   pop_enumerators,
 				   ctf_hash_string,
 				   ctf_hash_eq_string, NULL, NULL)) == NULL)
     return ENOMEM;
 
+  if ((fp->ctf_conflicting_enums
+       = ctf_dynset_create (htab_hash_string, htab_eq_string, NULL)) == NULL)
+    return ENOMEM;
+
   /* The ptrtab and txlate can be appropriately sized for precisely this set
      of types: the txlate because it is only used to look up static types,
      so dynamic types added later will never go through it, and the ptrtab
@@ -793,6 +829,8 @@ init_static_types (ctf_dict_t *fp, ctf_header_t *cth)
   /* In the second pass through the types, we fill in each entry of the
      type and pointer tables and add names to the appropriate hashes.
 
+     (Not all names are added in this pass, only type names.  See below.)
+
      Bump ctf_typemax as we go, but keep it one higher than normal, so that
      the type being read in is considered a valid type and it is at least
      barely possible to run simple lookups on it.  */
@@ -902,16 +940,25 @@ init_static_types (ctf_dict_t *fp, ctf_header_t *cth)
 	  break;
 
 	case CTF_K_ENUM:
-	  if (!isroot)
+	  {
+	    if (!isroot)
+	      break;
+
+	    err = ctf_dynhash_insert_type (fp, fp->ctf_enums,
+					   LCTF_INDEX_TO_TYPE (fp, id, child),
+					   tp->ctt_name);
+
+	    if (err != 0)
+	      return err;
+
+	    /* Remember all enums for later rescanning.  */
+
+	    err = ctf_dynset_insert (all_enums, (void *) (ptrdiff_t)
+				     LCTF_INDEX_TO_TYPE (fp, id, child));
+	    if (err != 0)
+	      return err;
 	    break;
-
-	  err = ctf_dynhash_insert_type (fp, fp->ctf_enums,
-					 LCTF_INDEX_TO_TYPE (fp, id, child),
-					 tp->ctt_name);
-
-	  if (err != 0)
-	    return err;
-	  break;
+	  }
 
 	case CTF_K_TYPEDEF:
 	  if (!isroot)
@@ -976,13 +1023,71 @@ init_static_types (ctf_dict_t *fp, ctf_header_t *cth)
   assert (fp->ctf_typemax == typemax);
 
   ctf_dprintf ("%lu total types processed\n", fp->ctf_typemax);
+
+  /* In the third pass, we traverse the enums we spotted earlier and add all
+     the enumeration constants therein either to the types table (if no
+     type exists with that name) or to ctf_conflciting_enums (otherwise).
+
+     Doing this in a third pass is necessary to avoid the case where an
+     enum appears with a constant FOO, then later a type named FOO appears,
+     too late to spot the conflict by checking the enum's constants.  */
+
+  while ((err = ctf_dynset_next (all_enums, &i, &k)) == 0)
+    {
+      ctf_id_t enum_id = (uintptr_t) k;
+      ctf_next_t *i_constants = NULL;
+      const char *cte_name;
+
+      while ((cte_name = ctf_enum_next (fp, enum_id, &i_constants, NULL)) != NULL)
+	{
+	  /* Add all the enumeration constants as identifiers.  They all appear
+	     as types that cite the original enum.
+
+	     Constants that appear in more than one enum, or which are already
+	     the names of types, appear in ctf_conflicting_enums as well.  */
+
+	  if (ctf_dynhash_lookup_type (fp->ctf_names, cte_name) == 0)
+	    {
+	      uint32_t name = ctf_str_add (fp, cte_name);
+
+	      if (name == 0)
+		goto enum_err;
+
+	      err = ctf_dynhash_insert_type (fp, fp->ctf_names, enum_id, name);
+	    }
+	  else
+	    {
+	      err = ctf_dynset_insert (fp->ctf_conflicting_enums, (void *)
+				       cte_name);
+
+	      if (err != 0)
+		goto enum_err;
+	    }
+	  continue;
+
+	enum_err:
+	  ctf_next_destroy (i_constants);
+	  ctf_next_destroy (i);
+	  return ctf_errno (fp);
+	}
+      if (ctf_errno (fp) != ECTF_NEXT_END)
+	{
+	  ctf_next_destroy (i);
+	  return ctf_errno (fp);
+	}
+    }
+  if (err != ECTF_NEXT_END)
+    return err;
+
   ctf_dprintf ("%zu enum names hashed\n",
 	       ctf_dynhash_elements (fp->ctf_enums));
+  ctf_dprintf ("%zu conflicting enumerators identified\n",
+	       ctf_dynset_elements (fp->ctf_conflicting_enums));
   ctf_dprintf ("%zu struct names hashed (%d long)\n",
 	       ctf_dynhash_elements (fp->ctf_structs), nlstructs);
   ctf_dprintf ("%zu union names hashed (%d long)\n",
 	       ctf_dynhash_elements (fp->ctf_unions), nlunions);
-  ctf_dprintf ("%zu base type names hashed\n",
+  ctf_dprintf ("%zu base type names and identifiers hashed\n",
 	       ctf_dynhash_elements (fp->ctf_names));
 
   return 0;
@@ -1786,6 +1891,7 @@ ctf_dict_close (ctf_dict_t *fp)
     }
   ctf_dynhash_destroy (fp->ctf_dthash);
 
+  ctf_dynset_destroy (fp->ctf_conflicting_enums);
   ctf_dynhash_destroy (fp->ctf_structs);
   ctf_dynhash_destroy (fp->ctf_unions);
   ctf_dynhash_destroy (fp->ctf_enums);
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 09/11] libctf: make the ctf_next ctn_fp non-const
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
                   ` (7 preceding siblings ...)
  2024-06-13 18:54 ` [PATCH 08/11] libctf: prohibit addition of enums with overlapping enumerator constants Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 10/11] include: libctf: comment improvements Nick Alcock
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

This was always an error, because the ctn_fp routinely has errors set on it,
which is not something you can (or should) do to a const object.

libctf/
	* ctf-impl.h (ctf_next_) <cu.ctn_fp>: Make non-const.
---
 libctf/ctf-impl.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index ec0b4feb328..299d981a718 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -560,11 +560,13 @@ struct ctf_next
     void **ctn_hash_slot;
   } u;
 
-  /* This union is of various sorts of dict we can iterate over:
-     currently dictionaries and archives, dynhashes, and dynsets.  */
+  /* This union is of various sorts of dict we can iterate over: currently
+     archives, dictionaries, dynhashes, and dynsets.  ctn_fp is non-const
+     because we need to set errors on it.  */
+
   union
   {
-    const ctf_dict_t *ctn_fp;
+    ctf_dict_t *ctn_fp;
     const ctf_archive_t *ctn_arc;
     const ctf_dynhash_t *ctn_h;
     const ctf_dynset_t *ctn_s;
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 10/11] include: libctf: comment improvements
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
                   ` (8 preceding siblings ...)
  2024-06-13 18:54 ` [PATCH 09/11] libctf: make the ctf_next ctn_fp non-const Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-13 18:54 ` [PATCH 11/11] libctf, include: new functions for looking up enumerators Nick Alcock
  2024-06-14 14:14 ` [PATCH libctf 00/11] enumerator query API, plus some bugfixes Tom Tromey
  11 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

Describe a bit more clearly what effects a type being non-root-
visible has.  More consistently use the term non-root-visible
rather than hidden.  Document ctf_enum_iter.

include/
	* ctf-api.h (ctf_enum_iter): Document.
        (ctf_type_iter): Hidden, not non-root.  Mention that
        parent dictionaries are not traversed.
---
 include/ctf-api.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/ctf-api.h b/include/ctf-api.h
index d7bdbdd0dac..d67db8be13f 100644
--- a/include/ctf-api.h
+++ b/include/ctf-api.h
@@ -663,6 +663,8 @@ extern int ctf_member_iter (ctf_dict_t *, ctf_id_t, ctf_member_f *, void *);
 extern ssize_t ctf_member_next (ctf_dict_t *, ctf_id_t, ctf_next_t **,
 				const char **name, ctf_id_t *membtype,
 				int flags);
+
+/* Return all enumeration constants in a given enum type.  */
 extern int ctf_enum_iter (ctf_dict_t *, ctf_id_t, ctf_enum_f *, void *);
 extern const char *ctf_enum_next (ctf_dict_t *, ctf_id_t, ctf_next_t **,
 				  int *);
@@ -672,8 +674,9 @@ extern const char *ctf_enum_next (ctf_dict_t *, ctf_id_t, ctf_next_t **,
    CTF_ADD_ROOT was passed).  All such types are returned, even if they are
    things like pointers that intrinsically have no name: this is the only effect
    of CTF_ADD_ROOT for such types.  ctf_type_next allows you to choose whether
-   to see hidden types or not with the want_hidden arg: if set, the flag (if
-   passed) returns the hidden state of each type in turn.  */
+   to see non-root types or not with the want_hidden arg: if set, the flag (if
+   passed) returns the non-root state of each type in turn.  Types in parent
+   dictionaries are not returned.  */
 
 extern int ctf_type_iter (ctf_dict_t *, ctf_type_f *, void *);
 extern int ctf_type_iter_all (ctf_dict_t *, ctf_type_all_f *, void *);
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 11/11] libctf, include: new functions for looking up enumerators
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
                   ` (9 preceding siblings ...)
  2024-06-13 18:54 ` [PATCH 10/11] include: libctf: comment improvements Nick Alcock
@ 2024-06-13 18:54 ` Nick Alcock
  2024-06-14 17:51   ` David Faust
  2024-06-14 14:14 ` [PATCH libctf 00/11] enumerator query API, plus some bugfixes Tom Tromey
  11 siblings, 1 reply; 16+ messages in thread
From: Nick Alcock @ 2024-06-13 18:54 UTC (permalink / raw)
  To: binutils; +Cc: stephen.brennan

Three new functions for looking up the enum type containing a given
enumeration constant, and optionally that constant's value.

The simplest, ctf_lookup_enumerator, looks up a root-visible enumerator by
name in one dict: if the dict contains multiple such constants (which is
possible for dicts created by older versions of hte libctf deduplicator),
ECTF_DUPLICATE is returned.

The next simplest, ctf_lookup_enumerator_next, is an iterator which returns
all enumerators with ag iven name in a given dict, whether root-visible or
not.

The most elaborate, ctf_arc_lookup_enumerator_next, finds all
enumerators with a given name across all dicts in an entire CTF archive,
whether root-visible or not, starting looking in the shared parent dict;
opened dicts are cached (as with all other ctf_arc_*lookup functions) so
that repeated use does not incur repeated opening costs.

All three of these return enumerator values as int64_t: unfortunately, API
compatibilty concerns prevent us from doing the same with the other older
enum-related functions, which all return enumerator constant values as ints.
We may be forced to add symbol-versioning compatibility aliases that fix the
other functions in due course, bumping the soname for platforms that do not
support such things.

ctf_arc_lookup_enumerator_next is implemented as a nested ctf_archive_next
iterator, and inside that, a nested ctf_lookup_enumerator_next iterator
within each dict.  To aid in this, add support to ctf_next_t iterators for
iterators that are implemented in terms of two simultaneous nested iterators
at once.  (It has always been possible for callers to use as many nested or
semi-overlapping ctf_next_t iterators as they need, which is one of the
advantages of this style over the _iter style that calls a function for each
thing iterated over: the iterator change here permits *ctf_next_t iterators
themselves* to be implemented by iterating using multiple other iterators as
part of their internal operation, transparently to the caller.)

Also add a testcase that tests all these functions (which is fairly easy
because ctf_arc_lookup_enumerator_next is implemented in terms of
ctf_lookup_enumerator_next) in addition to enumeration addition in
ctf_open()ed dicts, ctf_add_enumerator duplicate enumerator addition, and
conflicting enumerator constant deduplication.

include/
	* ctf-api.h (ctf_lookup_enumerator): New.
	(ctf_lookup_enumerator_next): Likewise.
	(ctf_arc_lookup_enumerator_next): Likewise.

libctf/
	* libctf.ver: Add them.
	* ctf-impl.h (ctf_next_t) <ctn_next_inner>: New.
	* ctf-util.c (ctf_next_copy): Copy it.
        (ctf_next_destroy): Destroy it.
	* ctf-lookup.c (ctf_lookup_enumerator): New.
	(ctf_lookup_enumerator_next): New.
	* ctf-archive.c (ctf_arc_lookup_enumerator_next): New.
	* testsuite/libctf-lookup/enumerator-iteration.*: New test.
	* testsuite/libctf-lookup/enum-ctf-2.c: New test CTF, used by the
	  above.
---
 include/ctf-api.h                             |  37 ++++
 libctf/ctf-archive.c                          | 107 +++++++++++
 libctf/ctf-impl.h                             |  12 +-
 libctf/ctf-lookup.c                           | 145 +++++++++++++++
 libctf/ctf-util.c                             |  29 ++-
 libctf/libctf.ver                             |   7 +
 libctf/testsuite/libctf-lookup/enum-ctf-2.c   |   6 +
 .../libctf-lookup/enumerator-iteration.c      | 168 ++++++++++++++++++
 .../libctf-lookup/enumerator-iteration.lk     |  17 ++
 9 files changed, 519 insertions(+), 9 deletions(-)
 create mode 100644 libctf/testsuite/libctf-lookup/enum-ctf-2.c
 create mode 100644 libctf/testsuite/libctf-lookup/enumerator-iteration.c
 create mode 100644 libctf/testsuite/libctf-lookup/enumerator-iteration.lk

diff --git a/include/ctf-api.h b/include/ctf-api.h
index d67db8be13f..d3bd1b586d8 100644
--- a/include/ctf-api.h
+++ b/include/ctf-api.h
@@ -25,6 +25,7 @@
 #define	_CTF_API_H
 
 #include <sys/types.h>
+#include <inttypes.h>
 #include <ctf.h>
 #include <zlib.h>
 
@@ -538,6 +539,16 @@ extern ctf_id_t ctf_lookup_by_name (ctf_dict_t *, const char *);
 
 extern ctf_id_t ctf_lookup_variable (ctf_dict_t *, const char *);
 
+/* Look up a single enumerator by enumeration constant name.  Returns the ID of
+   the enum it is contained within and optionally its value.  Error out with
+   ECTF_DUPLICATE if multiple exist (which can happen in some older dicts).  See
+   ctf_lookup_enumerator_next in that case.  Enumeration constants in non-root
+   types are not returned, but constants in parents are, if not overridden by
+   an enum in the child.  */
+
+extern ctf_id_t ctf_lookup_enumerator (ctf_dict_t *, const char *,
+				       int64_t *enum_value);
+
 /* Type lookup functions.  */
 
 /* Strip qualifiers and typedefs off a type, returning the base type.
@@ -669,6 +680,32 @@ extern int ctf_enum_iter (ctf_dict_t *, ctf_id_t, ctf_enum_f *, void *);
 extern const char *ctf_enum_next (ctf_dict_t *, ctf_id_t, ctf_next_t **,
 				  int *);
 
+/* Return all enumeration constants with a given name in a given dict, similar
+   to ctf_lookup_enumerator above but capable of returning multiple values.
+   Enumerators in parent dictionaries are not returned: enumerators in non-root
+   types *are* returned.  This operation internally iterates over all types in
+   the dict, so is relatively expensive in large dictionaries.
+
+   There is nothing preventing NAME from being changed by the caller in the
+   middle of iteration: the results might be slightly confusing, but they are
+   well-defined.  */
+
+extern ctf_id_t ctf_lookup_enumerator_next (ctf_dict_t *, const char *name,
+					    ctf_next_t **, int64_t *enum_value);
+
+/* Likewise, across all dicts in an archive (parent first).  The DICT and ERRP
+   arguments are not optional: without the one you can't tell which dict the
+   returned type is in, and without the other you can't distinguish real errors
+   from end-of-iteration.  DICT should be NULL before the first call and is set
+   to NULL after the last and on error: on successful call it is the caller's
+   responsibility to ctf_dict_close() it.  The caller should  otherwise pass it
+   back in unchanged (do not reassign it during iteration, just as with the
+   ctf_next_t iterator itself).  */
+
+extern ctf_id_t ctf_arc_lookup_enumerator_next (ctf_archive_t *, const char *name,
+						ctf_next_t **, int64_t *enum_value,
+						ctf_dict_t **dict, int *errp);
+
 /* Iterate over all types in a dict.  ctf_type_iter_all recurses over all types:
    ctf_type_iter recurses only over types with user-visible names (for which
    CTF_ADD_ROOT was passed).  All such types are returned, even if they are
diff --git a/libctf/ctf-archive.c b/libctf/ctf-archive.c
index 4744cb7a828..3ba6a3efabb 100644
--- a/libctf/ctf-archive.c
+++ b/libctf/ctf-archive.c
@@ -1011,6 +1011,113 @@ ctf_arc_lookup_symbol_name (ctf_archive_t *wrapper, const char *symname,
   return ctf_arc_lookup_sym_or_name (wrapper, 0, symname, typep, errp);
 }
 
+/* Return all enumeration constants with a given name across all dicts in an
+   archive, similar to ctf_lookup_enumerator_next.  The dict is cached, so
+   opening costs are paid only once, but (unlike ctf_arc_lookup_symbol*
+   above) the results of the iterations are not cached.  dict and errp are
+   not optional.  */
+
+ctf_id_t
+ctf_arc_lookup_enumerator_next (ctf_archive_t *arc, const char *name,
+				ctf_next_t **it, int64_t *enum_value,
+				ctf_dict_t **dict, int *errp)
+{
+  ctf_next_t *i = *it;
+  ctf_id_t type;
+  int opened_this_time = 0;
+  int err;
+
+  /* We have two nested iterators in here: ctn_next tracks archives, while
+     within it ctn_next_inner tracks enumerators within an archive.  We
+     keep track of the dict by simply reusing the passed-in arg: if it's
+     changed by the caller, the caller will get an ECTF_WRONGFP error,
+     so this is quite safe and means we don't have to track the arc and fp
+     simultaneously in the ctf_next_t.  */
+
+  if (!i)
+    {
+      if ((i = ctf_next_create ()) == NULL)
+	{
+	  err = ENOMEM;
+	  goto err;
+	}
+      i->ctn_iter_fun = (void (*) (void)) ctf_arc_lookup_enumerator_next;
+      i->cu.ctn_arc = arc;
+      *it = i;
+    }
+
+  if ((void (*) (void)) ctf_arc_lookup_enumerator_next != i->ctn_iter_fun)
+    {
+      err = ECTF_NEXT_WRONGFUN;
+      goto err;
+    }
+
+  if (arc != i->cu.ctn_arc)
+    {
+      err = ECTF_NEXT_WRONGFP;
+      goto err;
+    }
+
+  /* Prevent any earlier end-of-iteration on this dict from confusing the
+     test below.  */
+  if (i->ctn_next != NULL)
+    ctf_set_errno (*dict, 0);
+
+  do
+    {
+      /* At end of one dict, or not started any iterations yet?
+	 Traverse to next dict.  If we never returned this dict to the
+	 caller, close it ourselves: the caller will never see it and cannot
+	 do so.  */
+
+      if (i->ctn_next == NULL || ctf_errno (*dict) == ECTF_NEXT_END)
+	{
+	  if (opened_this_time)
+	    {
+	      ctf_dict_close (*dict);
+	      *dict = NULL;
+	      opened_this_time = 0;
+	    }
+
+	  *dict = ctf_archive_next (arc, &i->ctn_next, NULL, 0, &err);
+	  if (!*dict)
+	    goto err;
+	  opened_this_time = 1;
+	}
+
+      type = ctf_lookup_enumerator_next (*dict, name, &i->ctn_next_inner,
+					 enum_value);
+    }
+  while (type == CTF_ERR && ctf_errno (*dict) == ECTF_NEXT_END);
+
+  if (type == CTF_ERR)
+    {
+      err = ctf_errno (*dict);
+      goto err;
+    }
+
+  /* If this dict is being reused from the previous iteration, bump its
+     refcnt: the caller is going to close it and has no idea that we didn't
+     open it this time round.  */
+  if (!opened_this_time)
+    ctf_ref (*dict);
+
+  return type;
+
+ err:						/* Also ECTF_NEXT_END. */
+  if (opened_this_time)
+    {
+      ctf_dict_close (*dict);
+      *dict = NULL;
+    }
+
+  ctf_next_destroy (i);
+  *it = NULL;
+  if (errp)
+    *errp = err;
+  return CTF_ERR;
+}
+
 /* Raw iteration over all CTF files in an archive.  We pass the raw data for all
    CTF files in turn to the specified callback function.  */
 static int
diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
index 299d981a718..0a362b6b17c 100644
--- a/libctf/ctf-impl.h
+++ b/libctf/ctf-impl.h
@@ -544,13 +544,15 @@ struct ctf_next
   uint32_t ctn_n;
 
   /* Some iterators contain other iterators, in addition to their other
-     state.  */
+     state.  We allow for inner and outer iterators, for two-layer nested loops
+     like those found in ctf_arc_lookup_enumerator_next.  */
   ctf_next_t *ctn_next;
+  ctf_next_t *ctn_next_inner;
 
-  /* We can save space on this side of things by noting that a dictionary is
-     either dynamic or not, as a whole, and a given iterator can only iterate
-     over one kind of thing at once: so we can overlap the DTD and non-DTD
-     members, and the structure, variable and enum members, etc.  */
+  /* We can save space on this side of things by noting that a type is either
+     dynamic or not, as a whole, and a given iterator can only iterate over one
+     kind of thing at once: so we can overlap the DTD and non-DTD members, and
+     the structure, variable and enum members, etc.  */
   union
   {
     unsigned char *ctn_vlen;
diff --git a/libctf/ctf-lookup.c b/libctf/ctf-lookup.c
index e4d18bec112..8accb2ed99e 100644
--- a/libctf/ctf-lookup.c
+++ b/libctf/ctf-lookup.c
@@ -413,6 +413,151 @@ ctf_lookup_variable (ctf_dict_t *fp, const char *name)
   return type;
 }
 
+/* Look up a single enumerator by enumeration constant name.  Returns the ID of
+   the enum it is contained within and optionally its value.  Error out with
+   ECTF_DUPLICATE if multiple exist (which can happen in some older dicts).  See
+   ctf_lookup_enumerator_next in that case.  Enumeration constants in non-root
+   types are not returned, but constants in parents are, if not overridden by
+   an enum in the child..  */
+
+ctf_id_t
+ctf_lookup_enumerator (ctf_dict_t *fp, const char *name, int64_t *enum_value)
+{
+  ctf_id_t type;
+  int enum_int_value;
+
+  if (ctf_dynset_lookup (fp->ctf_conflicting_enums, name))
+    return (ctf_set_typed_errno (fp, ECTF_DUPLICATE));
+
+  /* CTF_K_UNKNOWN suffices for things like enumeration constants that aren't
+     actually types at all (ending up in the global name table).  */
+  type = ctf_lookup_by_rawname (fp, CTF_K_UNKNOWN, name);
+  /* Nonexistent type? It may be in the parent.  */
+  if (type == 0 && fp->ctf_parent)
+    {
+      if ((type = ctf_lookup_enumerator (fp->ctf_parent, name, enum_value)) == 0)
+	return ctf_set_typed_errno (fp, ECTF_NOENUMNAM);
+      return type;
+    }
+
+  /* Nothing more to do if this type didn't exist or we don't have to look up
+     the enum value.  */
+  if (type == 0)
+    return ctf_set_typed_errno (fp, ECTF_NOENUMNAM);
+
+  if (enum_value == NULL)
+    return type;
+
+  if (ctf_enum_value (fp, type, name, &enum_int_value) < 0)
+    return CTF_ERR;
+  *enum_value = enum_int_value;
+
+  return type;
+}
+
+/* Return all enumeration constants with a given name in a given dict, similar
+   to ctf_lookup_enumerator above but capable of returning multiple values.
+   Enumerators in parent dictionaries are not returned: enumerators in
+   hidden types *are* returned.  */
+
+ctf_id_t
+ctf_lookup_enumerator_next (ctf_dict_t *fp, const char *name,
+			    ctf_next_t **it, int64_t *val)
+{
+  ctf_next_t *i = *it;
+  int found = 0;
+
+  /* We use ctf_type_next() to iterate across all types, but then traverse each
+     enumerator found by hand: traversing enumerators is very easy, and it would
+     probably be more confusing to use two nested iterators than to do it this
+     way.  We use ctn_next to work over enums, then ctn_en and ctn_n to work
+     over enumerators within each enum.  */
+  if (!i)
+    {
+      if ((i = ctf_next_create ()) == NULL)
+	return ctf_set_typed_errno (fp, ENOMEM);
+
+      i->cu.ctn_fp = fp;
+      i->ctn_iter_fun = (void (*) (void)) ctf_lookup_enumerator_next;
+      i->ctn_increment = 0;
+      i->ctn_tp = NULL;
+      i->u.ctn_en = NULL;
+      i->ctn_n = 0;
+      *it = i;
+    }
+
+  if ((void (*) (void)) ctf_lookup_enumerator_next != i->ctn_iter_fun)
+    return (ctf_set_typed_errno (fp, ECTF_NEXT_WRONGFUN));
+
+  if (fp != i->cu.ctn_fp)
+    return (ctf_set_typed_errno (fp, ECTF_NEXT_WRONGFP));
+
+  do
+    {
+      const char *this_name;
+
+      /* At end of enum? Traverse to next one, if any are left.  */
+
+      if (i->u.ctn_en == NULL || i->ctn_n == 0)
+	{
+	  const ctf_type_t *tp;
+	  ctf_dtdef_t *dtd;
+
+	  do
+	    i->ctn_type = ctf_type_next (i->cu.ctn_fp, &i->ctn_next, NULL, 1);
+	  while (i->ctn_type != CTF_ERR
+		 && ctf_type_kind_unsliced (i->cu.ctn_fp, i->ctn_type)
+		 != CTF_K_ENUM);
+
+	  if (i->ctn_type == CTF_ERR)
+	    {
+	      /* Conveniently, when the iterator over all types is done, so is the
+		 iteration as a whole: so we can just pass all errors from the
+		 internal iterator straight back out..  */
+	      ctf_next_destroy (i);
+	      *it = NULL;
+	      return CTF_ERR;			/* errno is set for us.  */
+	    }
+
+	  if ((tp = ctf_lookup_by_id (&fp, i->ctn_type)) == NULL)
+	    return CTF_ERR;			/* errno is set for us.  */
+	  i->ctn_n = LCTF_INFO_VLEN (fp, tp->ctt_info);
+
+	  dtd = ctf_dynamic_type (fp, i->ctn_type);
+
+	  if (dtd == NULL)
+	    {
+	      (void) ctf_get_ctt_size (fp, tp, NULL, &i->ctn_increment);
+	      i->u.ctn_en = (const ctf_enum_t *) ((uintptr_t) tp +
+						  i->ctn_increment);
+	    }
+	  else
+	    i->u.ctn_en = (const ctf_enum_t *) dtd->dtd_vlen;
+	}
+
+      this_name = ctf_strptr (fp, i->u.ctn_en->cte_name);
+
+      i->ctn_n--;
+
+      if (strcmp (name, this_name) == 0)
+	{
+	  if (val)
+	    *val = i->u.ctn_en->cte_value;
+	  found = 1;
+
+	  /* Constant found in this enum: try the next one.  (Constant names
+	     cannot be duplicated within a given enum.)  */
+
+	  i->ctn_n = 0;
+	}
+
+      i->u.ctn_en++;
+    }
+  while (!found);
+
+  return i->ctn_type;
+}
+
 typedef struct ctf_symidx_sort_arg_cb
 {
   ctf_dict_t *fp;
diff --git a/libctf/ctf-util.c b/libctf/ctf-util.c
index 3ea6de9e86f..f75b1bfb01a 100644
--- a/libctf/ctf-util.c
+++ b/libctf/ctf-util.c
@@ -262,6 +262,8 @@ ctf_next_destroy (ctf_next_t *i)
     free (i->u.ctn_sorted_hkv);
   if (i->ctn_next)
     ctf_next_destroy (i->ctn_next);
+  if (i->ctn_next_inner)
+    ctf_next_destroy (i->ctn_next_inner);
   free (i);
 }
 
@@ -276,16 +278,35 @@ ctf_next_copy (ctf_next_t *i)
     return NULL;
   memcpy (i2, i, sizeof (struct ctf_next));
 
+  if (i2->ctn_next)
+    {
+      i2->ctn_next = ctf_next_copy (i2->ctn_next);
+      if (i2->ctn_next == NULL)
+	goto err_next;
+    }
+
+  if (i2->ctn_next_inner)
+    {
+      i2->ctn_next_inner = ctf_next_copy (i2->ctn_next_inner);
+      if (i2->ctn_next_inner == NULL)
+	goto err_next_inner;
+    }
+
   if (i2->ctn_iter_fun == (void (*) (void)) ctf_dynhash_next_sorted)
     {
       size_t els = ctf_dynhash_elements ((ctf_dynhash_t *) i->cu.ctn_h);
       if ((i2->u.ctn_sorted_hkv = calloc (els, sizeof (ctf_next_hkv_t))) == NULL)
-	{
-	  free (i2);
-	  return NULL;
-	}
+	goto err_sorted_hkv;
       memcpy (i2->u.ctn_sorted_hkv, i->u.ctn_sorted_hkv,
 	      els * sizeof (ctf_next_hkv_t));
     }
   return i2;
+
+ err_sorted_hkv:
+  ctf_next_destroy (i2->ctn_next_inner);
+ err_next_inner:
+  ctf_next_destroy (i2->ctn_next);
+ err_next:
+  ctf_next_destroy (i2);
+  return NULL;
 }
diff --git a/libctf/libctf.ver b/libctf/libctf.ver
index 6e7345be66b..e6c31ff37aa 100644
--- a/libctf/libctf.ver
+++ b/libctf/libctf.ver
@@ -198,3 +198,10 @@ LIBCTF_1.2 {
 	ctf_arc_lookup_symbol_name;
 	ctf_add_unknown;
 } LIBCTF_1.1;
+
+LIBCTF_1.3 {
+    global:
+	ctf_lookup_enumerator;
+	ctf_lookup_enumerator_next;
+	ctf_arc_lookup_enumerator_next;
+} LIBCTF_1.2;
diff --git a/libctf/testsuite/libctf-lookup/enum-ctf-2.c b/libctf/testsuite/libctf-lookup/enum-ctf-2.c
new file mode 100644
index 00000000000..39c9865e528
--- /dev/null
+++ b/libctf/testsuite/libctf-lookup/enum-ctf-2.c
@@ -0,0 +1,6 @@
+enum e { ENUMSAMPLE_1 = 6, ENUMSAMPLE_2 = 7 };
+
+enum ie2 { IENUMSAMPLE2_1 = -10, IENUMSAMPLE2_2 };
+
+enum e baz;
+enum ie2 quux;
diff --git a/libctf/testsuite/libctf-lookup/enumerator-iteration.c b/libctf/testsuite/libctf-lookup/enumerator-iteration.c
new file mode 100644
index 00000000000..e46dad6dc70
--- /dev/null
+++ b/libctf/testsuite/libctf-lookup/enumerator-iteration.c
@@ -0,0 +1,168 @@
+/* Test enumerator iteration and querying.  Because
+   ctf_arc_lookup_enumerator_next uses ctf_lookup_enumerator_next internally, we
+   only need to test the former.  */
+
+#include "config.h"
+#include <ctf-api.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+static void
+print_constants (ctf_archive_t *ctf, const char *name)
+{
+  ctf_next_t *i = NULL;
+  int err;
+  ctf_dict_t *fp;
+  ctf_id_t type;
+  int64_t val;
+
+  while ((type = ctf_arc_lookup_enumerator_next (ctf, name, &i,
+						 &val, &fp, &err)) != CTF_ERR)
+    {
+      char *foo;
+
+      printf ("%s in %s has value %i\n", name,
+	      foo = ctf_type_aname (fp, type), val);
+      free (foo);
+
+      ctf_dict_close (fp);
+    }
+  if (err != ECTF_NEXT_END)
+    {
+      fprintf (stderr, "iteration failed: %s\n", ctf_errmsg (err));
+      exit (1);
+    }
+}
+
+int
+main (int argc, char *argv[])
+{
+  ctf_archive_t *ctf;
+  ctf_dict_t *fp;
+  int err;
+  ctf_id_t type;
+  ctf_next_t *i = NULL;
+  const char *name;
+  int64_t val;
+  int counter = 0;
+
+  if (argc != 2)
+    {
+      fprintf (stderr, "Syntax: %s PROGRAM\n", argv[0]);
+      exit(1);
+    }
+
+  if ((ctf = ctf_open (argv[1], NULL, &err)) == NULL)
+    goto open_err;
+
+  /* Look for all instances of ENUMSAMPLE2_1, and add some new enums to all
+     dicts found, to test dynamic enum iteration as well as static.
+
+     Add two enums with a different name and constants to any that should
+     already be there (one hidden), and one with the same constants, but hidden,
+     to test ctf_lookup_enumerator_next()'s multiple-lookup functionality and
+     ctf_lookup_enumerator() in the presence of hidden types.  */
+
+  printf ("First iteration: addition of enums.\n");
+  while ((type = ctf_arc_lookup_enumerator_next (ctf, "IENUMSAMPLE2_2", &i,
+						 &val, &fp, &err)) != CTF_ERR)
+    {
+      char *foo;
+
+      printf ("IENUMSAMPLE2_2 in %s has value %i\n",
+	      foo = ctf_type_aname (fp, type), val);
+      free (foo);
+
+      if ((type = ctf_add_enum (fp, CTF_ADD_ROOT, "ie3")) == CTF_ERR)
+	goto enum_add_err;
+
+      if (ctf_add_enumerator (fp, type, "DYNADD", counter += 10) < 0)
+	goto enumerator_add_err;
+      if (ctf_add_enumerator (fp, type, "DYNADD2", counter += 10) < 0)
+	goto enumerator_add_err;
+
+      /* Make sure that overlapping enumerator addition fails as it should.  */
+
+      if (ctf_add_enumerator (fp, type, "IENUMSAMPLE2_2", 666) >= 0
+	  || ctf_errno (fp) != ECTF_DUPLICATE)
+	fprintf (stderr, "Duplicate enumerator addition did not fail as it ought to\n");
+
+      if ((type = ctf_add_enum (fp, CTF_ADD_NONROOT, "ie4_hidden")) == CTF_ERR)
+	goto enum_add_err;
+
+      if (ctf_add_enumerator (fp, type, "DYNADD3", counter += 10) < 0)
+	goto enumerator_add_err;
+      if (ctf_add_enumerator (fp, type, "DYNADD4", counter += 10) < 0)
+	goto enumerator_add_err;
+
+      if ((type = ctf_add_enum (fp, CTF_ADD_NONROOT, "ie3_hidden")) == CTF_ERR)
+	goto enum_add_err;
+
+      if (ctf_add_enumerator (fp, type, "DYNADD", counter += 10) < 0)
+	goto enumerator_add_err;
+      if (ctf_add_enumerator (fp, type, "DYNADD2", counter += 10) < 0)
+	goto enumerator_add_err;
+
+      /* Look them up via ctf_lookup_enumerator.  */
+
+      if (ctf_lookup_enumerator (fp, "DYNADD", &val) == CTF_ERR)
+	goto enumerator_lookup_err;
+      printf ("direct lookup: DYNADD value: %i\n", (int) val);
+
+      if ((type = ctf_lookup_enumerator (fp, "DYNADD3", &val) != CTF_ERR) ||
+	  ctf_errno (fp) != ECTF_NOENUMNAM)
+	{
+	  if (type != CTF_ERR)
+	    {
+	      char *foo;
+	      printf ("direct lookup: hidden lookup did not return ECTF_NOENUMNAM but rather %i in %s\n",
+		      val, foo = ctf_type_aname (fp, type));
+	      free (foo);
+	    }
+	  else
+	    printf ("direct lookup: hidden lookup did not return ECTF_NOENUMNAM but rather %s\n",
+		    ctf_errno (fp));
+	}
+
+      ctf_dict_close (fp);
+    }
+  if (err != ECTF_NEXT_END)
+    {
+      fprintf (stderr, "iteration failed: %s\n", ctf_errmsg (err));
+      return 1;
+    }
+
+  /* Look for (and print out) some enumeration constants.  */
+
+  printf ("Second iteration: printing of enums.\n");
+
+  print_constants (ctf, "ENUMSAMPLE_1");
+  print_constants (ctf, "IENUMSAMPLE_1");
+  print_constants (ctf, "ENUMSAMPLE_2");
+  print_constants (ctf, "DYNADD");
+  print_constants (ctf, "DYNADD3");
+
+  ctf_close (ctf);
+
+  printf ("All done.\n");
+
+  return 0;
+
+ open_err:
+  fprintf (stderr, "%s: cannot open: %s\n", argv[0], ctf_errmsg (err));
+  return 1;
+ enum_add_err:
+  fprintf (stderr, "Cannot add enum to dict \"%s\": %s\n",
+	   ctf_cuname (fp) ? ctf_cuname (fp) : "(null: parent)", ctf_errmsg (ctf_errno (fp)));
+  return 1;
+ enumerator_add_err:
+  fprintf (stderr, "Cannot add enumerator to dict \"%s\": %s\n",
+	   ctf_cuname (fp) ? ctf_cuname (fp) : "(null: parent)", ctf_errmsg (ctf_errno (fp)));
+  return 1;
+ enumerator_lookup_err:
+  fprintf (stderr, "Cannot look up enumerator in dict \"%s\": %s\n",
+	   ctf_cuname (fp) ? ctf_cuname (fp) : "(null: parent)", ctf_errmsg (ctf_errno (fp)));
+  return 1;
+}
diff --git a/libctf/testsuite/libctf-lookup/enumerator-iteration.lk b/libctf/testsuite/libctf-lookup/enumerator-iteration.lk
new file mode 100644
index 00000000000..0c3cbfbf15f
--- /dev/null
+++ b/libctf/testsuite/libctf-lookup/enumerator-iteration.lk
@@ -0,0 +1,17 @@
+# lookup: enumerator-iteration.c
+# source: enum-ctf.c
+# source: enum-ctf-2.c
+# link: on
+First iteration: addition of enums.
+IENUMSAMPLE2_2 in enum ie2 has value -9
+direct lookup: DYNADD value: 10
+Second iteration: printing of enums.
+ENUMSAMPLE_1 in enum e has value 6
+ENUMSAMPLE_1 in enum e has value 0
+IENUMSAMPLE_1 in enum ie has value -10
+ENUMSAMPLE_2 in enum e has value 7
+ENUMSAMPLE_2 in enum e has value 1
+DYNADD in enum ie3 has value 10
+DYNADD in enum ie3_hidden has value 50
+DYNADD3 in enum ie4_hidden has value 30
+All done.
-- 
2.45.1.275.g567cb0950c


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH libctf 00/11] enumerator query API, plus some bugfixes
  2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
                   ` (10 preceding siblings ...)
  2024-06-13 18:54 ` [PATCH 11/11] libctf, include: new functions for looking up enumerators Nick Alcock
@ 2024-06-14 14:14 ` Tom Tromey
  2024-06-14 17:57   ` Nick Alcock
  11 siblings, 1 reply; 16+ messages in thread
From: Tom Tromey @ 2024-06-14 14:14 UTC (permalink / raw)
  To: Nick Alcock; +Cc: binutils, stephen.brennan

>>>>> "Nick" == Nick Alcock <nick.alcock@oracle.com> writes:

Nick>   We also improve the deduplicator to consider
Nick> enums to be conflicting types if they are distinct but have enumeration
Nick> constants in common, and adapt ctf_add_enumerator() to prohibit the
Nick> addition of such enumerators (which are invalid C anyway, so anything
Nick> trying to add them is probably buggy).

I don't know libctf at all but  it seems to me that  a.c can use
enum { X = 1 } and b.c can use enum { X = 2 } without any problem.

Tom

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 11/11] libctf, include: new functions for looking up enumerators
  2024-06-13 18:54 ` [PATCH 11/11] libctf, include: new functions for looking up enumerators Nick Alcock
@ 2024-06-14 17:51   ` David Faust
  2024-06-17 13:43     ` Nick Alcock
  0 siblings, 1 reply; 16+ messages in thread
From: David Faust @ 2024-06-14 17:51 UTC (permalink / raw)
  To: Nick Alcock; +Cc: binutils, stephen.brennan


On 6/13/24 11:54, Nick Alcock wrote:
> Three new functions for looking up the enum type containing a given
> enumeration constant, and optionally that constant's value.
> 

The new functions make sense to me, though I don't have much experience
as a user of libctf or it's APIs.

Just a few small comments/nits below.

> The simplest, ctf_lookup_enumerator, looks up a root-visible enumerator by
> name in one dict: if the dict contains multiple such constants (which is
> possible for dicts created by older versions of hte libctf deduplicator),
> ECTF_DUPLICATE is returned.

typo: hte

> 
> The next simplest, ctf_lookup_enumerator_next, is an iterator which returns
> all enumerators with ag iven name in a given dict, whether root-visible or
> not.

typo: ag iven

> 
> The most elaborate, ctf_arc_lookup_enumerator_next, finds all
> enumerators with a given name across all dicts in an entire CTF archive,
> whether root-visible or not, starting looking in the shared parent dict;
> opened dicts are cached (as with all other ctf_arc_*lookup functions) so
> that repeated use does not incur repeated opening costs.
> 
> All three of these return enumerator values as int64_t: unfortunately, API
> compatibilty concerns prevent us from doing the same with the other older

typo: compatibilty

> enum-related functions, which all return enumerator constant values as ints.
> We may be forced to add symbol-versioning compatibility aliases that fix the
> other functions in due course, bumping the soname for platforms that do not
> support such things.
> 
> ctf_arc_lookup_enumerator_next is implemented as a nested ctf_archive_next
> iterator, and inside that, a nested ctf_lookup_enumerator_next iterator
> within each dict.  To aid in this, add support to ctf_next_t iterators for
> iterators that are implemented in terms of two simultaneous nested iterators
> at once.  (It has always been possible for callers to use as many nested or
> semi-overlapping ctf_next_t iterators as they need, which is one of the
> advantages of this style over the _iter style that calls a function for each
> thing iterated over: the iterator change here permits *ctf_next_t iterators
> themselves* to be implemented by iterating using multiple other iterators as
> part of their internal operation, transparently to the caller.)
> 
> Also add a testcase that tests all these functions (which is fairly easy
> because ctf_arc_lookup_enumerator_next is implemented in terms of
> ctf_lookup_enumerator_next) in addition to enumeration addition in
> ctf_open()ed dicts, ctf_add_enumerator duplicate enumerator addition, and
> conflicting enumerator constant deduplication.
> 
> include/
> 	* ctf-api.h (ctf_lookup_enumerator): New.
> 	(ctf_lookup_enumerator_next): Likewise.
> 	(ctf_arc_lookup_enumerator_next): Likewise.
> 
> libctf/
> 	* libctf.ver: Add them.
> 	* ctf-impl.h (ctf_next_t) <ctn_next_inner>: New.
> 	* ctf-util.c (ctf_next_copy): Copy it.
>         (ctf_next_destroy): Destroy it.
> 	* ctf-lookup.c (ctf_lookup_enumerator): New.
> 	(ctf_lookup_enumerator_next): New.
> 	* ctf-archive.c (ctf_arc_lookup_enumerator_next): New.
> 	* testsuite/libctf-lookup/enumerator-iteration.*: New test.
> 	* testsuite/libctf-lookup/enum-ctf-2.c: New test CTF, used by the
> 	  above.
> ---
>  include/ctf-api.h                             |  37 ++++
>  libctf/ctf-archive.c                          | 107 +++++++++++
>  libctf/ctf-impl.h                             |  12 +-
>  libctf/ctf-lookup.c                           | 145 +++++++++++++++
>  libctf/ctf-util.c                             |  29 ++-
>  libctf/libctf.ver                             |   7 +
>  libctf/testsuite/libctf-lookup/enum-ctf-2.c   |   6 +
>  .../libctf-lookup/enumerator-iteration.c      | 168 ++++++++++++++++++
>  .../libctf-lookup/enumerator-iteration.lk     |  17 ++
>  9 files changed, 519 insertions(+), 9 deletions(-)
>  create mode 100644 libctf/testsuite/libctf-lookup/enum-ctf-2.c
>  create mode 100644 libctf/testsuite/libctf-lookup/enumerator-iteration.c
>  create mode 100644 libctf/testsuite/libctf-lookup/enumerator-iteration.lk
> 
> diff --git a/include/ctf-api.h b/include/ctf-api.h
> index d67db8be13f..d3bd1b586d8 100644
> --- a/include/ctf-api.h
> +++ b/include/ctf-api.h
> @@ -25,6 +25,7 @@
>  #define	_CTF_API_H
>  
>  #include <sys/types.h>
> +#include <inttypes.h>
>  #include <ctf.h>
>  #include <zlib.h>
>  
> @@ -538,6 +539,16 @@ extern ctf_id_t ctf_lookup_by_name (ctf_dict_t *, const char *);
>  
>  extern ctf_id_t ctf_lookup_variable (ctf_dict_t *, const char *);
>  
> +/* Look up a single enumerator by enumeration constant name.  Returns the ID of
> +   the enum it is contained within and optionally its value.  Error out with
> +   ECTF_DUPLICATE if multiple exist (which can happen in some older dicts).  See
> +   ctf_lookup_enumerator_next in that case.  Enumeration constants in non-root
> +   types are not returned, but constants in parents are, if not overridden by
> +   an enum in the child.  */
> +
> +extern ctf_id_t ctf_lookup_enumerator (ctf_dict_t *, const char *,
> +				       int64_t *enum_value);
> +
>  /* Type lookup functions.  */
>  
>  /* Strip qualifiers and typedefs off a type, returning the base type.
> @@ -669,6 +680,32 @@ extern int ctf_enum_iter (ctf_dict_t *, ctf_id_t, ctf_enum_f *, void *);
>  extern const char *ctf_enum_next (ctf_dict_t *, ctf_id_t, ctf_next_t **,
>  				  int *);
>  
> +/* Return all enumeration constants with a given name in a given dict, similar
> +   to ctf_lookup_enumerator above but capable of returning multiple values.
> +   Enumerators in parent dictionaries are not returned: enumerators in non-root
> +   types *are* returned.  This operation internally iterates over all types in
> +   the dict, so is relatively expensive in large dictionaries.
> +
> +   There is nothing preventing NAME from being changed by the caller in the
> +   middle of iteration: the results might be slightly confusing, but they are
> +   well-defined.  */
> +
> +extern ctf_id_t ctf_lookup_enumerator_next (ctf_dict_t *, const char *name,
> +					    ctf_next_t **, int64_t *enum_value);
> +
> +/* Likewise, across all dicts in an archive (parent first).  The DICT and ERRP
> +   arguments are not optional: without the one you can't tell which dict the
> +   returned type is in, and without the other you can't distinguish real errors

Maybe "without the former... and without the latter..." would be
more clear?

> +   from end-of-iteration.  DICT should be NULL before the first call and is set
> +   to NULL after the last and on error: on successful call it is the caller's
> +   responsibility to ctf_dict_close() it.  The caller should  otherwise pass it

IMO it would be good to also clearly say that on success DICT is set to
point to the dictionary containing the returned type.  (Which has been
implicitly opened and therefore must be closed via ctf_dict_close by the
caller).


> +   back in unchanged (do not reassign it during iteration, just as with the
> +   ctf_next_t iterator itself).  */
> +
> +extern ctf_id_t ctf_arc_lookup_enumerator_next (ctf_archive_t *, const char *name,
> +						ctf_next_t **, int64_t *enum_value,
> +						ctf_dict_t **dict, int *errp);
> +
>  /* Iterate over all types in a dict.  ctf_type_iter_all recurses over all types:
>     ctf_type_iter recurses only over types with user-visible names (for which
>     CTF_ADD_ROOT was passed).  All such types are returned, even if they are
> diff --git a/libctf/ctf-archive.c b/libctf/ctf-archive.c
> index 4744cb7a828..3ba6a3efabb 100644
> --- a/libctf/ctf-archive.c
> +++ b/libctf/ctf-archive.c
> @@ -1011,6 +1011,113 @@ ctf_arc_lookup_symbol_name (ctf_archive_t *wrapper, const char *symname,
>    return ctf_arc_lookup_sym_or_name (wrapper, 0, symname, typep, errp);
>  }
>  
> +/* Return all enumeration constants with a given name across all dicts in an
> +   archive, similar to ctf_lookup_enumerator_next.  The dict is cached, so
> +   opening costs are paid only once, but (unlike ctf_arc_lookup_symbol*
> +   above) the results of the iterations are not cached.  dict and errp are
> +   not optional.  */

Little inconsistency in dict and errp not being capitalized like above.

> +
> +ctf_id_t
> +ctf_arc_lookup_enumerator_next (ctf_archive_t *arc, const char *name,
> +				ctf_next_t **it, int64_t *enum_value,
> +				ctf_dict_t **dict, int *errp)
> +{
> +  ctf_next_t *i = *it;
> +  ctf_id_t type;
> +  int opened_this_time = 0;
> +  int err;
> +
> +  /* We have two nested iterators in here: ctn_next tracks archives, while
> +     within it ctn_next_inner tracks enumerators within an archive.  We
> +     keep track of the dict by simply reusing the passed-in arg: if it's
> +     changed by the caller, the caller will get an ECTF_WRONGFP error,
> +     so this is quite safe and means we don't have to track the arc and fp
> +     simultaneously in the ctf_next_t.  */
> +
> +  if (!i)
> +    {
> +      if ((i = ctf_next_create ()) == NULL)
> +	{
> +	  err = ENOMEM;
> +	  goto err;
> +	}
> +      i->ctn_iter_fun = (void (*) (void)) ctf_arc_lookup_enumerator_next;
> +      i->cu.ctn_arc = arc;
> +      *it = i;
> +    }
> +
> +  if ((void (*) (void)) ctf_arc_lookup_enumerator_next != i->ctn_iter_fun)
> +    {
> +      err = ECTF_NEXT_WRONGFUN;
> +      goto err;
> +    }
> +
> +  if (arc != i->cu.ctn_arc)
> +    {
> +      err = ECTF_NEXT_WRONGFP;
> +      goto err;
> +    }
> +
> +  /* Prevent any earlier end-of-iteration on this dict from confusing the
> +     test below.  */
> +  if (i->ctn_next != NULL)
> +    ctf_set_errno (*dict, 0);
> +
> +  do
> +    {
> +      /* At end of one dict, or not started any iterations yet?
> +	 Traverse to next dict.  If we never returned this dict to the
> +	 caller, close it ourselves: the caller will never see it and cannot
> +	 do so.  */
> +
> +      if (i->ctn_next == NULL || ctf_errno (*dict) == ECTF_NEXT_END)
> +	{
> +	  if (opened_this_time)
> +	    {
> +	      ctf_dict_close (*dict);
> +	      *dict = NULL;
> +	      opened_this_time = 0;
> +	    }
> +
> +	  *dict = ctf_archive_next (arc, &i->ctn_next, NULL, 0, &err);
> +	  if (!*dict)
> +	    goto err;
> +	  opened_this_time = 1;
> +	}
> +
> +      type = ctf_lookup_enumerator_next (*dict, name, &i->ctn_next_inner,
> +					 enum_value);
> +    }
> +  while (type == CTF_ERR && ctf_errno (*dict) == ECTF_NEXT_END);
> +
> +  if (type == CTF_ERR)
> +    {
> +      err = ctf_errno (*dict);
> +      goto err;
> +    }
> +
> +  /* If this dict is being reused from the previous iteration, bump its
> +     refcnt: the caller is going to close it and has no idea that we didn't
> +     open it this time round.  */
> +  if (!opened_this_time)
> +    ctf_ref (*dict);
> +
> +  return type;
> +
> + err:						/* Also ECTF_NEXT_END. */
> +  if (opened_this_time)
> +    {
> +      ctf_dict_close (*dict);
> +      *dict = NULL;
> +    }
> +
> +  ctf_next_destroy (i);
> +  *it = NULL;
> +  if (errp)
> +    *errp = err;
> +  return CTF_ERR;
> +}
> +
>  /* Raw iteration over all CTF files in an archive.  We pass the raw data for all
>     CTF files in turn to the specified callback function.  */
>  static int
> diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h
> index 299d981a718..0a362b6b17c 100644
> --- a/libctf/ctf-impl.h
> +++ b/libctf/ctf-impl.h
> @@ -544,13 +544,15 @@ struct ctf_next
>    uint32_t ctn_n;
>  
>    /* Some iterators contain other iterators, in addition to their other
> -     state.  */
> +     state.  We allow for inner and outer iterators, for two-layer nested loops
> +     like those found in ctf_arc_lookup_enumerator_next.  */
>    ctf_next_t *ctn_next;
> +  ctf_next_t *ctn_next_inner;
>  
> -  /* We can save space on this side of things by noting that a dictionary is
> -     either dynamic or not, as a whole, and a given iterator can only iterate
> -     over one kind of thing at once: so we can overlap the DTD and non-DTD
> -     members, and the structure, variable and enum members, etc.  */
> +  /* We can save space on this side of things by noting that a type is either
> +     dynamic or not, as a whole, and a given iterator can only iterate over one
> +     kind of thing at once: so we can overlap the DTD and non-DTD members, and
> +     the structure, variable and enum members, etc.  */
>    union
>    {
>      unsigned char *ctn_vlen;
> diff --git a/libctf/ctf-lookup.c b/libctf/ctf-lookup.c
> index e4d18bec112..8accb2ed99e 100644
> --- a/libctf/ctf-lookup.c
> +++ b/libctf/ctf-lookup.c
> @@ -413,6 +413,151 @@ ctf_lookup_variable (ctf_dict_t *fp, const char *name)
>    return type;
>  }
>  
> +/* Look up a single enumerator by enumeration constant name.  Returns the ID of
> +   the enum it is contained within and optionally its value.  Error out with
> +   ECTF_DUPLICATE if multiple exist (which can happen in some older dicts).  See
> +   ctf_lookup_enumerator_next in that case.  Enumeration constants in non-root
> +   types are not returned, but constants in parents are, if not overridden by
> +   an enum in the child..  */
> +
> +ctf_id_t
> +ctf_lookup_enumerator (ctf_dict_t *fp, const char *name, int64_t *enum_value)
> +{
> +  ctf_id_t type;
> +  int enum_int_value;
> +
> +  if (ctf_dynset_lookup (fp->ctf_conflicting_enums, name))
> +    return (ctf_set_typed_errno (fp, ECTF_DUPLICATE));
> +
> +  /* CTF_K_UNKNOWN suffices for things like enumeration constants that aren't
> +     actually types at all (ending up in the global name table).  */
> +  type = ctf_lookup_by_rawname (fp, CTF_K_UNKNOWN, name);
> +  /* Nonexistent type? It may be in the parent.  */
> +  if (type == 0 && fp->ctf_parent)
> +    {
> +      if ((type = ctf_lookup_enumerator (fp->ctf_parent, name, enum_value)) == 0)
> +	return ctf_set_typed_errno (fp, ECTF_NOENUMNAM);
> +      return type;
> +    }
> +
> +  /* Nothing more to do if this type didn't exist or we don't have to look up
> +     the enum value.  */
> +  if (type == 0)
> +    return ctf_set_typed_errno (fp, ECTF_NOENUMNAM);
> +
> +  if (enum_value == NULL)
> +    return type;
> +
> +  if (ctf_enum_value (fp, type, name, &enum_int_value) < 0)
> +    return CTF_ERR;
> +  *enum_value = enum_int_value;
> +
> +  return type;
> +}
> +
> +/* Return all enumeration constants with a given name in a given dict, similar
> +   to ctf_lookup_enumerator above but capable of returning multiple values.
> +   Enumerators in parent dictionaries are not returned: enumerators in
> +   hidden types *are* returned.  */
> +
> +ctf_id_t
> +ctf_lookup_enumerator_next (ctf_dict_t *fp, const char *name,
> +			    ctf_next_t **it, int64_t *val)
> +{
> +  ctf_next_t *i = *it;
> +  int found = 0;
> +
> +  /* We use ctf_type_next() to iterate across all types, but then traverse each
> +     enumerator found by hand: traversing enumerators is very easy, and it would
> +     probably be more confusing to use two nested iterators than to do it this
> +     way.  We use ctn_next to work over enums, then ctn_en and ctn_n to work
> +     over enumerators within each enum.  */
> +  if (!i)
> +    {
> +      if ((i = ctf_next_create ()) == NULL)
> +	return ctf_set_typed_errno (fp, ENOMEM);
> +
> +      i->cu.ctn_fp = fp;
> +      i->ctn_iter_fun = (void (*) (void)) ctf_lookup_enumerator_next;
> +      i->ctn_increment = 0;
> +      i->ctn_tp = NULL;
> +      i->u.ctn_en = NULL;
> +      i->ctn_n = 0;
> +      *it = i;
> +    }
> +
> +  if ((void (*) (void)) ctf_lookup_enumerator_next != i->ctn_iter_fun)
> +    return (ctf_set_typed_errno (fp, ECTF_NEXT_WRONGFUN));
> +
> +  if (fp != i->cu.ctn_fp)
> +    return (ctf_set_typed_errno (fp, ECTF_NEXT_WRONGFP));
> +
> +  do
> +    {
> +      const char *this_name;
> +
> +      /* At end of enum? Traverse to next one, if any are left.  */
> +
> +      if (i->u.ctn_en == NULL || i->ctn_n == 0)
> +	{
> +	  const ctf_type_t *tp;
> +	  ctf_dtdef_t *dtd;
> +
> +	  do
> +	    i->ctn_type = ctf_type_next (i->cu.ctn_fp, &i->ctn_next, NULL, 1);
> +	  while (i->ctn_type != CTF_ERR
> +		 && ctf_type_kind_unsliced (i->cu.ctn_fp, i->ctn_type)
> +		 != CTF_K_ENUM);
> +
> +	  if (i->ctn_type == CTF_ERR)
> +	    {
> +	      /* Conveniently, when the iterator over all types is done, so is the
> +		 iteration as a whole: so we can just pass all errors from the
> +		 internal iterator straight back out..  */
> +	      ctf_next_destroy (i);
> +	      *it = NULL;
> +	      return CTF_ERR;			/* errno is set for us.  */
> +	    }
> +
> +	  if ((tp = ctf_lookup_by_id (&fp, i->ctn_type)) == NULL)
> +	    return CTF_ERR;			/* errno is set for us.  */
> +	  i->ctn_n = LCTF_INFO_VLEN (fp, tp->ctt_info);
> +
> +	  dtd = ctf_dynamic_type (fp, i->ctn_type);
> +
> +	  if (dtd == NULL)
> +	    {
> +	      (void) ctf_get_ctt_size (fp, tp, NULL, &i->ctn_increment);
> +	      i->u.ctn_en = (const ctf_enum_t *) ((uintptr_t) tp +
> +						  i->ctn_increment);
> +	    }
> +	  else
> +	    i->u.ctn_en = (const ctf_enum_t *) dtd->dtd_vlen;
> +	}
> +
> +      this_name = ctf_strptr (fp, i->u.ctn_en->cte_name);
> +
> +      i->ctn_n--;
> +
> +      if (strcmp (name, this_name) == 0)
> +	{
> +	  if (val)
> +	    *val = i->u.ctn_en->cte_value;
> +	  found = 1;
> +
> +	  /* Constant found in this enum: try the next one.  (Constant names
> +	     cannot be duplicated within a given enum.)  */
> +
> +	  i->ctn_n = 0;
> +	}
> +
> +      i->u.ctn_en++;
> +    }
> +  while (!found);
> +
> +  return i->ctn_type;
> +}
> +
>  typedef struct ctf_symidx_sort_arg_cb
>  {
>    ctf_dict_t *fp;
> diff --git a/libctf/ctf-util.c b/libctf/ctf-util.c
> index 3ea6de9e86f..f75b1bfb01a 100644
> --- a/libctf/ctf-util.c
> +++ b/libctf/ctf-util.c
> @@ -262,6 +262,8 @@ ctf_next_destroy (ctf_next_t *i)
>      free (i->u.ctn_sorted_hkv);
>    if (i->ctn_next)
>      ctf_next_destroy (i->ctn_next);
> +  if (i->ctn_next_inner)
> +    ctf_next_destroy (i->ctn_next_inner);
>    free (i);
>  }
>  
> @@ -276,16 +278,35 @@ ctf_next_copy (ctf_next_t *i)
>      return NULL;
>    memcpy (i2, i, sizeof (struct ctf_next));
>  
> +  if (i2->ctn_next)
> +    {
> +      i2->ctn_next = ctf_next_copy (i2->ctn_next);
> +      if (i2->ctn_next == NULL)
> +	goto err_next;
> +    }
> +
> +  if (i2->ctn_next_inner)
> +    {
> +      i2->ctn_next_inner = ctf_next_copy (i2->ctn_next_inner);
> +      if (i2->ctn_next_inner == NULL)
> +	goto err_next_inner;
> +    }
> +
>    if (i2->ctn_iter_fun == (void (*) (void)) ctf_dynhash_next_sorted)
>      {
>        size_t els = ctf_dynhash_elements ((ctf_dynhash_t *) i->cu.ctn_h);
>        if ((i2->u.ctn_sorted_hkv = calloc (els, sizeof (ctf_next_hkv_t))) == NULL)
> -	{
> -	  free (i2);
> -	  return NULL;
> -	}
> +	goto err_sorted_hkv;
>        memcpy (i2->u.ctn_sorted_hkv, i->u.ctn_sorted_hkv,
>  	      els * sizeof (ctf_next_hkv_t));
>      }
>    return i2;
> +
> + err_sorted_hkv:
> +  ctf_next_destroy (i2->ctn_next_inner);
> + err_next_inner:
> +  ctf_next_destroy (i2->ctn_next);
> + err_next:
> +  ctf_next_destroy (i2);
> +  return NULL;
>  }
> diff --git a/libctf/libctf.ver b/libctf/libctf.ver
> index 6e7345be66b..e6c31ff37aa 100644
> --- a/libctf/libctf.ver
> +++ b/libctf/libctf.ver
> @@ -198,3 +198,10 @@ LIBCTF_1.2 {
>  	ctf_arc_lookup_symbol_name;
>  	ctf_add_unknown;
>  } LIBCTF_1.1;
> +
> +LIBCTF_1.3 {
> +    global:
> +	ctf_lookup_enumerator;
> +	ctf_lookup_enumerator_next;
> +	ctf_arc_lookup_enumerator_next;
> +} LIBCTF_1.2;
> diff --git a/libctf/testsuite/libctf-lookup/enum-ctf-2.c b/libctf/testsuite/libctf-lookup/enum-ctf-2.c
> new file mode 100644
> index 00000000000..39c9865e528
> --- /dev/null
> +++ b/libctf/testsuite/libctf-lookup/enum-ctf-2.c
> @@ -0,0 +1,6 @@
> +enum e { ENUMSAMPLE_1 = 6, ENUMSAMPLE_2 = 7 };
> +
> +enum ie2 { IENUMSAMPLE2_1 = -10, IENUMSAMPLE2_2 };
> +
> +enum e baz;
> +enum ie2 quux;
> diff --git a/libctf/testsuite/libctf-lookup/enumerator-iteration.c b/libctf/testsuite/libctf-lookup/enumerator-iteration.c
> new file mode 100644
> index 00000000000..e46dad6dc70
> --- /dev/null
> +++ b/libctf/testsuite/libctf-lookup/enumerator-iteration.c
> @@ -0,0 +1,168 @@
> +/* Test enumerator iteration and querying.  Because
> +   ctf_arc_lookup_enumerator_next uses ctf_lookup_enumerator_next internally, we
> +   only need to test the former.  */
> +
> +#include "config.h"
> +#include <ctf-api.h>
> +#include <inttypes.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +static void
> +print_constants (ctf_archive_t *ctf, const char *name)
> +{
> +  ctf_next_t *i = NULL;
> +  int err;
> +  ctf_dict_t *fp;
> +  ctf_id_t type;
> +  int64_t val;
> +
> +  while ((type = ctf_arc_lookup_enumerator_next (ctf, name, &i,
> +						 &val, &fp, &err)) != CTF_ERR)
> +    {
> +      char *foo;
> +
> +      printf ("%s in %s has value %i\n", name,
> +	      foo = ctf_type_aname (fp, type), val);
> +      free (foo);
> +
> +      ctf_dict_close (fp);
> +    }
> +  if (err != ECTF_NEXT_END)
> +    {
> +      fprintf (stderr, "iteration failed: %s\n", ctf_errmsg (err));
> +      exit (1);
> +    }
> +}
> +
> +int
> +main (int argc, char *argv[])
> +{
> +  ctf_archive_t *ctf;
> +  ctf_dict_t *fp;
> +  int err;
> +  ctf_id_t type;
> +  ctf_next_t *i = NULL;
> +  const char *name;
> +  int64_t val;
> +  int counter = 0;
> +
> +  if (argc != 2)
> +    {
> +      fprintf (stderr, "Syntax: %s PROGRAM\n", argv[0]);
> +      exit(1);
> +    }
> +
> +  if ((ctf = ctf_open (argv[1], NULL, &err)) == NULL)
> +    goto open_err;
> +
> +  /* Look for all instances of ENUMSAMPLE2_1, and add some new enums to all
> +     dicts found, to test dynamic enum iteration as well as static.
> +
> +     Add two enums with a different name and constants to any that should
> +     already be there (one hidden), and one with the same constants, but hidden,
> +     to test ctf_lookup_enumerator_next()'s multiple-lookup functionality and
> +     ctf_lookup_enumerator() in the presence of hidden types.  */
> +
> +  printf ("First iteration: addition of enums.\n");
> +  while ((type = ctf_arc_lookup_enumerator_next (ctf, "IENUMSAMPLE2_2", &i,
> +						 &val, &fp, &err)) != CTF_ERR)
> +    {
> +      char *foo;
> +
> +      printf ("IENUMSAMPLE2_2 in %s has value %i\n",
> +	      foo = ctf_type_aname (fp, type), val);
> +      free (foo);
> +
> +      if ((type = ctf_add_enum (fp, CTF_ADD_ROOT, "ie3")) == CTF_ERR)
> +	goto enum_add_err;
> +
> +      if (ctf_add_enumerator (fp, type, "DYNADD", counter += 10) < 0)
> +	goto enumerator_add_err;
> +      if (ctf_add_enumerator (fp, type, "DYNADD2", counter += 10) < 0)
> +	goto enumerator_add_err;
> +
> +      /* Make sure that overlapping enumerator addition fails as it should.  */
> +
> +      if (ctf_add_enumerator (fp, type, "IENUMSAMPLE2_2", 666) >= 0
> +	  || ctf_errno (fp) != ECTF_DUPLICATE)
> +	fprintf (stderr, "Duplicate enumerator addition did not fail as it ought to\n");
> +
> +      if ((type = ctf_add_enum (fp, CTF_ADD_NONROOT, "ie4_hidden")) == CTF_ERR)
> +	goto enum_add_err;
> +
> +      if (ctf_add_enumerator (fp, type, "DYNADD3", counter += 10) < 0)
> +	goto enumerator_add_err;
> +      if (ctf_add_enumerator (fp, type, "DYNADD4", counter += 10) < 0)
> +	goto enumerator_add_err;
> +
> +      if ((type = ctf_add_enum (fp, CTF_ADD_NONROOT, "ie3_hidden")) == CTF_ERR)
> +	goto enum_add_err;
> +
> +      if (ctf_add_enumerator (fp, type, "DYNADD", counter += 10) < 0)
> +	goto enumerator_add_err;
> +      if (ctf_add_enumerator (fp, type, "DYNADD2", counter += 10) < 0)
> +	goto enumerator_add_err;
> +
> +      /* Look them up via ctf_lookup_enumerator.  */
> +
> +      if (ctf_lookup_enumerator (fp, "DYNADD", &val) == CTF_ERR)
> +	goto enumerator_lookup_err;
> +      printf ("direct lookup: DYNADD value: %i\n", (int) val);
> +
> +      if ((type = ctf_lookup_enumerator (fp, "DYNADD3", &val) != CTF_ERR) ||
> +	  ctf_errno (fp) != ECTF_NOENUMNAM)
> +	{
> +	  if (type != CTF_ERR)
> +	    {
> +	      char *foo;
> +	      printf ("direct lookup: hidden lookup did not return ECTF_NOENUMNAM but rather %i in %s\n",
> +		      val, foo = ctf_type_aname (fp, type));
> +	      free (foo);
> +	    }
> +	  else
> +	    printf ("direct lookup: hidden lookup did not return ECTF_NOENUMNAM but rather %s\n",
> +		    ctf_errno (fp));
> +	}
> +
> +      ctf_dict_close (fp);
> +    }
> +  if (err != ECTF_NEXT_END)
> +    {
> +      fprintf (stderr, "iteration failed: %s\n", ctf_errmsg (err));
> +      return 1;
> +    }
> +
> +  /* Look for (and print out) some enumeration constants.  */
> +
> +  printf ("Second iteration: printing of enums.\n");
> +
> +  print_constants (ctf, "ENUMSAMPLE_1");
> +  print_constants (ctf, "IENUMSAMPLE_1");
> +  print_constants (ctf, "ENUMSAMPLE_2");
> +  print_constants (ctf, "DYNADD");
> +  print_constants (ctf, "DYNADD3");
> +
> +  ctf_close (ctf);
> +
> +  printf ("All done.\n");
> +
> +  return 0;
> +
> + open_err:
> +  fprintf (stderr, "%s: cannot open: %s\n", argv[0], ctf_errmsg (err));
> +  return 1;
> + enum_add_err:
> +  fprintf (stderr, "Cannot add enum to dict \"%s\": %s\n",
> +	   ctf_cuname (fp) ? ctf_cuname (fp) : "(null: parent)", ctf_errmsg (ctf_errno (fp)));
> +  return 1;
> + enumerator_add_err:
> +  fprintf (stderr, "Cannot add enumerator to dict \"%s\": %s\n",
> +	   ctf_cuname (fp) ? ctf_cuname (fp) : "(null: parent)", ctf_errmsg (ctf_errno (fp)));
> +  return 1;
> + enumerator_lookup_err:
> +  fprintf (stderr, "Cannot look up enumerator in dict \"%s\": %s\n",
> +	   ctf_cuname (fp) ? ctf_cuname (fp) : "(null: parent)", ctf_errmsg (ctf_errno (fp)));
> +  return 1;
> +}
> diff --git a/libctf/testsuite/libctf-lookup/enumerator-iteration.lk b/libctf/testsuite/libctf-lookup/enumerator-iteration.lk
> new file mode 100644
> index 00000000000..0c3cbfbf15f
> --- /dev/null
> +++ b/libctf/testsuite/libctf-lookup/enumerator-iteration.lk
> @@ -0,0 +1,17 @@
> +# lookup: enumerator-iteration.c
> +# source: enum-ctf.c
> +# source: enum-ctf-2.c
> +# link: on
> +First iteration: addition of enums.
> +IENUMSAMPLE2_2 in enum ie2 has value -9
> +direct lookup: DYNADD value: 10
> +Second iteration: printing of enums.
> +ENUMSAMPLE_1 in enum e has value 6
> +ENUMSAMPLE_1 in enum e has value 0
> +IENUMSAMPLE_1 in enum ie has value -10
> +ENUMSAMPLE_2 in enum e has value 7
> +ENUMSAMPLE_2 in enum e has value 1
> +DYNADD in enum ie3 has value 10
> +DYNADD in enum ie3_hidden has value 50
> +DYNADD3 in enum ie4_hidden has value 30
> +All done.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH libctf 00/11] enumerator query API, plus some bugfixes
  2024-06-14 14:14 ` [PATCH libctf 00/11] enumerator query API, plus some bugfixes Tom Tromey
@ 2024-06-14 17:57   ` Nick Alcock
  0 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-14 17:57 UTC (permalink / raw)
  To: Tom Tromey; +Cc: binutils, stephen.brennan

On 14 Jun 2024, Tom Tromey uttered the following:

>>>>>> "Nick" == Nick Alcock <nick.alcock@oracle.com> writes:
>
> Nick>   We also improve the deduplicator to consider
> Nick> enums to be conflicting types if they are distinct but have enumeration
> Nick> constants in common, and adapt ctf_add_enumerator() to prohibit the
> Nick> addition of such enumerators (which are invalid C anyway, so anything
> Nick> trying to add them is probably buggy).
>
> I don't know libctf at all but  it seems to me that  a.c can use
> enum { X = 1 } and b.c can use enum { X = 2 } without any problem.

Yes indeed -- but those are *distinct translation units*, which roughly
correspond to one CTF child dictionary.


It looks like I need to bloviate a bit about what the end goal of
deduplicating CTF types is (and what we expect CTF types to look like
from the user's perspective). I'll stuff in some crude kinda-like
objdump --ctf output to demonstrate.


The compiler initially delivers us a pile of types, one in each TU,
nearly all more or less duplicated (from headers), some (opaque types)
"cut off" and not traversable because their definitions are not visible.
Here's a couple of tiny TUs as an example:

A.c: type 0x1: enum not_conflicting { X, Y, Z };
     type 0x2: struct not_conflicting; /* opaque */
     type 0x3: struct A -> 0x2: struct visible_in_B foo;
                           0x1: enum not_conflicting bar;
     type 0x4: enum should_be_conflicting { a = 1, b }

B.c: type 0x1: int
     type 0x2: enum not_conflicting { X, Y, Z };
     type 0x3: struct visible_in_B -> 0x1: int bar;
                                      0x2: enum not_conflicting baz;
                                      0x4: enum yes_should_be_conflicting quux;
     type 0x3: enum yes_should_be_conflicting { a, b, c }

(Hopefully the type names make things a bit clear here.)

The duplication and opacity of some types are not desirable properties,
so the deduplicator tries to fix that up, while keeping the set of types
visible from each TU's dict at least as big as it was before (adding
more types is fine, taking them away is not).

After deduplication, we want to end up with a situation in which we have
one child dictionary per translation unit (as before, but with empty
ones removed), where looking at a per-CU dict (plus the shared parent,
which is included in all the children) gives you every type that is in
scope in that child plus every other type we can pull out of the program
which doesn't conflict with those types, for ease of wandering over type
graphs in a debugger (because all the opaque types have been
"de-opacified").

But we want the types visible from a given child dict to still be more
or less what C considers a valid type system! It'll just have more types
visible in it than the original TU (without using any more space to do
so, because most of those types have been moved into the parent dict by
the deduplicator and thus will be visible from all child TUs). So,
before this change, we'd turn the above TUs into this, all in the shared
dict with no per-TU child dicts:

.ctf: type 0x1: enum not_conflicting { X, Y, Z };
      type 0x2: int
      type 0x3: struct visible_in_B -> 0x2: int bar;
                                       0x1: enum not_conflicting baz;
      type 0x4: struct A -> 0x3: struct visible_in_B foo;
                            0x1: enum not_conflicting bar;
                            0x6: enum yes_should_be_conflicting quux;
      type 0x5: enum should_be_conflicting { a = 1, b }
-->   type 0x6: enum yes_should_be_conflicting { a, b, c }

Clearly it is impossible to write one C translation unit that contains
the last two enums at once!

(There are no child dicts here, because child dicts that are empty are
simply dropped: users are expected to just look in .ctf directly in that
case.)

After this change, ctf_add_enumerator() is probibited from adding
clashing enum identifiers within *one single TU's dict*, so this now
corresponds with the C rules where before it did not. You can still add
enum { X = 1 } in one TU and enum { X = 2 } in another one just fine.
You can't add both of them in the TU corresponding to a.c, and if you
try you now get an error.

The related conflicting-type addition means that the deduplicator will
no longer look at enum { X = 1 } and enum { X = 2 } in two of its input
TUs and say, oh, those can both get pushed into the shared .ctf dict,
they don't conflict at all!

Oh yes they do, and if we hadn't fixed that the deduplicator would have
tripped the ctf_add_enumerator() error above: so now the enum that is
referenced most often by other types will end up in the shared dict,
while the other stays in per-translation-unit child dicts specific to
the TUs in which it appears. So in the case above you'd get this:

.ctf: type 0x1: enum not_conflicting { X, Y, Z };
      type 0x2: int
      type 0x3: struct visible_in_B -> 0x2: int bar;
                                       0x1: enum not_conflicting baz;
      type 0x4: struct A -> 0x3: struct visible_in_B foo;
                            0x1: enum not_conflicting bar;
                            0x5: enum yes_should_be_conflicting quux;
      type 0x5: enum yes_should_be_conflicting { a, b, c }

A.c: type 0x80000001: enum should_be_conflicting { a = 1, b }

After this change, .ctf is entirely valid as a TU you can write in C if
you like. A.c is more contentious: it's only valid as a TU you can write
in C if you screen out yes_should_be_conflicting from the parent, but
still you can only find clashes there if you start comparing types from
the parent and the child and hunting for conflicts that way, and they're
easily resolved by always preferring the child when you look things up
in the child, which is what the enumerator lookup functions added here
do (if you're looking at A.c and you look up enumerand X, it'll return
type 0x1: enum not_conflicting, value 0; if you look up enumerand a,
it'll return type 0x80000001: enum should_be_conflicting, value 1.)

(This "most referenced" popularity contest thing is something applied to
all types already. We do it because parent types cannot refer to child
types; so if a type B refers to a type A that is in a child dict, B
*also* has to be duplicated into that child dict, and so do all types
that reference *that*, and so on. If we can keep the most-heavily-
referenced types in the shared parent dict, this keeps the number of
types in child dicts down, which has a huge impact on CTF space usage.)

btw, this change has almost no impact on the .ctf section sizes of
anything I've been able to test, a few percent at most. Conflicting enum
identifiers do exist, but clearly they're not too wildly common.

-- 
NULL && (void)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 11/11] libctf, include: new functions for looking up enumerators
  2024-06-14 17:51   ` David Faust
@ 2024-06-17 13:43     ` Nick Alcock
  0 siblings, 0 replies; 16+ messages in thread
From: Nick Alcock @ 2024-06-17 13:43 UTC (permalink / raw)
  To: David Faust; +Cc: binutils, stephen.brennan

On 14 Jun 2024, David Faust uttered the following:

> On 6/13/24 11:54, Nick Alcock wrote:
>> Three new functions for looking up the enum type containing a given
>> enumeration constant, and optionally that constant's value.
>
> The new functions make sense to me, though I don't have much experience
> as a user of libctf or it's APIs.

The APIs I got help on from Stephen Brennan and Indu Bhagat earlier. I'm
fairly confident that at least *one* user will like them. :)

> Just a few small comments/nits below.

Thanks! All fixed as you suggested.

>> +   from end-of-iteration.  DICT should be NULL before the first call and is set
>> +   to NULL after the last and on error: on successful call it is the caller's
>> +   responsibility to ctf_dict_close() it.  The caller should  otherwise pass it
>
> IMO it would be good to also clearly say that on success DICT is set to
> point to the dictionary containing the returned type.  (Which has been
> implicitly opened and therefore must be closed via ctf_dict_close by the
> caller).

True! I, uh, assumed that would go without saying, which in
documentation is probably a bad stance :)

The "pass in DICT unchanged, we actually use it to store state" thing is
a bit weird but seems more or less harmless (when I get dict leaks, it's
not because of that, it's because I forget to close them on error
paths), and without it I'd need to add the ability for ctf_next_t
iterators to hold archives and dicts at the same time, which does get
quite complex. (I tried, and honestly the _next iterators are complex
enough to write as it is!)

-- 
NULL && (void)

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-06-17 13:43 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-13 18:54 [PATCH libctf 00/11] enumerator query API, plus some bugfixes Nick Alcock
2024-06-13 18:54 ` [PATCH 01/11] libctf: strtab corruption when strings are added to ctf_open()ed dicts Nick Alcock
2024-06-13 18:54 ` [PATCH 02/11] include: fix libctf ECTF_NOENUMNAM error message Nick Alcock
2024-06-13 18:54 ` [PATCH 03/11] libctf: doc: fix ctf_stype_t typedef string in spec Nick Alcock
2024-06-13 18:54 ` [PATCH 04/11] libctf: dedup: enums with overlapping enumerators are conflicting Nick Alcock
2024-06-13 18:54 ` [PATCH 05/11] libctf: don't leak enums if ctf_add_type fails Nick Alcock
2024-06-13 18:54 ` [PATCH 06/11] libctf: fix dict leak on archive-wide symbol lookup error path Nick Alcock
2024-06-13 18:54 ` [PATCH 07/11] libctf: suppress spurious failure of malloc-counting tests under valgrind Nick Alcock
2024-06-13 18:54 ` [PATCH 08/11] libctf: prohibit addition of enums with overlapping enumerator constants Nick Alcock
2024-06-13 18:54 ` [PATCH 09/11] libctf: make the ctf_next ctn_fp non-const Nick Alcock
2024-06-13 18:54 ` [PATCH 10/11] include: libctf: comment improvements Nick Alcock
2024-06-13 18:54 ` [PATCH 11/11] libctf, include: new functions for looking up enumerators Nick Alcock
2024-06-14 17:51   ` David Faust
2024-06-17 13:43     ` Nick Alcock
2024-06-14 14:14 ` [PATCH libctf 00/11] enumerator query API, plus some bugfixes Tom Tromey
2024-06-14 17:57   ` Nick Alcock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).