public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Nick Alcock <nick.alcock@oracle.com>
To: binutils@sourceware.org
Subject: [PATCH 04/22] libctf: fix name lookup in dicts containing base-type bitfields
Date: Wed, 17 Apr 2024 21:20:00 +0100	[thread overview]
Message-ID: <20240417202018.34966-5-nick.alcock@oracle.com> (raw)
In-Reply-To: <20240417202018.34966-1-nick.alcock@oracle.com>

The intent of the name lookup code was for lookups to yield non-bitfield
basic types except if none existed with a given name, and only then
return bitfield types with that name.  Unfortunately, the code as
written only does this if the base type has a type ID higher than all
bitfield types, which is most unlikely (the opposite is almost always
the case).

Adjust it so that what ends up in the name table is the highest-width
zero-offset type with a given name, if any such exist, and failing that
the first type with that name we see, no matter its offset.  (We don't
define *which* bitfield type you get, after all, so we might as well
just stuff in the first we find.)

Reported by Stephen Brennan <stephen.brennan@oracle.com>.

libctf/

	* ctf-open.c (init_types): Modify to allow some lookups during open;
	detect bitfield name reuse and prefer less bitfieldy types.
	* testsuite/libctf-writable/libctf-bitfield-name-lookup.*: New test.
---
 libctf/ctf-open.c                             |  73 ++++++----
 .../libctf-bitfield-name-lookup.c             | 136 ++++++++++++++++++
 .../libctf-bitfield-name-lookup.lk            |   1 +
 3 files changed, 186 insertions(+), 24 deletions(-)
 create mode 100644 libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c
 create mode 100644 libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk

diff --git a/libctf/ctf-open.c b/libctf/ctf-open.c
index 2945228ff2a..87b0f74367a 100644
--- a/libctf/ctf-open.c
+++ b/libctf/ctf-open.c
@@ -685,6 +685,7 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
   const ctf_type_t *tp;
   uint32_t id;
   uint32_t *xp;
+  unsigned long typemax = 0;
 
   /* We determine whether the dict is a child or a parent based on the value of
      cth_parname.  */
@@ -708,7 +709,7 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
   /* We make two passes through the entire type section.  In this first
      pass, we count the number of each type and the total number of types.  */
 
-  for (tp = tbuf; tp < tend; fp->ctf_typemax++)
+  for (tp = tbuf; tp < tend; typemax++)
     {
       unsigned short kind = LCTF_INFO_KIND (fp, tp->ctt_info);
       unsigned long vlen = LCTF_INFO_VLEN (fp, tp->ctt_info);
@@ -769,8 +770,8 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 				   ctf_hash_eq_string, NULL, NULL)) == NULL)
     return ENOMEM;
 
-  fp->ctf_txlate = malloc (sizeof (uint32_t) * (fp->ctf_typemax + 1));
-  fp->ctf_ptrtab_len = fp->ctf_typemax + 1;
+  fp->ctf_txlate = malloc (sizeof (uint32_t) * (typemax + 1));
+  fp->ctf_ptrtab_len = typemax + 1;
   fp->ctf_ptrtab = malloc (sizeof (uint32_t) * fp->ctf_ptrtab_len);
 
   if (fp->ctf_txlate == NULL || fp->ctf_ptrtab == NULL)
@@ -779,13 +780,17 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
   xp = fp->ctf_txlate;
   *xp++ = 0;			/* Type id 0 is used as a sentinel value.  */
 
-  memset (fp->ctf_txlate, 0, sizeof (uint32_t) * (fp->ctf_typemax + 1));
-  memset (fp->ctf_ptrtab, 0, sizeof (uint32_t) * (fp->ctf_typemax + 1));
+  memset (fp->ctf_txlate, 0, sizeof (uint32_t) * (typemax + 1));
+  memset (fp->ctf_ptrtab, 0, sizeof (uint32_t) * (typemax + 1));
 
   /* In the second pass through the types, we fill in each entry of the
-     type and pointer tables and add names to the appropriate hashes.  */
+     type and pointer tables and add names to the appropriate hashes.
 
-  for (id = 1, tp = tbuf; tp < tend; xp++, id++)
+     Bump ctf_typemax as we go, but keep it one higher than normal, so that
+     the type being read in is considered a valid type and it is at least
+     barely possible to run simple lookups on it.  */
+
+  for (id = 1, fp->ctf_typemax = 1, tp = tbuf; tp < tend; xp++, id++, fp->ctf_typemax++)
     {
       unsigned short kind = LCTF_INFO_KIND (fp, tp->ctt_info);
       unsigned short isroot = LCTF_INFO_ISROOT (fp, tp->ctt_info);
@@ -799,27 +804,47 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
       /* Cannot fail: shielded by call in loop above.  */
       vbytes = LCTF_VBYTES (fp, kind, size, vlen);
 
+      *xp = (uint32_t) ((uintptr_t) tp - (uintptr_t) fp->ctf_buf);
+
       switch (kind)
 	{
 	case CTF_K_UNKNOWN:
 	case CTF_K_INTEGER:
 	case CTF_K_FLOAT:
-	  /* Names are reused by bit-fields, which are differentiated by their
-	     encodings, and so typically we'd record only the first instance of
-	     a given intrinsic.  However, we replace an existing type with a
-	     root-visible version so that we can be sure to find it when
-	     checking for conflicting definitions in ctf_add_type().  */
+	  {
+	    ctf_id_t existing;
+	    ctf_encoding_t existing_en;
+	    ctf_encoding_t this_en;
 
-	  if (((ctf_dynhash_lookup_type (fp->ctf_names, name)) == 0)
-	      || isroot)
-	    {
-	      err = ctf_dynhash_insert_type (fp, fp->ctf_names,
-					  LCTF_INDEX_TO_TYPE (fp, id, child),
-					  tp->ctt_name);
-	      if (err != 0)
-		return err;
-	    }
-	  break;
+	    if (!isroot)
+	      break;
+
+	    /* Names are reused by bitfields, which are differentiated by
+	       their encodings.  So check for the type already existing, and
+	       iff the new type is a root-visible non-bitfield, replace the
+	       old one.  It's a little hard to figure out whether a type is
+	       a non-bitfield without already knowing that type's native
+	       width, but we can converge on it by replacing an existing
+	       type as long as the new type is zero-offset and has a
+	       bit-width wider than the existing one, since the native type
+	       must necessarily have a bit-width at least as wide as any
+	       bitfield based on it. */
+
+	    if (((existing = ctf_dynhash_lookup_type (fp->ctf_names, name)) == 0)
+		|| ctf_type_encoding (fp, existing, &existing_en) != 0
+		|| (ctf_type_encoding (fp, LCTF_INDEX_TO_TYPE (fp, id, child), &this_en) == 0
+		    && this_en.cte_offset == 0
+		    && (existing_en.cte_offset != 0
+			|| existing_en.cte_bits < this_en.cte_bits)))
+	      {
+		err = ctf_dynhash_insert_type (fp, fp->ctf_names,
+					       LCTF_INDEX_TO_TYPE (fp, id, child),
+					       tp->ctt_name);
+		if (err != 0)
+		  return err;
+	      }
+	    break;
+	  }
 
 	  /* These kinds have no name, so do not need interning into any
 	     hashtables.  */
@@ -938,10 +963,10 @@ init_types (ctf_dict_t *fp, ctf_header_t *cth)
 			_("init_static_types(): unhandled CTF kind: %x"), kind);
 	  return ECTF_CORRUPT;
 	}
-
-      *xp = (uint32_t) ((uintptr_t) tp - (uintptr_t) fp->ctf_buf);
       tp = (ctf_type_t *) ((uintptr_t) tp + increment + vbytes);
     }
+  fp->ctf_typemax--;
+  assert (fp->ctf_typemax == typemax);
 
   ctf_dprintf ("%lu total types processed\n", fp->ctf_typemax);
   ctf_dprintf ("%zu enum names hashed\n",
diff --git a/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c b/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c
new file mode 100644
index 00000000000..1554ca2d626
--- /dev/null
+++ b/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.c
@@ -0,0 +1,136 @@
+/* Verify that name lookup of basic types including old-style bitfield types
+   yields the non-bitfield.  */
+
+#include <ctf-api.h>
+#include <inttypes.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+int bitfieldery (int count, int up, int pos)
+{
+  unsigned char *ctf_written;
+  size_t size;
+  ctf_dict_t *dict;
+  const char *err = "opening";
+  int open_err;
+  ctf_encoding_t en;
+  ctf_encoding_t basic;
+  ctf_id_t type;
+  size_t i;
+
+  /* This is rendered annoying by two factors: old-style bitfields are not
+     generated by current compilers, so we need to build a suitable dict by
+     hand; and this is an open-time bug, so we need to serialize it and then
+     load it back in again.  */
+
+  if ((dict = ctf_create (&open_err)) == NULL)
+    goto open_err;
+
+  /* Populate with a pile of bitfields of increasing/decreasing size, with a
+     single basic type dropped in at position POS.  Oscillate the offset
+     between 0 and 1.  */
+
+  basic.cte_bits = count;
+  basic.cte_offset = 0;
+  basic.cte_format = CTF_INT_SIGNED;
+
+  en.cte_bits = up ? 0 : count - 1;
+  en.cte_offset = 0;
+  en.cte_format = CTF_INT_SIGNED;
+
+  for (i = 0; i < count; i++)
+    {
+      if (i == pos)
+	{
+	  err = "populating with basic type";
+	  if (ctf_add_integer (dict, CTF_ADD_ROOT, "int", &basic) < 0)
+	    goto err;
+	}
+
+      err = "populating";
+      if (ctf_add_integer (dict, CTF_ADD_ROOT, "int", &en) < 0)
+	goto err;
+
+      en.cte_bits += up ? 1 : -1;
+      if (en.cte_offset == 0)
+	en.cte_offset = 1;
+      else
+	en.cte_offset = 0;
+    }
+
+  /* Possibly populate with at-end basic type.  */
+  if (i == pos)
+    {
+      err = "populating with basic type";
+      if (ctf_add_integer (dict, CTF_ADD_ROOT, "int", &basic) < 0)
+	goto err;
+    }
+
+  err = "writing";
+  if ((ctf_written = ctf_write_mem (dict, &size, 4096)) == NULL)
+    goto err;
+  ctf_dict_close (dict);
+
+  err = "opening";
+  if ((dict = ctf_simple_open ((char *) ctf_written, size, NULL, 0,
+			       0, NULL, 0, &open_err)) == NULL)
+    goto open_err;
+
+  err = "looking up";
+  if ((type = ctf_lookup_by_name (dict, "int")) == CTF_ERR)
+    goto err;
+
+  err = "encoding check";
+  if (ctf_type_encoding (dict, type, &en) < 0)
+    goto err;
+
+  if (en.cte_bits < count || en.cte_offset != 0) {
+    fprintf (stderr, "Name lookup with count %i, pos %i, counting %s "
+	     "gave bitfield ID %lx with bits %i, offset %i\n", count, pos,
+	     up ? "up" : "down", type, en.cte_bits, en.cte_offset);
+    return 1;
+  }
+  ctf_dict_close (dict);
+  free (ctf_written);
+
+  return 0;
+
+ open_err:
+  fprintf (stdout, "Error %s: %s\n", err, ctf_errmsg (open_err));
+  return 1;
+
+ err:
+  fprintf (stdout, "Error %s: %s\n", err, ctf_errmsg (ctf_errno (dict)));
+  return 1;
+}
+
+/* Do a bunch of tests with a type of a given size: up and down, basic type
+   at and near the start and end, and in the middle.  */
+
+void mass_bitfieldery (long size)
+{
+  size *= 8;
+  bitfieldery (size, 1, 0);
+  bitfieldery (size, 0, 0);
+  bitfieldery (size, 1, 1);
+  bitfieldery (size, 0, 1);
+  bitfieldery (size, 1, size / 2);
+  bitfieldery (size, 0, size / 2);
+  bitfieldery (size, 1, size - 1);
+  bitfieldery (size, 0, size - 1);
+  bitfieldery (size, 1, size);
+  bitfieldery (size, 0, size);
+}
+
+int main (void)
+{
+  mass_bitfieldery (sizeof (char));
+  mass_bitfieldery (sizeof (short));
+  mass_bitfieldery (sizeof (int));
+  mass_bitfieldery (sizeof (long));
+  mass_bitfieldery (sizeof (uint64_t));
+
+  printf ("All done.\n");
+
+  return 0;
+}
diff --git a/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk b/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk
new file mode 100644
index 00000000000..b944f73d013
--- /dev/null
+++ b/libctf/testsuite/libctf-writable/libctf-bitfield-name-lookup.lk
@@ -0,0 +1 @@
+All done.
-- 
2.44.0.273.ge0bd14271f


  parent reply	other threads:[~2024-04-17 20:20 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-17 20:19 [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock
2024-04-17 20:19 ` [PATCH 01/22] binutils, objdump: Add --ctf-parent-section Nick Alcock
2024-04-18  2:05   ` Alan Modra
2024-04-18 13:06     ` Nick Alcock
2024-04-17 20:19 ` [PATCH 02/22] libctf: don't leak the symbol name in the name->type cache Nick Alcock
2024-04-17 20:19 ` [PATCH 03/22] libctf: remove static/dynamic name lookup distinction Nick Alcock
2024-04-17 20:20 ` Nick Alcock [this message]
2024-04-17 20:20 ` [PATCH 05/22] libctf: support addition of types to dicts read via ctf_open() Nick Alcock
2024-04-17 20:20 ` [PATCH 06/22] libctf: fix a comment Nick Alcock
2024-04-17 20:20 ` [PATCH 07/22] libctf: delete LCTF_DIRTY Nick Alcock
2024-04-17 20:20 ` [PATCH 08/22] libctf: fix a comment typo Nick Alcock
2024-04-17 20:20 ` [PATCH 09/22] libctf: rename ctf_dict.ctf_{symtab,strtab} Nick Alcock
2024-04-17 20:20 ` [PATCH 10/22] Revert "libctf: do not corrupt strings across ctf_serialize" Nick Alcock
2024-04-17 20:20 ` [PATCH 11/22] libctf: replace 'pending refs' abstraction Nick Alcock
2024-04-17 20:20 ` [PATCH 12/22] libctf: rethink strtab writeout Nick Alcock
2024-04-17 20:20 ` [PATCH 13/22] libctf: make ctf_serialize() actually serialize Nick Alcock
2024-04-17 20:20 ` [PATCH 14/22] libctf: fix tiny dumping error Nick Alcock
2024-04-17 20:20 ` [PATCH 15/22] libctf: improve handling of type dumping errors Nick Alcock
2024-04-17 20:20 ` [PATCH 16/22] libctf: make ctf_lookup of symbols by name work in more cases Nick Alcock
2024-04-17 20:20 ` [PATCH 17/22] libctf: fix a debugging typo Nick Alcock
2024-04-17 20:20 ` [PATCH 18/22] libctf: add rewriting tests Nick Alcock
2024-04-17 20:20 ` [PATCH 19/22] libctf: fix leak in test Nick Alcock
2024-04-17 20:20 ` [PATCH 20/22] libctf: don't pass errno into ctf_err_warn so often Nick Alcock
2024-04-17 20:20 ` [PATCH 21/22] libctf: Remove undefined functions from ver. map Nick Alcock
2024-04-17 20:20 ` [PATCH 22/22] libctf: do not include undefined functions in libctf.ver Nick Alcock
2024-04-19 15:51 ` [PATCH libctf 00/22] more modifiable CTF dicts (and a few bugfixes) Nick Alcock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240417202018.34966-5-nick.alcock@oracle.com \
    --to=nick.alcock@oracle.com \
    --cc=binutils@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).