From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 2103) id 23D1C3882AE0; Tue, 18 Jun 2024 12:55:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 23D1C3882AE0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1718715326; bh=T+DSj70kDFDhIgef9z4JimSJ+IZceuvJNH1qo/cjmuc=; h=From:To:Subject:Date:From; b=j7FM0xq536DnB98xerdn+7we9pAXz1RaCS44911euK9OemZBdvkzJWxeO0boRws0M qw9MRt7qpskdQHgDC6Rg/wBpv6S2tRcyIvWXttFg9fVNgQalbP035snLxxdKJsuOyX F55jDs2UBKQaaPdcuiuheM1FSvKW2iB3dLCqYHVA= Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable From: Nick Alcock To: binutils-cvs@sourceware.org Subject: [binutils-gdb] libctf: dedup: enums with overlapping enumerators are conflicting X-Act-Checkin: binutils-gdb X-Git-Author: Nick Alcock X-Git-Refname: refs/heads/master X-Git-Oldrev: 0c5f03a9d5ed35731742df2c98cc5ec6aa738828 X-Git-Newrev: f8da1a05db64d8c5c700e07a008a1938858a7adf Message-Id: <20240618125526.23D1C3882AE0@sourceware.org> Date: Tue, 18 Jun 2024 12:55:26 +0000 (GMT) List-Id: https://sourceware.org/git/gitweb.cgi?p=3Dbinutils-gdb.git;h=3Df8da1a05db64= d8c5c700e07a008a1938858a7adf commit f8da1a05db64d8c5c700e07a008a1938858a7adf Author: Nick Alcock Date: Tue Jun 11 19:51:33 2024 +0100 libctf: dedup: enums with overlapping enumerators are conflicting =20 The CTF deduplicator was not considering enumerators inside enum types = to be things that caused type conflicts, so if the following two TUs were lin= ked together, you would end up with the following in the resulting dict: =20 1.c: enum foo { A, B }; =20 2.c: enum bar { A, B }; =20 linked: =20 enum foo { A, B }; enum bar { A, B }; =20 This does work -- but it's not something that's valid C, and the general point of the shared dict is that it is something that you could potenti= ally get from any valid C TU. =20 So consider such types to be conflicting, but obviously don't consider actually identical enums to be conflicting, even though they too have (= all) their identifiers in common. This involves surprisingly little code. T= he deduplicator detects conflicting types by counting types in a hash tabl= e of hash tables: =20 decorated identifier -> (type hash -> count) =20 where the COUNT is the number of times a given hash has been observed: = any name with more than one hash associated with it is considered conflicti= ng (the count is used to identify the most common such name for promotion = to the shared dict). =20 Before now, those identifiers were all the identifiers of types (possib= ly decorated with their namespace on the front for enumerator identifiers)= , but we can equally well put *enumeration constant names* in there, undecora= ted like the identifiers of types in the global namespace, with the type ha= sh being the hash of each enum containing that enumerator. The existing conflicting-type-detection code will then accurately identify distinct = enums with enumeration constants in common. The enum that contains the most commonly-appearing enumerators will be promoted to the shared dict. =20 libctf/ * ctf-impl.h (ctf_dedup_t) : Extend comment. * ctf-dedup.c (ctf_dedup_count_name): New, split out of... (ctf_dedup_populate_mappings): ... here. Call it for all * enumeration constants in an enum as well as types. =20 ld/ * testsuite/ld-ctf/enum-3.c: New test CTF. * testsuite/ld-ctf/enum-4.c: Likewise. * testsuite/ld-ctf/overlapping-enums.d: New test. * testsuite/ld-ctf/overlapping-enums-2.d: Likewise. Diff: --- ld/testsuite/ld-ctf/enum-3.c | 3 +++ ld/testsuite/ld-ctf/enum-4.c | 3 +++ ld/testsuite/ld-ctf/overlapping-enums-2.d | 36 ++++++++++++++++++++++++++ ld/testsuite/ld-ctf/overlapping-enums.d | 35 ++++++++++++++++++++++++++ libctf/ctf-dedup.c | 42 ++++++++++++++++++++++++++-= ---- libctf/ctf-impl.h | 3 ++- 6 files changed, 115 insertions(+), 7 deletions(-) diff --git a/ld/testsuite/ld-ctf/enum-3.c b/ld/testsuite/ld-ctf/enum-3.c new file mode 100644 index 00000000000..c365aaff564 --- /dev/null +++ b/ld/testsuite/ld-ctf/enum-3.c @@ -0,0 +1,3 @@ +enum first_day_of_the_week {Sunday =3D 0}; + +static enum first_day_of_the_week day __attribute__((used)); diff --git a/ld/testsuite/ld-ctf/enum-4.c b/ld/testsuite/ld-ctf/enum-4.c new file mode 100644 index 00000000000..00634244107 --- /dev/null +++ b/ld/testsuite/ld-ctf/enum-4.c @@ -0,0 +1,3 @@ +enum intersecting_days_of_the_week {Montag =3D 1, Tuesday =3D 2}; + +static enum intersecting_days_of_the_week day __attribute__((used)); diff --git a/ld/testsuite/ld-ctf/overlapping-enums-2.d b/ld/testsuite/ld-ct= f/overlapping-enums-2.d new file mode 100644 index 00000000000..1adfd86b89b --- /dev/null +++ b/ld/testsuite/ld-ctf/overlapping-enums-2.d @@ -0,0 +1,36 @@ +#as: +#source: enum.c +#source: enum-4.c +#objdump: --ctf +#ld: -shared +#name: Semioverlapping enumerators + +.*: +file format .* + +Contents of CTF section .ctf: + + Header: + Magic number: 0xdff2 + Version: 4 \(CTF_VERSION_3\) +#... + Types: + 0x1: \(kind 8\) enum day_of_the_week \(size 0x[0-9a-f]*\) \(aligned at= 0x[0-9a-f]*\) + Monday: 0 + Tuesday: 1 + Wednesday: 2 + Thursday: 3 + Friday: 4 + Saturday: 5 + Sunday: 6 +#... + Strings: +#... +CTF archive member: .*enum.*\.c: +#... + Types: + 0x80000001: \(kind 8\) enum intersecting_days_of_the_week \(size 0x[0-= 9a-f]*\) \(aligned at 0x[0-9a-f]*\) + Montag: 1 + Tuesday: 2 + + Strings: +#... diff --git a/ld/testsuite/ld-ctf/overlapping-enums.d b/ld/testsuite/ld-ctf/= overlapping-enums.d new file mode 100644 index 00000000000..7cf57d62452 --- /dev/null +++ b/ld/testsuite/ld-ctf/overlapping-enums.d @@ -0,0 +1,35 @@ +#as: +#source: enum.c +#source: enum-3.c +#objdump: --ctf +#ld: -shared +#name: Overlapping enumerators + +.*: +file format .* + +Contents of CTF section .ctf: + + Header: + Magic number: 0xdff2 + Version: 4 \(CTF_VERSION_3\) +#... + Types: + 0x1: \(kind 8\) enum day_of_the_week \(size 0x[0-9a-f]*\) \(aligned at= 0x[0-9a-f]*\) + Monday: 0 + Tuesday: 1 + Wednesday: 2 + Thursday: 3 + Friday: 4 + Saturday: 5 + Sunday: 6 +#... + Strings: +#... +CTF archive member: .*enum.*\.c: +#... + Types: + 0x80000001: \(kind 8\) enum first_day_of_the_week \(size 0x[0-9a-f]*\)= \(aligned at 0x[0-9a-f]*\) + Sunday: 0 + + Strings: +#... diff --git a/libctf/ctf-dedup.c b/libctf/ctf-dedup.c index c7db6ab4965..dd234945462 100644 --- a/libctf/ctf-dedup.c +++ b/libctf/ctf-dedup.c @@ -1149,6 +1149,9 @@ ctf_dedup_hash_type (ctf_dict_t *fp, ctf_dict_t *inpu= t, return NULL; } =20 +static int +ctf_dedup_count_name (ctf_dict_t *fp, const char *name, void *id); + /* Populate a number of useful mappings not directly used by the hashing machinery: the output mapping, the cd_name_counts mapping from name -> = hash -> count of hashval deduplication state for a given hashed type, and the @@ -1164,8 +1167,6 @@ ctf_dedup_populate_mappings (ctf_dict_t *fp, ctf_dict= _t *input _libctf_unused_, { ctf_dedup_t *d =3D &fp->ctf_dedup; ctf_dynset_t *type_ids; - ctf_dynhash_t *name_counts; - long int count; =20 #ifdef ENABLE_LIBCTF_HASH_DEBUGGING ctf_dprintf ("Hash %s, %s, into output mapping for %i/%lx @ %s\n", @@ -1258,24 +1259,53 @@ ctf_dedup_populate_mappings (ctf_dict_t *fp, ctf_di= ct_t *input _libctf_unused_, && ctf_dynset_insert (type_ids, id) < 0) return ctf_set_errno (fp, errno); =20 + if (ctf_type_kind_unsliced (input, type) =3D=3D CTF_K_ENUM) + { + ctf_next_t *i =3D NULL; + const char *enumerator; + + while ((enumerator =3D ctf_enum_next (input, type, &i, NULL)) !=3D N= ULL) + { + if (ctf_dedup_count_name (fp, enumerator, id) < 0) + { + ctf_next_destroy (i); + return -1; + } + } + if (ctf_errno (input) !=3D ECTF_NEXT_END) + return ctf_set_errno (fp, ctf_errno (input)); + } + /* The rest only needs to happen for types with names. */ if (!decorated_name) return 0; =20 + if (ctf_dedup_count_name (fp, decorated_name, id) < 0) + return -1; /* errno is set for us. */ + + return 0; +} + +static int +ctf_dedup_count_name (ctf_dict_t *fp, const char *name, void *id) +{ + ctf_dedup_t *d =3D &fp->ctf_dedup; + ctf_dynhash_t *name_counts; + long int count; + const char *hval; + /* Count the number of occurrences of the hash value for this GID. */ =20 hval =3D ctf_dynhash_lookup (d->cd_type_hashes, id); =20 /* Mapping from name -> hash(hashval, count) not already present? */ - if ((name_counts =3D ctf_dynhash_lookup (d->cd_name_counts, - decorated_name)) =3D=3D NULL) + if ((name_counts =3D ctf_dynhash_lookup (d->cd_name_counts, name)) =3D= =3D NULL) { if ((name_counts =3D ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string, NULL, NULL)) =3D=3D NULL) return ctf_set_errno (fp, errno); - if (ctf_dynhash_cinsert (d->cd_name_counts, decorated_name, - name_counts) < 0) + if (ctf_dynhash_cinsert (d->cd_name_counts, name, name_counts) < 0) { ctf_dynhash_destroy (name_counts); return ctf_set_errno (fp, errno); diff --git a/libctf/ctf-impl.h b/libctf/ctf-impl.h index 03e1a66416a..eb89f8b4645 100644 --- a/libctf/ctf-impl.h +++ b/libctf/ctf-impl.h @@ -294,7 +294,8 @@ typedef struct ctf_dedup ctf_dynhash_t *cd_decorated_names[4]; =20 /* Map type names to a hash from type hash value -> number of times each= value - has appeared. */ + has appeared. Enumeration constants are tracked via the enum they ap= pear + in. */ ctf_dynhash_t *cd_name_counts; =20 /* Map global type IDs to type hash values. Used to determine if types = are