public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
From: Indu Bhagat <indu.bhagat@oracle.com>
To: Nick Alcock <nick.alcock@oracle.com>,
	Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: jose.marchesi@oracle.com, indu.bhagat@oracle.com,
	binutils@sourceware.org
Subject: Re: libctf: new enum-related API functions: request for better names
Date: Wed, 22 May 2024 12:50:39 -0700	[thread overview]
Message-ID: <8ae5d1f6-ca86-e3a9-8c3b-f942f3f1c292@oracle.com> (raw)
In-Reply-To: <87ikz71nhh.fsf@esperi.org.uk>

On 5/21/24 3:13 AM, Nick Alcock wrote:
> [Jose, Indu: your only interest here might be my musing about
>   identifying what mysterious values in memory dumps are: search for
>   "#define". If this were to happen it wouldn't be soon, would definitely
>   not be on by default, but it would need compiler help to do something
>   much like -g3 does now.]
> 
> On 20 May 2024, Stephen Brennan spake thusly:
> 
>> Hi Nick,
>>
>> I'm not subscribed here but found the mailto: link with In-Reply-To
>> header set on the archive page; hopefully this reply works as expected.
> It does! I like your suggested API and am switching straight to that
> instead.
> 
> It's such a good API that my eye skipped over one function and I thought
> 'oh, we're missing that' and proposed a new one with the exact same name
> and parameters in the same order before noticing it was already in your
> list.:)
> 
>> Nick Alcock writes:
>>> So Stephen Brennan pointed out many years ago that libctf's handling of
>>> enumeration constants is needlessly unhelpful: it treats them as if they
>>> are scoped within a given enum: you can only query from constant name ->
>>> value and back within a given enum's scope, so if you don't already know
>>> what enum something is part of you have to walk over every enum in the
>>> dict hunting for it.
>>>
>>> Worse yet, we do not consider enum constants with clashing values to be
>>> a sign of a type conflict, so can easily end up with multiple distinct
>>> enums containing enumeration constants with the*same name*  appearing
>>> in the shared dict. This definitely violates the principle of least
>>> surprise and the (largely unstated) assumption that the shared dict
>>> should be "as if" the entire C program's non-conflicting types were
>>> declared in a single giant file which was compiled with -gctf: you can't
>>> write a C file that declares the same enumeration constant twice!
>>>
>>> Half of this is easy to fix: libctf, and in particular the deduplicator,
>>> should track enumeration constant names just like it does all other
>>> identifiers, and push enums with clashing names into child dicts. (This
>>> might eat a lot of space when the enums have many other enumerators, but
>>> most of that space is identical strings, which means we can win nearly
>>> all the space back in v4 via the string-saving trick that is the second
>>> entry in<https://sourceware.org/binutils/wiki/CTF/Todo/Compactness>.)
>>>
>>> But I'm having trouble figuring out names for the new API functions
>>> we'll need for the rest.  Right now libctf has these:
>>>
>>> /* Convert the specified value to the corresponding enum tag name, if a
>>>     matching name can be found.  Otherwise NULL is returned.  */
>>>
>>> const char *ctf_enum_name (ctf_dict_t *fp, ctf_id_t type, int value);
>>>
>>> /* Convert the specified enum tag name to the corresponding value, if a
>>>     matching name can be found.  Otherwise CTF_ERR is returned.  */
>>>
>>> int ctf_enum_value (ctf_dict_t *fp, ctf_id_t type, const char *name,
>>> 		    int *valp);
>>>
>>> /* Iterate over the members of an ENUM.  We pass the string name and
>>>     associated integer value of each enum element to the specified callback
>>>     function.  */
>>>
>>> int ctf_enum_iter (ctf_dict_t *fp, ctf_id_t type, ctf_enum_f *func, void *arg);
>>>
>>> /* Iterate over the members of an enum TYPE, returning each enumerand's NAME or
>>>     NULL at end of iteration or error, and optionally passing back the
>>>     enumerand's integer VALue.  */
>>>
>>> const char *ctf_enum_next (ctf_dict_t *fp, ctf_id_t type, ctf_next_t **it,
>>>      	                   int *val);
>>>
>>> At the very least we want something like dict-wide equivalents of the
>>> first two: but ctf_enum_name has the very annoying behaviour of just
>>> picking the first name if there are multiple conflicting ones with the
>>> same value, and on a dict-wide basis there will be huge numbers of these
>>> (can you imagine how many enumeration constants have the value 1?:)  )
>> I've never personally had a use-case for ctf_enum_name(), looking up an
>> enumerator by the integer value. However, I can understand why you might
>> want to do it if you know the type ID already (e.g. a debugger may want
>> to represent an enum variable with the symbolic name).
> I mostly put it there for completeness -- it's something you can't do
> without a global view of the type system, which you*can*  do with one.
> 
>> But I can't imagine a case where:
>>
>>    a. I have an integer value, and I know it's an enum, but
>>    b. I don't know which enum type it belongs to, and yet
> Indeed -- this alone suggests you have no idea what it's being passed
> to, so where did you get it from? Usually in my case this is "memory
> dumps but I'm not quite sure what type it is" and I want to know if some
> huge mysterious magic number is actually an enum -- but for this to be
> really useful we also need to translate #defines of integers into
> something enum-like (a single big "enum" named "#DEFINE" perhaps, or
> some other C-invalid name). Hmmmm...
> 

I am not convinced this should be done in CTF at all. This would fall in 
the category of supporting "debugging" in general (which opens up a 
whole different pandora box of other things), not "type inspection / 
introspection".  IOW, strictly speaking this isn't type information. 
Having to use fake types is not something palatable either.

Anyway, IIUC, this discussion has already evolved enough to agree that 
we dont need such an API. Thats good :)

  parent reply	other threads:[~2024-05-22 19:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-17 12:08 Nick Alcock
2024-05-20 20:47 ` Stephen Brennan
2024-05-21 10:13   ` Nick Alcock
2024-05-22 10:31     ` Nick Alcock
2024-05-22 19:50     ` Indu Bhagat [this message]
2024-06-06 16:01       ` Nick Alcock
2024-05-22 21:09     ` Indu Bhagat
2024-06-06 16:00       ` Nick Alcock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8ae5d1f6-ca86-e3a9-8c3b-f942f3f1c292@oracle.com \
    --to=indu.bhagat@oracle.com \
    --cc=binutils@sourceware.org \
    --cc=jose.marchesi@oracle.com \
    --cc=nick.alcock@oracle.com \
    --cc=stephen.s.brennan@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).