public inbox for
 help / color / mirror / Atom feed
* debuginfod metadata patch review
@ 2022-08-31 21:59 Frank Ch. Eigler
  0 siblings, 0 replies; only message in thread
From: Frank Ch. Eigler @ 2022-08-31 21:59 UTC (permalink / raw)
  To: rgoldber; +Cc: elfutils-devel

Hi, Ryan -

Sorry it's been taking time.  These are some notes for what I believe
is the newest snapshot in git: users/rgoldber/try-metadata_query.

Overall things look fine.  The gist of it (for the sake of audience)
is the introduction of a new webapi to query debuginfod servers for
metadata about files matching a glob pattern.  So, fetching


would return a json document with an array of metadata kind of like

[ { "atype": "e", "buildid":
"f17a29b5a25bd4960531d82aa6b07c8abe84fa66", "mtime": 98298348,
"stype": "R", "source0": "/foo/D/hithere_1.0-1_amd64.deb",
"source1": "/usr/bin/hithere", "artifactsrc": "", "source0ref": ""
} ]

based on the current index of the debuginfod server, plus all of its
upstream federated peers.  So it'd be a really easy way of finding all
known versions of "/lib*/" for example.  Corresponding C api
& cli included.  Consumers can use this to map from path names to
buildids (such as for a forthcoming systemtap feature).

Quibble 1: the webapi quoting is a bit clumsy.  The testcase says:$PORT2/metadata/%2Fusr%2Abin%2Ahi%2A
Let's switch to something like:$PORT2/metadata?glob=/usr/bin/*hi*
because this leaves room for future expandability (beyond "glob="),
and does not require quite as much unnatural urlencoding.

Quibble 2: the C api has to be super clear about the lifetime of
the returned string: the m parameter here.

+int debuginfod_find_metadata (debuginfod_client *c,
+                              const char* p, char** m) { return -ENOSYS; }

In the other _find_ routines, the result is written to the *m 
pointer (if non-NULL!) and as a copied string that the caller is expected
to free.  That's fine here too, but then debuginfod_find_metadata 
cannot possibly just set   *metadata = "[ ]"; which is not a free'able
copied string.

Quibble 3: autoconf json-c can just feed HAVE_JSON_C rather than

Quibble 4: in the client, metadata_callback() could be simplified,
as realloc can take NULL first parameter, so the
   if (data->*data == NULL)
isn't needed.  Just use realloc.  (I see the same code in header_callback,
so same simplification can work there too.)

Quibble 5: In debuginfod-find, call PATH a GLOB, or something like that

Quibble 6: in debuginfod.cxx handle_metadata(), you don't really need
three separate sqlite queries.  One can combine the three via

      select XYZ from ABC
      union all
      select XYZ from DEF
      union all           
      select XYZ from GHI

Why the order by?

Quibble 7: Same function, bottom.  In case the metadata queries are
not supported, it's better not to give fake data "[ ]" back.  Older
servers will just reject the query anyway with a 503 or similar via an
exception.  It's okay to do the same here.

Quibble 8: Server, handler_cb().  There is no need for "after you"
type synchronization for metadata.  These queries are not worth
delaying, for "after you" purposes, because near-concurrent query
results r or prefetches are not retained in a cache.

Quibble 9: debuginfod.h doc for new function, spell out that the
**metadata must be non-NULL to receive the string.  Don't bother spell
out that things are "ESCAPED", it's just a normal json array string.

Quibble 10.  We need to decide how much of the individual json
metadata objects to document, i.e., which ones we promise to apps to
expect.  We can leave some fuzziness too, like "extra fields may
appear".  Absent fields are probably better than empty-string fields.
We should document whether there is any duplicate elimination provided
by the client, I think the answer is no, but let's be clear.

Quibble 11.  The test case test-compiles a C file to look for json-c.
It could probably just run `pkg-config json-c` to see if suceeds or
fails.  And heck the main autoconf could use
PKG_CHECK_MODULE([json-c]...)  also actually.  No big deal.

Quibble 12.  The test case should process the resulting files with jq
(really easy to extract exactly all the buildids from a json
array-of-objects), or else get some guarantee from debuginfod-find
that the output is formatted in a specific way and then grep or
whatever.  The test case shouldn't rely on undocumented formatting
accidents from debuginfod.

- FChE

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2022-08-31 21:59 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-31 21:59 debuginfod metadata patch review Frank Ch. Eigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).