From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184]) by sourceware.org (Postfix) with ESMTPS id 6A986385AFBA for ; Fri, 7 Jul 2023 14:26:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6A986385AFBA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org Received: from r6.localdomain (82-217-174-174.cable.dynamic.v4.ziggo.nl [82.217.174.174]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id 49211304F827; Fri, 7 Jul 2023 16:26:29 +0200 (CEST) Received: by r6.localdomain (Postfix, from userid 1000) id 04513340276; Fri, 7 Jul 2023 16:26:29 +0200 (CEST) Message-ID: Subject: Re: PR29472: debuginfod metadata query From: Mark Wielaard To: "Frank Ch. Eigler" , elfutils-devel@sourceware.org Date: Fri, 07 Jul 2023 16:26:28 +0200 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.3 (3.48.3-1.fc38) MIME-Version: 1.0 X-Spam-Status: No, score=-3028.0 required=5.0 tests=BAYES_00,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,RCVD_IN_BARRACUDACENTRAL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Frank, On Wed, 2023-04-12 at 16:31 -0400, Frank Ch. Eigler via Elfutils-devel wrote: > At long last, for your review, an updated & merged patch for $SUBJECT. > "git diff -s" against git/master follows, code also on > users/fche/try-pr29472e. >=20 >=20 > Author: Frank Ch. Eigler > Date: Tue Apr 11 23:35:25 2023 -0400 >=20 > PR29472: debuginfod: add metadata query webapi, C api, client > =20 > This patch extends the debuginfod API with a "metadata query" > operation. It allows clients to request an enumeration of file names > known to debuginfod servers, returning a JSON response including the > matching buildids. This lets clients later download debuginfo for a > range of versions of the same named binaries, in case they need to to > prospective work (like systemtap-based live-patching). It also lets > server operators implement prefetch triggering operations for popular > but slow debuginfo slivers like kernel vdso.debug files on fedora. > =20 > Implementation requires a modern enough json-c library. > =20 > % debuginfod-find metadata file /bin/ls > % debuginfod-find metadata glob "/usr/local/bin/c*" > =20 > Documentation and testing are included. > =20 > Signed-off-by: Ryan Goldberg > Signed-off-by: Frank Ch. Eigler Some higher level comments. - Take my comments on the sql commands and json-c reference counting with a grain of salt, I might not fully understand. - It might be better to just require the json-c library than make it all optional (when building the debuginfod code). If possible I would like to see the debuginfod-client code be usable without json-c (the merging might be possible by simple text manipulation, but if not lets just require it also). Optional features are a bit of a pain imho. - This could really use a couple of concrete examples. The interface is really abstract (just a provide a key and value and some json will come out). Which is good because that makes it extensible, but imho really confusing without some concrete examples how to use it. - Unfortunately debuginfod-find is not great for this right now, because it outputs results into the cache only (and with a file name that needs shell escaping). Might be a good idea to output pretty printed json to stdout by default. Then you could show some simple example easily. And it would make it easier for users to try some things out quickly. - Currently it seems we expect all results to be json arrays with the empty array representing either no results or error. I think it might be better to output a json object. Then you can have a object representing (empty) results as: { "paths": [ 'frob/baz', 'foo/bar/baz' ] } Or { "results": [ { buildid: "hexhexhex", file: "/path/to/file", archive: "" } ] Or return an error as: { "error": "Server unreachable code: 554" } - I don't really know how to reason properly about the merging done for results from multiple federated servers. But it seems to make it impossible to distinguish between no results or an error fetching results. Maybe creating one big results array as is done is the most sane way. Have you though of other ways to represent "merged" results? Cheers, Mark