From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id A4DF63858C55 for ; Wed, 31 Aug 2022 21:59:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A4DF63858C55 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661983164; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=949sMdEDOag9qkhMo3hsXYEd/0WMI3JX7WcA5vS0IB0=; b=VeG7/QJr2MgSXfdGLXdo9tBO5OQ1NJT71ObS62tzuJodz1rimuk6b41fiFCayQmOpNb2ap rsQwaXETMvGHZrHLif00mdOewcVHOn4npLqlwsHh+ulFq2bEhqBFgDjN63/M7AdF9les14 VcTAUX1jx13XJAJt2M3GkuwCpGcOt50= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-664-n1FwURM2O56T00hZDN341A-1; Wed, 31 Aug 2022 17:59:23 -0400 X-MC-Unique: n1FwURM2O56T00hZDN341A-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C57451C00137 for ; Wed, 31 Aug 2022 21:59:22 +0000 (UTC) Received: from redhat.com (unknown [10.2.17.198]) by smtp.corp.redhat.com (Postfix) with ESMTPS id AC9E3492C3B; Wed, 31 Aug 2022 21:59:22 +0000 (UTC) Received: from fche by redhat.com with local (Exim 4.94.2) (envelope-from ) id 1oTVk1-0003bq-UN; Wed, 31 Aug 2022 17:59:21 -0400 Date: Wed, 31 Aug 2022 17:59:21 -0400 From: "Frank Ch. Eigler" To: rgoldber@redhat.com Cc: elfutils-devel@sourceware.org Subject: debuginfod metadata patch review Message-ID: <20220831215921.GB12014@redhat.com> MIME-Version: 1.0 User-Agent: Mutt/1.12.0 (2019-05-25) X-Scanned-By: MIMEDefang 2.85 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-8.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_BADIPHTTP,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, Ryan - Sorry it's been taking time. These are some notes for what I believe is the newest snapshot in git: users/rgoldber/try-metadata_query. Overall things look fine. The gist of it (for the sake of audience) is the introduction of a new webapi to query debuginfod servers for metadata about files matching a glob pattern. So, fetching /metadata/GLOBPATTERN would return a json document with an array of metadata kind of like [ { "atype": "e", "buildid": "f17a29b5a25bd4960531d82aa6b07c8abe84fa66", "mtime": 98298348, "stype": "R", "source0": "/foo/D/hithere_1.0-1_amd64.deb", "source1": "/usr/bin/hithere", "artifactsrc": "", "source0ref": "" } ] based on the current index of the debuginfod server, plus all of its upstream federated peers. So it'd be a really easy way of finding all known versions of "/lib*/libc.so.6" for example. Corresponding C api & cli included. Consumers can use this to map from path names to buildids (such as for a forthcoming systemtap feature). Quibble 1: the webapi quoting is a bit clumsy. The testcase says: http://127.0.0.1:$PORT2/metadata/%2Fusr%2Abin%2Ahi%2A Let's switch to something like: http://127.0.0.1:$PORT2/metadata?glob=/usr/bin/*hi* because this leaves room for future expandability (beyond "glob="), and does not require quite as much unnatural urlencoding. Quibble 2: the C api has to be super clear about the lifetime of the returned string: the m parameter here. +int debuginfod_find_metadata (debuginfod_client *c, + const char* p, char** m) { return -ENOSYS; } In the other _find_ routines, the result is written to the *m pointer (if non-NULL!) and as a copied string that the caller is expected to free. That's fine here too, but then debuginfod_find_metadata cannot possibly just set *metadata = "[ ]"; which is not a free'able copied string. Quibble 3: autoconf json-c can just feed HAVE_JSON_C rather than ENABLE_FIND_METADATA. Quibble 4: in the client, metadata_callback() could be simplified, as realloc can take NULL first parameter, so the if (data->*data == NULL) malloc() else realloc() isn't needed. Just use realloc. (I see the same code in header_callback, so same simplification can work there too.) Quibble 5: In debuginfod-find, call PATH a GLOB, or something like that Quibble 6: in debuginfod.cxx handle_metadata(), you don't really need three separate sqlite queries. One can combine the three via select XYZ from ABC union all select XYZ from DEF union all select XYZ from GHI Why the order by? Quibble 7: Same function, bottom. In case the metadata queries are not supported, it's better not to give fake data "[ ]" back. Older servers will just reject the query anyway with a 503 or similar via an exception. It's okay to do the same here. Quibble 8: Server, handler_cb(). There is no need for "after you" type synchronization for metadata. These queries are not worth delaying, for "after you" purposes, because near-concurrent query results r or prefetches are not retained in a cache. Quibble 9: debuginfod.h doc for new function, spell out that the **metadata must be non-NULL to receive the string. Don't bother spell out that things are "ESCAPED", it's just a normal json array string. Quibble 10. We need to decide how much of the individual json metadata objects to document, i.e., which ones we promise to apps to expect. We can leave some fuzziness too, like "extra fields may appear". Absent fields are probably better than empty-string fields. We should document whether there is any duplicate elimination provided by the client, I think the answer is no, but let's be clear. Quibble 11. The test case test-compiles a C file to look for json-c. It could probably just run `pkg-config json-c` to see if suceeds or fails. And heck the main autoconf could use PKG_CHECK_MODULE([json-c]...) also actually. No big deal. Quibble 12. The test case should process the resulting files with jq (really easy to extract exactly all the buildids from a json array-of-objects), or else get some guarantee from debuginfod-find that the output is formatted in a specific way and then grep or whatever. The test case shouldn't rely on undocumented formatting accidents from debuginfod. - FChE