From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dd14210.kasserver.com (dd14210.kasserver.com [85.13.138.83]) by sourceware.org (Postfix) with ESMTPS id 6B1B63858D28 for ; Fri, 8 Apr 2022 21:56:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6B1B63858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=milianw.de Authentication-Results: sourceware.org; spf=none smtp.mailfrom=milianw.de Received: from milian-workstation.localnet (p54a1bbed.dip0.t-ipconnect.de [84.161.187.237]) by dd14210.kasserver.com (Postfix) with ESMTPSA id 3E605240115; Fri, 8 Apr 2022 23:56:16 +0200 (CEST) From: Milian Wolff To: Aaron Merey Cc: Mark Wielaard , elfutils-devel@sourceware.org Subject: Re: caching failed lookups of debuginfo? Date: Fri, 08 Apr 2022 23:56:15 +0200 Message-ID: <2180828.1FMjDaRedj@milian-workstation> In-Reply-To: References: <4448277.fIUe8AKecr@milian-workstation> <19455068.ZVunK98tCv@milian-workstation> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2186349.NSmFNhNV5k"; micalg="pgp-sha256"; protocol="application/pgp-signature" X-Spam-Status: No, score=1.1 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.4 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Apr 2022 21:56:21 -0000 --nextPart2186349.NSmFNhNV5k Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii"; protected-headers="v1" From: Milian Wolff To: Aaron Merey Cc: Mark Wielaard , elfutils-devel@sourceware.org Subject: Re: caching failed lookups of debuginfo? Date: Fri, 08 Apr 2022 23:56:15 +0200 Message-ID: <2180828.1FMjDaRedj@milian-workstation> On Freitag, 8. April 2022 23:34:06 CEST Aaron Merey wrote: > Hi Milian, > > On Fri, Apr 8, 2022 at 5:08 PM Milian Wolff wrote: > > I can reproduce it now suddenly with debuginfod-find too: > > > > ``` > > $ > > debuginfod-find debuginfo 85766e9d8458b16e9c7ce6e07c712c02b8471dbc > > debuginfod_find_debuginfo 85766e9d8458b16e9c7ce6e07c712c02b8471dbc > > server urls "https://debuginfod.archlinux.org/" > > checking build-id > > checking cache dir /home/milian/.cache/debuginfod_client > > using timeout 90 > > init server 0 https://debuginfod.archlinux.org/buildid > > url 0 https://debuginfod.archlinux.org/buildid/ > > 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo > > query 1 urls in parallel > > server response HTTP response code said error > > url 0 The requested URL returned error: 404 > > not found No such file or directory (err=-2) > > Server query failed: No such file or directory > > ``` > > > > I do see an empty `/home/milian/.cache/debuginfod_client/ > > 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo` file. But the server > > is > > still queried (i.e. rerunning the above command always produces the same > > > output for me). The lookup is costly too at ~70ms overall: > I am not able to reproduce this on Fedora 35 using commit 8db849976f070. > > The first time I run debuginfod-find I get similar output to you: > ``` > debuginfod_find_debuginfo 85766e9d8458b16e9c7ce6e07c712c02b8471dbc > server urls "https://debuginfod.archlinux.org/" > checking build-id > checking cache dir /home/amerey/.cache/debuginfod_client > using timeout 90 > init server 0 https://debuginfod.archlinux.org/buildid > url 0 > https://debuginfod.archlinux.org/buildid/85766e9d8458b16e9c7ce6e07c712c02b8 > 471dbc/debuginfo query 1 urls in parallel > server response HTTP response code said error > url 0 The requested URL returned error: 404 > not found No such file or directory (err=-2) > Server query failed: No such file or directory > ``` > > But when I rerun debuginfod-find I get output indicating that the cache was > checked: > ``` > debuginfod_find_debuginfo 85766e9d8458b16e9c7ce6e07c712c02b8471dbc > server urls "https://debuginfod.archlinux.org/" > checking build-id > checking cache dir /home/amerey/.cache/debuginfod_client > not found No such file or directory (err=-2) > Server query failed: No such file or directory > ``` > > No calls to curl_multi_wait occur. Is there a "cache_miss_s" file in the top > level of your debuginfod cache? It should contain the number of seconds > that must elapse after a failed attempt before another query would be > attempted (default is 600 seconds). Yes that file exists and it contains 600. Running through strace, I see that it actually removes the empty file first. This looks totally bogus: ``` strace -e file -f debuginfod-find debuginfo 85766e9d8458b16e9c7ce6e07c712c02b8471dbc |& grep "cache/debuginfod_client" checking cache dir /home/milian/.cache/debuginfod_client newfstatat(AT_FDCWD, "/home/milian/.cache/debuginfod_client", {st_mode=S_IFDIR|0755, st_size=12288, ...}, 0) = 0 newfstatat(AT_FDCWD, "/home/milian/.cache/debuginfod_client/ cache_clean_interval_s", {st_mode=S_IFREG|0644, st_size=5, ...}, 0) = 0 newfstatat(AT_FDCWD, "/home/milian/.cache/debuginfod_client/ cache_clean_interval_s", {st_mode=S_IFREG|0644, st_size=5, ...}, 0) = 0 openat(AT_FDCWD, "/home/milian/.cache/debuginfod_client/ cache_clean_interval_s", O_RDONLY) = 5 newfstatat(AT_FDCWD, "/home/milian/.cache/debuginfod_client/ 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo", {st_mode=S_IFREG|000, st_size=0, ...}, 0) = 0 newfstatat(AT_FDCWD, "/home/milian/.cache/debuginfod_client/cache_miss_s", {st_mode=S_IFREG|0644, st_size=3, ...}, 0) = 0 openat(AT_FDCWD, "/home/milian/.cache/debuginfod_client/cache_miss_s", O_RDONLY) = 5 unlink("/home/milian/.cache/debuginfod_client/ 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo") = 0 openat(AT_FDCWD, "/home/milian/.cache/debuginfod_client/ 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo", O_RDONLY) = -1 ENOENT (No such file or directory) mkdir("/home/milian/.cache/debuginfod_client/ 85766e9d8458b16e9c7ce6e07c712c02b8471dbc", 0700) = -1 EEXIST (File exists) openat(AT_FDCWD, "/home/milian/.cache/debuginfod_client/ 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo.KdSnXl", O_RDWR|O_CREAT| O_EXCL, 0600) = 5 openat(AT_FDCWD, "/home/milian/.cache/debuginfod_client/ 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo", O_RDONLY|O_CREAT|O_EXCL, 000) = 7 unlink("/home/milian/.cache/debuginfod_client/ 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo.KdSnXl") = 0 rmdir("/home/milian/.cache/debuginfod_client/ 85766e9d8458b16e9c7ce6e07c712c02b8471dbc") = -1 ENOTEMPTY (Directory not empty) ``` Note how there's an unlink first followed by an openat of the file that just got removed? Then a mkdir for a folder that still exists. Then a tmpfile for the download. Then follows the creation of the empty file. And then it tries to rmdir the whole folder, which fails, but that still sounds wrong too, no? Backtraces for the unlink: ``` unlink("/home/milian/.cache/debuginfod_client/ 85766e9d8458b16e9c7ce6e07c712c02b8471dbc/debuginfo") = 0 > /usr/lib/libc.so.6(unlink+0xb) [0x103aab] > /usr/lib/libdebuginfod-0.186.so(debuginfod_add_http_header+0x1821) [0x3de1] > /usr/bin/debuginfod-find() [0x122f] > /usr/lib/libc.so.6(__libc_start_call_main+0x80) [0x2d310] > /usr/lib/libc.so.6(__libc_start_main+0x81) [0x2d3c1] > /usr/bin/debuginfod-find() [0x1495] ``` That is: ``` eu-addr2line -fe /usr/lib/libdebuginfod-0.186.so -a 0x3de1 0x0000000000003de1 debuginfod_query_server /usr/src/debug/elfutils-0.186/debuginfod/debuginfod-client.c:791:9 ``` Which in turn points at the code that does cache cleanup in `debuginfod_query_server`. I now used `rr` to record such a bogus run and I clearly see that `(time(NULL) - st.st_mtime <= cache_miss)` is false and it goes into the unlink case. I'll try to debug this further now - I definitely do not wait 600s inbetween these runs here... -- Milian Wolff mail@milianw.de http://milianw.de --nextPart2186349.NSmFNhNV5k Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part. Content-Transfer-Encoding: 7Bit -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEezawi1aUvUGg3A1+8zYW/HGdOX8FAmJQr38ACgkQ8zYW/HGd OX/3RA/6A/JuBeBspHLlynfVDS5rkiZGzK+mOcR849W+cP9xjARWYsslkBj4hilP QiGxu3+TaSinbcitc6xQJoPC+vDqaDqBX8hayjeefDfJDqWQ9EmM1wWmf7gweZak i9cskt4x9OJXghmGTPuBPrw265WxM2OijhswZA1Ti4PEuVZrzocpa0DNTFG3MYNq tfT64Y9k+ZuehPT4WsUjJIl9dEyyHFJRRDhFfoz3ohiX+4mv3JC5wSd4eLYGf3VD Y66XYcZjMS/nmrBx3e7W7olb9MgOaVEr4eGvbjqELiW3ak97oOfXxEWIqAIgb6P2 q7BN/PR6kEtV1lKb8Ajicpm+PcrsOlAjcweSh3c3q9mHdMhpMoaklmAnnvueeSis hnVZ9ul6TFIOO5wODGGjRYmQViFHKWDTMpATAboSuUJCAD+zrPq9AIwahoeJRtLK R0Mhil4zYVeFuzptJ9GpW0bNmFpiUtsg8qzoau9f+R9utOJVBCrf5GQEJ8woe8KS g0JXVGHAFLKfMOzq9oMupfqp7PTnWh61G04aIMtK+Pgv//c+BBe+h+Y+8EmwHdHY yvNKol169oKYjcFYjKHuaCp5RPxwRIMVJ7Uuu7tKyBHJwcsaZtqCiRME5Zn0BzLd IlNM7PBhHTill1uoJOAiJ+qy5A3S0l4hCZ0a5FJCU95Vki0Gcqg= =k70E -----END PGP SIGNATURE----- --nextPart2186349.NSmFNhNV5k--