From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (wildebeest.demon.nl [212.238.236.112]) by sourceware.org (Postfix) with ESMTPS id B846A3858402 for ; Tue, 14 Sep 2021 11:05:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B846A3858402 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org Received: from tarox.wildebeest.org (83-87-18-245.cable.dynamic.v4.ziggo.nl [83.87.18.245]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id 96E1630002C0; Tue, 14 Sep 2021 13:05:46 +0200 (CEST) Received: by tarox.wildebeest.org (Postfix, from userid 1000) id EF2E8413CE02; Tue, 14 Sep 2021 13:05:45 +0200 (CEST) Message-ID: <30a9c8913db444c514f022f555c3424c5a1cd7ba.camel@klomp.org> Subject: Re: Buildbot failure in Wildebeest Builder on whole buildset From: Mark Wielaard To: buildbot@builder.wildebeest.org Cc: elfutils-devel@sourceware.org Date: Tue, 14 Sep 2021 13:05:45 +0200 In-Reply-To: References: <20210912231609.E04C480EA29@builder.wildebeest.org> Content-Type: multipart/mixed; boundary="=-MKOi1Q90/edxO5JK6W2A" X-Mailer: Evolution 3.28.5 (3.28.5-10.el7) Mime-Version: 1.0 X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Sep 2021 11:05:51 -0000 --=-MKOi1Q90/edxO5JK6W2A Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, On Mon, 2021-09-13 at 11:06 +0200, Mark Wielaard wrote: > On Sun, Sep 12, 2021 at 11:16:09PM +0000,=20 > buildbot@builder.wildebeest.org wrote: > > The Buildbot has detected a new failure on builder elfutils-fedora- > > s390x while building elfutils. > > Full details are available at: > > https://builder.wildebeest.org/buildbot/#builders/10/builds/795 > >=20 > > Buildbot URL: https://builder.wildebeest.org/buildbot/ > >=20 > > Worker for this Build: fedora-s390x >=20 > This is the same failure we saw on fedora-ppc64 and centos-x86_64 > yesterday. > https://builder.wildebeest.org/buildbot/#/builders/10/builds/795/steps/8/= logs/test-suite_log >=20 > I still don't understand why. In the logs we can see (for the PORT2 > server): >=20 > [Sun Sep 12 22:56:26 2021] (1493056/1493066): recorded > buildid=3Da0a48245eb29786f7b6853df68ab23cb608b344b > file=3D/home/mjw/bb/wildebeest/elfutils-fedora- > s390x/build/tests/dwfllines mtime=3D1631486319 atype=3DED >=20 > But then, 2 seconds later: > [Sun Sep 12 22:56:28 2021] (1493056/1493388): searching for > buildid=3Da0a48245eb29786f7b6853df68ab23cb608b344b > artifacttype=3Ddebuginfo suffix=3D > [Sun Sep 12 22:56:28 2021] (1493056/1493388): not found > [Sun Sep 12 22:56:28 2021] (1493056/1493388): 127.0.0.1:47886 > UA:elfutils/0.185,Linux/s390x,fedora/34 XFF: GET > /buildid/a0a48245eb29786f7b6853df68ab23cb608b344b/debuginfo 404 9 > 0+2ms >=20 > Somewhere inbetween the buildid seems to have been forgotten. But I > cannot figure out why or where. It is clearly non-deterministic since > normally the tests PASS. So the issue is triggered by this part in groom (): // delete buildids with no references in _r_de or _f_de tables; // cascades to _r_sref & _f_s records sqlite_ps buildids_del (db, "nuke orphan buildids", "delete from " BUILDIDS "_buildids " "where not exists (select 1 from " BUILDIDS "_f_= de d where " BUILDIDS "_buildids.id =3D d.buildid) " "and not exists (select 1 from " BUILDIDS "_r_de= d where " BUILDIDS "_buildids.id =3D d.buildid)"); buildids_del.reset().step_ok_done(); When commenting that out I can run the tests (or a simplified version using just one server and on one client request as attached) 30000 times without issue. While with groom executing that part of the=20 code the test will fail after a couple hundred cycles. Now the question is whether it is reasonable that groom removes the buildid here. Is that because of the way the test is written? Or is this a real bug where there is a bad interaction between a (partial?) scan run and a groom cycle? Cheers, Mark --=-MKOi1Q90/edxO5JK6W2A Content-Type: application/x-shellscript; name="run-debuginfod-federation-link.sh" Content-Disposition: inline; filename="run-debuginfod-federation-link.sh" Content-Transfer-Encoding: base64 IyEvdXNyL2Jpbi9lbnYgYmFzaAojCiMgQ29weXJpZ2h0IChDKSAyMDE5LTIwMjEgUmVkIEhhdCwg SW5jLgojIFRoaXMgZmlsZSBpcyBwYXJ0IG9mIGVsZnV0aWxzLgojCiMgVGhpcyBmaWxlIGlzIGZy ZWUgc29mdHdhcmU7IHlvdSBjYW4gcmVkaXN0cmlidXRlIGl0IGFuZC9vciBtb2RpZnkKIyBpdCB1 bmRlciB0aGUgdGVybXMgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGFzIHB1Ymxp c2hlZCBieQojIHRoZSBGcmVlIFNvZnR3YXJlIEZvdW5kYXRpb247IGVpdGhlciB2ZXJzaW9uIDMg b2YgdGhlIExpY2Vuc2UsIG9yCiMgKGF0IHlvdXIgb3B0aW9uKSBhbnkgbGF0ZXIgdmVyc2lvbi4K IwojIGVsZnV0aWxzIGlzIGRpc3RyaWJ1dGVkIGluIHRoZSBob3BlIHRoYXQgaXQgd2lsbCBiZSB1 c2VmdWwsIGJ1dAojIFdJVEhPVVQgQU5ZIFdBUlJBTlRZOyB3aXRob3V0IGV2ZW4gdGhlIGltcGxp ZWQgd2FycmFudHkgb2YKIyBNRVJDSEFOVEFCSUxJVFkgb3IgRklUTkVTUyBGT1IgQSBQQVJUSUNV TEFSIFBVUlBPU0UuICBTZWUgdGhlCiMgR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UgZm9yIG1v cmUgZGV0YWlscy4KIwojIFlvdSBzaG91bGQgaGF2ZSByZWNlaXZlZCBhIGNvcHkgb2YgdGhlIEdO VSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlCiMgYWxvbmcgd2l0aCB0aGlzIHByb2dyYW0uICBJZiBu b3QsIHNlZSA8aHR0cDovL3d3dy5nbnUub3JnL2xpY2Vuc2VzLz4uCgouICRzcmNkaXIvZGVidWdp bmZvZC1zdWJyLnNoCgojIGZvciB0ZXN0IGNhc2UgZGVidWdnaW5nLCB1bmNvbW1lbnQ6CnNldCAt eAp1bnNldCBWQUxHUklORF9DTUQKCkRCPSR7UFdEfS8uZGVidWdpbmZvZF90bXAuc3FsaXRlCmV4 cG9ydCBERUJVR0lORk9EX0NBQ0hFX1BBVEg9JHtQV0R9Ly5jbGllbnRfY2FjaGUKZXhwb3J0IERF QlVHSU5GT0RfVElNRU9VVD0xMAp0ZW1wZmlsZXMgJERCCgojIENsZWFuIG9sZCBkaXJpY3Rvcmll cwpta2RpciBEIEwgRgpta2RpciAtcCAkREVCVUdJTkZPRF9DQUNIRV9QQVRICiMgbm90IHRlbXBm aWxlcyBGIFIgTCBEIFogLSB0aGV5IGFyZSBkaXJlY3RvcmllcyB3aGljaCB3ZSBjbGVhbiB1cCBt YW51YWxseQpsbiAtcyAke2Fic19idWlsZGRpcn0vZHdmbGxpbmVzIEwvZm9vICAgIyBhbnkgcHJv Z3JhbSBub3QgdXNlZCBlbHNld2hlcmUgaW4gdGhpcyB0ZXN0CiMgVGhpcyB2YXJpYWJsZSBpcyBl c3NlbnRpYWwgYW5kIGVuc3VyZXMgbm8gdGltZS1yYWNlIGZvciBjbGFpbWluZyBwb3J0cyBvY2N1 cnMKIyBzZXQgYmFzZSB0byBhIHVuaXF1ZSBtdWx0aXBsZSBvZiAxMDAgbm90IHVzZWQgaW4gYW55 IG90aGVyICdydW4tZGVidWdpbmZvZC0qJyB0ZXN0CmJhc2U9ODkwMApnZXRfcG9ydHMKIyBOQjog cnVuIGluIC1MIHN5bWxpbmstZm9sbG93aW5nIG1vZGUgZm9yIHRoZSBMIHN1YmRpcgplbnYgTERf TElCUkFSWV9QQVRIPSRsZHBhdGggJHthYnNfYnVpbGRkaXJ9Ly4uL2RlYnVnaW5mb2QvZGVidWdp bmZvZCAkVkVSQk9TRSAtZCAke0RCfSAtRiAtVSAtdDAgLWcwIC1wICRQT1JUMSAtTCBMIEQgRiA+ IHZsb2ckUE9SVDEgMj4mMSAmClBJRDE9JCEKdGVtcGZpbGVzIHZsb2ckUE9SVDEKZXJyZmlsZXMg dmxvZyRQT1JUMQoKd2FpdF9yZWFkeSAkUE9SVDEgJ3JlYWR5JyAxCiMgTWFrZSBzdXJlIGluaXRp YWwgc2NhbiB3YXMgZG9uZQp3YWl0X3JlYWR5ICRQT1JUMSAndGhyZWFkX3dvcmtfdG90YWx7cm9s ZT0idHJhdmVyc2UifScgMQp3YWl0X3JlYWR5ICRQT1JUMSAndGhyZWFkX3dvcmtfcGVuZGluZ3ty b2xlPSJzY2FuIn0nIDAKd2FpdF9yZWFkeSAkUE9SVDEgJ3RocmVhZF9idXN5e3JvbGU9InNjYW4i fScgMAoKIyBoYXZlIGNsaWVudHMgY29udGFjdCB0aGUgbmV3IHNlcnZlcgpleHBvcnQgREVCVUdJ TkZPRF9VUkxTPWh0dHA6Ly8xMjcuMC4wLjE6JFBPUlQxCiMgVXNlIGZyZXNoIGNhY2hlIGZvciBk ZWJ1Z2luZm9kLWZpbmQgY2xpZW50IHJlcXVlc3RzCmV4cG9ydCBERUJVR0lORk9EX0NBQ0hFX1BB VEg9JHtQV0R9Ly5jbGllbnRfY2FjaGUyCm1rZGlyIC1wICRERUJVR0lORk9EX0NBQ0hFX1BBVEgK CkJVSUxESUQ9YGVudiBMRF9MSUJSQVJZX1BBVEg9JGxkcGF0aCAke2Fic19idWlsZGRpcn0vLi4v c3JjL3JlYWRlbGYgXAogICAgICAgICAtYSBML2ZvbyB8IGdyZXAgJ0J1aWxkIElEJyB8IGN1dCAt ZCAnICcgLWYgN2AKZmlsZSBML2ZvbwpmaWxlIC1MIEwvZm9vCnRlc3RydW4gJHthYnNfdG9wX2J1 aWxkZGlyfS9kZWJ1Z2luZm9kL2RlYnVnaW5mb2QtZmluZCBkZWJ1Z2luZm8gJEJVSUxESUQKCmtp bGwgJFBJRDEKd2FpdCAkUElEMQpQSUQxPTAKCmV4aXQgMAo= --=-MKOi1Q90/edxO5JK6W2A--