From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gnu.wildebeest.org (wildebeest.demon.nl [212.238.236.112]) by sourceware.org (Postfix) with ESMTPS id B1AB7385DC30 for ; Wed, 14 Apr 2021 12:14:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B1AB7385DC30 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mark@klomp.org Received: from librem (ip-213-127-40-55.ip.prioritytelecom.net [213.127.40.55]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gnu.wildebeest.org (Postfix) with ESMTPSA id EB09B30291BE; Wed, 14 Apr 2021 14:14:52 +0200 (CEST) Received: by librem (Postfix, from userid 1000) id 1878AC2C8F; Wed, 14 Apr 2021 14:13:33 +0200 (CEST) Date: Wed, 14 Apr 2021 14:13:33 +0200 From: Mark Wielaard To: buildbot@builder.wildebeest.org Cc: elfutils-devel@sourceware.org Subject: Re: Buildbot failure in Wildebeest Builder on whole buildset Message-ID: <20210414121333.GT3953@wildebeest.org> References: <20210413165413.6B9DB8028D9@builder.wildebeest.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210413165413.6B9DB8028D9@builder.wildebeest.org> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: elfutils-devel@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Elfutils-devel mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Apr 2021 12:14:57 -0000 Hi, On Tue, Apr 13, 2021 at 04:54:13PM +0000, buildbot@builder.wildebeest.org wrote: > The Buildbot has detected a failed build on builder whole buildset while building elfutils. > Full details are available at: > https://builder.wildebeest.org/buildbot/#builders/11/builds/702 > > Buildbot URL: https://builder.wildebeest.org/buildbot/ > > Worker for this Build: fedora-ppc64le This was a different issue: FAIL: run-backtrace-native-core.sh ================================== /usr/bin/coredumpctl Hint: You are currently not seeing messages from other users and the system. Users in groups 'adm', 'systemd-journal', 'wheel' can see all messages. Pass -q to turn off this notice. PID: 10643 (backtrace-child) UID: 1002 (mjw) GID: 1002 (mjw) Signal: 6 (ABRT) Timestamp: Tue 2021-04-13 12:59:04 UTC (2s ago) Command Line: /home/mjw/wildebeest/buildbot/elfutils-fedora-ppc64le/build/tests/backtrace-child --gencore Executable: /home/mjw/wildebeest/buildbot/elfutils-fedora-ppc64le/build/tests/backtrace-child Control Group: /user.slice/user-1002.slice/session-2.scope Unit: session-2.scope Slice: user-1002.slice Session: 2 Owner UID: 1002 (mjw) Boot ID: 4684256e966845baad90ffbef2d3c976 Machine ID: fa20d94f66194772a93b94464bd75866 Hostname: rh-power-vm60.fit.vutbr.cz Storage: /var/lib/systemd/coredump/core.backtrace-child.1002.4684256e966845baad90ffbef2d3c976.10643.1618318744000000.lz4 Message: Process 10643 (backtrace-child) of user 1002 dumped core. Stack trace of thread 10644: #0 0x00007fffadc28d48 raise (libpthread.so.0) #1 0x000000012d5e14a4 n/a (/home/mjw/wildebeest/buildbot/elfutils-fedora-ppc64le/build/tests/backtrace-child) #2 0x000000012d5e15cc n/a (/home/mjw/wildebeest/buildbot/elfutils-fedora-ppc64le/build/tests/backtrace-child) #3 0x000000012d5e161c n/a (/home/mjw/wildebeest/buildbot/elfutils-fedora-ppc64le/build/tests/backtrace-child) #4 0x000000012d5e1648 n/a (/home/mjw/wildebeest/buildbot/elfutils-fedora-ppc64le/build/tests/backtrace-child) #5 0x00007fffadc18c10 start_thread (libpthread.so.0) #6 0x00007fffadb2d8a8 __clone (libc.so.6) 0x7fffae620000 0x7fffae630000 linux-vdso64.so.1 0x7fffae640000 0x7fffae681108 ld64.so.2 0x7fffad590000 0x7fffad5c0428 libgcc_s.so.1 0x7fffad5d0000 0x7fffad6f0128 libm.so.6 0x7fffad700000 0x7fffad9955f8 libstdc++.so.6 0x7fffad9a0000 0x7fffad9c0320 librt.so.1 0x7fffad9d0000 0x7fffad9f0108 libdl.so.2 0x7fffada00000 0x7fffadc05378 libc.so.6 0x7fffadc10000 0x7fffadc54520 libpthread.so.0 0x7fffadc60000 0x7fffae5f51c8 libubsan.so.1 0x12d5e0000 0x12d6001c0 backtrace-child TID 10644: # 0 0x7fffadc28d48 raise # 1 0x12d5e14a4 - 1 sigusr2 # 2 0x12d5e15cc - 1 stdarg # 3 0x12d5e161c - 1 backtracegen # 4 0x12d5e1648 - 1 start # 5 0x7fffadc18c10 - 1 start_thread # 6 0x7fffadb2d8a8 - 1 __clone /home/mjw/wildebeest/buildbot/elfutils-fedora-ppc64le/build/tests/backtrace: dwfl_thread_getframes: address out of range backtrace: backtrace.c:81: callback_verify: Assertion `seen_main' failed. ./test-subr.sh: line 84: 10904 Aborted (core dumped) LD_LIBRARY_PATH="${built_library_path}${LD_LIBRARY_PATH:+:}$LD_LIBRARY_PATH" $VALGRIND_CMD "$@" backtrace-child-core.10643: no main rmdir: failed to remove 'test-10634': Directory not empty FAIL run-backtrace-native-core.sh (exit status: 1) It disappeared on a rebuild... It looks like in the failure case the child thread was unwound correctly, but the main thread couldn't. It is unclear to be if this was because of a bug in the unwinder or because systemd left us a bad core file. The corresponding change (commit 879513ab - nm: Fix file descriptor leak on dwfl_begin failure.) really couldn't have caused this IMHO. So it is a bit of a mystery. Sigh, Mark