public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release
@ 2018-09-17 10:59 mliska at suse dot cz
  2018-09-17 11:41 ` [Bug tools/23673] " mark at klomp dot org
                   ` (23 more replies)
  0 siblings, 24 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-17 10:59 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

            Bug ID: 23673
           Summary: TEST ./tests/backtrace-dwarf fails on s390x in 0.174
                    release
           Product: elfutils
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: tools
          Assignee: unassigned at sourceware dot org
          Reporter: mliska at suse dot cz
                CC: elfutils-devel at sourceware dot org
  Target Milestone: ---

Following test-case fails:

$ ./tests/backtrace-dwarf
0x3ffbd840622   raise
0x3ffbd823ce2   abort
./tests/backtrace-dwarf: dwfl_thread_getframes: no error

Fortunately I have an access to s390x machine, thus I can help with debugging.

The binary is build with GCC 8.1.1.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
@ 2018-09-17 11:41 ` mark at klomp dot org
  2018-09-17 11:45 ` mliska at suse dot cz
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mark at klomp dot org @ 2018-09-17 11:41 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #1 from Mark Wielaard <mark at klomp dot org> ---
Note that we have an s390x fedora buildbot worker that also uses GCC 8.1.1:
https://builder.wildebeest.org/buildbot/#/workers/5
That one is green.

So I suspect it is either a different binutils or glibc (the above buildbot
worker has glibc 2.27 and binutils 2.29.1) or different build/CFLAGS/defaults.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
  2018-09-17 11:41 ` [Bug tools/23673] " mark at klomp dot org
@ 2018-09-17 11:45 ` mliska at suse dot cz
  2018-09-17 19:44 ` mark at klomp dot org
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-17 11:45 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #2 from Martin Liska <mliska at suse dot cz> ---
$ ld --version
GNU ld (GNU Binutils; openSUSE:Factory:zSystems) 2.31

$ /lib64/libc.so.6
GNU C Library (GNU libc) stable release version 2.28 (git 3c03baca37fd).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
  2018-09-17 11:41 ` [Bug tools/23673] " mark at klomp dot org
  2018-09-17 11:45 ` mliska at suse dot cz
@ 2018-09-17 19:44 ` mark at klomp dot org
  2018-09-18  7:38 ` mliska at suse dot cz
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mark at klomp dot org @ 2018-09-17 19:44 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #3 from Mark Wielaard <mark at klomp dot org> ---
It does seem to work correctly on Fedora 29 with gcc 8.2, binutils 2.31 and
glibc 2.28: 

https://kojipkgs.fedoraproject.org//packages/elfutils/0.174/1.fc29/data/logs/s390x/build.log

PASS: run-backtrace-dwarf.sh

So it is probably some difference is default/build flags.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (2 preceding siblings ...)
  2018-09-17 19:44 ` mark at klomp dot org
@ 2018-09-18  7:38 ` mliska at suse dot cz
  2018-09-18 15:21 ` mark at klomp dot org
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-18  7:38 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #4 from Martin Liska <mliska at suse dot cz> ---
Created attachment 11257
  --> https://sourceware.org/bugzilla/attachment.cgi?id=11257&action=edit
openSUSE build log

I'm attaching my build log. In general, I guess following flags are used:

-std=gnu99 -Wall -Wshadow -Wformat=2 -Wold-style-definition -Wstrict-prototypes
-Wlogical-op -Wduplicated-cond -Wnull-dereference -Wimplicit-fallthrough=5
-Werror -Wunused -Wextra -Wstack-usage=262144   -fPIC -O2 -g -m64
-fmessage-length=0 -D_FORTIFY_SOURCE=2 -fstack-protector -funwind-tables
-fasynchronous-unwind-tables -g

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (3 preceding siblings ...)
  2018-09-18  7:38 ` mliska at suse dot cz
@ 2018-09-18 15:21 ` mark at klomp dot org
  2018-09-19  8:44 ` mliska at suse dot cz
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mark at klomp dot org @ 2018-09-18 15:21 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

Mark Wielaard <mark at klomp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #5 from Mark Wielaard <mark at klomp dot org> ---
We reviewed this on irc and came to the surprising conclusion that this was
caused by ptrace TRACEME failing with EPERM. That is really odd. But not a bug
in elfutils IMHO.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (4 preceding siblings ...)
  2018-09-18 15:21 ` mark at klomp dot org
@ 2018-09-19  8:44 ` mliska at suse dot cz
  2018-09-19  8:49 ` [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173 mliska at suse dot cz
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-19  8:44 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

Martin Liska <mliska at suse dot cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|INVALID                     |---

--- Comment #6 from Martin Liska <mliska at suse dot cz> ---
I've just played with that and I did an error: one can't utilize ptrace and
open an executable in gdb. That causes the EPERM errno.
So the issue is still valid in my opinion.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (5 preceding siblings ...)
  2018-09-19  8:44 ` mliska at suse dot cz
@ 2018-09-19  8:49 ` mliska at suse dot cz
  2018-09-19  9:29 ` ldv at sourceware dot org
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-19  8:49 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

Martin Liska <mliska at suse dot cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|TEST                        |TEST
                   |./tests/backtrace-dwarf     |./tests/backtrace-dwarf
                   |fails on s390x in 0.174     |fails on s390x in at least
                   |release                     |0.173

--- Comment #7 from Martin Liska <mliska at suse dot cz> ---
Note that it's not related to 0.174. I can see it also in 0.173, so as Mark
mentioned it's dependent on glibc, bintuils, ..

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (6 preceding siblings ...)
  2018-09-19  8:49 ` [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173 mliska at suse dot cz
@ 2018-09-19  9:29 ` ldv at sourceware dot org
  2018-09-19  9:49 ` mliska at suse dot cz
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ldv at sourceware dot org @ 2018-09-19  9:29 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #8 from Dmitry V. Levin <ldv at sourceware dot org> ---
If a process is not being traced and PTRACE_TRACEME fails with EPERM, then it
must be a kernel issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (7 preceding siblings ...)
  2018-09-19  9:29 ` ldv at sourceware dot org
@ 2018-09-19  9:49 ` mliska at suse dot cz
  2018-09-19 10:32 ` ldv at sourceware dot org
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-19  9:49 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #9 from Martin Liska <mliska at suse dot cz> ---
Hm, on x86_64 (on trunk) I see all tests OK, but:

$ ./backtrace-dwarf 
backtrace-dwarf: backtrace-dwarf.c:146: main: Assertion `errno == 0' failed.
0x7ffff7a4f08b  raise
0x7ffff7a384e9  abort
0x7ffff7a383c1  __assert_fail_base.cold.0
0x7ffff7a476f2  __assert_fail
0x40135a        main

which should not happen. On my machine I see errno == 2.

I would expect the test will fail with:

diff --git a/tests/backtrace-dwarf.c b/tests/backtrace-dwarf.c
index e1eb4928..273d2b5e 100644
--- a/tests/backtrace-dwarf.c
+++ b/tests/backtrace-dwarf.c
@@ -143,8 +143,8 @@ main (int argc __attribute__ ((unused)), char **argv)
       abort ();
     case 0:;
       long l = ptrace (PTRACE_TRACEME, 0, NULL, NULL);
-      assert (errno == 0);
-      assert (l == 0);
+      if (errno != 0 || l != 0)
+        return -1;
       cleanup_13_main ();
       abort ();
     default:

but it's still fine, while:
./backtrace-dwarf 
backtrace-dwarf: backtrace-dwarf.c:159: main: Assertion `WIFSTOPPED (status)'
failed.
Aborted (core dumped)

That said, the tests looks to me very fragile..

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (8 preceding siblings ...)
  2018-09-19  9:49 ` mliska at suse dot cz
@ 2018-09-19 10:32 ` ldv at sourceware dot org
  2018-09-19 10:50 ` mliska at suse dot cz
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ldv at sourceware dot org @ 2018-09-19 10:32 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

Dmitry V. Levin <ldv at sourceware dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ldv at sourceware dot org

--- Comment #10 from Dmitry V. Levin <ldv at sourceware dot org> ---
I'd suggest the following change to enhance error diagnostics:

diff --git a/tests/backtrace-dwarf.c b/tests/backtrace-dwarf.c
index 35f25ed6..3a22db31 100644
--- a/tests/backtrace-dwarf.c
+++ b/tests/backtrace-dwarf.c
@@ -143,9 +143,8 @@ main (int argc __attribute__ ((unused)), char **argv)
     case -1:
       abort ();
     case 0:;
-      long l = ptrace (PTRACE_TRACEME, 0, NULL, NULL);
-      assert (errno == 0);
-      assert (l == 0);
+      if (ptrace (PTRACE_TRACEME, 0, NULL, NULL))
+        _exit(errno ?: -1);
       cleanup_13_main ();
       abort ();
     default:
@@ -155,10 +154,12 @@ main (int argc __attribute__ ((unused)), char **argv)
   errno = 0;
   int status;
   pid_t got = waitpid (pid, &status, 0);
-  assert (errno == 0);
-  assert (got == pid);
-  assert (WIFSTOPPED (status));
-  assert (WSTOPSIG (status) == SIGABRT);
+  if (got != pid)
+    error (1, errno, "waitpid returned %d", got);
+  if (!WIFSTOPPED (status))
+    error (1, 0, "unexpected wait status %u", status);
+  if (WSTOPSIG (status) != SIGABRT)
+    error (1, 0, "unexpected signal %u", WSTOPSIG (status));

   Dwfl *dwfl = pid_to_dwfl (pid);
   dwfl_getthreads (dwfl, thread_callback, NULL);

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (9 preceding siblings ...)
  2018-09-19 10:32 ` ldv at sourceware dot org
@ 2018-09-19 10:50 ` mliska at suse dot cz
  2018-09-19 11:01 ` ldv at sourceware dot org
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-19 10:50 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #11 from Martin Liska <mliska at suse dot cz> ---
With the suggested patch I see following in test-suite.log on s390x:

[   86s] + cat tests/test-suite.log
[   86s] ==========================================
[   86s]    elfutils 0.174: tests/test-suite.log
[   86s] ==========================================
[   86s] 
[   86s] # TOTAL: 202
[   86s] # PASS:  194
[   86s] # SKIP:  7
[   86s] # XFAIL: 0
[   86s] # FAIL:  1
[   86s] # XPASS: 0
[   86s] # ERROR: 0
[   86s] 
[   86s] .. contents:: :depth: 2
[   86s] 
[   86s] SKIP: run-addr2line-i-demangle-test.sh
[   86s] ======================================
[   86s] 
[   86s] demangler unsupported
[   86s] SKIP run-addr2line-i-demangle-test.sh (exit status: 77)
[   86s] 
[   86s] SKIP: run-backtrace-data.sh
[   86s] ===========================
[   86s] 
[   86s] /home/abuild/rpmbuild/BUILD/elfutils-0.174/tests/backtrace-data:
Unwinding not supported for this architecture
[   86s] data: arch not supported
[   86s] SKIP run-backtrace-data.sh (exit status: 77)
[   86s] 
[   86s] FAIL: run-backtrace-dwarf.sh
[   86s] ============================
[   86s] 
[   86s] 0x3ffbda40622  raise
[   86s] 0x3ffbda23ce2  abort
[   86s] /home/abuild/rpmbuild/BUILD/elfutils-0.174/tests/backtrace-dwarf:
dwfl_thread_getframes: no error
[   86s] dwarf: no main
[   86s] FAIL run-backtrace-dwarf.sh (exit status: 1)
[   86s] 
[   86s] SKIP: run-backtrace-native-core.sh
[   86s] ==================================
[   86s] 
[   86s] No core.12202 file generated
[   86s] SKIP run-backtrace-native-core.sh (exit status: 77)
[   86s] 
[   86s] SKIP: run-backtrace-native-core-biarch.sh
[   86s] =========================================
[   86s] 
[   86s] No core.12218 file generated
[   86s] SKIP run-backtrace-native-core-biarch.sh (exit status: 77)
[   86s] 
[   86s] SKIP: run-backtrace-demangle.sh
[   86s] ===============================
[   86s] 
[   86s] demangler unsupported
[   86s] SKIP run-backtrace-demangle.sh (exit status: 77)
[   86s] 
[   86s] SKIP: run-stack-demangled-test.sh
[   86s] =================================
[   86s] 
[   86s] demangler unsupported
[   86s] SKIP run-stack-demangled-test.sh (exit status: 77)
[   86s] 
[   86s] SKIP: run-lfs-symbols.sh
[   86s] ========================
[   86s] 
[   86s] LFS testing is irrelevent on this system
[   86s] SKIP run-lfs-symbols.sh (exit status: 77)
[   86s]

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (10 preceding siblings ...)
  2018-09-19 10:50 ` mliska at suse dot cz
@ 2018-09-19 11:01 ` ldv at sourceware dot org
  2018-09-19 11:09 ` mliska at suse dot cz
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ldv at sourceware dot org @ 2018-09-19 11:01 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #12 from Dmitry V. Levin <ldv at sourceware dot org> ---
(In reply to Martin Liska from comment #11)
> With the suggested patch I see following in test-suite.log on s390x:
[...]
> [   86s] FAIL: run-backtrace-dwarf.sh
> [   86s] ============================
> [   86s] 
> [   86s] 0x3ffbda40622	raise
> [   86s] 0x3ffbda23ce2	abort
> [   86s] /home/abuild/rpmbuild/BUILD/elfutils-0.174/tests/backtrace-dwarf:
> dwfl_thread_getframes: no error
> [   86s] dwarf: no main
> [   86s] FAIL run-backtrace-dwarf.sh (exit status: 1)

This doesn't look like a PTRACE_TRACEME failing with EPERM, abort() has
actually been invoked by the tracee.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (11 preceding siblings ...)
  2018-09-19 11:01 ` ldv at sourceware dot org
@ 2018-09-19 11:09 ` mliska at suse dot cz
  2018-09-19 12:44 ` mark at klomp dot org
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-19 11:09 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #13 from Martin Liska <mliska at suse dot cz> ---
(In reply to Dmitry V. Levin from comment #12)
> (In reply to Martin Liska from comment #11)
> > With the suggested patch I see following in test-suite.log on s390x:
> [...]
> > [   86s] FAIL: run-backtrace-dwarf.sh
> > [   86s] ============================
> > [   86s] 
> > [   86s] 0x3ffbda40622	raise
> > [   86s] 0x3ffbda23ce2	abort
> > [   86s] /home/abuild/rpmbuild/BUILD/elfutils-0.174/tests/backtrace-dwarf:
> > dwfl_thread_getframes: no error
> > [   86s] dwarf: no main
> > [   86s] FAIL run-backtrace-dwarf.sh (exit status: 1)
> 
> This doesn't look like a PTRACE_TRACEME failing with EPERM, abort() has
> actually been invoked by the tracee.

Agree with that, question is how to debug that. Any idea?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (12 preceding siblings ...)
  2018-09-19 11:09 ` mliska at suse dot cz
@ 2018-09-19 12:44 ` mark at klomp dot org
  2018-09-21  8:18 ` mliska at suse dot cz
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mark at klomp dot org @ 2018-09-19 12:44 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #14 from Mark Wielaard <mark at klomp dot org> ---
The test case does use assert and abort too much. How about we extend Dmitry's
patch to get rid of them all (the only abort that should be there is the one in
cleanup-13.c).

diff --git a/tests/backtrace-dwarf.c b/tests/backtrace-dwarf.c
index 35f25ed..498416f 100644
--- a/tests/backtrace-dwarf.c
+++ b/tests/backtrace-dwarf.c
@@ -16,7 +16,6 @@
    along with this program.  If not, see <http://www.gnu.org/licenses/>.  */

 #include <config.h>
-#include <assert.h>
 #include <inttypes.h>
 #include <stdio_ext.h>
 #include <locale.h>
@@ -141,13 +140,18 @@ main (int argc __attribute__ ((unused)), char **argv)
   switch (pid)
   {
     case -1:
-      abort ();
+      perror ("fork failed");
+      exit (-1);
     case 0:;
       long l = ptrace (PTRACE_TRACEME, 0, NULL, NULL);
-      assert (errno == 0);
-      assert (l == 0);
+      if (l != 0)
+       {
+         perror ("PTRACE_TRACEME failed");
+         exit (-1);
+       }
       cleanup_13_main ();
-      abort ();
+      printf ("cleanup_13_main returned, impossible...\n");
+      exit (-1);
     default:
       break;
   }
@@ -155,16 +159,20 @@ main (int argc __attribute__ ((unused)), char **argv)
   errno = 0;
   int status;
   pid_t got = waitpid (pid, &status, 0);
-  assert (errno == 0);
-  assert (got == pid);
-  assert (WIFSTOPPED (status));
-  assert (WSTOPSIG (status) == SIGABRT);
+  if (got != pid)
+    error (1, errno, "waitpid returned %d", got);
+  if (!WIFSTOPPED (status))
+    error (1, 0, "unexpected wait status %u", status);
+  if (WSTOPSIG (status) != SIGABRT)
+    error (1, 0, "unexpected signal %u", WSTOPSIG (status));

   Dwfl *dwfl = pid_to_dwfl (pid);
-  dwfl_getthreads (dwfl, thread_callback, NULL);
+  if (dwfl_getthreads (dwfl, thread_callback, NULL) == -1)
+    error (1, 0, "dwfl_getthreads: %s", dwfl_errmsg (-1));

   /* There is an exit (0) call if we find the "main" frame,  */
-  error (1, 0, "dwfl_getthreads: %s", dwfl_errmsg (-1));
+  printf ("dwfl_getthreads returned, main not found\n");
+  exit (-1);
 }

 #endif /* ! __linux__ */

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (13 preceding siblings ...)
  2018-09-19 12:44 ` mark at klomp dot org
@ 2018-09-21  8:18 ` mliska at suse dot cz
  2018-09-21  9:06 ` mark at klomp dot org
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-21  8:18 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #15 from Martin Liska <mliska at suse dot cz> ---
Thanks Mark, I installed the patch but I see still the same. For now, I'm
leaving that, I'm not so much interested in s390x ;)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (14 preceding siblings ...)
  2018-09-21  8:18 ` mliska at suse dot cz
@ 2018-09-21  9:06 ` mark at klomp dot org
  2018-09-21  9:20 ` mliska at suse dot cz
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mark at klomp dot org @ 2018-09-21  9:06 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #16 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Martin Liska from comment #15)
> Thanks Mark, I installed the patch but I see still the same.

The output was exactly the same? That is surprising. So there is no additional
output that explains which failure path was taken? I would have expected at
least a message about the dwfl_getthreads call.

> For now, I'm
> leaving that, I'm not so much interested in s390x ;)

Understood if it is too much work to track down. We have other s390x setups
that seems fine. But I still don't fully understand the issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (15 preceding siblings ...)
  2018-09-21  9:06 ` mark at klomp dot org
@ 2018-09-21  9:20 ` mliska at suse dot cz
  2018-09-21 11:38 ` mark at klomp dot org
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-09-21  9:20 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #17 from Martin Liska <mliska at suse dot cz> ---
(In reply to Mark Wielaard from comment #16)
> (In reply to Martin Liska from comment #15)
> > Thanks Mark, I installed the patch but I see still the same.
> 
> The output was exactly the same? That is surprising. So there is no
> additional output that explains which failure path was taken? I would have
> expected at least a message about the dwfl_getthreads call.

Yes:

$ ./backtrace-dwarf 
0x3ff8a9c0622   raise
0x3ff8a9a3ce2   abort
./backtrace-dwarf: dwfl_thread_getframes: no error

Looks that child correctly triggers assert.

> 
> > For now, I'm
> > leaving that, I'm not so much interested in s390x ;)
> 
> Understood if it is too much work to track down. We have other s390x setups
> that seems fine. But I still don't fully understand the issue.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (16 preceding siblings ...)
  2018-09-21  9:20 ` mliska at suse dot cz
@ 2018-09-21 11:38 ` mark at klomp dot org
  2018-10-16  0:13 ` michael.hudson at canonical dot com
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mark at klomp dot org @ 2018-09-21 11:38 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #18 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Martin Liska from comment #17)
> (In reply to Mark Wielaard from comment #16)
> > (In reply to Martin Liska from comment #15)
> > > Thanks Mark, I installed the patch but I see still the same.
> > 
> > The output was exactly the same? That is surprising. So there is no
> > additional output that explains which failure path was taken? I would have
> > expected at least a message about the dwfl_getthreads call.
> 
> Yes:
> 
> $ ./backtrace-dwarf 
> 0x3ff8a9c0622	raise
> 0x3ff8a9a3ce2	abort
> ./backtrace-dwarf: dwfl_thread_getframes: no error
> 
> Looks that child correctly triggers assert.

Aha, ok, yes, I missed that dwfl_thread_getthreads just calls
dwfl_thread_getframes (there is only one thread) and this does indeed not find
the main frame. I'll tweak the testcase a bit more to make it show that.

But we now know for sure that it isn't the testframe infrastructure failing,
but that the unwinder really seems to not unwind through abort and so doesn't
find main. Still don't know what is happening though.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (17 preceding siblings ...)
  2018-09-21 11:38 ` mark at klomp dot org
@ 2018-10-16  0:13 ` michael.hudson at canonical dot com
  2018-10-17 20:41 ` mark at klomp dot org
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: michael.hudson at canonical dot com @ 2018-10-16  0:13 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

Michael Hudson-Doyle <michael.hudson at canonical dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |michael.hudson at canonical dot co
                   |                            |m

--- Comment #19 from Michael Hudson-Doyle <michael.hudson at canonical dot com> ---
I see a similar looking failure on arm64 on Ubuntu 18.10:

  
https://launchpadlibrarian.net/391377304/buildlog_ubuntu-cosmic-arm64.elfutils_0.170-0.5_BUILDING.txt.gz

I've gdb-ed this to the point that the key difference between a working system
(Ubuntu 18.04) and the failing one is that libc.so.6 has a lot more entries in
.eh_frame_hdr in the failing system. On 18.04 it fails to find a fde for
abort() (or raise, I think) and unwinds using .debug_frame and that succeeds.
On 18.10 it finds a fde for both raise and abort but fails to successfully
unwind past abort using it. I don't know either why the newer libc.so.6 has a
bigger eh_frame_hdr (it is glibc 2.28 vs 2.27 but also built with newer gcc and
binutils) or why unwinding using eh_frame info fails.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (18 preceding siblings ...)
  2018-10-16  0:13 ` michael.hudson at canonical dot com
@ 2018-10-17 20:41 ` mark at klomp dot org
  2018-10-18  2:18 ` michael.hudson at canonical dot com
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: mark at klomp dot org @ 2018-10-17 20:41 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #20 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Michael Hudson-Doyle from comment #19)
> I see a similar looking failure on arm64 on Ubuntu 18.10:
>   
> https://launchpadlibrarian.net/391377304/buildlog_ubuntu-cosmic-arm64.
> elfutils_0.170-0.5_BUILDING.txt.gz

So, if possible could you build with current git or 0.174 + the patch from
comment #14 or commit 69d6e67eee30c483ba53a8e1da1b3568033e3ddecommit
69d6e67eee30c483ba53a8e1da1b3568033e3dde

> I've gdb-ed this to the point that the key difference between a working
> system (Ubuntu 18.04) and the failing one is that libc.so.6 has a lot more
> entries in .eh_frame_hdr in the failing system. On 18.04 it fails to find a
> fde for abort() (or raise, I think) and unwinds using .debug_frame and that
> succeeds. On 18.10 it finds a fde for both raise and abort but fails to
> successfully unwind past abort using it. I don't know either why the newer
> libc.so.6 has a bigger eh_frame_hdr (it is glibc 2.28 vs 2.27 but also built
> with newer gcc and binutils) or why unwinding using eh_frame info fails.

In principle the .eh_frame and .debug_frame should provide the same CFI,
although encoded slightly differently. Maybe there is a difference? You should
be able to find both with eu-readelf --debug-dump=frame

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (19 preceding siblings ...)
  2018-10-17 20:41 ` mark at klomp dot org
@ 2018-10-18  2:18 ` michael.hudson at canonical dot com
  2018-10-18  6:27 ` mark at klomp dot org
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: michael.hudson at canonical dot com @ 2018-10-18  2:18 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #21 from Michael Hudson-Doyle <michael.hudson at canonical dot com> ---
(In reply to Mark Wielaard from comment #20)
> (In reply to Michael Hudson-Doyle from comment #19)
> > I see a similar looking failure on arm64 on Ubuntu 18.10:
> >   
> > https://launchpadlibrarian.net/391377304/buildlog_ubuntu-cosmic-arm64.
> > elfutils_0.170-0.5_BUILDING.txt.gz
> 
> So, if possible could you build with current git or 0.174 + the patch from
> comment #14 or commit 69d6e67eee30c483ba53a8e1da1b3568033e3ddecommit
> 69d6e67eee30c483ba53a8e1da1b3568033e3dde

Oh hmm current git passes!  Sorry for the noise.

Oh and obviously f881459ffc95b6fad51aa055a158ee14814073aa fixes this (somehow I
failed to read the git log correctly and had to bisect to find it but there's
no real excuse for that).

> > I've gdb-ed this to the point that the key difference between a working
> > system (Ubuntu 18.04) and the failing one is that libc.so.6 has a lot more
> > entries in .eh_frame_hdr in the failing system. On 18.04 it fails to find a
> > fde for abort() (or raise, I think) and unwinds using .debug_frame and that
> > succeeds. On 18.10 it finds a fde for both raise and abort but fails to
> > successfully unwind past abort using it. I don't know either why the newer
> > libc.so.6 has a bigger eh_frame_hdr (it is glibc 2.28 vs 2.27 but also built
> > with newer gcc and binutils) or why unwinding using eh_frame info fails.
> 
> In principle the .eh_frame and .debug_frame should provide the same CFI,
> although encoded slightly differently. Maybe there is a difference? You
> should be able to find both with eu-readelf --debug-dump=frame

I wrote most of what follows while waiting for the test run above to complete
but for the record...

So something I forgot to mention is that the newer glibc has no .debug_frame
(not even in the /usr/lib/debug file that has the other debug data). So in a
sense the fact that elfutils is trying to unwind using eh_frame and not trying
the debug_frame data at all is actually not relevant here.

That said, here is the debug_frame CFI from libc in the working environment:

 [  3d28] FDE length=36 cie=[  3d18]
   CIE_pointer:              15640
   initial_location:         +0x0000000000033760 <abort>
   address_range:            0x228

   Program:
     advance_loc 1 to 0x4
     def_cfa_offset 320
     offset r29 (x29) at cfa-320
     offset r30 (x30) at cfa-312
     advance_loc 2 to 0xc
     def_cfa_register r29 (x29)
     advance_loc 1 to 0x10
     offset r19 (x19) at cfa-304
     offset r20 (x20) at cfa-296

And here is the eh_frame CFI from the libc that fails:

 [  2b08] FDE length=28 cie=[     0]
   CIE_pointer:              11020
   initial_location:         +0x00000000000207d8 <abort> (offset: 0x207d8)
   address_range:            0x214 (end offset: 0x209ec)

   Program:
     advance_loc 1 to 0x207dc
     def_cfa_offset 320
     offset r29 (x29) at cfa-320
     offset r30 (x30) at cfa-312
     advance_loc 4 to 0x207ec
     offset r19 (x19) at cfa-304
     offset r20 (x20) at cfa-296
     nop
     nop

I guess it's the lack of the def_cfa_register r29 in the eh_frame data that is
making the difference.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (20 preceding siblings ...)
  2018-10-18  2:18 ` michael.hudson at canonical dot com
@ 2018-10-18  6:27 ` mark at klomp dot org
  2018-11-16 13:33 ` mliska at suse dot cz
  2018-11-16 14:05 ` mark at klomp dot org
  23 siblings, 0 replies; 25+ messages in thread
From: mark at klomp dot org @ 2018-10-18  6:27 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #22 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Michael Hudson-Doyle from comment #21)
> (In reply to Mark Wielaard from comment #20)
> > (In reply to Michael Hudson-Doyle from comment #19)
> > > I see a similar looking failure on arm64 on Ubuntu 18.10:
> > >   
> > > https://launchpadlibrarian.net/391377304/buildlog_ubuntu-cosmic-arm64.
> > > elfutils_0.170-0.5_BUILDING.txt.gz
> > 
> > So, if possible could you build with current git or 0.174 + the patch from
> > comment #14 or commit 69d6e67eee30c483ba53a8e1da1b3568033e3ddecommit
> > 69d6e67eee30c483ba53a8e1da1b3568033e3dde
> 
> Oh hmm current git passes!  Sorry for the noise.
> 
> Oh and obviously f881459ffc95b6fad51aa055a158ee14814073aa fixes this

Cool. So this is different from the s390x issue.
Which we sadly don't yet understand.

But if that happens again on s390x an inspection of the CFI and whether it
comes from .eh_frame or .debug_frame might be helpful.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (21 preceding siblings ...)
  2018-10-18  6:27 ` mark at klomp dot org
@ 2018-11-16 13:33 ` mliska at suse dot cz
  2018-11-16 14:05 ` mark at klomp dot org
  23 siblings, 0 replies; 25+ messages in thread
From: mliska at suse dot cz @ 2018-11-16 13:33 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

--- Comment #23 from Martin Liska <mliska at suse dot cz> ---
Just for the record, as of version 0.175 the test works fine on all targets I
can test (including s390x).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173
  2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
                   ` (22 preceding siblings ...)
  2018-11-16 13:33 ` mliska at suse dot cz
@ 2018-11-16 14:05 ` mark at klomp dot org
  23 siblings, 0 replies; 25+ messages in thread
From: mark at klomp dot org @ 2018-11-16 14:05 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=23673

Mark Wielaard <mark at klomp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |WORKSFORME

--- Comment #24 from Mark Wielaard <mark at klomp dot org> ---
(In reply to Martin Liska from comment #23)
> Just for the record, as of version 0.175 the test works fine on all targets
> I can test (including s390x).

Lets close this for now. It can be reopened if we have a new test failure.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-11-16 14:05 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-17 10:59 [Bug tools/23673] New: TEST ./tests/backtrace-dwarf fails on s390x in 0.174 release mliska at suse dot cz
2018-09-17 11:41 ` [Bug tools/23673] " mark at klomp dot org
2018-09-17 11:45 ` mliska at suse dot cz
2018-09-17 19:44 ` mark at klomp dot org
2018-09-18  7:38 ` mliska at suse dot cz
2018-09-18 15:21 ` mark at klomp dot org
2018-09-19  8:44 ` mliska at suse dot cz
2018-09-19  8:49 ` [Bug tools/23673] TEST ./tests/backtrace-dwarf fails on s390x in at least 0.173 mliska at suse dot cz
2018-09-19  9:29 ` ldv at sourceware dot org
2018-09-19  9:49 ` mliska at suse dot cz
2018-09-19 10:32 ` ldv at sourceware dot org
2018-09-19 10:50 ` mliska at suse dot cz
2018-09-19 11:01 ` ldv at sourceware dot org
2018-09-19 11:09 ` mliska at suse dot cz
2018-09-19 12:44 ` mark at klomp dot org
2018-09-21  8:18 ` mliska at suse dot cz
2018-09-21  9:06 ` mark at klomp dot org
2018-09-21  9:20 ` mliska at suse dot cz
2018-09-21 11:38 ` mark at klomp dot org
2018-10-16  0:13 ` michael.hudson at canonical dot com
2018-10-17 20:41 ` mark at klomp dot org
2018-10-18  2:18 ` michael.hudson at canonical dot com
2018-10-18  6:27 ` mark at klomp dot org
2018-11-16 13:33 ` mliska at suse dot cz
2018-11-16 14:05 ` mark at klomp dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).