public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* Re: Suprious backtrace-native test failures on ubuntu package builders
@ 2016-08-30  7:34 Matthias Klose
  0 siblings, 0 replies; 3+ messages in thread
From: Matthias Klose @ 2016-08-30  7:34 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 2715 bytes --]

On 12.08.2016 13:14, Mark Wielaard wrote:
> Hi Matthias, elfutils hackers,
> 
> The ubuntu package builders sometimes see failures in backtrace-native
> and/or backtrace-native-biarch tests. The failures comes from an assert
> in tests/backtrace.c (see_exec_module):
> 
> struct see_exec_module
> {
>   Dwfl_Module *mod;
>   char selfpath[PATH_MAX + 1];
> };
> 
> static int
> see_exec_module (Dwfl_Module *mod, void **userdata __attribute__ ((unused)),
>                  const char *name __attribute__ ((unused)),
>                  Dwarf_Addr start __attribute__ ((unused)), void *arg)
> {
>   struct see_exec_module *data = arg;
>   if (strcmp (name, data->selfpath) != 0)
>     return DWARF_CB_OK;
>   assert (data->mod == NULL);
>   data->mod = mod;
>   return DWARF_CB_OK;
> }
> 
> The assert makes sure that we only see one one module with the same
> "selfpath". The selfpath is setup as follows:
> 
>   char *selfpathname;
>   int i = asprintf (&selfpathname, "/proc/%ld/exe", (long) pid);
>   assert (i > 0);
>   size_t ssize = (selfpathname, data.selfpath,
>                   sizeof (data.selfpath));
>   free (selfpathname);
>   assert (ssize > 0 && ssize < (ssize_t) sizeof (data.selfpath));
>   data.selfpath[ssize] = '\0';
>   data.mod = NULL;
>   ptrdiff_t ptrdiff = dwfl_getmodules (dwfl, see_exec_module, &data, 0);
>   assert (ptrdiff == 0);
>   assert (data.mod != NULL);
> 
> The dwfl is setup with dwfl_linux_proc_report (dwfl, pid).
> So it could be a bug in our /proc/PID/maps reader. But it could also
> be that for some reason the exec file is actually mapped twice and
> two separate Dwfl_Modules are created for it.
> 
> I have been unable to recreate the failure and so don't really understand
> what is going wrong. Has anybody else seen a failure with backtrace-native
> and/or backtrace-native-biarch in see_exec_module?
> 
> This is probably the wrong place to do this sanity check and we should
> have a separate testcase for it, so it is clearer why/what is going
> wrong/tested. It probably would also be a good idea to add a
> dwfl_mainmodule () function that gives you the main exec or kernel
> module since this is something people seem to often want. In this case
> we really just want the first Dwfl_Module with the given path anyway
> and we could just remove the assert and skip further probing once we
> find the requested main module.
> 
> But before we do it would be good to understand why the failure is
> happening. Matthias, would you be able to replicate the issue somehow
> with the attached patch added to give us a bit more information?

after a few retries the build succeeded "unfortunately".

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Suprious backtrace-native test failures on ubuntu package builders
@ 2016-08-30 11:50 Mark Wielaard
  0 siblings, 0 replies; 3+ messages in thread
From: Mark Wielaard @ 2016-08-30 11:50 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 491 bytes --]

On Tue, 2016-08-30 at 09:34 +0200, Matthias Klose wrote:
> after a few retries the build succeeded "unfortunately".

Always annoying when we cannot replicate an issue. Thanks for trying.
For 0.167 we did remove this specific assert from the testcase since it
didn't really provide any useful feedback on what was failing (and in
theory it isn't really "wrong", but it certainly is odd and unexpected).
Lets see if we get different reports about the tests for 0.167.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Suprious backtrace-native test failures on ubuntu package builders
@ 2016-08-12 11:14 Mark Wielaard
  0 siblings, 0 replies; 3+ messages in thread
From: Mark Wielaard @ 2016-08-12 11:14 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 2507 bytes --]

Hi Matthias, elfutils hackers,

The ubuntu package builders sometimes see failures in backtrace-native
and/or backtrace-native-biarch tests. The failures comes from an assert
in tests/backtrace.c (see_exec_module):

struct see_exec_module
{
  Dwfl_Module *mod;
  char selfpath[PATH_MAX + 1];
};

static int
see_exec_module (Dwfl_Module *mod, void **userdata __attribute__ ((unused)),
                 const char *name __attribute__ ((unused)),
                 Dwarf_Addr start __attribute__ ((unused)), void *arg)
{
  struct see_exec_module *data = arg;
  if (strcmp (name, data->selfpath) != 0)
    return DWARF_CB_OK;
  assert (data->mod == NULL);
  data->mod = mod;
  return DWARF_CB_OK;
}

The assert makes sure that we only see one one module with the same
"selfpath". The selfpath is setup as follows:

  char *selfpathname;
  int i = asprintf (&selfpathname, "/proc/%ld/exe", (long) pid);
  assert (i > 0);
  size_t ssize = (selfpathname, data.selfpath,
                  sizeof (data.selfpath));
  free (selfpathname);
  assert (ssize > 0 && ssize < (ssize_t) sizeof (data.selfpath));
  data.selfpath[ssize] = '\0';
  data.mod = NULL;
  ptrdiff_t ptrdiff = dwfl_getmodules (dwfl, see_exec_module, &data, 0);
  assert (ptrdiff == 0);
  assert (data.mod != NULL);

The dwfl is setup with dwfl_linux_proc_report (dwfl, pid).
So it could be a bug in our /proc/PID/maps reader. But it could also
be that for some reason the exec file is actually mapped twice and
two separate Dwfl_Modules are created for it.

I have been unable to recreate the failure and so don't really understand
what is going wrong. Has anybody else seen a failure with backtrace-native
and/or backtrace-native-biarch in see_exec_module?

This is probably the wrong place to do this sanity check and we should
have a separate testcase for it, so it is clearer why/what is going
wrong/tested. It probably would also be a good idea to add a
dwfl_mainmodule () function that gives you the main exec or kernel
module since this is something people seem to often want. In this case
we really just want the first Dwfl_Module with the given path anyway
and we could just remove the assert and skip further probing once we
find the requested main module.

But before we do it would be good to understand why the failure is
happening. Matthias, would you be able to replicate the issue somehow
with the attached patch added to give us a bit more information?

Thanks,

Mark

[-- Attachment #2: see_exec_module.patch --]
[-- Type: text/plain, Size: 1380 bytes --]

diff --git a/tests/backtrace.c b/tests/backtrace.c
index 1247643..2e3f7b8 100644
--- a/tests/backtrace.c
+++ b/tests/backtrace.c
@@ -229,6 +229,7 @@ dump (Dwfl *dwfl)
 
 struct see_exec_module
 {
+  pid_t pid;
   Dwfl_Module *mod;
   char selfpath[PATH_MAX + 1];
 };
@@ -241,7 +242,22 @@ see_exec_module (Dwfl_Module *mod, void **userdata __attribute__ ((unused)),
   struct see_exec_module *data = arg;
   if (strcmp (name, data->selfpath) != 0)
     return DWARF_CB_OK;
-  assert (data->mod == NULL);
+  if (data->mod != NULL)
+    {
+      char *selfmaps;
+      char buf[4096];
+      FILE *file;
+      size_t nread;
+      fprintf (stderr, "Saw two modules with the same selfpath: %s\n",
+	       data->selfpath);
+      asprintf (&selfmaps, "/proc/%ld/maps", (long) data->pid);
+      fprintf (stderr, "  %s:\n", selfmaps);
+      file = fopen (selfmaps, "r");
+      while ((nread = fread (buf, 1, sizeof buf, file)) > 0)
+        fwrite (buf, 1, nread, stderr);
+      fclose(file);
+      exit (-1);
+    }
   data->mod = mod;
   return DWARF_CB_OK;
 }
@@ -370,6 +386,7 @@ exec_dump (const char *exec)
   assert (ssize > 0 && ssize < (ssize_t) sizeof (data.selfpath));
   data.selfpath[ssize] = '\0';
   data.mod = NULL;
+  data.pid = pid;
   ptrdiff_t ptrdiff = dwfl_getmodules (dwfl, see_exec_module, &data, 0);
   assert (ptrdiff == 0);
   assert (data.mod != NULL);

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-08-30 11:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-30  7:34 Suprious backtrace-native test failures on ubuntu package builders Matthias Klose
  -- strict thread matches above, loose matches on Subject: below --
2016-08-30 11:50 Mark Wielaard
2016-08-12 11:14 Mark Wielaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).