public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug threads/10048] New: Apparent race in gdbserver causes it lose control of inferior
@ 2009-04-08 20:16 ppluzhnikov at google dot com
  2009-04-08 21:16 ` [Bug threads/10048] " pedro at codesourcery dot com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: ppluzhnikov at google dot com @ 2009-04-08 20:16 UTC (permalink / raw)
  To: gdb-prs

The symptom is that inferior dies with SIGSEGV without GDB stopping at
the bug:

  Program terminated with signal SIGSEGV, Segmentation fault.
  The program no longer exists.

This happens intermittently (but pretty regularly) on the following
test case:

/// --- cut ---
/// compile with "gcc -g -pthread pthread_creash3.c -o pthread_crash3"
#include <assert.h>
#include <stdio.h>
#include <pthread.h>
#include <syscall.h>

void *crash(void *p)
{
  char *cp = NULL;
  fprintf (stderr, "thread %p (LWP %d) about to crash\n",
           pthread_self (), syscall (SYS_gettid));
  cp[1] = 'a';
  return p;
}

void *fn(void *p)
{
  pthread_t tid;
  fprintf (stderr, "thread %p (LWP %d) about to create new thread\n",
          pthread_self (), syscall (SYS_gettid));
  pthread_create (&tid, NULL, crash, NULL);
  pthread_join (tid, NULL);
  return 0;
}

int am_I_being_traced_p ()
{
  char buf[BUFSIZ];
  FILE *fp = fopen("/proc/self/status", "r");
  int tracer = 0;

  assert (fp != NULL);
  while (fgets(buf, sizeof(buf), fp) != NULL) {
    if (sscanf (buf, "TracerPid:\t%d", &tracer) == 1)
      break;
  }
  fclose (fp);
  return tracer;
}

int main(int argc, char *argv[])
{
  pthread_t tid;
  while (!am_I_being_traced_p ()) {
    sleep (1);
  }
  fprintf(stderr, "main thread (LWP %d) has been attached\n",
          syscall (SYS_gettid));
  pthread_create (&tid, 0, fn, NULL);
  pthread_join (tid, 0);
  return 0;
}
/// --- cut ---

Here is the trace of failure:

  ./pthread-crash3 &
  sleep 1; gdbserver/gdbserver --attach :12345 $(pgrep pthread-crash3) &
  sleep 1; ./gdb -ex 'target remote :12345' -ex 'set debug infrun 1' -ex cont
-ex quit ./pthread-crash3
  [1] 23306
  [2] 23308
  Attached; pid = 23306
  Listening on port 12345
  GNU gdb (GDB) 6.8.50.20090406-cvs
  Copyright (C) 2009 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu".
  For bug reporting instructions, please see:
  <http://www.gnu.org/software/gdb/bugs/>...
  Remote debugging from host 127.0.0.1
  warning: Can not parse XML target description; XML support was disabled at
compile time
  0x00007fa16f0a1a42 in __nanosleep_nocancel () from /usr/grte/v1/lib64/libc.so.6
  0x00007fa16f0a1a42 <__nanosleep_nocancel+9>:    cmp    $0xfffffffffffff001,%rax
  infrun: clear_proceed_status_thread (Thread 23306)
  infrun: proceed (addr=0xffffffffffffffff, signal=144, step=0)
  infrun: resume (step=0, signal=0), trap_expected=0
  infrun: wait_for_inferior (treat_exec_as_sigtrap=0)
  main thread (LWP 23306) has been attached
  thread 0x40d77960 (LWP 23315) about to create new thread
  thread 0x41578960 (LWP 23316) about to crash

  Child terminated with signal = 0xb (SIGSEGV)
  GDBserver exiting
  infrun: target_wait (-1, status) =
  infrun:   42000 [process 42000],
  infrun:   status->kind = signalled, signal = SIGSEGV
  infrun: infwait_normal_state
  infrun: TARGET_WAITKIND_SIGNALLED

  Program terminated with signal SIGSEGV, Segmentation fault.
  The program no longer exists.

Here is the same trace when GDB works correctly:

  ./pthread-crash3 &
  sleep 1; gdbserver/gdbserver --attach :12345 $(pgrep pthread-crash3) &
  sleep 1; ./gdb -ex 'target remote :12345' -ex 'set debug infrun 1' -ex cont
-ex quit ./pthread-crash3
  [1] 24050
  [2] 24052
  Attached; pid = 24050
  Listening on port 12345
  GNU gdb (GDB) 6.8.50.20090406-cvs
  Copyright (C) 2009 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu".
  For bug reporting instructions, please see:
  <http://www.gnu.org/software/gdb/bugs/>...
  Remote debugging from host 127.0.0.1
  warning: Can not parse XML target description; XML support was disabled at
compile time
  0x00007f48b8b08a42 in __nanosleep_nocancel () from /usr/grte/v1/lib64/libc.so.6
  0x00007f48b8b08a42 <__nanosleep_nocancel+9>:    cmp    $0xfffffffffffff001,%rax
  infrun: clear_proceed_status_thread (Thread 24050)
  infrun: proceed (addr=0xffffffffffffffff, signal=144, step=0)
  infrun: resume (step=0, signal=0), trap_expected=0
  main thread (LWP 24050) has been attached
  thread 0x40a44960 (LWP 24059) about to create new thread
  thread 0x41e01960 (LWP 24060) about to crash
  infrun: wait_for_inferior (treat_exec_as_sigtrap=0)
  [New Thread 24060]
  infrun: target_wait (-1, status) =
  infrun:   42000 [Thread 24060],
  infrun:   status->kind = stopped, signal = SIGSEGV
  infrun: infwait_normal_state
  infrun: TARGET_WAITKIND_STOPPED
  infrun: stop_pc = 0x4003aa
  infrun: context switch
  infrun: Switching context from Thread 24050 to Thread 24060
  infrun: random signal 11

  Program received signal SIGSEGV, Segmentation fault.
  infrun: stop_stepping
  [Switching to Thread 24060]
  0x00000000004003aa in crash (p=0x0) at pthread-crash3.c:12
  12        cp[1] = 'a';
  Detaching from process 24050

I observed this using gdb-6.8 that ships with Fedora 9 on i686, and also
on CVS Head on x86_64.

Attaching to already running process appears to be required; I could never
reproduce this when inferior runs under gdbserver from the start.

Also, I couldn't reproduce the failure if crashing thread is created from
a thread that GDB already knows about (e.g. main thread). It appears that
creating 2 threads in rapid succession is required to trigger the bug.

-- 
           Summary: Apparent race in gdbserver causes it lose control of
                    inferior
           Product: gdb
           Version: 6.8
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: threads
        AssignedTo: unassigned at sourceware dot org
        ReportedBy: ppluzhnikov at google dot com
                CC: gdb-prs at sourceware dot org
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://sourceware.org/bugzilla/show_bug.cgi?id=10048

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug threads/10048] Apparent race in gdbserver causes it lose control of inferior
  2009-04-08 20:16 [Bug threads/10048] New: Apparent race in gdbserver causes it lose control of inferior ppluzhnikov at google dot com
@ 2009-04-08 21:16 ` pedro at codesourcery dot com
  2009-04-08 22:01 ` ppluzhnikov at google dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pedro at codesourcery dot com @ 2009-04-08 21:16 UTC (permalink / raw)
  To: gdb-prs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1810 bytes --]


------- Additional Comments From pedro at codesourcery dot com  2009-04-08 21:16 -------
Subject: Re:  New: Apparent race in gdbserver causes it lose control of inferior

On Wednesday 08 April 2009 21:16:08, ppluzhnikov at google dot com wrote:
>   main thread (LWP 23306) has been attached
>   thread 0x40d77960 (LWP 23315) about to create new thread
>   thread 0x41578960 (LWP 23316) about to crash
> 
>   Child terminated with signal = 0xb (SIGSEGV)
>   GDBserver exiting
>   infrun: target_wait (-1, status) =
>   infrun:   42000 [process 42000],
>   infrun:   status->kind = signalled, signal = SIGSEGV
>   infrun: infwait_normal_state
>   infrun: TARGET_WAITKIND_SIGNALLED

So, probably gdbserver didn't attach to the LWP that crashed.  Assuming
your kernel has support for PTRACE_EVENT_CLONE, gdbserver should have
seen a clone event, unless you are tripping on this FIXME ... :

 /* Attach to an inferior process.  */

 static void
 linux_attach_lwp_1 (unsigned long lwpid, int initial)
 {
...
   /* FIXME: This intermittently fails.
      We need to wait for SIGSTOP first.  */
   ptrace (PTRACE_SETOPTIONS, lwpid, 0, PTRACE_O_TRACECLONE);

If that is failing, gdbserver would not notice the new threads
being created, and the new threads wouldn't be traced, thus,
gdbserver would not be able to catch the SIGSEGV.

Does a patch-let like this:

 -  ptrace (PTRACE_SETOPTIONS, lwpid, 0, PTRACE_O_TRACECLONE);
 +  must_set_ptrace_flags = 1;

... make any difference?

stracing gdbserver would probably show what's going on.  Also running
gdbserver with 'gdbserver --debug' often helps spotting the issue.



-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=10048

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug threads/10048] Apparent race in gdbserver causes it lose control of inferior
  2009-04-08 20:16 [Bug threads/10048] New: Apparent race in gdbserver causes it lose control of inferior ppluzhnikov at google dot com
  2009-04-08 21:16 ` [Bug threads/10048] " pedro at codesourcery dot com
@ 2009-04-08 22:01 ` ppluzhnikov at google dot com
  2009-05-04 10:18 ` [Bug server/10048] " pedro at codesourcery dot com
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: ppluzhnikov at google dot com @ 2009-04-08 22:01 UTC (permalink / raw)
  To: gdb-prs


------- Additional Comments From ppluzhnikov at google dot com  2009-04-08 22:01 -------
(In reply to comment #1)

> Does a patch-let like this:
> 
>  -  ptrace (PTRACE_SETOPTIONS, lwpid, 0, PTRACE_O_TRACECLONE);
>  +  must_set_ptrace_flags = 1;
> 
> ... make any difference?

Yes: with above, I have 0 failures in 100 tries.
When I revert it, I get 62 failures in 100 tries.



-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=10048

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug server/10048] Apparent race in gdbserver causes it lose control of inferior
  2009-04-08 20:16 [Bug threads/10048] New: Apparent race in gdbserver causes it lose control of inferior ppluzhnikov at google dot com
  2009-04-08 21:16 ` [Bug threads/10048] " pedro at codesourcery dot com
  2009-04-08 22:01 ` ppluzhnikov at google dot com
@ 2009-05-04 10:18 ` pedro at codesourcery dot com
  2009-05-06 17:33 ` cvs-commit at gcc dot gnu dot org
  2009-05-06 17:38 ` pedro at codesourcery dot com
  4 siblings, 0 replies; 6+ messages in thread
From: pedro at codesourcery dot com @ 2009-05-04 10:18 UTC (permalink / raw)
  To: gdb-prs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|threads                     |server


http://sourceware.org/bugzilla/show_bug.cgi?id=10048

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug server/10048] Apparent race in gdbserver causes it lose control of inferior
  2009-04-08 20:16 [Bug threads/10048] New: Apparent race in gdbserver causes it lose control of inferior ppluzhnikov at google dot com
                   ` (2 preceding siblings ...)
  2009-05-04 10:18 ` [Bug server/10048] " pedro at codesourcery dot com
@ 2009-05-06 17:33 ` cvs-commit at gcc dot gnu dot org
  2009-05-06 17:38 ` pedro at codesourcery dot com
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2009-05-06 17:33 UTC (permalink / raw)
  To: gdb-prs


------- Additional Comments From cvs-commit at gcc dot gnu dot org  2009-05-06 17:33 -------
Subject: Bug 10048

CVSROOT:	/cvs/src
Module name:	src
Changes by:	palves@sourceware.org	2009-05-06 17:33:00

Modified files:
	gdb/gdbserver  : ChangeLog linux-low.c linux-low.h 

Log message:
	PR server/10048
	
	* linux-low.c (must_set_ptrace_flags): Delete.
	(linux_create_inferior): Set `lwp->must_set_ptrace_flags' instead
	of the global.
	(linux_attach_lwp_1): Don't set PTRACE_SETOPTIONS here.  Set
	`lwp->must_set_ptrace_flags' instead.
	(linux_wait_for_event_1): If ptrace options here.
	(linux_wait_1): ... not here.

Patches:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/gdbserver/ChangeLog.diff?cvsroot=src&r1=1.263&r2=1.264
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/gdbserver/linux-low.c.diff?cvsroot=src&r1=1.98&r2=1.99
http://sources.redhat.com/cgi-bin/cvsweb.cgi/src/gdb/gdbserver/linux-low.h.diff?cvsroot=src&r1=1.28&r2=1.29



-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=10048

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug server/10048] Apparent race in gdbserver causes it lose control of inferior
  2009-04-08 20:16 [Bug threads/10048] New: Apparent race in gdbserver causes it lose control of inferior ppluzhnikov at google dot com
                   ` (3 preceding siblings ...)
  2009-05-06 17:33 ` cvs-commit at gcc dot gnu dot org
@ 2009-05-06 17:38 ` pedro at codesourcery dot com
  4 siblings, 0 replies; 6+ messages in thread
From: pedro at codesourcery dot com @ 2009-05-06 17:38 UTC (permalink / raw)
  To: gdb-prs


------- Additional Comments From pedro at codesourcery dot com  2009-05-06 17:38 -------
http://sourceware.org/ml/gdb-patches/2009-05/msg00055.html

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |FIXED


http://sourceware.org/bugzilla/show_bug.cgi?id=10048

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-05-06 17:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-08 20:16 [Bug threads/10048] New: Apparent race in gdbserver causes it lose control of inferior ppluzhnikov at google dot com
2009-04-08 21:16 ` [Bug threads/10048] " pedro at codesourcery dot com
2009-04-08 22:01 ` ppluzhnikov at google dot com
2009-05-04 10:18 ` [Bug server/10048] " pedro at codesourcery dot com
2009-05-06 17:33 ` cvs-commit at gcc dot gnu dot org
2009-05-06 17:38 ` pedro at codesourcery dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).