[PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version
@ 2014-04-03  8:12 Hui Zhu
  2014-05-28 19:19 ` Pedro Alves
  0 siblings, 1 reply; 11+ messages in thread
From: Hui Zhu @ 2014-04-03  8:12 UTC (permalink / raw)
  To: gdb-patches ml

Got gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw)
(timeout) with Linux 2.6.32 and older version.

The rootcause is after the test use "set can-use-hw-watchpoints 0" let GDB
doesn't use hardware breakpoint and set a watchpoint on "global", GDB
continue will keep single step inside function "vfork".
The Linux 2.6.32 and older version doesn't have commit
6580807da14c423f0d0a708108e6df6ebc8bc83d (get more info please goto
http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=6580807da14c423f0d0a708108e6df6ebc8bc83d).
When the function "vfork" do syscall, the single step flag TIF_SINGLESTEP
will copy to child process.
Then GDB detach it, child process and parent process will be hanged.

So I make a patch that do a single step before detach.  Then TIF_SINGLESTEP
of child process in old Linux kernel will be cleared before detach.
Child process in new Linux kernel will not be affected by this single step.

The patch was tested and pass regression in new linux
kernel (3.13.6-200.fc20.x86_64) and old Linux kernel (2.6.32-38-server).

Please help me review it.

Thanks,
Hui

2014-04-03  Hui Zhu  <hui@codesourcery.com>

	* linux-nat.c (linux_child_follow_fork): do a single step before
	detach.

--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -442,6 +442,26 @@ holding the child stopped.  Try \"set de

  	  if (linux_nat_prepare_to_resume != NULL)
  	    linux_nat_prepare_to_resume (child_lp);
+
+	  /* When debug a inferior in the architecture that support
+	     hardware single step and the Linux kernel without commit
+	     6580807da14c423f0d0a708108e6df6ebc8bc83d, the vfork child
+	     process will starts with TIF_SINGLESTEP/X86_EFLAGS_TF bits
+	     if the parent process has it.
+	     So let child process do a single step under GDB control
+	     before detach it to remove this flags.  */
+
+	  if (!gdbarch_software_single_step_p (target_thread_architecture
+						   (child_lp->ptid)))
+	    {
+	      int status;
+
+	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
+		perror_with_name (_("Couldn't do single step"));
+	      if (my_waitpid (child_pid, &status, 0) < 0)
+		perror_with_name (_("Couldn't wait vfork process"));
+	    }
+
  	  ptrace (PTRACE_DETACH, child_pid, 0, 0);

  	  do_cleanups (old_chain);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version
  2014-04-03  8:12 [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version Hui Zhu
@ 2014-05-28 19:19 ` Pedro Alves
  2014-06-04  8:43   ` Hui Zhu
  2014-06-05  7:48   ` Hui Zhu
  0 siblings, 2 replies; 11+ messages in thread
From: Pedro Alves @ 2014-05-28 19:19 UTC (permalink / raw)
  To: Hui Zhu, gdb-patches ml

On 04/03/2014 09:12 AM, Hui Zhu wrote:
> Got gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw)
> (timeout) with Linux 2.6.32 and older version.
> 
> The rootcause is after the test use "set can-use-hw-watchpoints 0" let GDB
> doesn't use hardware breakpoint and set a watchpoint on "global", GDB
> continue will keep single step inside function "vfork".
> The Linux 2.6.32 and older version doesn't have commit
> 6580807da14c423f0d0a708108e6df6ebc8bc83d (get more info please goto
> http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=6580807da14c423f0d0a708108e6df6ebc8bc83d).
> When the function "vfork" do syscall, the single step flag TIF_SINGLESTEP
> will copy to child process.
> Then GDB detach it, child process and parent process will be hanged.
> 
> So I make a patch that do a single step before detach.  Then TIF_SINGLESTEP
> of child process in old Linux kernel will be cleared before detach.
> Child process in new Linux kernel will not be affected by this single step.
> 
> The patch was tested and pass regression in new linux
> kernel (3.13.6-200.fc20.x86_64) and old Linux kernel (2.6.32-38-server).
> 
> Please help me review it.

Thanks.

> 2014-04-03  Hui Zhu  <hui@codesourcery.com>
> 
> 	* linux-nat.c (linux_child_follow_fork): do a single step before
> 	detach.
> 
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -442,6 +442,26 @@ holding the child stopped.  Try \"set de
>   
>   	  if (linux_nat_prepare_to_resume != NULL)
>   	    linux_nat_prepare_to_resume (child_lp);
> +
> +	  /* When debug a inferior in the architecture that support
> +	     hardware single step and the Linux kernel without commit
> +	     6580807da14c423f0d0a708108e6df6ebc8bc83d, the vfork child
> +	     process will starts with TIF_SINGLESTEP/X86_EFLAGS_TF bits
> +	     if the parent process has it.
> +	     So let child process do a single step under GDB control
> +	     before detach it to remove this flags.  */

From the kernel patch's looks, this doesn't sound like architecture
specific, otherwise I'd suggest clearing TF instead.

So it sounds like a good solution.

I suggested this updated comment, copy/edited a bit from yours:

	  /* When debugging an inferior in an architecture that supports
	     hardware single stepping on a kernel without commit
	     6580807da14c423f0d0a708108e6df6ebc8bc83d, the vfork child
	     process starts with the TIF_SINGLESTEP/X86_EFLAGS_TF bits
	     set if the parent process had them set.
	     To work around this, single step the child process
	     once before detaching to clear the flags.  */

> +
> +	  if (!gdbarch_software_single_step_p (target_thread_architecture
> +						   (child_lp->ptid)))
> +	    {
> +	      int status;
> +
> +	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
> +		perror_with_name (_("Couldn't do single step"));
> +	      if (my_waitpid (child_pid, &status, 0) < 0)
> +		perror_with_name (_("Couldn't wait vfork process"));

If the child gets a signal here, we should pass it on to the child.

> +	    }
> +
>   	  ptrace (PTRACE_DETACH, child_pid, 0, 0);

That is:

      ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));

And I think we should disable all ptrace options in the child
before stepping it, in case some event is reported right
at that point, and we mishandle it.  Otherwise we'd need to
make sure we didn't get an extended wait status before passing
it on.  But disabling events is just safer.

>   
>   	  do_cleanups (old_chain);
> 

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version
  2014-05-28 19:19 ` Pedro Alves
@ 2014-06-04  8:43   ` Hui Zhu
  2014-06-04 16:11     ` Pedro Alves
  2014-06-05  7:48   ` Hui Zhu
  1 sibling, 1 reply; 11+ messages in thread
From: Hui Zhu @ 2014-06-04  8:43 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches ml

On 05/29/14 03:19, Pedro Alves wrote:
> On 04/03/2014 09:12 AM, Hui Zhu wrote:
>> Got gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw)
>> (timeout) with Linux 2.6.32 and older version.
>>
>> The rootcause is after the test use "set can-use-hw-watchpoints 0" let GDB
>> doesn't use hardware breakpoint and set a watchpoint on "global", GDB
>> continue will keep single step inside function "vfork".
>> The Linux 2.6.32 and older version doesn't have commit
>> 6580807da14c423f0d0a708108e6df6ebc8bc83d (get more info please goto
>> http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=6580807da14c423f0d0a708108e6df6ebc8bc83d).
>> When the function "vfork" do syscall, the single step flag TIF_SINGLESTEP
>> will copy to child process.
>> Then GDB detach it, child process and parent process will be hanged.
>>
>> So I make a patch that do a single step before detach.  Then TIF_SINGLESTEP
>> of child process in old Linux kernel will be cleared before detach.
>> Child process in new Linux kernel will not be affected by this single step.
>>
>> The patch was tested and pass regression in new linux
>> kernel (3.13.6-200.fc20.x86_64) and old Linux kernel (2.6.32-38-server).
>>
>> Please help me review it.
>
> Thanks.
>
>> 2014-04-03  Hui Zhu  <hui@codesourcery.com>
>>
>> 	* linux-nat.c (linux_child_follow_fork): do a single step before
>> 	detach.
>>
>> --- a/gdb/linux-nat.c
>> +++ b/gdb/linux-nat.c
>> @@ -442,6 +442,26 @@ holding the child stopped.  Try \"set de
>>
>>    	  if (linux_nat_prepare_to_resume != NULL)
>>    	    linux_nat_prepare_to_resume (child_lp);
>> +
>> +	  /* When debug a inferior in the architecture that support
>> +	     hardware single step and the Linux kernel without commit
>> +	     6580807da14c423f0d0a708108e6df6ebc8bc83d, the vfork child
>> +	     process will starts with TIF_SINGLESTEP/X86_EFLAGS_TF bits
>> +	     if the parent process has it.
>> +	     So let child process do a single step under GDB control
>> +	     before detach it to remove this flags.  */
>
>  From the kernel patch's looks, this doesn't sound like architecture
> specific, otherwise I'd suggest clearing TF instead.
>
> So it sounds like a good solution.
>
> I suggested this updated comment, copy/edited a bit from yours:
>
> 	  /* When debugging an inferior in an architecture that supports
> 	     hardware single stepping on a kernel without commit
> 	     6580807da14c423f0d0a708108e6df6ebc8bc83d, the vfork child
> 	     process starts with the TIF_SINGLESTEP/X86_EFLAGS_TF bits
> 	     set if the parent process had them set.
> 	     To work around this, single step the child process
> 	     once before detaching to clear the flags.  */
>
>> +
>> +	  if (!gdbarch_software_single_step_p (target_thread_architecture
>> +						   (child_lp->ptid)))
>> +	    {
>> +	      int status;
>> +
>> +	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
>> +		perror_with_name (_("Couldn't do single step"));
>> +	      if (my_waitpid (child_pid, &status, 0) < 0)
>> +		perror_with_name (_("Couldn't wait vfork process"));
>
> If the child gets a signal here, we should pass it on to the child.
>
>> +	    }
>> +
>>    	  ptrace (PTRACE_DETACH, child_pid, 0, 0);
>
> That is:
>
>        ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));
>
> And I think we should disable all ptrace options in the child
> before stepping it, in case some event is reported right
> at that point, and we mishandle it.  Otherwise we'd need to
> make sure we didn't get an extended wait status before passing
> it on.  But disabling events is just safer.

Could you give me some help on this part?
I don't know how to disable all ptrace options.

Thanks,
Hui

>
>>
>>    	  do_cleanups (old_chain);
>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version
  2014-06-04  8:43   ` Hui Zhu
@ 2014-06-04 16:11     ` Pedro Alves
  0 siblings, 0 replies; 11+ messages in thread
From: Pedro Alves @ 2014-06-04 16:11 UTC (permalink / raw)
  To: Hui Zhu, gdb-patches ml

On 06/04/2014 09:43 AM, Hui Zhu wrote:
> On 05/29/14 03:19, Pedro Alves wrote:

>> And I think we should disable all ptrace options in the child
>> before stepping it, in case some event is reported right
>> at that point, and we mishandle it.  Otherwise we'd need to
>> make sure we didn't get an extended wait status before passing
>> it on.  But disabling events is just safer.
> 
> Could you give me some help on this part?
> I don't know how to disable all ptrace options.

There's a linux_enable_event_reporting function in common/linux-ptrace.c.
Add a linux_disable_event_reporting counterpart, and call that.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version
  2014-05-28 19:19 ` Pedro Alves
  2014-06-04  8:43   ` Hui Zhu
@ 2014-06-05  7:48   ` Hui Zhu
  2014-06-05  8:43     ` Pedro Alves
  1 sibling, 1 reply; 11+ messages in thread
From: Hui Zhu @ 2014-06-05  7:48 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches ml

Hi Pedro,

Thanks for your help.

On 05/29/14 03:19, Pedro Alves wrote:
> On 04/03/2014 09:12 AM, Hui Zhu wrote:
>> Got gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw)
>> (timeout) with Linux 2.6.32 and older version.
>>
>> The rootcause is after the test use "set can-use-hw-watchpoints 0" let GDB
>> doesn't use hardware breakpoint and set a watchpoint on "global", GDB
>> continue will keep single step inside function "vfork".
>> The Linux 2.6.32 and older version doesn't have commit
>> 6580807da14c423f0d0a708108e6df6ebc8bc83d (get more info please goto
>> http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=6580807da14c423f0d0a708108e6df6ebc8bc83d).
>> When the function "vfork" do syscall, the single step flag TIF_SINGLESTEP
>> will copy to child process.
>> Then GDB detach it, child process and parent process will be hanged.
>>
>> So I make a patch that do a single step before detach.  Then TIF_SINGLESTEP
>> of child process in old Linux kernel will be cleared before detach.
>> Child process in new Linux kernel will not be affected by this single step.
>>
>> The patch was tested and pass regression in new linux
>> kernel (3.13.6-200.fc20.x86_64) and old Linux kernel (2.6.32-38-server).
>>
>> Please help me review it.
>
> Thanks.
>
>> 2014-04-03  Hui Zhu  <hui@codesourcery.com>
>>
>> 	* linux-nat.c (linux_child_follow_fork): do a single step before
>> 	detach.
>>
>> --- a/gdb/linux-nat.c
>> +++ b/gdb/linux-nat.c
>> @@ -442,6 +442,26 @@ holding the child stopped.  Try \"set de
>>
>>    	  if (linux_nat_prepare_to_resume != NULL)
>>    	    linux_nat_prepare_to_resume (child_lp);
>> +
>> +	  /* When debug a inferior in the architecture that support
>> +	     hardware single step and the Linux kernel without commit
>> +	     6580807da14c423f0d0a708108e6df6ebc8bc83d, the vfork child
>> +	     process will starts with TIF_SINGLESTEP/X86_EFLAGS_TF bits
>> +	     if the parent process has it.
>> +	     So let child process do a single step under GDB control
>> +	     before detach it to remove this flags.  */
>
>  From the kernel patch's looks, this doesn't sound like architecture
> specific, otherwise I'd suggest clearing TF instead.
>
> So it sounds like a good solution.
>
> I suggested this updated comment, copy/edited a bit from yours:
>
> 	  /* When debugging an inferior in an architecture that supports
> 	     hardware single stepping on a kernel without commit
> 	     6580807da14c423f0d0a708108e6df6ebc8bc83d, the vfork child
> 	     process starts with the TIF_SINGLESTEP/X86_EFLAGS_TF bits
> 	     set if the parent process had them set.
> 	     To work around this, single step the child process
> 	     once before detaching to clear the flags.  */
>

Updated.

>> +
>> +	  if (!gdbarch_software_single_step_p (target_thread_architecture
>> +						   (child_lp->ptid)))
>> +	    {
>> +	      int status;
>> +
>> +	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
>> +		perror_with_name (_("Couldn't do single step"));
>> +	      if (my_waitpid (child_pid, &status, 0) < 0)
>> +		perror_with_name (_("Couldn't wait vfork process"));
>
> If the child gets a signal here, we should pass it on to the child.
>
>> +	    }
>> +
>>    	  ptrace (PTRACE_DETACH, child_pid, 0, 0);
>
> That is:
>
>        ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));
>

Fixed.

> And I think we should disable all ptrace options in the child
> before stepping it, in case some event is reported right
> at that point, and we mishandle it.  Otherwise we'd need to
> make sure we didn't get an extended wait status before passing
> it on.  But disabling events is just safer.
>

> There's a linux_enable_event_reporting function in common/linux-ptrace.c.
> Add a linux_disable_event_reporting counterpart, and call that.
>


Added a new function linux_disable_event_reporting and call it in the 
part before let child do single step.

>>
>>    	  do_cleanups (old_chain);
>>
>

This is the new patch for the issue.  Please help me review it.

Best,
Hui

2014-06-05  Hui Zhu  <hui@codesourcery.com>

	* common/linux-ptrace.c (linux_disable_event_reporting): New.
	* common/linux-ptrace.h (linux_disable_event_reporting): New extern.
	* linux-nat.c (linux_child_follow_fork): do a single step before
	detach

--- a/gdb/common/linux-ptrace.c
+++ b/gdb/common/linux-ptrace.c
@@ -476,6 +476,15 @@ linux_enable_event_reporting (pid_t pid)
  	  (PTRACE_TYPE_ARG4) (uintptr_t) current_ptrace_options);
  }

+/* Disable reporting of all currently supported ptrace events.  */
+
+void
+linux_disable_event_reporting (pid_t pid)
+{
+  /* Set the options.  */
+  ptrace (PTRACE_SETOPTIONS, pid, (PTRACE_TYPE_ARG3) 0, 0);
+}
+
  /* Returns non-zero if PTRACE_OPTIONS is contained within
     CURRENT_PTRACE_OPTIONS, therefore supported.  Returns 0
     otherwise.  */
--- a/gdb/common/linux-ptrace.h
+++ b/gdb/common/linux-ptrace.h
@@ -86,6 +86,7 @@ struct buffer;
  extern void linux_ptrace_attach_fail_reason (pid_t pid, struct buffer 
*buffer);
  extern void linux_ptrace_init_warnings (void);
  extern void linux_enable_event_reporting (pid_t pid);
+extern void linux_disable_event_reporting (pid_t pid);
  extern int linux_supports_tracefork (void);
  extern int linux_supports_traceclone (void);
  extern int linux_supports_tracevforkdone (void);
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -414,6 +414,7 @@ holding the child stopped.  Try \"set de
        if (detach_fork)
  	{
  	  struct cleanup *old_chain;
+	  int status = 0;

  	  /* Before detaching from the child, remove all breakpoints
  	     from it.  If we forked, then this has already been taken
@@ -447,7 +448,28 @@ holding the child stopped.  Try \"set de

  	  if (linux_nat_prepare_to_resume != NULL)
  	    linux_nat_prepare_to_resume (child_lp);
-	  ptrace (PTRACE_DETACH, child_pid, 0, 0);
+
+	  /* When debugging an inferior in an architecture that supports
+	     hardware single stepping on a kernel without commit
+	     6580807da14c423f0d0a708108e6df6ebc8bc83d, the vfork child
+	     process starts with the TIF_SINGLESTEP/X86_EFLAGS_TF bits
+	     set if the parent process had them set.
+	     To work around this, single step the child process
+	     once before detaching to clear the flags.  */
+
+	  if (!gdbarch_software_single_step_p (target_thread_architecture
+						   (child_lp->ptid)))
+	    {
+	      int status;
+
+	      linux_disable_event_reporting (child_pid);
+	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
+		perror_with_name (_("Couldn't do single step"));
+	      if (my_waitpid (child_pid, &status, 0) < 0)
+		perror_with_name (_("Couldn't wait vfork process"));
+	    }
+
+	  ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));

  	  do_cleanups (old_chain);
  	}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version
  2014-06-05  7:48   ` Hui Zhu
@ 2014-06-05  8:43     ` Pedro Alves
  2014-06-08 11:16       ` Hui Zhu
  0 siblings, 1 reply; 11+ messages in thread
From: Pedro Alves @ 2014-06-05  8:43 UTC (permalink / raw)
  To: Hui Zhu, gdb-patches ml

On 06/05/2014 08:47 AM, Hui Zhu wrote:

> 
> 2014-06-05  Hui Zhu  <hui@codesourcery.com>
> 
> 	* common/linux-ptrace.c (linux_disable_event_reporting): New.
> 	* common/linux-ptrace.h (linux_disable_event_reporting): New extern.

You're not adding an extern to an existing function, but adding new declaration
that happens to declare a function with extern linkage.

> 	* linux-nat.c (linux_child_follow_fork): do a single step before
> 	detach

Capitalization, and full stop both missing.  Write instead:

	* common/linux-ptrace.c (linux_disable_event_reporting): New function.
	* common/linux-ptrace.h (linux_disable_event_reporting): New declaration.
	* linux-nat.c (linux_child_follow_fork): Do a single step before
	detach.

> +	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
> +		perror_with_name (_("Couldn't do single step"));
> +	      if (my_waitpid (child_pid, &status, 0) < 0)
> +		perror_with_name (_("Couldn't wait vfork process"));
> +	    }
> +
> +	  ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));

The child could have exited with that single-step.  So:

	  if (WIFSTOPPED (status))
	    ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));

OK with that change.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version
  2014-06-05  8:43     ` Pedro Alves
@ 2014-06-08 11:16       ` Hui Zhu
  2014-06-09 13:58         ` [pushed] Fix a bunch of fork related regressions. (was: Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version) Pedro Alves
  2014-07-03 16:24         ` [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version Hui Zhu
  0 siblings, 2 replies; 11+ messages in thread
From: Hui Zhu @ 2014-06-08 11:16 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches ml

On 06/05/14 16:43, Pedro Alves wrote:
> On 06/05/2014 08:47 AM, Hui Zhu wrote:
>
>>
>> 2014-06-05  Hui Zhu  <hui@codesourcery.com>
>>
>> 	* common/linux-ptrace.c (linux_disable_event_reporting): New.
>> 	* common/linux-ptrace.h (linux_disable_event_reporting): New extern.
>
> You're not adding an extern to an existing function, but adding new declaration
> that happens to declare a function with extern linkage.
>
>> 	* linux-nat.c (linux_child_follow_fork): do a single step before
>> 	detach
>
> Capitalization, and full stop both missing.  Write instead:
>
> 	* common/linux-ptrace.c (linux_disable_event_reporting): New function.
> 	* common/linux-ptrace.h (linux_disable_event_reporting): New declaration.
> 	* linux-nat.c (linux_child_follow_fork): Do a single step before
> 	detach.

Updated.

>
>> +	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
>> +		perror_with_name (_("Couldn't do single step"));
>> +	      if (my_waitpid (child_pid, &status, 0) < 0)
>> +		perror_with_name (_("Couldn't wait vfork process"));
>> +	    }
>> +
>> +	  ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));
>
> The child could have exited with that single-step.  So:
>
> 	  if (WIFSTOPPED (status))
> 	    ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));

Updated.

>
> OK with that change.
>

Thanks for your help.  Committed as 
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=c077881afaedb9b74063bee992b3e472b4b6e9ca

Best,
Hui

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pushed] Fix a bunch of fork related regressions. (was: Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version)
  2014-06-08 11:16       ` Hui Zhu
@ 2014-06-09 13:58         ` Pedro Alves
  2014-07-03 16:24         ` [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version Hui Zhu
  1 sibling, 0 replies; 11+ messages in thread
From: Pedro Alves @ 2014-06-09 13:58 UTC (permalink / raw)
  To: Hui Zhu, gdb-patches ml

On 06/08/2014 12:16 PM, Hui Zhu wrote:

> Thanks for your help.  Committed as 
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=c077881afaedb9b74063bee992b3e472b4b6e9ca

This caused a bunch of failures and time outs.
Hui, that patch clearly wasn't tested... ;-)
I've pushed the patch below to fix this.

8<--------------
[PATCH] Fix a bunch of fork related regressions.

I'm seeing a ton of new FAILs in fork-related tests.  Like, these and
many more:

 +FAIL: gdb.base/disp-step-syscall.exp: vfork: continue to vfork (2nd time) (timeout)
 +FAIL: gdb.base/disp-step-syscall.exp: vfork: display/i $pc (timeout)
 ...
 -PASS: gdb.base/foll-vfork.exp: exec: vfork parent follow, through step: step
 +FAIL: gdb.base/foll-vfork.exp: exec: vfork parent follow, through step: step (timeout)
 -PASS: gdb.base/foll-vfork.exp: exec: vfork parent follow, to bp: continue to bp
 +FAIL: gdb.base/foll-vfork.exp: exec: vfork parent follow, to bp: continue to bp (timeout)
  ...
  FAIL: gdb.threads/watchpoint-fork.exp: parent: multithreaded: breakpoint (A) after the first fork (timeout)
  FAIL: gdb.threads/watchpoint-fork.exp: parent: multithreaded: watchpoint A after the first fork (timeout)
  FAIL: gdb.base/fileio.exp: System(3) call (timeout)
  FAIL: gdb.threads/watchpoint-fork.exp: parent: multithreaded: watchpoint B after the first fork (timeout)
 -PASS: gdb.base/multi-forks.exp: run to exit 2
 +FAIL: gdb.base/multi-forks.exp: run to exit 2 (timeout)
  ...
  PASS: gdb.base/watch-vfork.exp: Watchpoint on global variable (hw)
 -PASS: gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (hw)
 +FAIL: gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (hw) (timeout)
  PASS: gdb.base/watch-vfork.exp: Watchpoint on global variable (sw)
 -PASS: gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw)
 +FAIL: gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout)

Three issues with
https://sourceware.org/ml/gdb-patches/2014-06/msg00348.html
(c077881a).

 - The inner 'status' local is shadowing the outer 'status' local,
   thus PTRACE_DETACH is never seeing the status it intends to pass on
   the inferior.

 - With that fixed, we then try to pass down the SIGTRAP that results
   from the step to the inferior.  Need to filter out signals that are
   in nopass state.

 - For software single-step archs, the current code is equivalent to:

      int status = 0;
      if (WIFSTOPPED (status))
        ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));

   ... and status == 0 is WIFEXITED, not WIFSTOPPED, so we're never
   detaching.

gdb/
2014-06-09  Pedro Alves  <palves@redhat.com>

	* linux-nat.c (linux_child_follow_fork): Initialize status with
	W_STOPCODE (0) instead of 0.  Remove shodowing 'status' local from
	inner block.  Only pass the signal to PTRACE_DETACH if in pass
	state.
---
 gdb/testsuite/ChangeLog |  7 +++++++
 gdb/linux-nat.c         | 14 ++++++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/gdb/testsuite/ChangeLog b/gdb/testsuite/ChangeLog
index 2b5cca3..6c78bdd 100644
--- a/gdb/testsuite/ChangeLog
+++ b/gdb/testsuite/ChangeLog
@@ -1,3 +1,10 @@
+2014-06-09  Pedro Alves  <palves@redhat.com>
+
+	* linux-nat.c (linux_child_follow_fork): Initialize status with
+	W_STOPCODE (0) instead of 0.  Remove shodowing 'status' local from
+	inner block.  Only pass the signal to PTRACE_DETACH if in pass
+	state.
+
 2014-06-09  Gary Benson  <gbenson@redhat.com>
 
 	* gdb.base/sigall.c [Functions to send signals]: Reorder to
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 3b117b5..c9677ca 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -414,7 +414,7 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
       if (detach_fork)
 	{
 	  struct cleanup *old_chain;
-	  int status = 0;
+	  int status = W_STOPCODE (0);
 
 	  /* Before detaching from the child, remove all breakpoints
 	     from it.  If we forked, then this has already been taken
@@ -460,8 +460,6 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	  if (!gdbarch_software_single_step_p (target_thread_architecture
 						   (child_lp->ptid)))
 	    {
-	      int status;
-
 	      linux_disable_event_reporting (child_pid);
 	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
 		perror_with_name (_("Couldn't do single step"));
@@ -470,7 +468,15 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	    }
 
 	  if (WIFSTOPPED (status))
-	    ptrace (PTRACE_DETACH, child_pid, 0, WSTOPSIG (status));
+	    {
+	      int signo;
+
+	      signo = WSTOPSIG (status);
+	      if (signo != 0
+		  && !signal_pass_state (gdb_signal_from_host (signo)))
+		signo = 0;
+	      ptrace (PTRACE_DETACH, child_pid, 0, signo);
+	    }
 
 	  do_cleanups (old_chain);
 	}
-- 
1.9.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version
  2014-06-08 11:16       ` Hui Zhu
  2014-06-09 13:58         ` [pushed] Fix a bunch of fork related regressions. (was: Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version) Pedro Alves
@ 2014-07-03 16:24         ` Hui Zhu
  2014-07-04 17:51           ` [PATCH] Handle signals sent to a fork/vfork child before it has a chance to first run (Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version) Pedro Alves
  1 sibling, 1 reply; 11+ messages in thread
From: Hui Zhu @ 2014-07-03 16:24 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches ml

>
>>
>> OK with that change.
>>
>
> Thanks for your help.  Committed as
> https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=c077881afaedb9b74063bee992b3e472b4b6e9ca
>
>
> Best,
> Hui

Hi Pedro,

After this patch, I still got fail with this test in Linux 2.6.32.
The cause this:
	      signo = WSTOPSIG (status);
#In linux 2.6.32, signo will be SIGSTOP.
	      if (signo != 0
		  && !signal_pass_state (gdb_signal_from_host (signo)))
		signo = 0;
#SIGSTOP will send to child and make it stop.
	      ptrace (PTRACE_DETACH, child_pid, 0, signo);

So I make a patch to fix it.

Thanks,
Hui

2014-07-04  Hui Zhu  <hui@codesourcery.com>

	* linux-nat.c(linux_child_follow_fork): Add check if signo
	is SIGSTOP.

--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -472,8 +472,9 @@ holding the child stopped.  Try \"set de
  	      int signo;

  	      signo = WSTOPSIG (status);
-	      if (signo != 0
-		  && !signal_pass_state (gdb_signal_from_host (signo)))
+	      if (signo == SIGSTOP
+		  || (signo != 0
+		      && !signal_pass_state (gdb_signal_from_host (signo))))
  		signo = 0;
  	      ptrace (PTRACE_DETACH, child_pid, 0, signo);
  	    }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] Handle signals sent to a fork/vfork child before it has a chance to first run (Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version)
  2014-07-03 16:24         ` [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version Hui Zhu
@ 2014-07-04 17:51           ` Pedro Alves
  2014-07-05  6:08             ` Hui Zhu
  0 siblings, 1 reply; 11+ messages in thread
From: Pedro Alves @ 2014-07-04 17:51 UTC (permalink / raw)
  To: Hui Zhu, gdb-patches ml

On 07/03/2014 05:24 PM, Hui Zhu wrote:

> After this patch, I still got fail with this test in Linux 2.6.32.
> The cause this:
> 	      signo = WSTOPSIG (status);
> #In linux 2.6.32, signo will be SIGSTOP.
> 	      if (signo != 0
> 		  && !signal_pass_state (gdb_signal_from_host (signo)))
> 		signo = 0;
> #SIGSTOP will send to child and make it stop.
> 	      ptrace (PTRACE_DETACH, child_pid, 0, signo);
> 
> So I make a patch to fix it.
> 
> Thanks,
> Hui
> 
> 2014-07-04  Hui Zhu  <hui@codesourcery.com>
> 
> 	* linux-nat.c(linux_child_follow_fork): Add check if signo
> 	is SIGSTOP.
> 
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -472,8 +472,9 @@ holding the child stopped.  Try \"set de
>   	      int signo;
> 
>   	      signo = WSTOPSIG (status);
> -	      if (signo != 0
> -		  && !signal_pass_state (gdb_signal_from_host (signo)))
> +	      if (signo == SIGSTOP
> +		  || (signo != 0
> +		      && !signal_pass_state (gdb_signal_from_host (signo))))
>   		signo = 0;
>   	      ptrace (PTRACE_DETACH, child_pid, 0, signo);
>   	    }
> 

Hmm.  We should actually be passing the signal the fork child had stopped
for earlier, when we step it for the workaround.  But we lose that signal...

/me hacks.

Does this patch fix the issue too ?

8<-----------
From 4184b4d0ffca89776d06d6b7d1241e9c0d01017a Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Fri, 4 Jul 2014 17:31:41 +0100
Subject: [PATCH] Handle signals sent to a fork/vfork child before it has a
 chance to first run

This is a WIP.

We handle this for clone, but not for fork/vfork.  For clone it's
easier as we just store the status in the new child's lwp immediately.
But for fork/vfork we don't add the child to the list until the next
resume, when we either follow or detach the child, depending on
follow-fork mode, in linux_child_follow_fork.  So store the child's
status in a new simple_pid_list list.

I haven't tried to write a test yet.  I think we have one for clones
somewhere, which we can model on.

Tested on x86_64 Fedora 20 with no regressions.

gdb/
2014-07-04  Pedro Alves  <palves@redhat.com>

	* linux-nat.c (forked_pids): New global.
	(save_new_child_stop_status): New function.
	(get_status_pass_signal): New function.
	(linux_child_follow_fork): Retrieve the child's stop status from
	the forked pids list, and pass the signal to the child if
	detaching from it, or leave it pending if not detaching.
	(cancel_pending_sigstop): New function, factored out from
	detach_callback.
	(detach_callback): Adjust to use cancel_pending_sigstop.
	(linux_handle_extended_wait): Store the fork/vfork child's
	status.  Adjust comment.
---
 gdb/linux-nat.c | 143 ++++++++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 118 insertions(+), 25 deletions(-)

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index 0ab0362..1435079 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -223,6 +223,10 @@ struct simple_pid_list
 };
 struct simple_pid_list *stopped_pids;
 
+/* The status of forked pids we've already seen the
+   PTRACE_EVENT_FORK/VFORK event for, but haven't followed yet.  */
+static struct simple_pid_list *forked_pids;
+
 /* Async mode support.  */
 
 /* The read/write ends of the pipe registered as waitable file in the
@@ -371,12 +375,71 @@ delete_lwp_cleanup (void *lp_voidp)
   delete_lwp (lp->ptid);
 }
 
+/* CHILD_LP is a new clone/fork child that stopped with STATUS.  If
+   STATUS is not SIGSTOP, store the status for later, and mark the lwp
+   as signalled (there's a SIGSTOP pending in the kernel queue).  This
+   can happen if someone starts sending signals to the new thread
+   before it gets a chance to run, which have a lower number than
+   SIGSTOP (e.g. SIGUSR1).  */
+
+static void
+save_new_child_stop_status (struct lwp_info *child_lp, int *status_p)
+{
+  int status = *status_p;
+
+  if (status != 0)
+    {
+      gdb_assert (WIFSTOPPED (status));
+
+      if (WSTOPSIG (status) != SIGSTOP)
+	{
+	  child_lp->signalled = 1;
+
+	  /* Save the wait status to report later.  */
+	  if (debug_linux_nat)
+	    fprintf_unfiltered (gdb_stdlog,
+				"SCSS: waitpid of new LWP %ld, "
+				"saving status %s\n",
+				ptid_get_lwp (child_lp->ptid),
+				status_to_str (status));
+	}
+      else
+	status = 0;
+    }
+
+  child_lp->status = status;
+  *status_p = status;
+}
+
+/* We're about to detach from a child that had stopped with STATUS.
+   Return the host signal number to pass, if any.  */
+
+static int
+get_pass_signal_from_status (int status)
+{
+  if (status != 0 && WIFSTOPPED (status))
+    {
+      int signo;
+
+      signo = WSTOPSIG (status);
+      if (signo != 0
+	  && !signal_pass_state (gdb_signal_from_host (signo)))
+	signo = 0;
+      return signo;
+    }
+
+  return 0;
+}
+
+static void cancel_pending_sigstop (struct lwp_info *lp);
+
 static int
 linux_child_follow_fork (struct target_ops *ops, int follow_child,
 			 int detach_fork)
 {
   int has_vforked;
   int parent_pid, child_pid;
+  int child_status;
 
   has_vforked = (inferior_thread ()->pending_follow.kind
 		 == TARGET_WAITKIND_VFORKED);
@@ -404,6 +467,11 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
       return 1;
     }
 
+  /* If we haven't already seen the new PID stop, wait for it now.  */
+  if (!pull_pid_from_list (&forked_pids, child_pid, &child_status))
+    internal_error (__FILE__, __LINE__,
+		    _("forked pid %d not found in forked list"), child_pid);
+
   if (! follow_child)
     {
       struct lwp_info *child_lp = NULL;
@@ -414,7 +482,6 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
       if (detach_fork)
 	{
 	  struct cleanup *old_chain;
-	  int status = W_STOPCODE (0);
 
 	  /* Before detaching from the child, remove all breakpoints
 	     from it.  If we forked, then this has already been taken
@@ -444,8 +511,12 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	  child_lp = add_lwp (inferior_ptid);
 	  child_lp->stopped = 1;
 	  child_lp->last_resume_kind = resume_stop;
+	  save_new_child_stop_status (child_lp, &child_status);
 	  make_cleanup (delete_lwp_cleanup, child_lp);
 
+	  /* If there is a pending SIGSTOP, get rid of it.  */
+	  cancel_pending_sigstop (child_lp);
+
 	  if (linux_nat_prepare_to_resume != NULL)
 	    linux_nat_prepare_to_resume (child_lp);
 
@@ -455,26 +526,26 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	     process starts with the TIF_SINGLESTEP/X86_EFLAGS_TF bits
 	     set if the parent process had them set.
 	     To work around this, single step the child process
-	     once before detaching to clear the flags.  */
+	     once before detaching to clear the flags.
+	     Be careful not to lose any signal though.  */
 
 	  if (!gdbarch_software_single_step_p (target_thread_architecture
-						   (child_lp->ptid)))
+					       (child_lp->ptid)))
 	    {
+	      int signo = get_pass_signal_from_status (child_status);
+
 	      linux_disable_event_reporting (child_pid);
-	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
+	      signo = get_pass_signal_from_status (child_status);
+	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, signo) < 0)
 		perror_with_name (_("Couldn't do single step"));
-	      if (my_waitpid (child_pid, &status, 0) < 0)
+	      if (my_waitpid (child_pid, &child_status, 0) < 0)
 		perror_with_name (_("Couldn't wait vfork process"));
 	    }
 
-	  if (WIFSTOPPED (status))
+	  if (child_status != 0 && WIFSTOPPED (child_status))
 	    {
-	      int signo;
+	      int signo = get_pass_signal_from_status (child_status);
 
-	      signo = WSTOPSIG (status);
-	      if (signo != 0
-		  && !signal_pass_state (gdb_signal_from_host (signo)))
-		signo = 0;
 	      ptrace (PTRACE_DETACH, child_pid, 0, signo);
 	    }
 
@@ -502,6 +573,7 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
 	  child_lp = add_lwp (inferior_ptid);
 	  child_lp->stopped = 1;
 	  child_lp->last_resume_kind = resume_stop;
+	  save_new_child_stop_status (child_lp, &child_status);
 	  child_inf->symfile_flags = SYMFILE_NO_READ;
 
 	  /* If this is a vfork child, then the address-space is
@@ -696,6 +768,7 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
       child_lp = add_lwp (inferior_ptid);
       child_lp->stopped = 1;
       child_lp->last_resume_kind = resume_stop;
+      save_new_child_stop_status (child_lp, &child_status);
 
       /* If this is a vfork child, then the address-space is shared
 	 with the parent.  If we detached from the parent, then we can
@@ -1510,27 +1583,41 @@ get_pending_status (struct lwp_info *lp, int *status)
   return 0;
 }
 
-static int
-detach_callback (struct lwp_info *lp, void *data)
-{
-  gdb_assert (lp->status == 0 || WIFSTOPPED (lp->status));
-
-  if (debug_linux_nat && lp->status)
-    fprintf_unfiltered (gdb_stdlog, "DC:  Pending %s for %s on detach.\n",
-			strsignal (WSTOPSIG (lp->status)),
-			target_pid_to_str (lp->ptid));
+/* If LP was signalled, can't the pending SIGSTOP with a SIGCONT.  */
 
+static void
+cancel_pending_sigstop (struct lwp_info *lp)
+{
   /* If there is a pending SIGSTOP, get rid of it.  */
   if (lp->signalled)
     {
+      /* This can happen if someone starts sending signals to
+	 the new thread before it gets a chance to run, which
+	 have a lower number than SIGSTOP (e.g. SIGUSR1).  */
+
+      /* There is a pending SIGSTOP; get rid of it.  */
       if (debug_linux_nat)
 	fprintf_unfiltered (gdb_stdlog,
-			    "DC: Sending SIGCONT to %s\n",
+			    "Sending SIGCONT to %s\n",
 			    target_pid_to_str (lp->ptid));
 
       kill_lwp (ptid_get_lwp (lp->ptid), SIGCONT);
       lp->signalled = 0;
     }
+}
+
+static int
+detach_callback (struct lwp_info *lp, void *data)
+{
+  gdb_assert (lp->status == 0 || WIFSTOPPED (lp->status));
+
+  if (debug_linux_nat && lp->status)
+    fprintf_unfiltered (gdb_stdlog, "DC:  Pending %s for %s on detach.\n",
+			strsignal (WSTOPSIG (lp->status)),
+			target_pid_to_str (lp->ptid));
+
+  /* If there is a pending SIGSTOP, get rid of it.  */
+  cancel_pending_sigstop (lp);
 
   /* We don't actually detach from the LWP that has an id equal to the
      overall process id just yet.  */
@@ -2061,6 +2148,14 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
 	  return 0;
 	}
 
+      if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK)
+	{
+	  /* The child may have stopped with a signal other than
+	     SIGSTOP.  Store the status for later, so we don't lose
+	     the signal.  */
+	  add_to_pid_list (&forked_pids, new_pid, status);
+	}
+
       if (event == PTRACE_EVENT_FORK)
 	ourstatus->kind = TARGET_WAITKIND_FORKED;
       else if (event == PTRACE_EVENT_VFORK)
@@ -2086,10 +2181,8 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
 	      /* This can happen if someone starts sending signals to
 		 the new thread before it gets a chance to run, which
 		 have a lower number than SIGSTOP (e.g. SIGUSR1).
-		 This is an unlikely case, and harder to handle for
-		 fork / vfork than for clone, so we do not try - but
-		 we handle it for clone events here.  We'll send
-		 the other signal on to the thread below.  */
+		 We'll send the other signal on to the thread
+		 below.  */
 
 	      new_lp->signalled = 1;
 	    }
-- 
1.9.3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Handle signals sent to a fork/vfork child before it has a chance to first run (Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version)
  2014-07-04 17:51           ` [PATCH] Handle signals sent to a fork/vfork child before it has a chance to first run (Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version) Pedro Alves
@ 2014-07-05  6:08             ` Hui Zhu
  0 siblings, 0 replies; 11+ messages in thread
From: Hui Zhu @ 2014-07-05  6:08 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches ml



On 07/05/14 01:50, Pedro Alves wrote:
> On 07/03/2014 05:24 PM, Hui Zhu wrote:
>
>> After this patch, I still got fail with this test in Linux 2.6.32.
>> The cause this:
>> 	      signo = WSTOPSIG (status);
>> #In linux 2.6.32, signo will be SIGSTOP.
>> 	      if (signo != 0
>> 		  && !signal_pass_state (gdb_signal_from_host (signo)))
>> 		signo = 0;
>> #SIGSTOP will send to child and make it stop.
>> 	      ptrace (PTRACE_DETACH, child_pid, 0, signo);
>>
>> So I make a patch to fix it.
>>
>> Thanks,
>> Hui
>>
>> 2014-07-04  Hui Zhu  <hui@codesourcery.com>
>>
>> 	* linux-nat.c(linux_child_follow_fork): Add check if signo
>> 	is SIGSTOP.
>>
>> --- a/gdb/linux-nat.c
>> +++ b/gdb/linux-nat.c
>> @@ -472,8 +472,9 @@ holding the child stopped.  Try \"set de
>>    	      int signo;
>>
>>    	      signo = WSTOPSIG (status);
>> -	      if (signo != 0
>> -		  && !signal_pass_state (gdb_signal_from_host (signo)))
>> +	      if (signo == SIGSTOP
>> +		  || (signo != 0
>> +		      && !signal_pass_state (gdb_signal_from_host (signo))))
>>    		signo = 0;
>>    	      ptrace (PTRACE_DETACH, child_pid, 0, signo);
>>    	    }
>>
>
> Hmm.  We should actually be passing the signal the fork child had stopped
> for earlier, when we step it for the workaround.  But we lose that signal...
>
> /me hacks.
>
> Does this patch fix the issue too ?
>
> 8<-----------
>  From 4184b4d0ffca89776d06d6b7d1241e9c0d01017a Mon Sep 17 00:00:00 2001
> From: Pedro Alves <palves@redhat.com>
> Date: Fri, 4 Jul 2014 17:31:41 +0100
> Subject: [PATCH] Handle signals sent to a fork/vfork child before it has a
>   chance to first run
>
> This is a WIP.
>
> We handle this for clone, but not for fork/vfork.  For clone it's
> easier as we just store the status in the new child's lwp immediately.
> But for fork/vfork we don't add the child to the list until the next
> resume, when we either follow or detach the child, depending on
> follow-fork mode, in linux_child_follow_fork.  So store the child's
> status in a new simple_pid_list list.
>
> I haven't tried to write a test yet.  I think we have one for clones
> somewhere, which we can model on.
>
> Tested on x86_64 Fedora 20 with no regressions.

Hi Pedro,

Thanks for your work.
The patch fixed the issue and pass regression test in Ubuntu 10.04 that 
use Linux 2.6.32.

Best,
Hui


>
> gdb/
> 2014-07-04  Pedro Alves  <palves@redhat.com>
>
> 	* linux-nat.c (forked_pids): New global.
> 	(save_new_child_stop_status): New function.
> 	(get_status_pass_signal): New function.
> 	(linux_child_follow_fork): Retrieve the child's stop status from
> 	the forked pids list, and pass the signal to the child if
> 	detaching from it, or leave it pending if not detaching.
> 	(cancel_pending_sigstop): New function, factored out from
> 	detach_callback.
> 	(detach_callback): Adjust to use cancel_pending_sigstop.
> 	(linux_handle_extended_wait): Store the fork/vfork child's
> 	status.  Adjust comment.
> ---
>   gdb/linux-nat.c | 143 ++++++++++++++++++++++++++++++++++++++++++++++----------
>   1 file changed, 118 insertions(+), 25 deletions(-)
>
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index 0ab0362..1435079 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -223,6 +223,10 @@ struct simple_pid_list
>   };
>   struct simple_pid_list *stopped_pids;
>
> +/* The status of forked pids we've already seen the
> +   PTRACE_EVENT_FORK/VFORK event for, but haven't followed yet.  */
> +static struct simple_pid_list *forked_pids;
> +
>   /* Async mode support.  */
>
>   /* The read/write ends of the pipe registered as waitable file in the
> @@ -371,12 +375,71 @@ delete_lwp_cleanup (void *lp_voidp)
>     delete_lwp (lp->ptid);
>   }
>
> +/* CHILD_LP is a new clone/fork child that stopped with STATUS.  If
> +   STATUS is not SIGSTOP, store the status for later, and mark the lwp
> +   as signalled (there's a SIGSTOP pending in the kernel queue).  This
> +   can happen if someone starts sending signals to the new thread
> +   before it gets a chance to run, which have a lower number than
> +   SIGSTOP (e.g. SIGUSR1).  */
> +
> +static void
> +save_new_child_stop_status (struct lwp_info *child_lp, int *status_p)
> +{
> +  int status = *status_p;
> +
> +  if (status != 0)
> +    {
> +      gdb_assert (WIFSTOPPED (status));
> +
> +      if (WSTOPSIG (status) != SIGSTOP)
> +	{
> +	  child_lp->signalled = 1;
> +
> +	  /* Save the wait status to report later.  */
> +	  if (debug_linux_nat)
> +	    fprintf_unfiltered (gdb_stdlog,
> +				"SCSS: waitpid of new LWP %ld, "
> +				"saving status %s\n",
> +				ptid_get_lwp (child_lp->ptid),
> +				status_to_str (status));
> +	}
> +      else
> +	status = 0;
> +    }
> +
> +  child_lp->status = status;
> +  *status_p = status;
> +}
> +
> +/* We're about to detach from a child that had stopped with STATUS.
> +   Return the host signal number to pass, if any.  */
> +
> +static int
> +get_pass_signal_from_status (int status)
> +{
> +  if (status != 0 && WIFSTOPPED (status))
> +    {
> +      int signo;
> +
> +      signo = WSTOPSIG (status);
> +      if (signo != 0
> +	  && !signal_pass_state (gdb_signal_from_host (signo)))
> +	signo = 0;
> +      return signo;
> +    }
> +
> +  return 0;
> +}
> +
> +static void cancel_pending_sigstop (struct lwp_info *lp);
> +
>   static int
>   linux_child_follow_fork (struct target_ops *ops, int follow_child,
>   			 int detach_fork)
>   {
>     int has_vforked;
>     int parent_pid, child_pid;
> +  int child_status;
>
>     has_vforked = (inferior_thread ()->pending_follow.kind
>   		 == TARGET_WAITKIND_VFORKED);
> @@ -404,6 +467,11 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
>         return 1;
>       }
>
> +  /* If we haven't already seen the new PID stop, wait for it now.  */
> +  if (!pull_pid_from_list (&forked_pids, child_pid, &child_status))
> +    internal_error (__FILE__, __LINE__,
> +		    _("forked pid %d not found in forked list"), child_pid);
> +
>     if (! follow_child)
>       {
>         struct lwp_info *child_lp = NULL;
> @@ -414,7 +482,6 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
>         if (detach_fork)
>   	{
>   	  struct cleanup *old_chain;
> -	  int status = W_STOPCODE (0);
>
>   	  /* Before detaching from the child, remove all breakpoints
>   	     from it.  If we forked, then this has already been taken
> @@ -444,8 +511,12 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
>   	  child_lp = add_lwp (inferior_ptid);
>   	  child_lp->stopped = 1;
>   	  child_lp->last_resume_kind = resume_stop;
> +	  save_new_child_stop_status (child_lp, &child_status);
>   	  make_cleanup (delete_lwp_cleanup, child_lp);
>
> +	  /* If there is a pending SIGSTOP, get rid of it.  */
> +	  cancel_pending_sigstop (child_lp);
> +
>   	  if (linux_nat_prepare_to_resume != NULL)
>   	    linux_nat_prepare_to_resume (child_lp);
>
> @@ -455,26 +526,26 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
>   	     process starts with the TIF_SINGLESTEP/X86_EFLAGS_TF bits
>   	     set if the parent process had them set.
>   	     To work around this, single step the child process
> -	     once before detaching to clear the flags.  */
> +	     once before detaching to clear the flags.
> +	     Be careful not to lose any signal though.  */
>
>   	  if (!gdbarch_software_single_step_p (target_thread_architecture
> -						   (child_lp->ptid)))
> +					       (child_lp->ptid)))
>   	    {
> +	      int signo = get_pass_signal_from_status (child_status);
> +
>   	      linux_disable_event_reporting (child_pid);
> -	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, 0) < 0)
> +	      signo = get_pass_signal_from_status (child_status);
> +	      if (ptrace (PTRACE_SINGLESTEP, child_pid, 0, signo) < 0)
>   		perror_with_name (_("Couldn't do single step"));
> -	      if (my_waitpid (child_pid, &status, 0) < 0)
> +	      if (my_waitpid (child_pid, &child_status, 0) < 0)
>   		perror_with_name (_("Couldn't wait vfork process"));
>   	    }
>
> -	  if (WIFSTOPPED (status))
> +	  if (child_status != 0 && WIFSTOPPED (child_status))
>   	    {
> -	      int signo;
> +	      int signo = get_pass_signal_from_status (child_status);
>
> -	      signo = WSTOPSIG (status);
> -	      if (signo != 0
> -		  && !signal_pass_state (gdb_signal_from_host (signo)))
> -		signo = 0;
>   	      ptrace (PTRACE_DETACH, child_pid, 0, signo);
>   	    }
>
> @@ -502,6 +573,7 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
>   	  child_lp = add_lwp (inferior_ptid);
>   	  child_lp->stopped = 1;
>   	  child_lp->last_resume_kind = resume_stop;
> +	  save_new_child_stop_status (child_lp, &child_status);
>   	  child_inf->symfile_flags = SYMFILE_NO_READ;
>
>   	  /* If this is a vfork child, then the address-space is
> @@ -696,6 +768,7 @@ holding the child stopped.  Try \"set detach-on-fork\" or \
>         child_lp = add_lwp (inferior_ptid);
>         child_lp->stopped = 1;
>         child_lp->last_resume_kind = resume_stop;
> +      save_new_child_stop_status (child_lp, &child_status);
>
>         /* If this is a vfork child, then the address-space is shared
>   	 with the parent.  If we detached from the parent, then we can
> @@ -1510,27 +1583,41 @@ get_pending_status (struct lwp_info *lp, int *status)
>     return 0;
>   }
>
> -static int
> -detach_callback (struct lwp_info *lp, void *data)
> -{
> -  gdb_assert (lp->status == 0 || WIFSTOPPED (lp->status));
> -
> -  if (debug_linux_nat && lp->status)
> -    fprintf_unfiltered (gdb_stdlog, "DC:  Pending %s for %s on detach.\n",
> -			strsignal (WSTOPSIG (lp->status)),
> -			target_pid_to_str (lp->ptid));
> +/* If LP was signalled, can't the pending SIGSTOP with a SIGCONT.  */
>
> +static void
> +cancel_pending_sigstop (struct lwp_info *lp)
> +{
>     /* If there is a pending SIGSTOP, get rid of it.  */
>     if (lp->signalled)
>       {
> +      /* This can happen if someone starts sending signals to
> +	 the new thread before it gets a chance to run, which
> +	 have a lower number than SIGSTOP (e.g. SIGUSR1).  */
> +
> +      /* There is a pending SIGSTOP; get rid of it.  */
>         if (debug_linux_nat)
>   	fprintf_unfiltered (gdb_stdlog,
> -			    "DC: Sending SIGCONT to %s\n",
> +			    "Sending SIGCONT to %s\n",
>   			    target_pid_to_str (lp->ptid));
>
>         kill_lwp (ptid_get_lwp (lp->ptid), SIGCONT);
>         lp->signalled = 0;
>       }
> +}
> +
> +static int
> +detach_callback (struct lwp_info *lp, void *data)
> +{
> +  gdb_assert (lp->status == 0 || WIFSTOPPED (lp->status));
> +
> +  if (debug_linux_nat && lp->status)
> +    fprintf_unfiltered (gdb_stdlog, "DC:  Pending %s for %s on detach.\n",
> +			strsignal (WSTOPSIG (lp->status)),
> +			target_pid_to_str (lp->ptid));
> +
> +  /* If there is a pending SIGSTOP, get rid of it.  */
> +  cancel_pending_sigstop (lp);
>
>     /* We don't actually detach from the LWP that has an id equal to the
>        overall process id just yet.  */
> @@ -2061,6 +2148,14 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
>   	  return 0;
>   	}
>
> +      if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK)
> +	{
> +	  /* The child may have stopped with a signal other than
> +	     SIGSTOP.  Store the status for later, so we don't lose
> +	     the signal.  */
> +	  add_to_pid_list (&forked_pids, new_pid, status);
> +	}
> +
>         if (event == PTRACE_EVENT_FORK)
>   	ourstatus->kind = TARGET_WAITKIND_FORKED;
>         else if (event == PTRACE_EVENT_VFORK)
> @@ -2086,10 +2181,8 @@ linux_handle_extended_wait (struct lwp_info *lp, int status,
>   	      /* This can happen if someone starts sending signals to
>   		 the new thread before it gets a chance to run, which
>   		 have a lower number than SIGSTOP (e.g. SIGUSR1).
> -		 This is an unlikely case, and harder to handle for
> -		 fork / vfork than for clone, so we do not try - but
> -		 we handle it for clone events here.  We'll send
> -		 the other signal on to the thread below.  */
> +		 We'll send the other signal on to the thread
> +		 below.  */
>
>   	      new_lp->signalled = 1;
>   	    }
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-07-05  6:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-03  8:12 [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version Hui Zhu
2014-05-28 19:19 ` Pedro Alves
2014-06-04  8:43   ` Hui Zhu
2014-06-04 16:11     ` Pedro Alves
2014-06-05  7:48   ` Hui Zhu
2014-06-05  8:43     ` Pedro Alves
2014-06-08 11:16       ` Hui Zhu
2014-06-09 13:58         ` [pushed] Fix a bunch of fork related regressions. (was: Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version) Pedro Alves
2014-07-03 16:24         ` [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version Hui Zhu
2014-07-04 17:51           ` [PATCH] Handle signals sent to a fork/vfork child before it has a chance to first run (Re: [PATCH] Fix gdb.base/watch-vfork.exp: Watchpoint triggers after vfork (sw) (timeout) with Linux 2.6.32 and older version) Pedro Alves
2014-07-05  6:08             ` Hui Zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).