public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Don Breazeal <donb@codesourcery.com>
To: <gdb-patches@sourceware.org>, <palves@redhat.com>
Subject: Re: [PATCH v2 2/3] PR remote/19496, interrupted syscall in forking-threads-plus-bkpt
Date: Thu, 25 Feb 2016 17:28:00 -0000	[thread overview]
Message-ID: <56CF39CF.5080007@codesourcery.com> (raw)
In-Reply-To: <1455150484-12600-1-git-send-email-donb@codesourcery.com>

Ping
Thanks,
--Don

On 2/10/2016 4:28 PM, Don Breazeal wrote:
> Hi Pedro,
> 
> On 2/1/2016 11:38 AM, Pedro Alves wrote:
>> On 01/28/2016 12:48 AM, Don Breazeal wrote:
>>> This patch addresses "fork:Interrupted system call" (or wait:) failures
>>> in gdb.threads/forking-threads-plus-breakpoint.exp.
>>>
>>> The test program spawns ten threads, each of which do ten fork/waitpid
>>> sequences.  The cause of the problem was that when one of the fork
>>> children exited before the corresponding fork parent could initiate its
>>> waitpid for that child, a SIGCHLD was delivered and interrupted a fork
>>> or waitpid in another thread.
> 
> In fact, I think my diagnosis here was incorrect, or at least incorrect
> in some cases.  I believe at least some of the interruptions are caused
> by SIGSTOP, sent by GDB when stopping all the threads.  More below.
> 
>>>
>>> The fix was to wrap the system calls in a loop to retry the call if
>>> it was interrupted, like:
>>>
>>> do
>>>   {
>>>     pid = fork ();
>>>   }
>>> while (pid == -1 && errno == EINTR);
>>>
>>> Since this is a Linux-only test I figure it is OK to use errno and EINTR.
>>>
>>> Tested on Nios II Linux target with x86 Linux host.
>>
>> I'd prefer to avoid this if possible.  These loops potentially hide
>> bugs like ERESTARTSYS escaping out of a syscall and mishandling of
>> signals.  See bc9540e842eb5639ca59cb133adef211d252843c for example:
>>    https://sourceware.org/ml/gdb-patches/2015-02/msg00654.html
>>
>> How about setting SIGCHLD to SIG_IGN, or making SIGCHLD be SA_RESTART?
> 
> I spent a couple of days trying to find an alternate solution, but
> couldn't find one that was reliable.  Here is a snapshot of what I tried:
> 
> 1) SIG_IGN: results in an ECHILD from waitpid.  The man page for waitpid
>    says "This can happen for one's own child if the action for SIGCHLD is
>    set to SIG_IGN."
> 
> 2) SA_RESTART: While waitpid is listed as a system call that can be
>    restarted by SA_RESTART, fork is not.  Even if I leave the "EINTR loop"
>    in place for fork, using SA_RESTART I still see an interrupted system
>    call for waitpid.  Possibly because the problem is SIGSTOP and not
>    SIGCHLD.
> 
> 3) pthread_sigblock: With this set for SIGCHLD in all the threads, I
>    still saw an interrupted system call.  You can't block SIGSTOP.
> 
> 4) pthread_sigblock with sigwait: using pthread_sigblock on all the
>    blockable signals with a signal thread that called sigwait for all
>    the signals in a loop, the signal thread would see a bunch of SIGCHLDs,
>    but there would eventually be an interrupted system call.
> 
> 5) bsd_signal: this function is supposed to automatically restart blocking
>    system calls.  fork is not a blocking system call, but it doesn't help
>    for waitpid either.
> 
> I found this in the ptrace(2) man page: "Note that a suppressed signal
> still causes system calls to return prematurely.  In this case, system
> calls will be restarted: the tracer will observe the tracee to reexecute
> the interrupted system call (or restart_syscall(2) system call for a few
> system calls which use a different mechanism for restarting) if the tracer
> uses PTRACE_SYSCALL.  Even system calls (such as poll(2)) which are not
> restartable after signal are restarted after signal is suppressed; however,
> kernel bugs exist which cause some system calls to fail with EINTR even
> though no observable signal is injected to the tracee."
> 
> The GDB manual mentions something similar about interrupted system calls.
> 
> So, the bottom line is that I haven't changed the fix for the interrupted
> system calls, because I can't find anything that works as well as the
> original fix.  Perhaps this test puts enough stress on the kernel that the
> kernel bugs mentioned above are exposed.
> 
> One change I did make from the previous version was to increase the
> timeout to 90 seconds, which was necessary to get more reliable results
> on the Nios II target.
> 
> Let me know what you think.
> Thanks!
> --Don
> 
> ---
> This patch addresses "fork:Interrupted system call" (or wait:) failures
> in gdb.threads/forking-threads-plus-breakpoint.exp.
> 
> The test program spawns ten threads, each of which do ten fork/waitpid
> sequences.  The cause of the problem was that when one of the fork
> children exited before the corresponding fork parent could initiate its
> waitpid for that child, a SIGCHLD was delivered and interrupted a fork
> or waitpid in another thread.
> 
> The fix was to wrap the system calls in a loop to retry the call if
> it was interrupted, like:
> 
> do
>   {
>     pid = fork ();
>   }
> while (pid == -1 && errno == EINTR);
> 
> Since this is a Linux-only test I figure it is OK to use errno and EINTR.
> I tried a number of alternative fixes using SIG_IGN, SA_RESTART,
> pthread_sigblock, and bsd_signal, but none of these worked as well.
> 
> Tested on Nios II Linux target with x86 Linux host.
> 
> gdb/testsuite/ChangeLog:
> 2016-02-10  Don Breazeal  <donb@codesourcery.com>
> 
> 	* gdb.threads/forking-threads-plus-breakpoint.c (thread_forks):
> 	Retry fork and waitpid on interrupted system call errors.
> 	* gdb.threads/forking-threads-plus-breakpoint.exp: (do_test):
> 	Increase timeout to 90.
> 
> ---
>  .../gdb.threads/forking-threads-plus-breakpoint.c          | 14 ++++++++++++--
>  .../gdb.threads/forking-threads-plus-breakpoint.exp        |  3 +++
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.c b/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.c
> index fc64d93..c169e18 100644
> --- a/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.c
> +++ b/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.c
> @@ -22,6 +22,7 @@
>  #include <sys/types.h>
>  #include <sys/wait.h>
>  #include <stdlib.h>
> +#include <errno.h>
>  
>  /* Number of threads.  Each thread continuously spawns a fork and wait
>     for it.  If we have another thread continuously start a step over,
> @@ -49,14 +50,23 @@ thread_forks (void *arg)
>      {
>        pid_t pid;
>  
> -      pid = fork ();
> +      do
> +	{
> +	  pid = fork ();
> +	}
> +      while (pid == -1 && errno == EINTR);
>  
>        if (pid > 0)
>  	{
>  	  int status;
>  
>  	  /* Parent.  */
> -	  pid = waitpid (pid, &status, 0);
> +	  do
> +	    {
> +	      pid = waitpid (pid, &status, 0);
> +	    }
> +	  while (pid == -1 && errno == EINTR);
> +
>  	  if (pid == -1)
>  	    {
>  	      perror ("wait");
> diff --git a/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.exp b/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.exp
> index ff3ca9a..6889c2b 100644
> --- a/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.exp
> +++ b/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.exp
> @@ -73,6 +73,9 @@ proc do_test { cond_bp_target detach_on_fork displaced } {
>      global linenum
>      global is_remote_target
>  
> +    global timeout
> +    set timeout 90
> +
>      set saved_gdbflags $GDBFLAGS
>      set GDBFLAGS [concat $GDBFLAGS " -ex \"set non-stop on\""]
>      clean_restart $binfile
> 

  reply	other threads:[~2016-02-25 17:28 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-28  0:48 [PATCH 0/3] PR remote/19496, remote fork failures Don Breazeal
2016-01-28  0:48 ` [PATCH 1/3] PR remote/19496, internal err forking-threads-plus-bkpt Don Breazeal
2016-02-01 19:26   ` [pushed] Test gdb.threads/forking-threads-plus-breakpoint.exp with, displaced stepping off (Re: [PATCH 1/3] PR remote/19496, internal err forking-threads-plus-bkpt) Pedro Alves
2016-02-01 20:38   ` [PATCH 1/3] PR remote/19496, internal err forking-threads-plus-bkpt Pedro Alves
2016-02-11  0:26     ` [PATCH v2 " Don Breazeal
2016-02-12 20:15       ` Pedro Alves
2016-02-16 17:21         ` Don Breazeal
2016-01-28  0:48 ` [PATCH 3/3] PR remote/19496, timeout in forking-threads-plus-bkpt Don Breazeal
2016-02-01 12:05   ` Pedro Alves
2016-02-01 19:29     ` Don Breazeal
2016-02-01 20:09       ` Pedro Alves
2016-02-11  0:28         ` [PATCH v2 2/3] PR remote/19496, interrupted syscall " Don Breazeal
2016-02-25 17:28           ` Don Breazeal [this message]
2016-03-03 18:19             ` [PING] " Don Breazeal
2016-03-14 21:29               ` Don Breazeal
2016-03-15 12:55           ` Pedro Alves
2016-03-16 18:26             ` Don Breazeal
2016-03-16 18:33               ` Pedro Alves
2016-03-16 22:18                 ` Don Breazeal
2016-02-11  0:28         ` [PATCH v2 3/3] PR remote/19496, timeout " Don Breazeal
2016-02-25 17:29           ` [PING]Re: " Don Breazeal
2016-03-03 18:20             ` [PING] " Don Breazeal
2016-03-14 21:30               ` Don Breazeal
2016-03-15 15:30           ` Pedro Alves
2016-03-16 17:29             ` Don Breazeal
2016-03-16 22:51               ` Don Breazeal
2016-03-17 10:38                 ` Pedro Alves
2016-01-28  0:48 ` [PATCH 2/3] PR remote/19496, interrupted syscall " Don Breazeal
2016-02-01 19:38   ` Pedro Alves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56CF39CF.5080007@codesourcery.com \
    --to=donb@codesourcery.com \
    --cc=gdb-patches@sourceware.org \
    --cc=palves@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).