From: Don Breazeal <donb@codesourcery.com>
To: <gdb-patches@sourceware.org>, <palves@redhat.com>
Subject: Re: [PING] Re: [PATCH v2 2/3] PR remote/19496, interrupted syscall in forking-threads-plus-bkpt
Date: Mon, 14 Mar 2016 21:29:00 -0000 [thread overview]
Message-ID: <56E72D2F.2010307@codesourcery.com> (raw)
In-Reply-To: <56D88042.1050207@codesourcery.com>
Hi Pedro,
Did you have any more suggestions about handling the interrupted system
call, or shall we go with the "loop until we don't get -1 and
errno===EINTR" approach?
Thanks,
--Don
On 3/3/2016 10:19 AM, Don Breazeal wrote:
> Ping.
> I checked, the patch still applies cleanly to mainline.
> Thanks
> --Don
>
> On 2/25/2016 9:28 AM, Don Breazeal wrote:
>> Ping
>> Thanks,
>> --Don
>>
>> On 2/10/2016 4:28 PM, Don Breazeal wrote:
>>> Hi Pedro,
>>>
>>> On 2/1/2016 11:38 AM, Pedro Alves wrote:
>>>> On 01/28/2016 12:48 AM, Don Breazeal wrote:
>>>>> This patch addresses "fork:Interrupted system call" (or wait:) failures
>>>>> in gdb.threads/forking-threads-plus-breakpoint.exp.
>>>>>
>>>>> The test program spawns ten threads, each of which do ten fork/waitpid
>>>>> sequences. The cause of the problem was that when one of the fork
>>>>> children exited before the corresponding fork parent could initiate its
>>>>> waitpid for that child, a SIGCHLD was delivered and interrupted a fork
>>>>> or waitpid in another thread.
>>>
>>> In fact, I think my diagnosis here was incorrect, or at least incorrect
>>> in some cases. I believe at least some of the interruptions are caused
>>> by SIGSTOP, sent by GDB when stopping all the threads. More below.
>>>
>>>>>
>>>>> The fix was to wrap the system calls in a loop to retry the call if
>>>>> it was interrupted, like:
>>>>>
>>>>> do
>>>>> {
>>>>> pid = fork ();
>>>>> }
>>>>> while (pid == -1 && errno == EINTR);
>>>>>
>>>>> Since this is a Linux-only test I figure it is OK to use errno and EINTR.
>>>>>
>>>>> Tested on Nios II Linux target with x86 Linux host.
>>>>
>>>> I'd prefer to avoid this if possible. These loops potentially hide
>>>> bugs like ERESTARTSYS escaping out of a syscall and mishandling of
>>>> signals. See bc9540e842eb5639ca59cb133adef211d252843c for example:
>>>> https://sourceware.org/ml/gdb-patches/2015-02/msg00654.html
>>>>
>>>> How about setting SIGCHLD to SIG_IGN, or making SIGCHLD be SA_RESTART?
>>>
>>> I spent a couple of days trying to find an alternate solution, but
>>> couldn't find one that was reliable. Here is a snapshot of what I tried:
>>>
>>> 1) SIG_IGN: results in an ECHILD from waitpid. The man page for waitpid
>>> says "This can happen for one's own child if the action for SIGCHLD is
>>> set to SIG_IGN."
>>>
>>> 2) SA_RESTART: While waitpid is listed as a system call that can be
>>> restarted by SA_RESTART, fork is not. Even if I leave the "EINTR loop"
>>> in place for fork, using SA_RESTART I still see an interrupted system
>>> call for waitpid. Possibly because the problem is SIGSTOP and not
>>> SIGCHLD.
>>>
>>> 3) pthread_sigblock: With this set for SIGCHLD in all the threads, I
>>> still saw an interrupted system call. You can't block SIGSTOP.
>>>
>>> 4) pthread_sigblock with sigwait: using pthread_sigblock on all the
>>> blockable signals with a signal thread that called sigwait for all
>>> the signals in a loop, the signal thread would see a bunch of SIGCHLDs,
>>> but there would eventually be an interrupted system call.
>>>
>>> 5) bsd_signal: this function is supposed to automatically restart blocking
>>> system calls. fork is not a blocking system call, but it doesn't help
>>> for waitpid either.
>>>
>>> I found this in the ptrace(2) man page: "Note that a suppressed signal
>>> still causes system calls to return prematurely. In this case, system
>>> calls will be restarted: the tracer will observe the tracee to reexecute
>>> the interrupted system call (or restart_syscall(2) system call for a few
>>> system calls which use a different mechanism for restarting) if the tracer
>>> uses PTRACE_SYSCALL. Even system calls (such as poll(2)) which are not
>>> restartable after signal are restarted after signal is suppressed; however,
>>> kernel bugs exist which cause some system calls to fail with EINTR even
>>> though no observable signal is injected to the tracee."
>>>
>>> The GDB manual mentions something similar about interrupted system calls.
>>>
>>> So, the bottom line is that I haven't changed the fix for the interrupted
>>> system calls, because I can't find anything that works as well as the
>>> original fix. Perhaps this test puts enough stress on the kernel that the
>>> kernel bugs mentioned above are exposed.
>>>
>>> One change I did make from the previous version was to increase the
>>> timeout to 90 seconds, which was necessary to get more reliable results
>>> on the Nios II target.
>>>
>>> Let me know what you think.
>>> Thanks!
>>> --Don
>>>
>>> ---
>>> This patch addresses "fork:Interrupted system call" (or wait:) failures
>>> in gdb.threads/forking-threads-plus-breakpoint.exp.
>>>
>>> The test program spawns ten threads, each of which do ten fork/waitpid
>>> sequences. The cause of the problem was that when one of the fork
>>> children exited before the corresponding fork parent could initiate its
>>> waitpid for that child, a SIGCHLD was delivered and interrupted a fork
>>> or waitpid in another thread.
>>>
>>> The fix was to wrap the system calls in a loop to retry the call if
>>> it was interrupted, like:
>>>
>>> do
>>> {
>>> pid = fork ();
>>> }
>>> while (pid == -1 && errno == EINTR);
>>>
>>> Since this is a Linux-only test I figure it is OK to use errno and EINTR.
>>> I tried a number of alternative fixes using SIG_IGN, SA_RESTART,
>>> pthread_sigblock, and bsd_signal, but none of these worked as well.
>>>
>>> Tested on Nios II Linux target with x86 Linux host.
>>>
>>> gdb/testsuite/ChangeLog:
>>> 2016-02-10 Don Breazeal <donb@codesourcery.com>
>>>
>>> * gdb.threads/forking-threads-plus-breakpoint.c (thread_forks):
>>> Retry fork and waitpid on interrupted system call errors.
>>> * gdb.threads/forking-threads-plus-breakpoint.exp: (do_test):
>>> Increase timeout to 90.
>>>
>>> ---
>>> .../gdb.threads/forking-threads-plus-breakpoint.c | 14 ++++++++++++--
>>> .../gdb.threads/forking-threads-plus-breakpoint.exp | 3 +++
>>> 2 files changed, 15 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.c b/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.c
>>> index fc64d93..c169e18 100644
>>> --- a/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.c
>>> +++ b/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.c
>>> @@ -22,6 +22,7 @@
>>> #include <sys/types.h>
>>> #include <sys/wait.h>
>>> #include <stdlib.h>
>>> +#include <errno.h>
>>>
>>> /* Number of threads. Each thread continuously spawns a fork and wait
>>> for it. If we have another thread continuously start a step over,
>>> @@ -49,14 +50,23 @@ thread_forks (void *arg)
>>> {
>>> pid_t pid;
>>>
>>> - pid = fork ();
>>> + do
>>> + {
>>> + pid = fork ();
>>> + }
>>> + while (pid == -1 && errno == EINTR);
>>>
>>> if (pid > 0)
>>> {
>>> int status;
>>>
>>> /* Parent. */
>>> - pid = waitpid (pid, &status, 0);
>>> + do
>>> + {
>>> + pid = waitpid (pid, &status, 0);
>>> + }
>>> + while (pid == -1 && errno == EINTR);
>>> +
>>> if (pid == -1)
>>> {
>>> perror ("wait");
>>> diff --git a/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.exp b/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.exp
>>> index ff3ca9a..6889c2b 100644
>>> --- a/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.exp
>>> +++ b/gdb/testsuite/gdb.threads/forking-threads-plus-breakpoint.exp
>>> @@ -73,6 +73,9 @@ proc do_test { cond_bp_target detach_on_fork displaced } {
>>> global linenum
>>> global is_remote_target
>>>
>>> + global timeout
>>> + set timeout 90
>>> +
>>> set saved_gdbflags $GDBFLAGS
>>> set GDBFLAGS [concat $GDBFLAGS " -ex \"set non-stop on\""]
>>> clean_restart $binfile
>>>
>>
>
next prev parent reply other threads:[~2016-03-14 21:29 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-28 0:48 [PATCH 0/3] PR remote/19496, remote fork failures Don Breazeal
2016-01-28 0:48 ` [PATCH 2/3] PR remote/19496, interrupted syscall in forking-threads-plus-bkpt Don Breazeal
2016-02-01 19:38 ` Pedro Alves
2016-01-28 0:48 ` [PATCH 3/3] PR remote/19496, timeout " Don Breazeal
2016-02-01 12:05 ` Pedro Alves
2016-02-01 19:29 ` Don Breazeal
2016-02-01 20:09 ` Pedro Alves
2016-02-11 0:28 ` [PATCH v2 " Don Breazeal
2016-02-25 17:29 ` [PING]Re: " Don Breazeal
2016-03-03 18:20 ` [PING] " Don Breazeal
2016-03-14 21:30 ` Don Breazeal
2016-03-15 15:30 ` Pedro Alves
2016-03-16 17:29 ` Don Breazeal
2016-03-16 22:51 ` Don Breazeal
2016-03-17 10:38 ` Pedro Alves
2016-02-11 0:28 ` [PATCH v2 2/3] PR remote/19496, interrupted syscall " Don Breazeal
2016-02-25 17:28 ` Don Breazeal
2016-03-03 18:19 ` [PING] " Don Breazeal
2016-03-14 21:29 ` Don Breazeal [this message]
2016-03-15 12:55 ` Pedro Alves
2016-03-16 18:26 ` Don Breazeal
2016-03-16 18:33 ` Pedro Alves
2016-03-16 22:18 ` Don Breazeal
2016-01-28 0:48 ` [PATCH 1/3] PR remote/19496, internal err forking-threads-plus-bkpt Don Breazeal
2016-02-01 19:26 ` [pushed] Test gdb.threads/forking-threads-plus-breakpoint.exp with, displaced stepping off (Re: [PATCH 1/3] PR remote/19496, internal err forking-threads-plus-bkpt) Pedro Alves
2016-02-01 20:38 ` [PATCH 1/3] PR remote/19496, internal err forking-threads-plus-bkpt Pedro Alves
2016-02-11 0:26 ` [PATCH v2 " Don Breazeal
2016-02-12 20:15 ` Pedro Alves
2016-02-16 17:21 ` Don Breazeal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56E72D2F.2010307@codesourcery.com \
--to=donb@codesourcery.com \
--cc=gdb-patches@sourceware.org \
--cc=palves@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).