From: Sandra Loosemore <sandra@codesourcery.com>
To: gdb-patches <gdb-patches@sourceware.org>
Cc: Pedro Alves <palves@redhat.com>, Yao Qi <yao.qi@linaro.org>
Subject: [RFC] fix gdb.threads/non-stop-fair-events.exp timeouts
Date: Fri, 04 Sep 2015 16:55:00 -0000 [thread overview]
Message-ID: <55E9CCCD.7060604@codesourcery.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2544 bytes --]
While running GDB tests on nios2-linux-gnu with gdbserver and "target
remote", I've been seeing random failures in
gdb.threads/non-stop-fair-events.exp. E.g. in one test run I got
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 1
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 2
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 3
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=7: thread 1
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=10: thread 1
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=10: thread 2
broke out of loop (timeout)
and in other test runs I got a different ones. The pattern seemed to be
that sometimes it took an extra long time for the first thread to break
out of the loop, but once that happened they would all stop correctly
and send the expected replies even though GDB had given up on waiting
for the first few already.
I've come up with the attached patch to factor the timeout for the
failing tests by the number of threads still running, which seems to
take care of the problem. Does this seem reasonable?
I'm somewhat confused because, in spite of it sometimes taking at least
3 times the normal timeout for the first stop message to appear, the
alarm in the test case (which is tied to the normal timeout) was never
triggering. My best theory on that is that the slowness is not in the
test case, but rather in gdbserver. IOW, all the threads are already
stopped by the time the alarm would expire, but gdb and gdbserver
haven't finished all the notifications and requests to print a stop
message for any of the threads yet. Is that plausible? Should the
timeout for the alarm be factored by the number of threads, too, just to
be safe?
I'm also not entirely sure what this test case is supposed to test.
From the original commit message and comments in the .exp file it seems
like timeouts were supposed to be a sign of a broken kernel with thread
starvation problems, not bugs in gdb or gdbserver. But, don't we
normally just skip tests that the target doesn't support or can't run
properly, rather than report them as FAILs? And, I don't know how to
distinguish timeouts that mean the kernel is broken from timeouts that
mean the target is just slow and you need to set a bigger value in the
test harness.
-Sandra the confused
[-- Attachment #2: fair.log --]
[-- Type: text/x-log, Size: 190 bytes --]
2015-09-04 Sandra Loosemore <sandra@codesourcery.com>
gdb/testsuite/
* gdb.threads/non-stop-fair-events.exp (test): Use factored
timeout when waiting for threads to break out of loop.
[-- Attachment #3: fair.patch --]
[-- Type: text/x-patch, Size: 1315 bytes --]
diff --git a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
index e2d3f7d..1570d3f 100644
--- a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
+++ b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
@@ -135,16 +135,22 @@ proc test {signal_thread} {
# Wait for all threads to finish their steps, and for the main
# thread to hit the breakpoint.
+ # Running this many threads may be quite slow on remote targets,
+ # so factor the timeout according to how many threads are running.
+ set max_timeout $NUM_THREADS
for {set i 1} { $i <= $NUM_THREADS } { incr i } {
set test "thread $i broke out of loop"
- gdb_test_multiple "" $test {
- -re "loop_broke" {
- # The prompt was already matched in the "continue
- # &" test above. We're now consuming asynchronous
- # output that comes after the prompt.
- pass $test
+ with_timeout_factor $max_timeout {
+ gdb_test_multiple "" $test {
+ -re "loop_broke" {
+ # The prompt was already matched in the "continue
+ # &" test above. We're now consuming asynchronous
+ # output that comes after the prompt.
+ pass $test
+ }
}
}
+ set max_timeout [expr $max_timeout - 1]
}
# It's helpful to have this in the log if the test ever
next reply other threads:[~2015-09-04 16:55 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-04 16:55 Sandra Loosemore [this message]
2015-09-08 16:30 ` Pedro Alves
2015-09-09 16:09 ` Sandra Loosemore
2015-09-09 18:40 ` Pedro Alves
2015-09-15 2:25 ` Sandra Loosemore
2015-09-16 14:47 ` Pedro Alves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55E9CCCD.7060604@codesourcery.com \
--to=sandra@codesourcery.com \
--cc=gdb-patches@sourceware.org \
--cc=palves@redhat.com \
--cc=yao.qi@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).