[RFC] fix gdb.threads/non-stop-fair-events.exp timeouts

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [RFC] fix gdb.threads/non-stop-fair-events.exp timeouts
@ 2015-09-04 16:55 Sandra Loosemore
  2015-09-08 16:30 ` Pedro Alves
  0 siblings, 1 reply; 6+ messages in thread
From: Sandra Loosemore @ 2015-09-04 16:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Pedro Alves, Yao Qi

[-- Attachment #1: Type: text/plain, Size: 2544 bytes --]

While running GDB tests on nios2-linux-gnu with gdbserver and "target 
remote", I've been seeing random failures in 
gdb.threads/non-stop-fair-events.exp.  E.g. in one test run I got

FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 1 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 2 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 3 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=7: thread 1 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=10: thread 1 
broke out of loop (timeout)
FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=10: thread 2 
broke out of loop (timeout)

and in other test runs I got a different ones.  The pattern seemed to be 
that sometimes it took an extra long time for the first thread to break 
out of the loop, but once that happened they would all stop correctly 
and send the expected replies even though GDB had given up on waiting 
for the first few already.

I've come up with the attached patch to factor the timeout for the 
failing tests by the number of threads still running, which seems to 
take care of the problem.  Does this seem reasonable?

I'm somewhat confused because, in spite of it sometimes taking at least 
3 times the normal timeout for the first stop message to appear, the 
alarm in the test case (which is tied to the normal timeout) was never 
triggering.  My best theory on that is that the slowness is not in the 
test case, but rather in gdbserver.  IOW, all the threads are already 
stopped by the time the alarm would expire, but gdb and gdbserver 
haven't finished all the notifications and requests to print a stop 
message for any of the threads yet.  Is that plausible?  Should the 
timeout for the alarm be factored by the number of threads, too, just to 
be safe?

I'm also not entirely sure what this test case is supposed to test. 
 From the original commit message and comments in the .exp file it seems 
like timeouts were supposed to be a sign of a broken kernel with thread 
starvation problems, not bugs in gdb or gdbserver.  But, don't we 
normally just skip tests that the target doesn't support or can't run 
properly, rather than report them as FAILs?  And, I don't know how to 
distinguish timeouts that mean the kernel is broken from timeouts that 
mean the target is just slow and you need to set a bigger value in the 
test harness.

-Sandra the confused

[-- Attachment #2: fair.log --]
[-- Type: text/x-log, Size: 190 bytes --]

2015-09-04  Sandra Loosemore  <sandra@codesourcery.com>

	gdb/testsuite/
	* gdb.threads/non-stop-fair-events.exp (test): Use factored
	timeout	when waiting for threads to break out of loop.

[-- Attachment #3: fair.patch --]
[-- Type: text/x-patch, Size: 1315 bytes --]

diff --git a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
index e2d3f7d..1570d3f 100644
--- a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
+++ b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
@@ -135,16 +135,22 @@ proc test {signal_thread} {

 	# Wait for all threads to finish their steps, and for the main
 	# thread to hit the breakpoint.
+	# Running this many threads may be quite slow on remote targets,
+	# so factor the timeout according to how many threads are running.
+	set max_timeout $NUM_THREADS
 	for {set i 1} { $i <= $NUM_THREADS } { incr i } {
 	    set test "thread $i broke out of loop"
-	    gdb_test_multiple "" $test {
-		-re "loop_broke" {
-		    # The prompt was already matched in the "continue
-		    # &" test above.  We're now consuming asynchronous
-		    # output that comes after the prompt.
-		    pass $test
+	    with_timeout_factor $max_timeout {
+	        gdb_test_multiple "" $test {
+		    -re "loop_broke" {
+			# The prompt was already matched in the "continue
+			# &" test above.  We're now consuming asynchronous
+			# output that comes after the prompt.
+			pass $test
+		    }
 		}
 	    }
+	    set max_timeout [expr $max_timeout - 1]
 	}

 	# It's helpful to have this in the log if the test ever

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] fix gdb.threads/non-stop-fair-events.exp timeouts
  2015-09-04 16:55 [RFC] fix gdb.threads/non-stop-fair-events.exp timeouts Sandra Loosemore
@ 2015-09-08 16:30 ` Pedro Alves
  2015-09-09 16:09   ` Sandra Loosemore
  0 siblings, 1 reply; 6+ messages in thread
From: Pedro Alves @ 2015-09-08 16:30 UTC (permalink / raw)
  To: Sandra Loosemore, gdb-patches; +Cc: Yao Qi

[-- Attachment #1: Type: text/plain, Size: 4363 bytes --]

On 09/04/2015 05:54 PM, Sandra Loosemore wrote:
> While running GDB tests on nios2-linux-gnu with gdbserver and "target 
> remote", I've been seeing random failures in 
> gdb.threads/non-stop-fair-events.exp.  E.g. in one test run I got
> 
> FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 1 
> broke out of loop (timeout)
> FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 2 
> broke out of loop (timeout)
> FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=6: thread 3 
> broke out of loop (timeout)
> FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=7: thread 1 
> broke out of loop (timeout)
> FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=10: thread 1 
> broke out of loop (timeout)
> FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=10: thread 2 
> broke out of loop (timeout)
> 
> and in other test runs I got a different ones.  The pattern seemed to be 
> that sometimes it took an extra long time for the first thread to break 
> out of the loop, but once that happened they would all stop correctly 
> and send the expected replies even though GDB had given up on waiting 
> for the first few already.

Yeah, I've seen this before with a local series I use for debugging
software single-step things that implements software single-stepping
on x86.  I just re-tried it now after rebasing that series to
current mainline, and I still see the time outs against gdbserver.

AFAICS, nios2 is a software single-step target that does not implement
displaced stepping either.  I had a patch for this that I had
never posted.  See attached.

> 
> I've come up with the attached patch to factor the timeout for the 
> failing tests by the number of threads still running, which seems to 
> take care of the problem.  Does this seem reasonable?

I'd rather avoid it unconditionally; it's just 10 threads, and if the
target supports displaced stepping, if starvation avoidance in gdb is
working correctly, the test should complete quickly.  I takes a couple
seconds on my getting-old x86-64 laptop.

> 
> I'm somewhat confused because, in spite of it sometimes taking at least 
> 3 times the normal timeout for the first stop message to appear, the 
> alarm in the test case (which is tied to the normal timeout) was never 
> triggering.  My best theory on that is that the slowness is not in the 
> test case, but rather in gdbserver.  IOW, all the threads are already 
> stopped by the time the alarm would expire, but gdb and gdbserver 
> haven't finished all the notifications and requests to print a stop 
> message for any of the threads yet.  Is that plausible?  Should the 
> timeout for the alarm be factored by the number of threads, too, just to 
> be safe?

Or maybe it was, and the SIGALRM never manages to be processed by gdb
and passed down to the inferior.

> 
> I'm also not entirely sure what this test case is supposed to test. 
>  From the original commit message and comments in the .exp file it seems 
> like timeouts were supposed to be a sign of a broken kernel with thread 
> starvation problems, not bugs in gdb or gdbserver.  

On the kernel side, "waitpid(-1, ...)" just walks the task list linearly
looking for the first that had an event.  Say you have two threads, A and
B which are constantly hitting events/breakpoints.  If A is quick enough,
"waitpid(-1, ...)" returns the event for thread A over and over, and thread B is
starved.  The linux backends in both gdb and gdbserver have code in place that
picks an event LWP at random out of all that have had events.  A similar problem
exists as soon as events are queued out of the target backends into gdb's core
run control -- events can end up pending for processing later in gdb's core
data structures too, and so if gdb just picked those events by walking its own
thread list looking for the first thread that has an event pending, it'd starve
some threads.  So again infrun.c has similar randomization code to avoid
starvation (random_pending_event_thread).

> But, don't we 
> normally just skip tests that the target doesn't support or can't run 
> properly, rather than report them as FAILs?  And, I don't know how to 
> distinguish timeouts that mean the kernel is broken from timeouts that 
> mean the target is just slow and you need to set a bigger value in the 
> test harness.

Pedro Alves


[-- Attachment #2: 0001-Make-it-easier-to-debug-non-stop-fair-events.exp.patch --]
[-- Type: text/x-patch, Size: 3466 bytes --]

From f84e249f33a8d80f9ffc137f8505f5fd79cd13ab Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Mon, 13 Apr 2015 20:59:37 +0100
Subject: [PATCH 1/2] Make it easier to debug non-stop-fair-events.exp

If we enable infrun debug running this test, it quickly fails with a
full expect buffer.  That can be simply handled with a couple
exp_continues.  As it's annoying to hack this every time we need to
debug the test, this patch adds bits to enable debugging support
easily, with a one-line change.

And then, if any iteration of the test fails, we end up with a long
cascade of time outs.  Just bail out when we see the first fail.

gdb/testsuite/
2015-09-08  Pedro Alves  <palves@redhat.com>

	* gdb.threads/non-stop-fair-events.exp (gdb_test_no_anchor)
	(enable_debug): New procedures.
	(test): Use them.  Bail out if waiting for threads fails.
	(top level): Bail out if a test fails.
---
 gdb/testsuite/gdb.threads/non-stop-fair-events.exp | 57 ++++++++++++++++++++--
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
index e2d3f7d..37f5bcb 100644
--- a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
+++ b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
@@ -98,6 +98,29 @@ proc restart {} {
     delete_breakpoints
 }
 
+# Run command and wait for the prompt, without end anchor.
+
+proc gdb_test_no_anchor {cmd} {
+    global gdb_prompt
+
+    gdb_test_multiple $cmd $cmd {
+	-re "$gdb_prompt " {
+	    pass $cmd
+	}
+    }
+}
+
+# Enable/disable debugging.
+
+proc enable_debug {enable} {
+
+    # Comment out to debug problems with the test.
+    return
+
+    gdb_test_no_anchor "set debug infrun $enable"
+    gdb_test_no_anchor "set debug displaced $enable"
+}
+
 # The test proper.  SIGNAL_THREAD is the thread that has been elected
 # to receive the SIGUSR1 signal.
 
@@ -126,30 +149,55 @@ proc test {signal_thread} {
 
 	# Let the main thread queue the signal.
 	gdb_breakpoint "loop_broke"
+
+	enable_debug 1
+
+	set saw_continuing 0
 	set test "continue &"
 	gdb_test_multiple $test $test {
-	    -re "Continuing.\r\n$gdb_prompt " {
-		pass $test
+	    -re "Continuing.\r\n" {
+		set saw_continuing 1
+		exp_continue
+	    }
+	    -re "$gdb_prompt " {
+		gdb_assert $saw_continuing $test
+	    }
+	    -re "infrun:" {
+		exp_continue
 	    }
 	}
 
+	set gotit 0
+
 	# Wait for all threads to finish their steps, and for the main
 	# thread to hit the breakpoint.
 	for {set i 1} { $i <= $NUM_THREADS } { incr i } {
 	    set test "thread $i broke out of loop"
+	    set gotit 0
 	    gdb_test_multiple "" $test {
 		-re "loop_broke" {
 		    # The prompt was already matched in the "continue
 		    # &" test above.  We're now consuming asynchronous
 		    # output that comes after the prompt.
+		    set gotit 1
 		    pass $test
 		}
+		-re "infrun:" {
+		    exp_continue
+		}
+	    }
+	    if {!$gotit} {
+		break
 	    }
 	}
 
+	enable_debug 0
+
 	# It's helpful to have this in the log if the test ever
 	# happens to fail.
 	gdb_test "info threads"
+
+	return $gotit
     }
 }
 
@@ -158,5 +206,8 @@ proc test {signal_thread} {
 # with lowest kernel thread ID.  So test once with the signal pending
 # in each thread, except the main thread.
 for {set i 2} { $i <= $NUM_THREADS } { incr i } {
-    test $i
+    if {![test $i]} {
+	# Avoid cascading timeouts, and bail out.
+	return
+    }
 }
-- 
1.9.3


[-- Attachment #3: 0002-non-stop-fair-events.exp-slower-on-software-single-s.patch --]
[-- Type: text/x-patch, Size: 7097 bytes --]

From ccd1e1cca1409ea8eb6f564561280f345f321077 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Tue, 8 Sep 2015 17:26:02 +0100
Subject: [PATCH 2/2] non-stop-fair-events.exp slower on software single-step
 && !displ-step targets

On software single-step targets that don't support displaced stepping,
threads keep hitting each other's single-step breakpoints, and then
GDB needs to pause all threads to step past those.  The end result is
that progress in the main thread will be slower and it may take a bit
longer for the signal to be queued.  This patch bumps the timeout on
such targets.

gdb/testsuite/ChangeLog:
2015-09-08  Pedro Alves  <palves@redhat.com>

	* gdb.threads/non-stop-fair-events.c (timeout): New global.
	(SECONDS): Redefine.
	(main): Call pthread_kill and alarm early.
	* gdb.threads/non-stop-fair-events.exp: Probe displaced stepping
	support.
	(test): If the target can't hardware step and doesn't support
	displaced stepping, increase the timeout.
---
 gdb/testsuite/gdb.threads/non-stop-fair-events.c   |  9 ++-
 gdb/testsuite/gdb.threads/non-stop-fair-events.exp | 91 +++++++++++++++-------
 gdb/testsuite/lib/gdb.exp                          | 20 +++--
 3 files changed, 82 insertions(+), 38 deletions(-)

diff --git a/gdb/testsuite/gdb.threads/non-stop-fair-events.c b/gdb/testsuite/gdb.threads/non-stop-fair-events.c
index f82c366..700676b 100644
--- a/gdb/testsuite/gdb.threads/non-stop-fair-events.c
+++ b/gdb/testsuite/gdb.threads/non-stop-fair-events.c
@@ -24,7 +24,9 @@
 const int num_threads = NUM_THREADS;
 /* Allow for as much timeout as DejaGnu wants, plus a bit of
    slack.  */
-#define SECONDS (TIMEOUT + 20)
+
+volatile unsigned int timeout = TIMEOUT;
+#define SECONDS (timeout + 20)
 
 pthread_t child_thread[NUM_THREADS];
 volatile pthread_t signal_thread;
@@ -69,6 +71,11 @@ main (void)
   int res;
   int i;
 
+  /* Call these early so that we're sure their PLTs are quickly
+     resolved now, instead of in the busy threads.  */
+  pthread_kill (pthread_self (), 0);
+  alarm (0);
+
   signal (SIGUSR1, handler);
 
   for (i = 0; i < NUM_THREADS; i++)
diff --git a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
index 37f5bcb..ba14f6a 100644
--- a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
+++ b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
@@ -62,6 +62,19 @@ set NUM_THREADS [get_value "num_threads" "get num_threads"]
 # Account for the main thread.
 incr NUM_THREADS
 
+# Probe for displaced stepping support.
+set displaced_stepping_enabled 0
+set msg "check displaced-stepping"
+gdb_test_no_output "set debug displaced 1"
+gdb_test_multiple "si" $msg {
+    -re "displaced pc to.*$gdb_prompt $" {
+	set displaced_stepping_enabled 1
+    }
+    -re ".*$gdb_prompt $" {
+    }
+}
+gdb_test_no_output "set debug displaced 0"
+
 # Run threads to their start positions.  This prepares for a new test
 # sequence.
 
@@ -127,6 +140,8 @@ proc enable_debug {enable} {
 proc test {signal_thread} {
     global gdb_prompt
     global NUM_THREADS
+    global timeout
+    global displaced_stepping_enabled
 
     with_test_prefix "signal_thread=$signal_thread" {
 	restart
@@ -152,42 +167,58 @@ proc test {signal_thread} {
 
 	enable_debug 1
 
-	set saw_continuing 0
-	set test "continue &"
-	gdb_test_multiple $test $test {
-	    -re "Continuing.\r\n" {
-		set saw_continuing 1
-		exp_continue
-	    }
-	    -re "$gdb_prompt " {
-		gdb_assert $saw_continuing $test
-	    }
-	    -re "infrun:" {
-		exp_continue
-	    }
+	# On software single-step targets that don't support displaced
+	# stepping, threads keep hitting each others' single-step
+	# breakpoints, and then GDB needs to pause all threads to step
+	# past those.  The end result is that progress in the main
+	# thread will be slower and it may take a bit longer for the
+	# signal to be queued; bump the timeout.
+	if {!$displaced_stepping_enabled && ![can_hardware_single_step]} {
+	    set factor 5
+	} else {
+	    set factor 1
 	}
-
-	set gotit 0
-
-	# Wait for all threads to finish their steps, and for the main
-	# thread to hit the breakpoint.
-	for {set i 1} { $i <= $NUM_THREADS } { incr i } {
-	    set test "thread $i broke out of loop"
-	    set gotit 0
-	    gdb_test_multiple "" $test {
-		-re "loop_broke" {
-		    # The prompt was already matched in the "continue
-		    # &" test above.  We're now consuming asynchronous
-		    # output that comes after the prompt.
-		    set gotit 1
-		    pass $test
+	with_timeout_factor $factor {
+	    gdb_test "print timeout = $timeout" " = $timeout" \
+		"set timeout in the inferior"
+
+	    set saw_continuing 0
+	    set test "continue &"
+	    gdb_test_multiple $test $test {
+		-re "Continuing.\r\n" {
+		    set saw_continuing 1
+		    exp_continue
+		}
+		-re "$gdb_prompt " {
+		    gdb_assert $saw_continuing $test
 		}
 		-re "infrun:" {
 		    exp_continue
 		}
 	    }
-	    if {!$gotit} {
-		break
+
+	    set gotit 0
+
+	    # Wait for all threads to finish their steps, and for the main
+	    # thread to hit the breakpoint.
+	    for {set i 1} { $i <= $NUM_THREADS } { incr i } {
+		set test "thread $i broke out of loop"
+		set gotit 0
+		gdb_test_multiple "" $test {
+		    -re "loop_broke" {
+			# The prompt was already matched in the "continue
+			# &" test above.  We're now consuming asynchronous
+			# output that comes after the prompt.
+			set gotit 1
+			pass $test
+		    }
+		    -re "infrun:" {
+			exp_continue
+		    }
+		}
+		if {!$gotit} {
+		    break
+		}
 	    }
 	}
 
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 56cde7a..9eaf721 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -2150,15 +2150,10 @@ proc supports_get_siginfo_type {} {
     }
 }
 
-# Return 1 if target hardware or OS supports single stepping to signal
-# handler, otherwise, return 0.
+# Return 1 if the target supports hardware single stepping.
 
-proc can_single_step_to_signal_handler {} {
+proc can_hardware_single_step {} {
 
-    # Targets don't have hardware single step.  On these targets, when
-    # a signal is delivered during software single step, gdb is unable
-    # to determine the next instruction addresses, because start of signal
-    # handler is one of them.
     if { [istarget "arm*-*-*"] || [istarget "mips*-*-*"]
 	 || [istarget "tic6x-*-*"] || [istarget "sparc*-*-linux*"]
 	 || [istarget "nios2-*-*"] } {
@@ -2168,6 +2163,17 @@ proc can_single_step_to_signal_handler {} {
     return 1
 }
 
+# Return 1 if target hardware or OS supports single stepping to signal
+# handler, otherwise, return 0.
+
+proc can_single_step_to_signal_handler {} {
+    # Targets don't have hardware single step.  On these targets, when
+    # a signal is delivered during software single step, gdb is unable
+    # to determine the next instruction addresses, because start of signal
+    # handler is one of them.
+    return [can_hardware_single_step]
+}
+
 # Return 1 if target supports process record, otherwise return 0.
 
 proc supports_process_record {} {
-- 
1.9.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] fix gdb.threads/non-stop-fair-events.exp timeouts
  2015-09-08 16:30 ` Pedro Alves
@ 2015-09-09 16:09   ` Sandra Loosemore
  2015-09-09 18:40     ` Pedro Alves
  0 siblings, 1 reply; 6+ messages in thread
From: Sandra Loosemore @ 2015-09-09 16:09 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, Yao Qi

On 09/08/2015 10:30 AM, Pedro Alves wrote:
>
> Yeah, I've seen this before with a local series I use for debugging
> software single-step things that implements software single-stepping
> on x86.  I just re-tried it now after rebasing that series to
> current mainline, and I still see the time outs against gdbserver.
>
> AFAICS, nios2 is a software single-step target that does not implement
> displaced stepping either.  I had a patch for this that I had
> never posted.  See attached.
>

Hmmm, these two patches are not working for me.  The trouble is that 
this part:

> +gdb_test_multiple "si" $msg {
> +    -re "displaced pc to.*$gdb_prompt $" {
> +	set displaced_stepping_enabled 1
> +    }
> +    -re ".*$gdb_prompt $" {
> +    }
> +}

is causing the target to step from main to pthread_self, which is in a 
different file.  This causes the subsequent breakpoint commands to fail, 
and things go south from there:

Breakpoint 1, main () at 
/scratch/sandra/nios2-linux-trunk/src/gdb-trunk/gdb/testsuite/gdb.threads/non-stop-fair-events.c:76
76	  pthread_kill (pthread_self (), 0);
(gdb) handle SIGUSR1 print nostop pass
Signal        Stop	Print	Pass to program	Description
SIGUSR1       No	Yes	Yes		User defined signal 1
(gdb) PASS: gdb.threads/non-stop-fair-events.exp: handle SIGUSR1 print 
nostop pass
print num_threads
$1 = 10
(gdb) PASS: gdb.threads/non-stop-fair-events.exp: get num_threads
set debug displaced 1
(gdb) PASS: gdb.threads/non-stop-fair-events.exp: set debug displaced 1
si
0x00002720 in pthread_self () at pthread_self.c:27
27	}
(gdb) set debug displaced 0
(gdb) PASS: gdb.threads/non-stop-fair-events.exp: set debug displaced 0
delete breakpoints
Delete all breakpoints? (y or n) y
(gdb) info breakpoints
No breakpoints or watchpoints.
(gdb) print got_sig = 0
$2 = 0
(gdb) PASS: gdb.threads/non-stop-fair-events.exp: signal_thread=2: print 
got_sig = 0
break 63
No line 63 in the current file.
Make breakpoint pending on future shared library load? (y or [n]) n
(gdb) FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=2: 
setting breakpoint at 63
break 88
No line 88 in the current file.
Make breakpoint pending on future shared library load? (y or [n]) n
(gdb) FAIL: gdb.threads/non-stop-fair-events.exp: signal_thread=2: 
setting breakpoint at 88

-Sandra

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] fix gdb.threads/non-stop-fair-events.exp timeouts
  2015-09-09 16:09   ` Sandra Loosemore
@ 2015-09-09 18:40     ` Pedro Alves
  2015-09-15  2:25       ` Sandra Loosemore
  0 siblings, 1 reply; 6+ messages in thread
From: Pedro Alves @ 2015-09-09 18:40 UTC (permalink / raw)
  To: Sandra Loosemore; +Cc: gdb-patches, Yao Qi

On 09/09/2015 05:08 PM, Sandra Loosemore wrote:
> On 09/08/2015 10:30 AM, Pedro Alves wrote:
>>
>> Yeah, I've seen this before with a local series I use for debugging
>> software single-step things that implements software single-stepping
>> on x86.  I just re-tried it now after rebasing that series to
>> current mainline, and I still see the time outs against gdbserver.
>>
>> AFAICS, nios2 is a software single-step target that does not implement
>> displaced stepping either.  I had a patch for this that I had
>> never posted.  See attached.
>>
> 
> Hmmm, these two patches are not working for me.  The trouble is that 
> this part:
> 
>> +gdb_test_multiple "si" $msg {
>> +    -re "displaced pc to.*$gdb_prompt $" {
>> +	set displaced_stepping_enabled 1
>> +    }
>> +    -re ".*$gdb_prompt $" {
>> +    }
>> +}
> 
> is causing the target to step from main to pthread_self, which is in a 
> different file.  This causes the subsequent breakpoint commands to fail, 
> and things go south from there:

OK, I got "lucky" on x86 and a stepi runs some instruction before the call.
Could you try simply replacing the "si" with "next" ?  It doesn't matter
whether that issues several single-steps or not.  What matters is that gdb
tries to step past the breakpoint that is set at the current PC (from the
earlier runto_main).  We're trying to figure out if gdb uses displaced
stepping for that.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] fix gdb.threads/non-stop-fair-events.exp timeouts
  2015-09-09 18:40     ` Pedro Alves
@ 2015-09-15  2:25       ` Sandra Loosemore
  2015-09-16 14:47         ` Pedro Alves
  0 siblings, 1 reply; 6+ messages in thread
From: Sandra Loosemore @ 2015-09-15  2:25 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb-patches, Yao Qi

On 09/09/2015 12:40 PM, Pedro Alves wrote:
> On 09/09/2015 05:08 PM, Sandra Loosemore wrote:
>> On 09/08/2015 10:30 AM, Pedro Alves wrote:
>>>
>>> Yeah, I've seen this before with a local series I use for debugging
>>> software single-step things that implements software single-stepping
>>> on x86.  I just re-tried it now after rebasing that series to
>>> current mainline, and I still see the time outs against gdbserver.
>>>
>>> AFAICS, nios2 is a software single-step target that does not implement
>>> displaced stepping either.  I had a patch for this that I had
>>> never posted.  See attached.
>>>
>>
>> Hmmm, these two patches are not working for me.  The trouble is that
>> this part:
>>
>>> +gdb_test_multiple "si" $msg {
>>> +    -re "displaced pc to.*$gdb_prompt $" {
>>> +	set displaced_stepping_enabled 1
>>> +    }
>>> +    -re ".*$gdb_prompt $" {
>>> +    }
>>> +}
>>
>> is causing the target to step from main to pthread_self, which is in a
>> different file.  This causes the subsequent breakpoint commands to fail,
>> and things go south from there:
>
> OK, I got "lucky" on x86 and a stepi runs some instruction before the call.
> Could you try simply replacing the "si" with "next" ?  It doesn't matter
> whether that issues several single-steps or not.  What matters is that gdb
> tries to step past the breakpoint that is set at the current PC (from the
> earlier runto_main).  We're trying to figure out if gdb uses displaced
> stepping for that.

Yes, that fixes the trouble, and the tests run OK now.  I did find that 
it still timed out 1 of the 5 times I ran it, though, so maybe the 
timeout factor really does need to match NUM_THREADS to be safe?

-Sandra

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC] fix gdb.threads/non-stop-fair-events.exp timeouts
  2015-09-15  2:25       ` Sandra Loosemore
@ 2015-09-16 14:47         ` Pedro Alves
  0 siblings, 0 replies; 6+ messages in thread
From: Pedro Alves @ 2015-09-16 14:47 UTC (permalink / raw)
  To: Sandra Loosemore; +Cc: gdb-patches, Yao Qi

On 09/15/2015 03:24 AM, Sandra Loosemore wrote:

> Yes, that fixes the trouble, and the tests run OK now.  I did find that 
> it still timed out 1 of the 5 times I ran it, though, so maybe the 
> timeout factor really does need to match NUM_THREADS to be safe?

Indeed, just played with NUM_THREADS now, and the time it takes to
complete the test depends on it.

Here's what I'm pushing then.

From 7e2a5b76cc4dc2871b44792737ca02950b68c2fb Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Wed, 16 Sep 2015 15:31:48 +0100
Subject: [PATCH] non-stop-fair-events.exp slower on software single-step &&
 !displ-step targets

On software single-step targets that don't support displaced stepping,
threads keep hitting each other's single-step breakpoints, and then
GDB needs to pause all threads to step past those.  The end result is
that progress in the main thread will be slower and it may take a bit
longer for the signal to be queued.  This patch bumps the timeout on
such targets.

gdb/testsuite/ChangeLog:
2015-09-16  Pedro Alves  <palves@redhat.com>
	    Sandra Loosemore <sandra@codesourcery.com>

	* gdb.threads/non-stop-fair-events.c (timeout): New global.
	(SECONDS): Redefine.
	(main): Call pthread_kill and alarm early.
	* gdb.threads/non-stop-fair-events.exp: Probe displaced stepping
	support.
	(test): If the target can't hardware step and doesn't support
	displaced stepping, increase the timeout.
---
 gdb/testsuite/gdb.threads/non-stop-fair-events.c   |  9 ++-
 gdb/testsuite/gdb.threads/non-stop-fair-events.exp | 94 +++++++++++++++-------
 gdb/testsuite/lib/gdb.exp                          | 20 +++--
 3 files changed, 85 insertions(+), 38 deletions(-)

diff --git a/gdb/testsuite/gdb.threads/non-stop-fair-events.c b/gdb/testsuite/gdb.threads/non-stop-fair-events.c
index f82c366..700676b 100644
--- a/gdb/testsuite/gdb.threads/non-stop-fair-events.c
+++ b/gdb/testsuite/gdb.threads/non-stop-fair-events.c
@@ -24,7 +24,9 @@
 const int num_threads = NUM_THREADS;
 /* Allow for as much timeout as DejaGnu wants, plus a bit of
    slack.  */
-#define SECONDS (TIMEOUT + 20)
+
+volatile unsigned int timeout = TIMEOUT;
+#define SECONDS (timeout + 20)
 
 pthread_t child_thread[NUM_THREADS];
 volatile pthread_t signal_thread;
@@ -69,6 +71,11 @@ main (void)
   int res;
   int i;
 
+  /* Call these early so that we're sure their PLTs are quickly
+     resolved now, instead of in the busy threads.  */
+  pthread_kill (pthread_self (), 0);
+  alarm (0);
+
   signal (SIGUSR1, handler);
 
   for (i = 0; i < NUM_THREADS; i++)
diff --git a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
index 37f5bcb..27b50c5 100644
--- a/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
+++ b/gdb/testsuite/gdb.threads/non-stop-fair-events.exp
@@ -62,6 +62,21 @@ set NUM_THREADS [get_value "num_threads" "get num_threads"]
 # Account for the main thread.
 incr NUM_THREADS
 
+# Probe for displaced stepping support.  We're stopped at the main
+# breakpoint.  If displaced stepping is supported, we should see
+# related debug output.
+set displaced_stepping_enabled 0
+set msg "check displaced-stepping"
+gdb_test_no_output "set debug displaced 1"
+gdb_test_multiple "next" $msg {
+    -re "displaced pc to.*$gdb_prompt $" {
+	set displaced_stepping_enabled 1
+    }
+    -re ".*$gdb_prompt $" {
+    }
+}
+gdb_test_no_output "set debug displaced 0"
+
 # Run threads to their start positions.  This prepares for a new test
 # sequence.
 
@@ -127,6 +142,8 @@ proc enable_debug {enable} {
 proc test {signal_thread} {
     global gdb_prompt
     global NUM_THREADS
+    global timeout
+    global displaced_stepping_enabled
 
     with_test_prefix "signal_thread=$signal_thread" {
 	restart
@@ -152,42 +169,59 @@ proc test {signal_thread} {
 
 	enable_debug 1
 
-	set saw_continuing 0
-	set test "continue &"
-	gdb_test_multiple $test $test {
-	    -re "Continuing.\r\n" {
-		set saw_continuing 1
-		exp_continue
-	    }
-	    -re "$gdb_prompt " {
-		gdb_assert $saw_continuing $test
-	    }
-	    -re "infrun:" {
-		exp_continue
-	    }
+	# On software single-step targets that don't support displaced
+	# stepping, threads keep hitting each others' single-step
+	# breakpoints, and then GDB needs to pause all threads to step
+	# past those.  The end result is that progress in the main
+	# thread will be slower and it may take a bit longer for the
+	# signal to be queued; bump the timeout.
+	if {!$displaced_stepping_enabled && ![can_hardware_single_step]} {
+	    # The more threads we have, the longer it takes.
+	    set factor $NUM_THREADS
+	} else {
+	    set factor 1
 	}
-
-	set gotit 0
-
-	# Wait for all threads to finish their steps, and for the main
-	# thread to hit the breakpoint.
-	for {set i 1} { $i <= $NUM_THREADS } { incr i } {
-	    set test "thread $i broke out of loop"
-	    set gotit 0
-	    gdb_test_multiple "" $test {
-		-re "loop_broke" {
-		    # The prompt was already matched in the "continue
-		    # &" test above.  We're now consuming asynchronous
-		    # output that comes after the prompt.
-		    set gotit 1
-		    pass $test
+	with_timeout_factor $factor {
+	    gdb_test "print timeout = $timeout" " = $timeout" \
+		"set timeout in the inferior"
+
+	    set saw_continuing 0
+	    set test "continue &"
+	    gdb_test_multiple $test $test {
+		-re "Continuing.\r\n" {
+		    set saw_continuing 1
+		    exp_continue
+		}
+		-re "$gdb_prompt " {
+		    gdb_assert $saw_continuing $test
 		}
 		-re "infrun:" {
 		    exp_continue
 		}
 	    }
-	    if {!$gotit} {
-		break
+
+	    set gotit 0
+
+	    # Wait for all threads to finish their steps, and for the main
+	    # thread to hit the breakpoint.
+	    for {set i 1} { $i <= $NUM_THREADS } { incr i } {
+		set test "thread $i broke out of loop"
+		set gotit 0
+		gdb_test_multiple "" $test {
+		    -re "loop_broke" {
+			# The prompt was already matched in the "continue
+			# &" test above.  We're now consuming asynchronous
+			# output that comes after the prompt.
+			set gotit 1
+			pass $test
+		    }
+		    -re "infrun:" {
+			exp_continue
+		    }
+		}
+		if {!$gotit} {
+		    break
+		}
 	    }
 	}
 
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 56cde7a..9eaf721 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -2150,15 +2150,10 @@ proc supports_get_siginfo_type {} {
     }
 }
 
-# Return 1 if target hardware or OS supports single stepping to signal
-# handler, otherwise, return 0.
+# Return 1 if the target supports hardware single stepping.
 
-proc can_single_step_to_signal_handler {} {
+proc can_hardware_single_step {} {
 
-    # Targets don't have hardware single step.  On these targets, when
-    # a signal is delivered during software single step, gdb is unable
-    # to determine the next instruction addresses, because start of signal
-    # handler is one of them.
     if { [istarget "arm*-*-*"] || [istarget "mips*-*-*"]
 	 || [istarget "tic6x-*-*"] || [istarget "sparc*-*-linux*"]
 	 || [istarget "nios2-*-*"] } {
@@ -2168,6 +2163,17 @@ proc can_single_step_to_signal_handler {} {
     return 1
 }
 
+# Return 1 if target hardware or OS supports single stepping to signal
+# handler, otherwise, return 0.
+
+proc can_single_step_to_signal_handler {} {
+    # Targets don't have hardware single step.  On these targets, when
+    # a signal is delivered during software single step, gdb is unable
+    # to determine the next instruction addresses, because start of signal
+    # handler is one of them.
+    return [can_hardware_single_step]
+}
+
 # Return 1 if target supports process record, otherwise return 0.
 
 proc supports_process_record {} {
-- 
1.9.3


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-09-16 14:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-04 16:55 [RFC] fix gdb.threads/non-stop-fair-events.exp timeouts Sandra Loosemore
2015-09-08 16:30 ` Pedro Alves
2015-09-09 16:09   ` Sandra Loosemore
2015-09-09 18:40     ` Pedro Alves
2015-09-15  2:25       ` Sandra Loosemore
2015-09-16 14:47         ` Pedro Alves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).