* [PATCH] GDB/testsuite: Extend the time gdbserver is waited for @ 2014-07-28 19:39 Maciej W. Rozycki 2014-07-29 13:41 ` Pedro Alves 0 siblings, 1 reply; 7+ messages in thread From: Maciej W. Rozycki @ 2014-07-28 19:39 UTC (permalink / raw) To: gdb-patches Hi, Gdbserver support code uses the global timeout value to determine when to stop waiting for a gdbserver process being started to respond before continuing anyway. This timeout is usually as low as 10s and may not be enough in this context, for example on the first run where the filesystem cache is cold, even if it is elsewhere. E.g. I observe this reliably with gdbserver started the first time in QEMU running in the system emulation mode: (gdb) file .../gdb.base/advance Reading symbols from .../gdb.base/advance...done. (gdb) delete breakpoints (gdb) info breakpoints No breakpoints or watchpoints. (gdb) break main Breakpoint 1 at 0x87f8: file .../gdb.base/advance.c, line 41. (gdb) set remotetimeout 15 (gdb) kill The program is not being run. (gdb) [...] .../bin/gdbserver --once :6014 advance target remote localhost:6014 Remote debugging using localhost:6014 Remote communication error. Target disconnected.: Connection reset by peer. (gdb) continue The program is not being run. (gdb) Process advance created; pid = 999 Listening on port 6014 FAIL: gdb.base/advance.exp: Can't run to main -- notice how the test harness proceeded with the `target remote ...' command even though gdbserver hasn't completed its startup yet. A while later when it's finally ready it's too late already. I checked the timing here and it takes gdbserver roughly 25 seconds to start in this scenario. Subsequent gdbserver starts in the same test run take less time and usually complete within 10 seconds although occasionally `target remote ...' precedes the corresponding `Listening on port...' message again. Therefore I have fixed this problem by setting an explicit timeout to 120s on the expect call in question. If this turns out too arbitrary sometime, then perhaps a separate `gdbserver_timeout' setting might be due. Tested with arm-linux-gnueabi and mips-linux-gnu. OK to apply? 2014-07-28 Maciej W. Rozycki <macro@codesourcery.com> gdb/testsuite/ * lib/gdbserver-support.exp (gdbserver_start): Set timeout to 120 on waiting for the TCP socket to open. Maciej gdb-test-gdbserver-start-timeout.diff Index: gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp =================================================================== --- gdb-fsf-trunk-quilt.orig/gdb/testsuite/lib/gdbserver-support.exp 2014-05-13 02:52:11.347706187 +0100 +++ gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp 2014-05-30 01:45:51.658977074 +0100 @@ -275,6 +275,7 @@ proc gdbserver_start { options arguments # Wait for the server to open its TCP socket, so that GDB can connect. expect { -i $server_spawn_id + -timeout 120 -notransfer -re "Listening on" { } -re "Can't bind address: Address already in use\\.\r\n" { ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] GDB/testsuite: Extend the time gdbserver is waited for 2014-07-28 19:39 [PATCH] GDB/testsuite: Extend the time gdbserver is waited for Maciej W. Rozycki @ 2014-07-29 13:41 ` Pedro Alves 2014-08-26 17:14 ` [PATCH] gdbserver-support: Handle gdbserver start failures Maciej W. Rozycki 2014-09-09 15:15 ` [PATCH] GDB/testsuite: Extend the time gdbserver is waited for Maciej W. Rozycki 0 siblings, 2 replies; 7+ messages in thread From: Pedro Alves @ 2014-07-29 13:41 UTC (permalink / raw) To: Maciej W. Rozycki, gdb-patches On 07/28/2014 08:34 PM, Maciej W. Rozycki wrote: > gdb-test-gdbserver-start-timeout.diff > Index: gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp > =================================================================== > --- gdb-fsf-trunk-quilt.orig/gdb/testsuite/lib/gdbserver-support.exp 2014-05-13 02:52:11.347706187 +0100 > +++ gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp 2014-05-30 01:45:51.658977074 +0100 > @@ -275,6 +275,7 @@ proc gdbserver_start { options arguments > # Wait for the server to open its TCP socket, so that GDB can connect. > expect { > -i $server_spawn_id > + -timeout 120 > -notransfer > -re "Listening on" { } > -re "Can't bind address: Address already in use\\.\r\n" { > OK. Wouldn't it be good to add a 'timeout {...}' case that emits a warning or some such? Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] gdbserver-support: Handle gdbserver start failures 2014-07-29 13:41 ` Pedro Alves @ 2014-08-26 17:14 ` Maciej W. Rozycki 2014-09-03 12:39 ` [PING][PATCH] " Maciej W. Rozycki 2014-09-04 9:31 ` [PATCH] " Pedro Alves 2014-09-09 15:15 ` [PATCH] GDB/testsuite: Extend the time gdbserver is waited for Maciej W. Rozycki 1 sibling, 2 replies; 7+ messages in thread From: Maciej W. Rozycki @ 2014-08-26 17:14 UTC (permalink / raw) To: Pedro Alves; +Cc: gdb-patches On Tue, 29 Jul 2014, Pedro Alves wrote: > On 07/28/2014 08:34 PM, Maciej W. Rozycki wrote: > > > gdb-test-gdbserver-start-timeout.diff > > Index: gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp > > =================================================================== > > --- gdb-fsf-trunk-quilt.orig/gdb/testsuite/lib/gdbserver-support.exp 2014-05-13 02:52:11.347706187 +0100 > > +++ gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp 2014-05-30 01:45:51.658977074 +0100 > > @@ -275,6 +275,7 @@ proc gdbserver_start { options arguments > > # Wait for the server to open its TCP socket, so that GDB can connect. > > expect { > > -i $server_spawn_id > > + -timeout 120 > > -notransfer > > -re "Listening on" { } > > -re "Can't bind address: Address already in use\\.\r\n" { > > > > OK. > > Wouldn't it be good to add a 'timeout {...}' case that emits a > warning or some such? Good point. I actually went further than that. As it happens we have a board that fails a gdb.base/gcore-relro.exp test case reproducibly and moreover the case appears to trigger a kernel bug making the it less than usable. Specifically the board remains responsive to some extent, however processes do not appear to be able to successfully complete termination anymore and perhaps more importantly further gdbserver processes can be started, but they never reach the stage of listening on the RSP socket. This change handles timeouts in gdbserver start properly, by throwing a TCL error exception when gdbserver does not report listening on the RSP socket in time. This is then caught at the outer level and reported, and 2 rather than 1 is returned so that the caller may tell the failure to start gdbserver and other issues apart and act accordingly (or do nothing). I thought letting the exception unwind further on might be a good idea for any test harnesses out there to break outright where a gdbserver start error is silently ignored right now, however I figured out the calls to gdbserver-support.exp are buried down too deep in the GDB test suite for such a change to be made easily. I think returning a distinct return value is good enough (the API says "non-zero", so 2 is as good as 1) and we can always make the error harder in a later step if required. With config/gdbserver.exp being used this change remains transparent to the target board, the return value is passed up by gdb_reload and the error exception unwinds through gdbserver_gdb_load and is caught and handled by mi_gdb_target_load. A call to perror is still made, reporting the timeout, and in the case of mi_gdb_target_load the procedure returns a value denoting unsuccessful completion. An unsuccessful completion of gdb_reload is already handled elsewhere. An alternative gdbserver board configuration can interpret the return value in its gdb_reload implementation and catch the error in gdbserver_gdb_load in an attempt to recover a target board that has gone astray, for example by rebooting the board somehow. This has proved effective with our failing board, that now completes the remaining test cases with no further hiccups. I pushed it through regression testing with the powerpc-linux-gnu target and a some half a dozen of multilibs (including ones that trip over the faulty kernel on the problematic board) and there were no issues. The full list of the offending test cases is as follows: FAIL: gdb.base/gcore-relro.exp: save a corefile FAIL: gdb.base/print-symbol-loading.exp: save a corefile FAIL: gdb.threads/gcore-thread.exp: save a corefile so essentially the same stuff in different places. OK to apply? 2014-08-26 Maciej W. Rozycki <macro@codesourcery.com> gdb/testsuite/ * lib/gdbserver-support.exp (gdbserver_start): Throw an error exception on timeout. (gdbserver_run): Catch any `gdbserver_spawn' error exceptions. (gdbserver_start_extended): Catch any `gdbserver_start' error exceptions. (gdbserver_start_multi, mi_gdbserver_start_multi): Likewise. * lib/mi-support.exp (mi_gdb_target_load): Catch any `gdbserver_gdb_load' error exceptions. Maciej gdb-test-gdbserver-start-timeout-handle.diff Index: gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp =================================================================== --- gdb-fsf-trunk-quilt.orig/gdb/testsuite/lib/gdbserver-support.exp 2014-08-23 01:40:41.758957007 +0100 +++ gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp 2014-08-23 01:45:06.788973638 +0100 @@ -287,6 +287,9 @@ proc gdbserver_start { options arguments continue } } + timeout { + error "Timeout waiting for gdbserver response." + } } break } @@ -326,7 +329,8 @@ proc gdbserver_spawn { child_args } { } # Start a gdbserver process running HOST_EXEC and pass CHILD_ARGS -# to it. Return 0 on success, or non-zero on failure. +# to it. Return 0 on success, or non-zero on failure: 2 if gdbserver +# failed to start or 1 if we couldn't connect to it. proc gdbserver_run { child_args } { global gdbserver_protocol @@ -347,7 +351,10 @@ proc gdbserver_run { child_args } { } } - set res [gdbserver_spawn $child_args] + if { [catch { gdbserver_spawn $child_args } res] == 1 } { + perror $res + return 2 + } set gdbserver_protocol [lindex $res 0] set gdbserver_gdbport [lindex $res 1] @@ -377,7 +384,10 @@ proc gdbserver_start_extended { } { global gdbserver_gdbport global use_gdb_stub - set res [gdbserver_start "--multi" ""] + if { [catch { gdbserver_start "--multi" "" } res] == 1 } { + perror $res + return 2 + } set gdbserver_protocol [lindex $res 0] if { [string first "extended-" $gdbserver_protocol] != 0} { set gdbserver_protocol "extended-$gdbserver_protocol" @@ -399,7 +409,10 @@ proc gdbserver_start_multi { } { global gdbserver_protocol global gdbserver_gdbport - set res [gdbserver_start "--multi" ""] + if { [catch { gdbserver_start "--multi" "" } res] == 1 } { + perror $res + return 2 + } set gdbserver_protocol [lindex $res 0] set gdbserver_gdbport [lindex $res 1] @@ -414,7 +427,10 @@ proc mi_gdbserver_start_multi { } { global gdbserver_protocol global gdbserver_gdbport - set res [gdbserver_start "--multi" ""] + if { [catch { gdbserver_start "--multi" "" } res] == 1 } { + perror $res + return 2 + } set gdbserver_protocol [lindex $res 0] set gdbserver_gdbport [lindex $res 1] Index: gdb-fsf-trunk-quilt/gdb/testsuite/lib/mi-support.exp =================================================================== --- gdb-fsf-trunk-quilt.orig/gdb/testsuite/lib/mi-support.exp 2014-08-23 01:11:09.000000000 +0100 +++ gdb-fsf-trunk-quilt/gdb/testsuite/lib/mi-support.exp 2014-08-23 01:45:06.788973638 +0100 @@ -477,7 +477,10 @@ proc mi_gdb_target_load { } { if { [info procs gdbserver_gdb_load] != "" } { mi_gdb_test "kill" ".*" "" - set res [gdbserver_gdb_load] + if { [catch gdbserver_gdb_load res] == 1 } { + perror $res + return -1 + } set protocol [lindex $res 0] set gdbport [lindex $res 1] ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PING][PATCH] gdbserver-support: Handle gdbserver start failures 2014-08-26 17:14 ` [PATCH] gdbserver-support: Handle gdbserver start failures Maciej W. Rozycki @ 2014-09-03 12:39 ` Maciej W. Rozycki 2014-09-04 9:31 ` [PATCH] " Pedro Alves 1 sibling, 0 replies; 7+ messages in thread From: Maciej W. Rozycki @ 2014-09-03 12:39 UTC (permalink / raw) To: Pedro Alves; +Cc: gdb-patches On Tue, 26 Aug 2014, Maciej W. Rozycki wrote: > > Wouldn't it be good to add a 'timeout {...}' case that emits a > > warning or some such? > > Good point. I actually went further than that. > > As it happens we have a board that fails a gdb.base/gcore-relro.exp test > case reproducibly and moreover the case appears to trigger a kernel bug > making the it less than usable. Specifically the board remains responsive > to some extent, however processes do not appear to be able to successfully > complete termination anymore and perhaps more importantly further > gdbserver processes can be started, but they never reach the stage of > listening on the RSP socket. > > This change handles timeouts in gdbserver start properly, by throwing a > TCL error exception when gdbserver does not report listening on the RSP > socket in time. This is then caught at the outer level and reported, and > 2 rather than 1 is returned so that the caller may tell the failure to > start gdbserver and other issues apart and act accordingly (or do > nothing). > > I thought letting the exception unwind further on might be a good idea > for any test harnesses out there to break outright where a gdbserver start > error is silently ignored right now, however I figured out the calls to > gdbserver-support.exp are buried down too deep in the GDB test suite for > such a change to be made easily. I think returning a distinct return > value is good enough (the API says "non-zero", so 2 is as good as 1) and > we can always make the error harder in a later step if required. > > With config/gdbserver.exp being used this change remains transparent to > the target board, the return value is passed up by gdb_reload and the > error exception unwinds through gdbserver_gdb_load and is caught and > handled by mi_gdb_target_load. A call to perror is still made, reporting > the timeout, and in the case of mi_gdb_target_load the procedure returns a > value denoting unsuccessful completion. An unsuccessful completion of > gdb_reload is already handled elsewhere. > > An alternative gdbserver board configuration can interpret the return > value in its gdb_reload implementation and catch the error in > gdbserver_gdb_load in an attempt to recover a target board that has gone > astray, for example by rebooting the board somehow. This has proved > effective with our failing board, that now completes the remaining test > cases with no further hiccups. > > I pushed it through regression testing with the powerpc-linux-gnu target > and a some half a dozen of multilibs (including ones that trip over the > faulty kernel on the problematic board) and there were no issues. The > full list of the offending test cases is as follows: > > FAIL: gdb.base/gcore-relro.exp: save a corefile > FAIL: gdb.base/print-symbol-loading.exp: save a corefile > FAIL: gdb.threads/gcore-thread.exp: save a corefile > > so essentially the same stuff in different places. > > OK to apply? > > 2014-08-26 Maciej W. Rozycki <macro@codesourcery.com> > > gdb/testsuite/ > * lib/gdbserver-support.exp (gdbserver_start): Throw an error > exception on timeout. > (gdbserver_run): Catch any `gdbserver_spawn' error exceptions. > (gdbserver_start_extended): Catch any `gdbserver_start' error > exceptions. > (gdbserver_start_multi, mi_gdbserver_start_multi): Likewise. > * lib/mi-support.exp (mi_gdb_target_load): Catch any > `gdbserver_gdb_load' error exceptions. Ping! Maciej ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] gdbserver-support: Handle gdbserver start failures 2014-08-26 17:14 ` [PATCH] gdbserver-support: Handle gdbserver start failures Maciej W. Rozycki 2014-09-03 12:39 ` [PING][PATCH] " Maciej W. Rozycki @ 2014-09-04 9:31 ` Pedro Alves 2014-09-09 15:24 ` Maciej W. Rozycki 1 sibling, 1 reply; 7+ messages in thread From: Pedro Alves @ 2014-09-04 9:31 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: gdb-patches Hi Maciej, On 08/26/2014 06:14 PM, Maciej W. Rozycki wrote: >>> Index: gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp >>> =================================================================== >>> --- gdb-fsf-trunk-quilt.orig/gdb/testsuite/lib/gdbserver-support.exp 2014-05-13 02:52:11.347706187 +0100 >>> +++ gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp 2014-05-30 01:45:51.658977074 +0100 >>> @@ -275,6 +275,7 @@ proc gdbserver_start { options arguments >>> # Wait for the server to open its TCP socket, so that GDB can connect. >>> expect { >>> -i $server_spawn_id >>> + -timeout 120 >>> -notransfer >>> -re "Listening on" { } >>> -re "Can't bind address: Address already in use\\.\r\n" { >>> >> >> OK. >> >> Wouldn't it be good to add a 'timeout {...}' case that emits a >> warning or some such? > > Good point. I actually went further than that. > > As it happens we have a board that fails a gdb.base/gcore-relro.exp test > case reproducibly and moreover the case appears to trigger a kernel bug > making the it less than usable. Specifically the board remains responsive > to some extent, however processes do not appear to be able to successfully > complete termination anymore and perhaps more importantly further > gdbserver processes can be started, but they never reach the stage of > listening on the RSP socket. > > This change handles timeouts in gdbserver start properly, by throwing a > TCL error exception when gdbserver does not report listening on the RSP > socket in time. This is then caught at the outer level and reported, and > 2 rather than 1 is returned so that the caller may tell the failure to > start gdbserver and other issues apart and act accordingly (or do > nothing). > > I thought letting the exception unwind further on might be a good idea > for any test harnesses out there to break outright where a gdbserver start > error is silently ignored right now, however I figured out the calls to > gdbserver-support.exp are buried down too deep in the GDB test suite for > such a change to be made easily. I think returning a distinct return > value is good enough (the API says "non-zero", so 2 is as good as 1) and > we can always make the error harder in a later step if required. > > With config/gdbserver.exp being used this change remains transparent to > the target board, the return value is passed up by gdb_reload and the > error exception unwinds through gdbserver_gdb_load and is caught and > handled by mi_gdb_target_load. A call to perror is still made, reporting > the timeout, and in the case of mi_gdb_target_load the procedure returns a > value denoting unsuccessful completion. An unsuccessful completion of > gdb_reload is already handled elsewhere. > > An alternative gdbserver board configuration can interpret the return > value in its gdb_reload implementation and catch the error in > gdbserver_gdb_load in an attempt to recover a target board that has gone > astray, for example by rebooting the board somehow. This has proved > effective with our failing board, that now completes the remaining test > cases with no further hiccups. > > I pushed it through regression testing with the powerpc-linux-gnu target > and a some half a dozen of multilibs (including ones that trip over the > faulty kernel on the problematic board) and there were no issues. The > full list of the offending test cases is as follows: > > FAIL: gdb.base/gcore-relro.exp: save a corefile > FAIL: gdb.base/print-symbol-loading.exp: save a corefile > FAIL: gdb.threads/gcore-thread.exp: save a corefile > > so essentially the same stuff in different places. > > OK to apply? OK. Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] gdbserver-support: Handle gdbserver start failures 2014-09-04 9:31 ` [PATCH] " Pedro Alves @ 2014-09-09 15:24 ` Maciej W. Rozycki 0 siblings, 0 replies; 7+ messages in thread From: Maciej W. Rozycki @ 2014-09-09 15:24 UTC (permalink / raw) To: Pedro Alves; +Cc: gdb-patches On Thu, 4 Sep 2014, Pedro Alves wrote: > > I pushed it through regression testing with the powerpc-linux-gnu target > > and a some half a dozen of multilibs (including ones that trip over the > > faulty kernel on the problematic board) and there were no issues. The > > full list of the offending test cases is as follows: > > > > FAIL: gdb.base/gcore-relro.exp: save a corefile > > FAIL: gdb.base/print-symbol-loading.exp: save a corefile > > FAIL: gdb.threads/gcore-thread.exp: save a corefile > > > > so essentially the same stuff in different places. > > > > OK to apply? > > OK. Thanks for your review, I have applied this change now too. Maciej ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] GDB/testsuite: Extend the time gdbserver is waited for 2014-07-29 13:41 ` Pedro Alves 2014-08-26 17:14 ` [PATCH] gdbserver-support: Handle gdbserver start failures Maciej W. Rozycki @ 2014-09-09 15:15 ` Maciej W. Rozycki 1 sibling, 0 replies; 7+ messages in thread From: Maciej W. Rozycki @ 2014-09-09 15:15 UTC (permalink / raw) To: Pedro Alves; +Cc: gdb-patches On Tue, 29 Jul 2014, Pedro Alves wrote: > > gdb-test-gdbserver-start-timeout.diff > > Index: gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp > > =================================================================== > > --- gdb-fsf-trunk-quilt.orig/gdb/testsuite/lib/gdbserver-support.exp 2014-05-13 02:52:11.347706187 +0100 > > +++ gdb-fsf-trunk-quilt/gdb/testsuite/lib/gdbserver-support.exp 2014-05-30 01:45:51.658977074 +0100 > > @@ -275,6 +275,7 @@ proc gdbserver_start { options arguments > > # Wait for the server to open its TCP socket, so that GDB can connect. > > expect { > > -i $server_spawn_id > > + -timeout 120 > > -notransfer > > -re "Listening on" { } > > -re "Can't bind address: Address already in use\\.\r\n" { > > > > OK. I have applied this change now, thanks for your review. Maciej ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-09-09 15:24 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-07-28 19:39 [PATCH] GDB/testsuite: Extend the time gdbserver is waited for Maciej W. Rozycki 2014-07-29 13:41 ` Pedro Alves 2014-08-26 17:14 ` [PATCH] gdbserver-support: Handle gdbserver start failures Maciej W. Rozycki 2014-09-03 12:39 ` [PING][PATCH] " Maciej W. Rozycki 2014-09-04 9:31 ` [PATCH] " Pedro Alves 2014-09-09 15:24 ` Maciej W. Rozycki 2014-09-09 15:15 ` [PATCH] GDB/testsuite: Extend the time gdbserver is waited for Maciej W. Rozycki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).