* [PATCH 0/2] btrace, gdbserver: allow "record btrace" for running threads @ 2016-12-06 15:55 Markus Metzger 2016-12-06 15:55 ` [PATCH 2/2] btrace: allow recording to be started " Markus Metzger 2016-12-06 15:55 ` [PATCH 1/2] gdbserver: catch fetch registers error Markus Metzger 0 siblings, 2 replies; 9+ messages in thread From: Markus Metzger @ 2016-12-06 15:55 UTC (permalink / raw) To: gdb-patches This refers to: https://sourceware.org/ml/gdb-patches/2016-11/msg00994.html. For the remote case I ran into an unexpected gdbserver quit, which is addressed in the first of the two patches. I tested this on one 32-bit and one 64-bit Intel Fedora system. On the former (4.4.6-301.fc23.i686+PAE), I ran into several PASS/FAIL changes between different runs in the gdb.threads suite. When I run those tests a few times, I get completely different results for almost every run (already without my changes). Ignoring those. Markus Metzger (2): gdbserver: catch fetch registers error btrace: allow recording to be started for running threads gdb/btrace.c | 24 +++++++++- gdb/gdbserver/linux-low.c | 19 +++++++- gdb/gdbserver/server.c | 18 +++++++- gdb/testsuite/gdb.btrace/enable-running.c | 47 +++++++++++++++++++ gdb/testsuite/gdb.btrace/enable-running.exp | 72 +++++++++++++++++++++++++++++ 5 files changed, 175 insertions(+), 5 deletions(-) create mode 100644 gdb/testsuite/gdb.btrace/enable-running.c create mode 100644 gdb/testsuite/gdb.btrace/enable-running.exp -- 1.8.3.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] btrace: allow recording to be started for running threads 2016-12-06 15:55 [PATCH 0/2] btrace, gdbserver: allow "record btrace" for running threads Markus Metzger @ 2016-12-06 15:55 ` Markus Metzger 2016-12-07 20:20 ` Luis Machado 2016-12-06 15:55 ` [PATCH 1/2] gdbserver: catch fetch registers error Markus Metzger 1 sibling, 1 reply; 9+ messages in thread From: Markus Metzger @ 2016-12-06 15:55 UTC (permalink / raw) To: gdb-patches When recording is started for a running thread, GDB was able to start tracing but then failed to read registers to insert the initial entry for the current PC. We don't really need that initial entry if we don't know where exactly we started recording. Silently ignore such errors to allow recording to be started while threads are running. For the BTRACE_FORMAT_PT btrace format, we don't need that initial entry since it will be recorded in the trace. We can omit the call to btrace_add_pc. 2016-12-06 Markus Metzger <markus.t.metzger@intel.com> gdb/ * btrace.c (btrace_enable): Do not call btrace_add_pc for BTRACE_FORMAT_PT. Silently ignore errors from btrace_add_pc. testsuite/ * gdb.btrace/enable-running.c: New. * gdb.btrace/enable-running.exp: New. --- gdb/btrace.c | 24 +++++++++- gdb/testsuite/gdb.btrace/enable-running.c | 47 +++++++++++++++++++ gdb/testsuite/gdb.btrace/enable-running.exp | 72 +++++++++++++++++++++++++++++ 3 files changed, 141 insertions(+), 2 deletions(-) create mode 100644 gdb/testsuite/gdb.btrace/enable-running.c create mode 100644 gdb/testsuite/gdb.btrace/enable-running.exp diff --git a/gdb/btrace.c b/gdb/btrace.c index 39d537c..920b9ab 100644 --- a/gdb/btrace.c +++ b/gdb/btrace.c @@ -1474,8 +1474,28 @@ btrace_enable (struct thread_info *tp, const struct btrace_config *conf) /* Add an entry for the current PC so we start tracing from where we enabled it. */ - if (tp->btrace.target != NULL) - btrace_add_pc (tp); + TRY + { + /* This is not relevant for BTRACE_FORMAT_PT since the trace will already + start at the enable location. */ + if ((tp->btrace.target != NULL) && (conf->format != BTRACE_FORMAT_PT)) + btrace_add_pc (tp); + } + CATCH (exception, RETURN_MASK_ALL) + { + /* We may fail to add the initial entry, for example if TP is currently + running. + + We won't be able to stitch the initial trace chunk to this initial + entry so tracing will start at the next branch target instead of at the + current PC. Since TP is currently running, this shouldn't make a + difference. + + If TP were waiting most of the time and made only a little bit of + progress before it was stopped, we'd lose the instructions until the + first branch. */ + } + END_CATCH } /* See btrace.h. */ diff --git a/gdb/testsuite/gdb.btrace/enable-running.c b/gdb/testsuite/gdb.btrace/enable-running.c new file mode 100644 index 0000000..6380c29 --- /dev/null +++ b/gdb/testsuite/gdb.btrace/enable-running.c @@ -0,0 +1,47 @@ +/* This testcase is part of GDB, the GNU debugger. + + Copyright 2016 Free Software Foundation, Inc. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see <http://www.gnu.org/licenses/>. */ + +#include <pthread.h> + +static int global; + +static void * +test (void *arg) +{ + for (;;) + global += 1; + + return arg; +} + +int +main (void) +{ + int i; + + for (i = 0; i < 3; ++i) + { + pthread_t th; + + pthread_create (&th, NULL, test, NULL); + pthread_detach (th); + } + + test (NULL); /* bp.1 */ + + return 0; +} diff --git a/gdb/testsuite/gdb.btrace/enable-running.exp b/gdb/testsuite/gdb.btrace/enable-running.exp new file mode 100644 index 0000000..ead86af --- /dev/null +++ b/gdb/testsuite/gdb.btrace/enable-running.exp @@ -0,0 +1,72 @@ +# This testcase is part of GDB, the GNU debugger. +# +# Copyright 2016 Free Software Foundation, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +if { [skip_btrace_tests] } { return -1 } + +standard_testfile +if {[gdb_compile_pthreads "$srcdir/$subdir/$srcfile" "$binfile" executable {debug}] != "" } { + return -1 +} +clean_restart $testfile + +# We need to enable non-stop mode for the remote case. +gdb_test_no_output "set non-stop on" + +if ![runto_main] { + return -1 +} + +set bp_1 [gdb_get_line_number "bp.1" $srcfile] + +gdb_breakpoint $bp_1 +gdb_continue_to_breakpoint "cont to $bp_1" ".*$bp_1\r\n.*" +gdb_test "cont&" "Continuing\." + +# All threads are running. Let's start recording. +gdb_test_no_output "record btrace" + +proc check_tracing_enabled { thread } { + global gdb_prompt + + gdb_test "thread $thread" "(running).*" + with_test_prefix "thread $thread" { + # Stop the thread before reading the trace. + gdb_test_multiple "interrupt" "interrupt" { + -re "interrupt\r\n$gdb_prompt " { + pass "interrupt" + } + } + # Wait until the thread actually stopped. + gdb_test_multiple "" "stopped" { + -re "Thread $thread.*stopped\." { + pass "stopped" + } + } + # We will consume the thread's current location as part of the + # "info record" output. + gdb_test "info record" [multi_line \ + "Active record target: record-btrace" \ + "Recording format: .*" \ + "Recorded \[0-9\]+ instructions \[^\\\r\\\n\]*" \ + ] + } +} + +# Check that recording was started on each thread. +foreach thread {1 2 3 4} { + check_tracing_enabled $thread +} -- 1.8.3.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] btrace: allow recording to be started for running threads 2016-12-06 15:55 ` [PATCH 2/2] btrace: allow recording to be started " Markus Metzger @ 2016-12-07 20:20 ` Luis Machado 2016-12-09 12:22 ` Metzger, Markus T 0 siblings, 1 reply; 9+ messages in thread From: Luis Machado @ 2016-12-07 20:20 UTC (permalink / raw) To: Markus Metzger, gdb-patches On 12/06/2016 09:54 AM, Markus Metzger wrote: > When recording is started for a running thread, GDB was able to start tracing > but then failed to read registers to insert the initial entry for the current > PC. We don't really need that initial entry if we don't know where exactly we > started recording. Silently ignore such errors to allow recording to be started > while threads are running. > > For the BTRACE_FORMAT_PT btrace format, we don't need that initial entry since > it will be recorded in the trace. We can omit the call to btrace_add_pc. > > 2016-12-06 Markus Metzger <markus.t.metzger@intel.com> > > gdb/ > * btrace.c (btrace_enable): Do not call btrace_add_pc for > BTRACE_FORMAT_PT. Silently ignore errors from btrace_add_pc. > > testsuite/ > * gdb.btrace/enable-running.c: New. > * gdb.btrace/enable-running.exp: New. > --- > gdb/btrace.c | 24 +++++++++- > gdb/testsuite/gdb.btrace/enable-running.c | 47 +++++++++++++++++++ > gdb/testsuite/gdb.btrace/enable-running.exp | 72 +++++++++++++++++++++++++++++ > 3 files changed, 141 insertions(+), 2 deletions(-) > create mode 100644 gdb/testsuite/gdb.btrace/enable-running.c > create mode 100644 gdb/testsuite/gdb.btrace/enable-running.exp > > diff --git a/gdb/btrace.c b/gdb/btrace.c > index 39d537c..920b9ab 100644 > --- a/gdb/btrace.c > +++ b/gdb/btrace.c > @@ -1474,8 +1474,28 @@ btrace_enable (struct thread_info *tp, const struct btrace_config *conf) > > /* Add an entry for the current PC so we start tracing from where we > enabled it. */ > - if (tp->btrace.target != NULL) > - btrace_add_pc (tp); > + TRY > + { > + /* This is not relevant for BTRACE_FORMAT_PT since the trace will already > + start at the enable location. */ enabled location? > + if ((tp->btrace.target != NULL) && (conf->format != BTRACE_FORMAT_PT)) > + btrace_add_pc (tp); > + } > + CATCH (exception, RETURN_MASK_ALL) > + { > + /* We may fail to add the initial entry, for example if TP is currently > + running. > + > + We won't be able to stitch the initial trace chunk to this initial > + entry so tracing will start at the next branch target instead of at the > + current PC. Since TP is currently running, this shouldn't make a > + difference. > + > + If TP were waiting most of the time and made only a little bit of > + progress before it was stopped, we'd lose the instructions until the > + first branch. */ > + } > + END_CATCH > } > > /* See btrace.h. */ > diff --git a/gdb/testsuite/gdb.btrace/enable-running.c b/gdb/testsuite/gdb.btrace/enable-running.c > new file mode 100644 > index 0000000..6380c29 > --- /dev/null > +++ b/gdb/testsuite/gdb.btrace/enable-running.c > @@ -0,0 +1,47 @@ > +/* This testcase is part of GDB, the GNU debugger. > + > + Copyright 2016 Free Software Foundation, Inc. > + > + This program is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3 of the License, or > + (at your option) any later version. > + > + This program is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with this program. If not, see <http://www.gnu.org/licenses/>. */ > + > +#include <pthread.h> > + > +static int global; > + > +static void * > +test (void *arg) > +{ > + for (;;) > + global += 1; > + > + return arg; > +} > + > +int > +main (void) > +{ > + int i; > + > + for (i = 0; i < 3; ++i) > + { > + pthread_t th; > + > + pthread_create (&th, NULL, test, NULL); > + pthread_detach (th); > + } > + > + test (NULL); /* bp.1 */ > + > + return 0; > +} > diff --git a/gdb/testsuite/gdb.btrace/enable-running.exp b/gdb/testsuite/gdb.btrace/enable-running.exp > new file mode 100644 > index 0000000..ead86af > --- /dev/null > +++ b/gdb/testsuite/gdb.btrace/enable-running.exp > @@ -0,0 +1,72 @@ > +# This testcase is part of GDB, the GNU debugger. > +# > +# Copyright 2016 Free Software Foundation, Inc. > +# > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 3 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +# GNU General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program. If not, see <http://www.gnu.org/licenses/>. > + > +if { [skip_btrace_tests] } { return -1 } unsupported "btrace not supported on this target"? > + > +standard_testfile > +if {[gdb_compile_pthreads "$srcdir/$subdir/$srcfile" "$binfile" executable {debug}] != "" } { untested "failed to compile" > + return -1 > +} > +clean_restart $testfile > + > +# We need to enable non-stop mode for the remote case. > +gdb_test_no_output "set non-stop on" > + > +if ![runto_main] { untested "couldn't run to main" > + return -1 > +} > + > +set bp_1 [gdb_get_line_number "bp.1" $srcfile] > + > +gdb_breakpoint $bp_1 > +gdb_continue_to_breakpoint "cont to $bp_1" ".*$bp_1\r\n.*" > +gdb_test "cont&" "Continuing\." > + > +# All threads are running. Let's start recording. > +gdb_test_no_output "record btrace" > + > +proc check_tracing_enabled { thread } { > + global gdb_prompt > + > + gdb_test "thread $thread" "(running).*" Maybe add a more meaningful test name here? "Check if thread $thread is running"? > + with_test_prefix "thread $thread" { > + # Stop the thread before reading the trace. > + gdb_test_multiple "interrupt" "interrupt" { > + -re "interrupt\r\n$gdb_prompt " { > + pass "interrupt" > + } Any reason why we couldn't use gdb_test here? Also, we a more meaningful test name and make the name unique (by mentioning the thread number, possibly) > + } > + # Wait until the thread actually stopped. > + gdb_test_multiple "" "stopped" { > + -re "Thread $thread.*stopped\." { > + pass "stopped" > + } > + } Same here. Couldn't we use gdb_test? Also a more meaningful and unique test name. > + # We will consume the thread's current location as part of the > + # "info record" output. > + gdb_test "info record" [multi_line \ > + "Active record target: record-btrace" \ > + "Recording format: .*" \ > + "Recorded \[0-9\]+ instructions \[^\\\r\\\n\]*" \ > + ] > + } > +} > + > +# Check that recording was started on each thread. > +foreach thread {1 2 3 4} { > + check_tracing_enabled $thread > +} > ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH 2/2] btrace: allow recording to be started for running threads 2016-12-07 20:20 ` Luis Machado @ 2016-12-09 12:22 ` Metzger, Markus T 2017-01-19 15:15 ` Pedro Alves 0 siblings, 1 reply; 9+ messages in thread From: Metzger, Markus T @ 2016-12-09 12:22 UTC (permalink / raw) To: Luis Machado, gdb-patches > -----Original Message----- > From: Luis Machado [mailto:lgustavo@codesourcery.com] > Sent: Wednesday, December 7, 2016 9:21 PM > To: Metzger, Markus T <markus.t.metzger@intel.com>; gdb- > patches@sourceware.org > Subject: Re: [PATCH 2/2] btrace: allow recording to be started for running threads Hi Luis, Thanks for your feedback. > > + /* This is not relevant for BTRACE_FORMAT_PT since the trace will already > > + start at the enable location. */ > > enabled location? I rephrased the sentence to: /* This is not relevant for BTRACE_FORMAT_PT since the trace will already start at the PC at which tracing was enabled. */ > > +if { [skip_btrace_tests] } { return -1 } > > unsupported "btrace not supported on this target"? Let me do this for all gdb.btrace tests in a separate patch. > > + gdb_test "thread $thread" "(running).*" > > Maybe add a more meaningful test name here? "Check if thread $thread is > running"? OK. Also moved it below the with_test_prefix. It now looks like this: PASS: gdb.btrace/enable-running.exp: thread 1: is running > > + with_test_prefix "thread $thread" { > > + # Stop the thread before reading the trace. > > + gdb_test_multiple "interrupt" "interrupt" { > > + -re "interrupt\r\n$gdb_prompt " { > > + pass "interrupt" > > + } > > Any reason why we couldn't use gdb_test here? Also, we a more meaningful > test name and make the name unique (by mentioning the thread number, > possibly) The thread number is contained in the test prefix so the test name is unique. PASS: gdb.btrace/enable-running.exp: thread 1: interrupt Regarding the use of gdb_test .... > > + } > > + # Wait until the thread actually stopped. > > + gdb_test_multiple "" "stopped" { > > + -re "Thread $thread.*stopped\." { > > + pass "stopped" > > + } > > + } > > Same here. Couldn't we use gdb_test? Also a more meaningful and unique > test name. ... I may be able to use it for the first part. But not here. This is non-stop mode so we're getting an asynchronous stopped notification. gdb_test expects a prompt at the end but we already got the prompt and now need to consume the stopped notification to ensure that the thread really stopped before we issue the next command. I'm not entirely sure about using gdb_test on the first part, either. If we got the stopped notification fast enough, I'm not sure whether the "...$gdb_prompt $" pattern in gdb_test would still match. According to my understanding of how dejagnu works the test may fail sporadically since there is already more input available after the prompt and the "... $" wouldn't match. The test prefix makes the test name unique: PASS: gdb.btrace/enable-running.exp: thread 1: stopped Regards, Markus. Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] btrace: allow recording to be started for running threads 2016-12-09 12:22 ` Metzger, Markus T @ 2017-01-19 15:15 ` Pedro Alves 0 siblings, 0 replies; 9+ messages in thread From: Pedro Alves @ 2017-01-19 15:15 UTC (permalink / raw) To: Metzger, Markus T, Luis Machado, gdb-patches On 12/09/2016 12:21 PM, Metzger, Markus T wrote: > I'm not entirely sure about using gdb_test on the first part, either. If we got the > stopped notification fast enough, I'm not sure whether the "...$gdb_prompt $" > pattern in gdb_test would still match. According to my understanding of how > dejagnu works the test may fail sporadically since there is already more input > available after the prompt and the "... $" wouldn't match. You are correct. (Haven't looked at the test at all, but in general.) Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] gdbserver: catch fetch registers error 2016-12-06 15:55 [PATCH 0/2] btrace, gdbserver: allow "record btrace" for running threads Markus Metzger 2016-12-06 15:55 ` [PATCH 2/2] btrace: allow recording to be started " Markus Metzger @ 2016-12-06 15:55 ` Markus Metzger 2017-01-19 13:22 ` Pedro Alves 1 sibling, 1 reply; 9+ messages in thread From: Markus Metzger @ 2016-12-06 15:55 UTC (permalink / raw) To: gdb-patches; +Cc: Daniel Jacobowitz When the PTRACE_PEEKUSER ptrace request to read registers fails, gdbserer throws an error that is caught in captured_main, where it causes a E01 error packet to be sent and gdbserer to quit (if --once was specified) or the event loop to be re-started (otherwise). We may get such ptrace errors when trying to fetch registers for an exited or running thread. There are checks in GDB that check those conditions and throw meaningful error messages before we could run into the above ptrace error, e.g. thread.c:validate_registers_access. I ran into a new case and, rather than adding another call to validate_registers_access in GDB, I propose to catch the error already when handling the 'g' packet in gdbserver and reply with an error packet - assuming that gdbserver's internal state is still intact. To not replace a meaningful error message with E01, I'm trying to generate a useful error message when the error is detected and the exception is thrown. It would look like this ... gdb) PASS: gdb.btrace/enable-running.exp: continue to breakpoint: cont to 44 cont& Continuing. (gdb) PASS: gdb.btrace/enable-running.exp: cont& record btrace warning: Remote failure reply: E.Selected thread is running. warning: Remote failure reply: E.Selected thread is running. ... although in this particular case, I'm going to suppress the warning. To make this look a bit nicer, we could consider stripping the "E." or the entire "Remote failure reply: E." when (re-)throwing the error inside GDB in remote.c. CC: Daniel Jacobowitz <drow@false.org> 2016-12-06 Markus Metzger <markus.t.metzger@intel.com> gdbserver/ * server.c (process_serial_event): Add TRY/CATCH. * linux-low.c (fetch_register): Improve error message. --- gdb/gdbserver/linux-low.c | 19 ++++++++++++++++++- gdb/gdbserver/server.c | 18 ++++++++++++++++-- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c index e3e372c..a942b87 100644 --- a/gdb/gdbserver/linux-low.c +++ b/gdb/gdbserver/linux-low.c @@ -5692,7 +5692,24 @@ fetch_register (const struct usrregs_info *usrregs, (PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4) 0); regaddr += sizeof (PTRACE_XFER_TYPE); if (errno != 0) - error ("reading register %d: %s", regno, strerror (errno)); + { + /* ESRCH could mean that the thread is not traced, exited, or is not + stopped. */ + if (errno == ESRCH) + { + struct lwp_info *lwp = get_thread_lwp (current_thread); + + if (!lwp_is_stopped (lwp)) + error (_("Selected thread is running.")); + + if (lwp_is_marked_dead (lwp)) + error (_("Selected thread has terminated.")); + } + + /* Report a generic error if we could not determine the exact + reason. */ + error (_("Could not read register %d: %s."), regno, strerror (errno)); + } } if (the_low_target.supply_ptrace_register) diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c index ef8dd03..3064b4f 100644 --- a/gdb/gdbserver/server.c +++ b/gdb/gdbserver/server.c @@ -4132,8 +4132,22 @@ process_serial_event (void) write_enn (own_buf); else { - regcache = get_thread_regcache (current_thread, 1); - registers_to_string (regcache, own_buf); + TRY + { + regcache = get_thread_regcache (current_thread, 1); + registers_to_string (regcache, own_buf); + } + CATCH (exception, RETURN_MASK_ALL) + { + const char *message; + + message = exception.message; + if (message == NULL) + message = _("Reading registers failed."); + + sprintf (own_buf, "E.%s", message); + } + END_CATCH } } break; -- 1.8.3.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] gdbserver: catch fetch registers error 2016-12-06 15:55 ` [PATCH 1/2] gdbserver: catch fetch registers error Markus Metzger @ 2017-01-19 13:22 ` Pedro Alves 2017-01-19 14:48 ` Metzger, Markus T 0 siblings, 1 reply; 9+ messages in thread From: Pedro Alves @ 2017-01-19 13:22 UTC (permalink / raw) To: Markus Metzger, gdb-patches; +Cc: Daniel Jacobowitz On 12/06/2016 03:54 PM, Markus Metzger wrote: > When the PTRACE_PEEKUSER ptrace request to read registers fails, gdbserer throws > an error that is caught in captured_main, where it causes a E01 error packet to > be sent and gdbserer to quit (if --once was specified) or the event loop to be > re-started (otherwise). > > We may get such ptrace errors when trying to fetch registers for an exited or > running thread. There are checks in GDB that check those conditions and throw > meaningful error messages before we could run into the above ptrace error, > e.g. thread.c:validate_registers_access. > > I ran into a new case and, rather than adding another call to > validate_registers_access in GDB, I propose to catch the error already when > handling the 'g' packet in gdbserver and reply with an error packet - assuming > that gdbserver's internal state is still intact. So there are two changes here: #1 - don't kill the debug session just because of register reading failure. I.e., handle it gracefully. The case of trying to read registers from a thread that is running is always a client bug. GDB should know which threads it resumed, and checks like the validate_registers_access you found should prevent such accesses reaching the server. I don't object to being defensive in gdbserver if that's not too invasive, but if we find such cases in GDB, we should fix them. However, the case of trying to read from an exited thread can well happen, and there's nothing the client can do to prevent it. The only thing it can do it handle it gracefully. This is because threads can transition from stopped -> exited without a "running" state in between. And this can happen if some _other_ thread of the inferior process is resumed and that other thread causes the whole process to exit (e.g., calls "exit()") just while gdb/gdbserver is accessing registers of the other supposedly-stopped thread. It can also happen when some child thread other than the one we're trying to access registers from execs, and the thread the client is trying to access registers from is not the main thread (the exec makes the kernel kill all threads, including the ones that were stopped). These sorts of corner cases are exercised by the process-dies-while-detaching.exp / process-dies-while-handling-bp.exp testcase and maybe others I don't recall their names immediately. See also: https://sourceware.org/bugzilla/show_bug.cgi?id=18749 Note process-dies-while-handling-bp.exp is largely kfailed when running against remote, exactly because we don't handle this all to well currently, in both gdbserver, and in gdb common code. #2 - sending a human readable error message back to gdb. > > To not replace a meaningful error message with E01, I'm trying to generate a > useful error message when the error is detected and the exception is thrown. > > It would look like this ... > > gdb) PASS: gdb.btrace/enable-running.exp: continue to breakpoint: cont to 44 > cont& > Continuing. > (gdb) PASS: gdb.btrace/enable-running.exp: cont& > record btrace > warning: Remote failure reply: E.Selected thread is running. > warning: Remote failure reply: E.Selected thread is running. > > ... although in this particular case, I'm going to suppress the warning. So it seems to me that btrace shouldn't even be trying to be enabled on running threads. > > To make this look a bit nicer, we could consider stripping the "E." or the > entire "Remote failure reply: E." when (re-)throwing the error inside GDB in > remote.c. Agreed. > @@ -5692,7 +5692,24 @@ fetch_register (const struct usrregs_info *usrregs, > (PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4) 0); > regaddr += sizeof (PTRACE_XFER_TYPE); > if (errno != 0) > - error ("reading register %d: %s", regno, strerror (errno)); > + { > + /* ESRCH could mean that the thread is not traced, exited, or is not > + stopped. */ > + if (errno == ESRCH) > + { > + struct lwp_info *lwp = get_thread_lwp (current_thread); > + > + if (!lwp_is_stopped (lwp)) > + error (_("Selected thread is running.")); > + > + if (lwp_is_marked_dead (lwp)) > + error (_("Selected thread has terminated.")); The order of the checks seems backwards -- if the lwp is dead, lwp_is_stopped is meaningless? But, if either of these conditions is true, and the errors are reached, then it looks like an internal gdbserver error to me. lwp_is_stopped, only checks the lwp->stopped flag. If !lwp->stopped, then this looks like a case of "we should not have reached here in the first place". Similarly for lwp_is_marked_dead -- if the lwp is known dead, then why did we try to ptrace it? Something higher up in the call chain should have errored out earlier, IMO. If we want gdbserver to be defensive against GDB mistakes, we can look at the request as coming from server.c, to handle a g/G packet. So in that case it's gdbserver's knowledge of gdb's perspective of whether the thread is running that counts here. I.e., a 'g' packet that tries to read from a thread that gdb resumed before should error out even if the thread happens to be momentarily paused due to some internal gdbserver event handling. That suggests to me that for the "running" case, server.c's g/G packet handling should be erroring out before calling into the backend, if: if (thread->last_resume_kind != resume_stop || thread->last_status.kind == TARGET_WAITKIND_IGNORE) See the linux-low.c:lwp_resumed function, which we could move to common code. Now, even with that in place, we'd still need to handle the unpredictable "stopped -> exited" transitions. And those can only be handled by checking for failure after the fact... But with an upfront "!running" check, then we can assume that ESRCH means the thread is gone, and thus unconditionally throw the error (_("Selected thread has terminated.")); error. Note that what gdbserver thinks of "selected" thread has no 1-1 relation with what the user thinks as selected thread, so that may be a bit confusing. E.g., the user may well have thread 2, selected, while that error triggers for some other thread because gdb decided to temporarily switch to some other thread, internally, e.g., to iterate over threads and read their registers. > + } > + > + /* Report a generic error if we could not determine the exact > + reason. */ > + error (_("Could not read register %d: %s."), regno, strerror (errno)); > + } > } > > if (the_low_target.supply_ptrace_register) > diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c > index ef8dd03..3064b4f 100644 > --- a/gdb/gdbserver/server.c > +++ b/gdb/gdbserver/server.c > @@ -4132,8 +4132,22 @@ process_serial_event (void) > write_enn (own_buf); > else > { > - regcache = get_thread_regcache (current_thread, 1); > - registers_to_string (regcache, own_buf); > + TRY > + { > + regcache = get_thread_regcache (current_thread, 1); > + registers_to_string (regcache, own_buf); > + } > + CATCH (exception, RETURN_MASK_ALL) Why RETURN_MASK_ALL ? > + { > + const char *message; > + > + message = exception.message; > + if (message == NULL) > + message = _("Reading registers failed."); > + > + sprintf (own_buf, "E.%s", message); > + } > + END_CATCH > } > } > break; > Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH 1/2] gdbserver: catch fetch registers error 2017-01-19 13:22 ` Pedro Alves @ 2017-01-19 14:48 ` Metzger, Markus T 2017-01-19 15:02 ` Pedro Alves 0 siblings, 1 reply; 9+ messages in thread From: Metzger, Markus T @ 2017-01-19 14:48 UTC (permalink / raw) To: Pedro Alves, gdb-patches; +Cc: Daniel Jacobowitz Hello Pedro, Thanks for the review and the elaborate response. > The case of trying to read registers from a thread that is > running is always a client bug. GDB should know which threads > it resumed, and checks like the validate_registers_access > you found should prevent such accesses reaching the server. That's a general misunderstanding of the design on my part, then. I typically don't check if I can/should do something but instead just do it and handle errors in case it failed. Like reading registers. Let me drop this patch and instead call validate_registers_access in btrace before reading the PC. > So it seems to me that btrace shouldn't even be trying to be enabled > on running threads. It works fine and I didn't see a reason why we would not want to allow it. Thanks, Markus. > -----Original Message----- > From: Pedro Alves [mailto:palves@redhat.com] > Sent: Thursday, January 19, 2017 2:22 PM > To: Metzger, Markus T <markus.t.metzger@intel.com>; gdb- > patches@sourceware.org > Cc: Daniel Jacobowitz <drow@false.org> > Subject: Re: [PATCH 1/2] gdbserver: catch fetch registers error > > On 12/06/2016 03:54 PM, Markus Metzger wrote: > > When the PTRACE_PEEKUSER ptrace request to read registers fails, gdbserer > throws > > an error that is caught in captured_main, where it causes a E01 error packet to > > be sent and gdbserer to quit (if --once was specified) or the event loop to be > > re-started (otherwise). > > > > We may get such ptrace errors when trying to fetch registers for an exited or > > running thread. There are checks in GDB that check those conditions and throw > > meaningful error messages before we could run into the above ptrace error, > > e.g. thread.c:validate_registers_access. > > > > I ran into a new case and, rather than adding another call to > > validate_registers_access in GDB, I propose to catch the error already when > > handling the 'g' packet in gdbserver and reply with an error packet - assuming > > that gdbserver's internal state is still intact. > > So there are two changes here: > > #1 - don't kill the debug session just because > of register reading failure. I.e., handle it gracefully. > > The case of trying to read registers from a thread that is > running is always a client bug. GDB should know which threads > it resumed, and checks like the validate_registers_access > you found should prevent such accesses reaching the server. > I don't object to being defensive in gdbserver if that's > not too invasive, but if we find such cases in GDB, we should > fix them. > > However, the case of trying to read from an exited thread can > well happen, and there's nothing the client can do to prevent it. > The only thing it can do it handle it gracefully. > This is because threads can transition from stopped -> exited > without a "running" state in between. And this can happen if > some _other_ thread of the inferior process is resumed and that > other thread causes the whole process to exit (e.g., calls "exit()") > just while gdb/gdbserver is accessing registers of the other > supposedly-stopped thread. It can also happen when some child > thread other than the one we're trying to access registers > from execs, and the thread the client is trying to access registers > from is not the main thread (the exec makes the kernel kill all > threads, including the ones that were stopped). > > These sorts of corner cases are exercised by the > process-dies-while-detaching.exp / process-dies-while-handling-bp.exp > testcase and maybe others I don't recall their names immediately. > > See also: https://sourceware.org/bugzilla/show_bug.cgi?id=18749 > > Note process-dies-while-handling-bp.exp is largely kfailed when > running against remote, exactly because we don't handle this all > to well currently, in both gdbserver, and in gdb common code. > > #2 - sending a human readable error message back to gdb. > > > > > To not replace a meaningful error message with E01, I'm trying to generate a > > useful error message when the error is detected and the exception is thrown. > > > > It would look like this ... > > > > gdb) PASS: gdb.btrace/enable-running.exp: continue to breakpoint: cont to 44 > > cont& > > Continuing. > > (gdb) PASS: gdb.btrace/enable-running.exp: cont& > > record btrace > > warning: Remote failure reply: E.Selected thread is running. > > warning: Remote failure reply: E.Selected thread is running. > > > > ... although in this particular case, I'm going to suppress the warning. > > So it seems to me that btrace shouldn't even be trying to be enabled > on running threads. > > > > > To make this look a bit nicer, we could consider stripping the "E." or the > > entire "Remote failure reply: E." when (re-)throwing the error inside GDB in > > remote.c. > > Agreed. > > > @@ -5692,7 +5692,24 @@ fetch_register (const struct usrregs_info *usrregs, > > (PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4) > 0); > > regaddr += sizeof (PTRACE_XFER_TYPE); > > if (errno != 0) > > - error ("reading register %d: %s", regno, strerror (errno)); > > + { > > + /* ESRCH could mean that the thread is not traced, exited, or is not > > + stopped. */ > > + if (errno == ESRCH) > > + { > > + struct lwp_info *lwp = get_thread_lwp (current_thread); > > + > > + if (!lwp_is_stopped (lwp)) > > + error (_("Selected thread is running.")); > > + > > + if (lwp_is_marked_dead (lwp)) > > + error (_("Selected thread has terminated.")); > > The order of the checks seems backwards -- if the lwp is dead, > lwp_is_stopped is meaningless? But, if either of these conditions > is true, and the errors are reached, then it looks like > an internal gdbserver error to me. > > lwp_is_stopped, only checks the lwp->stopped flag. > If !lwp->stopped, then this looks like a case of "we should not > have reached here in the first place". > > Similarly for lwp_is_marked_dead -- if the lwp is known dead, > then why did we try to ptrace it? Something higher up in the > call chain should have errored out earlier, IMO. > > If we want gdbserver to be defensive against GDB mistakes, > we can look at the request as coming from server.c, to handle a > g/G packet. So in that case it's gdbserver's knowledge of > gdb's perspective of whether the thread is running that counts > here. I.e., a 'g' packet that tries to read from a thread > that gdb resumed before should error out even if the > thread happens to be momentarily paused due to some internal > gdbserver event handling. > > That suggests to me that for the "running" case, > server.c's g/G packet handling should be erroring out before > calling into the backend, if: > > if (thread->last_resume_kind != resume_stop > || thread->last_status.kind == TARGET_WAITKIND_IGNORE) > > See the linux-low.c:lwp_resumed function, which we could > move to common code. > > Now, even with that in place, we'd still need to handle > the unpredictable "stopped -> exited" transitions. And those > can only be handled by checking for failure after the fact... > But with an upfront "!running" check, then we can assume that > ESRCH means the thread is gone, and thus unconditionally > throw the > error (_("Selected thread has terminated.")); > error. > > Note that what gdbserver thinks of "selected" thread has > no 1-1 relation with what the user thinks as selected thread, > so that may be a bit confusing. E.g., the user may well have thread 2, > selected, while that error triggers for some other thread because > gdb decided to temporarily switch to some other thread, internally, > e.g., to iterate over threads and read their registers. > > > > + } > > + > > + /* Report a generic error if we could not determine the exact > > + reason. */ > > + error (_("Could not read register %d: %s."), regno, strerror (errno)); > > + } > > } > > > > if (the_low_target.supply_ptrace_register) > > diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c > > index ef8dd03..3064b4f 100644 > > --- a/gdb/gdbserver/server.c > > +++ b/gdb/gdbserver/server.c > > @@ -4132,8 +4132,22 @@ process_serial_event (void) > > write_enn (own_buf); > > else > > { > > - regcache = get_thread_regcache (current_thread, 1); > > - registers_to_string (regcache, own_buf); > > + TRY > > + { > > + regcache = get_thread_regcache (current_thread, 1); > > + registers_to_string (regcache, own_buf); > > + } > > + CATCH (exception, RETURN_MASK_ALL) > > Why RETURN_MASK_ALL ? > > > + { > > + const char *message; > > + > > + message = exception.message; > > + if (message == NULL) > > + message = _("Reading registers failed."); > > + > > + sprintf (own_buf, "E.%s", message); > > + } > > + END_CATCH > > } > > } > > break; > > > > Thanks, > Pedro Alves Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Christian Lamprechter Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] gdbserver: catch fetch registers error 2017-01-19 14:48 ` Metzger, Markus T @ 2017-01-19 15:02 ` Pedro Alves 0 siblings, 0 replies; 9+ messages in thread From: Pedro Alves @ 2017-01-19 15:02 UTC (permalink / raw) To: Metzger, Markus T, gdb-patches; +Cc: Daniel Jacobowitz On 01/19/2017 02:48 PM, Metzger, Markus T wrote: > Hello Pedro, > > Thanks for the review and the elaborate response. > >> The case of trying to read registers from a thread that is >> running is always a client bug. GDB should know which threads >> it resumed, and checks like the validate_registers_access >> you found should prevent such accesses reaching the server. > > That's a general misunderstanding of the design on my part, then. I typically > don't check if I can/should do something but instead just do it and handle errors > in case it failed. Like reading registers. > > Let me drop this patch and instead call validate_registers_access in btrace before > reading the PC. We have similar checks in other places, like e.g., frame.c:has_stack_frames. Might make sense to factor things out a bit to avoid a throw/catch. >> So it seems to me that btrace shouldn't even be trying to be enabled >> on running threads. > > It works fine and I didn't see a reason why we would not want to allow it. Sorry, that was bogus. I agree -- what it shouldn't be doing is reading registers. Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-01-19 15:15 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-12-06 15:55 [PATCH 0/2] btrace, gdbserver: allow "record btrace" for running threads Markus Metzger 2016-12-06 15:55 ` [PATCH 2/2] btrace: allow recording to be started " Markus Metzger 2016-12-07 20:20 ` Luis Machado 2016-12-09 12:22 ` Metzger, Markus T 2017-01-19 15:15 ` Pedro Alves 2016-12-06 15:55 ` [PATCH 1/2] gdbserver: catch fetch registers error Markus Metzger 2017-01-19 13:22 ` Pedro Alves 2017-01-19 14:48 ` Metzger, Markus T 2017-01-19 15:02 ` Pedro Alves
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).