[PATCH 0/2] btrace, gdbserver: allow "record btrace" for running threads

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 0/2] btrace, gdbserver: allow "record btrace" for running threads
@ 2016-12-06 15:55 Markus Metzger
  2016-12-06 15:55 ` [PATCH 2/2] btrace: allow recording to be started " Markus Metzger
  2016-12-06 15:55 ` [PATCH 1/2] gdbserver: catch fetch registers error Markus Metzger
  0 siblings, 2 replies; 9+ messages in thread
From: Markus Metzger @ 2016-12-06 15:55 UTC (permalink / raw)
  To: gdb-patches

This refers to:  https://sourceware.org/ml/gdb-patches/2016-11/msg00994.html.

For the remote case I ran into an unexpected gdbserver quit, which is addressed
in the first of the two patches.

I tested this on one 32-bit and one 64-bit Intel Fedora system.  On the former
(4.4.6-301.fc23.i686+PAE), I ran into several PASS/FAIL changes between
different runs in the gdb.threads suite.  When I run those tests a few times, I
get completely different results for almost every run (already without my
changes).  Ignoring those.

Markus Metzger (2):
  gdbserver: catch fetch registers error
  btrace: allow recording to be started for running threads

 gdb/btrace.c                                | 24 +++++++++-
 gdb/gdbserver/linux-low.c                   | 19 +++++++-
 gdb/gdbserver/server.c                      | 18 +++++++-
 gdb/testsuite/gdb.btrace/enable-running.c   | 47 +++++++++++++++++++
 gdb/testsuite/gdb.btrace/enable-running.exp | 72 +++++++++++++++++++++++++++++
 5 files changed, 175 insertions(+), 5 deletions(-)
 create mode 100644 gdb/testsuite/gdb.btrace/enable-running.c
 create mode 100644 gdb/testsuite/gdb.btrace/enable-running.exp

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/2] btrace: allow recording to be started for running threads
  2016-12-06 15:55 [PATCH 0/2] btrace, gdbserver: allow "record btrace" for running threads Markus Metzger
@ 2016-12-06 15:55 ` Markus Metzger
  2016-12-07 20:20   ` Luis Machado
  2016-12-06 15:55 ` [PATCH 1/2] gdbserver: catch fetch registers error Markus Metzger
  1 sibling, 1 reply; 9+ messages in thread
From: Markus Metzger @ 2016-12-06 15:55 UTC (permalink / raw)
  To: gdb-patches

When recording is started for a running thread, GDB was able to start tracing
but then failed to read registers to insert the initial entry for the current
PC.  We don't really need that initial entry if we don't know where exactly we
started recording.  Silently ignore such errors to allow recording to be started
while threads are running.

For the BTRACE_FORMAT_PT btrace format, we don't need that initial entry since
it will be recorded in the trace.  We can omit the call to btrace_add_pc.

2016-12-06  Markus Metzger  <markus.t.metzger@intel.com>

gdb/
	* btrace.c (btrace_enable): Do not call btrace_add_pc for
	BTRACE_FORMAT_PT.  Silently ignore errors from btrace_add_pc.

testsuite/
	* gdb.btrace/enable-running.c: New.
	* gdb.btrace/enable-running.exp: New.
---
 gdb/btrace.c                                | 24 +++++++++-
 gdb/testsuite/gdb.btrace/enable-running.c   | 47 +++++++++++++++++++
 gdb/testsuite/gdb.btrace/enable-running.exp | 72 +++++++++++++++++++++++++++++
 3 files changed, 141 insertions(+), 2 deletions(-)
 create mode 100644 gdb/testsuite/gdb.btrace/enable-running.c
 create mode 100644 gdb/testsuite/gdb.btrace/enable-running.exp

diff --git a/gdb/btrace.c b/gdb/btrace.c
index 39d537c..920b9ab 100644
--- a/gdb/btrace.c
+++ b/gdb/btrace.c
@@ -1474,8 +1474,28 @@ btrace_enable (struct thread_info *tp, const struct btrace_config *conf)
 
   /* Add an entry for the current PC so we start tracing from where we
      enabled it.  */
-  if (tp->btrace.target != NULL)
-    btrace_add_pc (tp);
+  TRY
+    {
+      /* This is not relevant for BTRACE_FORMAT_PT since the trace will already
+	 start at the enable location.  */
+      if ((tp->btrace.target != NULL) && (conf->format != BTRACE_FORMAT_PT))
+	btrace_add_pc (tp);
+    }
+  CATCH (exception, RETURN_MASK_ALL)
+    {
+      /* We may fail to add the initial entry, for example if TP is currently
+	 running.
+
+	 We won't be able to stitch the initial trace chunk to this initial
+	 entry so tracing will start at the next branch target instead of at the
+	 current PC.  Since TP is currently running, this shouldn't make a
+	 difference.
+
+	 If TP were waiting most of the time and made only a little bit of
+	 progress before it was stopped, we'd lose the instructions until the
+	 first branch.  */
+    }
+  END_CATCH
 }
 
 /* See btrace.h.  */
diff --git a/gdb/testsuite/gdb.btrace/enable-running.c b/gdb/testsuite/gdb.btrace/enable-running.c
new file mode 100644
index 0000000..6380c29
--- /dev/null
+++ b/gdb/testsuite/gdb.btrace/enable-running.c
@@ -0,0 +1,47 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2016 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <pthread.h>
+
+static int global;
+
+static void *
+test (void *arg)
+{
+  for (;;)
+    global += 1;
+
+  return arg;
+}
+
+int
+main (void)
+{
+  int i;
+
+  for (i = 0; i < 3; ++i)
+    {
+      pthread_t th;
+
+      pthread_create (&th, NULL, test, NULL);
+      pthread_detach (th);
+    }
+
+  test (NULL); /* bp.1 */
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.btrace/enable-running.exp b/gdb/testsuite/gdb.btrace/enable-running.exp
new file mode 100644
index 0000000..ead86af
--- /dev/null
+++ b/gdb/testsuite/gdb.btrace/enable-running.exp
@@ -0,0 +1,72 @@
+# This testcase is part of GDB, the GNU debugger.
+#
+# Copyright 2016 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+if { [skip_btrace_tests] } { return -1 }
+
+standard_testfile
+if {[gdb_compile_pthreads "$srcdir/$subdir/$srcfile" "$binfile" executable {debug}] != "" } {
+    return -1
+}
+clean_restart $testfile
+
+# We need to enable non-stop mode for the remote case.
+gdb_test_no_output "set non-stop on"
+
+if ![runto_main] {
+    return -1
+}
+
+set bp_1 [gdb_get_line_number "bp.1" $srcfile]
+
+gdb_breakpoint $bp_1
+gdb_continue_to_breakpoint "cont to $bp_1" ".*$bp_1\r\n.*"
+gdb_test "cont&" "Continuing\."
+
+# All threads are running.  Let's start recording.
+gdb_test_no_output "record btrace"
+
+proc check_tracing_enabled { thread } {
+    global gdb_prompt
+
+    gdb_test "thread $thread" "(running).*"
+    with_test_prefix "thread $thread" {
+        # Stop the thread before reading the trace.
+        gdb_test_multiple "interrupt" "interrupt" {
+            -re "interrupt\r\n$gdb_prompt " {
+                pass "interrupt"
+            }
+        }
+        # Wait until the thread actually stopped.
+        gdb_test_multiple "" "stopped" {
+            -re "Thread $thread.*stopped\." {
+                pass "stopped"
+            }
+        }
+        # We will consume the thread's current location as part of the
+        # "info record" output.
+        gdb_test "info record" [multi_line \
+            "Active record target: record-btrace" \
+            "Recording format: .*" \
+            "Recorded \[0-9\]+ instructions \[^\\\r\\\n\]*" \
+        ]
+    }
+}
+
+# Check that recording was started on each thread.
+foreach thread {1 2 3 4} {
+    check_tracing_enabled $thread
+}
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] btrace: allow recording to be started for running threads
  2016-12-06 15:55 ` [PATCH 2/2] btrace: allow recording to be started " Markus Metzger
@ 2016-12-07 20:20   ` Luis Machado
  2016-12-09 12:22     ` Metzger, Markus T
  0 siblings, 1 reply; 9+ messages in thread
From: Luis Machado @ 2016-12-07 20:20 UTC (permalink / raw)
  To: Markus Metzger, gdb-patches

On 12/06/2016 09:54 AM, Markus Metzger wrote:
> When recording is started for a running thread, GDB was able to start tracing
> but then failed to read registers to insert the initial entry for the current
> PC.  We don't really need that initial entry if we don't know where exactly we
> started recording.  Silently ignore such errors to allow recording to be started
> while threads are running.
>
> For the BTRACE_FORMAT_PT btrace format, we don't need that initial entry since
> it will be recorded in the trace.  We can omit the call to btrace_add_pc.
>
> 2016-12-06  Markus Metzger  <markus.t.metzger@intel.com>
>
> gdb/
> 	* btrace.c (btrace_enable): Do not call btrace_add_pc for
> 	BTRACE_FORMAT_PT.  Silently ignore errors from btrace_add_pc.
>
> testsuite/
> 	* gdb.btrace/enable-running.c: New.
> 	* gdb.btrace/enable-running.exp: New.
> ---
>  gdb/btrace.c                                | 24 +++++++++-
>  gdb/testsuite/gdb.btrace/enable-running.c   | 47 +++++++++++++++++++
>  gdb/testsuite/gdb.btrace/enable-running.exp | 72 +++++++++++++++++++++++++++++
>  3 files changed, 141 insertions(+), 2 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.btrace/enable-running.c
>  create mode 100644 gdb/testsuite/gdb.btrace/enable-running.exp
>
> diff --git a/gdb/btrace.c b/gdb/btrace.c
> index 39d537c..920b9ab 100644
> --- a/gdb/btrace.c
> +++ b/gdb/btrace.c
> @@ -1474,8 +1474,28 @@ btrace_enable (struct thread_info *tp, const struct btrace_config *conf)
>
>    /* Add an entry for the current PC so we start tracing from where we
>       enabled it.  */
> -  if (tp->btrace.target != NULL)
> -    btrace_add_pc (tp);
> +  TRY
> +    {
> +      /* This is not relevant for BTRACE_FORMAT_PT since the trace will already
> +	 start at the enable location.  */

enabled location?


> +      if ((tp->btrace.target != NULL) && (conf->format != BTRACE_FORMAT_PT))
> +	btrace_add_pc (tp);
> +    }
> +  CATCH (exception, RETURN_MASK_ALL)
> +    {
> +      /* We may fail to add the initial entry, for example if TP is currently
> +	 running.
> +
> +	 We won't be able to stitch the initial trace chunk to this initial
> +	 entry so tracing will start at the next branch target instead of at the
> +	 current PC.  Since TP is currently running, this shouldn't make a
> +	 difference.
> +
> +	 If TP were waiting most of the time and made only a little bit of
> +	 progress before it was stopped, we'd lose the instructions until the
> +	 first branch.  */
> +    }
> +  END_CATCH
>  }
>
>  /* See btrace.h.  */
> diff --git a/gdb/testsuite/gdb.btrace/enable-running.c b/gdb/testsuite/gdb.btrace/enable-running.c
> new file mode 100644
> index 0000000..6380c29
> --- /dev/null
> +++ b/gdb/testsuite/gdb.btrace/enable-running.c
> @@ -0,0 +1,47 @@
> +/* This testcase is part of GDB, the GNU debugger.
> +
> +   Copyright 2016 Free Software Foundation, Inc.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
> +
> +#include <pthread.h>
> +
> +static int global;
> +
> +static void *
> +test (void *arg)
> +{
> +  for (;;)
> +    global += 1;
> +
> +  return arg;
> +}
> +
> +int
> +main (void)
> +{
> +  int i;
> +
> +  for (i = 0; i < 3; ++i)
> +    {
> +      pthread_t th;
> +
> +      pthread_create (&th, NULL, test, NULL);
> +      pthread_detach (th);
> +    }
> +
> +  test (NULL); /* bp.1 */
> +
> +  return 0;
> +}
> diff --git a/gdb/testsuite/gdb.btrace/enable-running.exp b/gdb/testsuite/gdb.btrace/enable-running.exp
> new file mode 100644
> index 0000000..ead86af
> --- /dev/null
> +++ b/gdb/testsuite/gdb.btrace/enable-running.exp
> @@ -0,0 +1,72 @@
> +# This testcase is part of GDB, the GNU debugger.
> +#
> +# Copyright 2016 Free Software Foundation, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +if { [skip_btrace_tests] } { return -1 }

unsupported "btrace not supported on this target"?

> +
> +standard_testfile
> +if {[gdb_compile_pthreads "$srcdir/$subdir/$srcfile" "$binfile" executable {debug}] != "" } {

untested "failed to compile"

> +    return -1
> +}
> +clean_restart $testfile
> +
> +# We need to enable non-stop mode for the remote case.
> +gdb_test_no_output "set non-stop on"
> +
> +if ![runto_main] {

untested "couldn't run to main"

> +    return -1
> +}
> +
> +set bp_1 [gdb_get_line_number "bp.1" $srcfile]
> +
> +gdb_breakpoint $bp_1
> +gdb_continue_to_breakpoint "cont to $bp_1" ".*$bp_1\r\n.*"
> +gdb_test "cont&" "Continuing\."
> +
> +# All threads are running.  Let's start recording.
> +gdb_test_no_output "record btrace"
> +
> +proc check_tracing_enabled { thread } {
> +    global gdb_prompt
> +
> +    gdb_test "thread $thread" "(running).*"

Maybe add a more meaningful test name here? "Check if thread $thread is 
running"?

> +    with_test_prefix "thread $thread" {
> +        # Stop the thread before reading the trace.
> +        gdb_test_multiple "interrupt" "interrupt" {
> +            -re "interrupt\r\n$gdb_prompt " {
> +                pass "interrupt"
> +            }

Any reason why we couldn't use gdb_test here? Also, we a more meaningful 
test name and make the name unique (by mentioning the thread number, 
possibly)

> +        }
> +        # Wait until the thread actually stopped.
> +        gdb_test_multiple "" "stopped" {
> +            -re "Thread $thread.*stopped\." {
> +                pass "stopped"
> +            }
> +        }

Same here. Couldn't we use gdb_test? Also a more meaningful and unique 
test name.

> +        # We will consume the thread's current location as part of the
> +        # "info record" output.
> +        gdb_test "info record" [multi_line \
> +            "Active record target: record-btrace" \
> +            "Recording format: .*" \
> +            "Recorded \[0-9\]+ instructions \[^\\\r\\\n\]*" \
> +        ]
> +    }
> +}
> +
> +# Check that recording was started on each thread.
> +foreach thread {1 2 3 4} {
> +    check_tracing_enabled $thread
> +}
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH 2/2] btrace: allow recording to be started for running threads
  2016-12-07 20:20   ` Luis Machado
@ 2016-12-09 12:22     ` Metzger, Markus T
  2017-01-19 15:15       ` Pedro Alves
  0 siblings, 1 reply; 9+ messages in thread
From: Metzger, Markus T @ 2016-12-09 12:22 UTC (permalink / raw)
  To: Luis Machado, gdb-patches

> -----Original Message-----
> From: Luis Machado [mailto:lgustavo@codesourcery.com]
> Sent: Wednesday, December 7, 2016 9:21 PM
> To: Metzger, Markus T <markus.t.metzger@intel.com>; gdb-
> patches@sourceware.org
> Subject: Re: [PATCH 2/2] btrace: allow recording to be started for running threads

Hi Luis,

Thanks for your feedback.

> > +      /* This is not relevant for BTRACE_FORMAT_PT since the trace will already
> > +	 start at the enable location.  */
> 
> enabled location?

I rephrased the sentence to:

      /* This is not relevant for BTRACE_FORMAT_PT since the trace will already
	 start at the PC at which tracing was enabled.  */

> > +if { [skip_btrace_tests] } { return -1 }
> 
> unsupported "btrace not supported on this target"?

Let me do this for all gdb.btrace tests in a separate patch.

> > +    gdb_test "thread $thread" "(running).*"
> 
> Maybe add a more meaningful test name here? "Check if thread $thread is
> running"?

OK.  Also moved it below the with_test_prefix.  It now looks like this:
PASS: gdb.btrace/enable-running.exp: thread 1: is running

> > +    with_test_prefix "thread $thread" {
> > +        # Stop the thread before reading the trace.
> > +        gdb_test_multiple "interrupt" "interrupt" {
> > +            -re "interrupt\r\n$gdb_prompt " {
> > +                pass "interrupt"
> > +            }
> 
> Any reason why we couldn't use gdb_test here? Also, we a more meaningful
> test name and make the name unique (by mentioning the thread number,
> possibly)

The thread number is contained in the test prefix so the test name is unique.
PASS: gdb.btrace/enable-running.exp: thread 1: interrupt

Regarding the use of gdb_test ....

> > +        }
> > +        # Wait until the thread actually stopped.
> > +        gdb_test_multiple "" "stopped" {
> > +            -re "Thread $thread.*stopped\." {
> > +                pass "stopped"
> > +            }
> > +        }
>
> Same here. Couldn't we use gdb_test? Also a more meaningful and unique
> test name.

... I may be able to use it for the first part.  But not here.  This is non-stop mode
so we're getting an asynchronous stopped notification.  gdb_test expects a prompt
at the end but we already got the prompt and now need to consume the stopped
notification to ensure that the thread really stopped before we issue the next
command.

I'm not entirely sure about using gdb_test on the first part, either.  If we got the
stopped notification fast enough, I'm not sure whether the "...$gdb_prompt $"
pattern in gdb_test would still match.  According to my understanding of how
dejagnu works the test may fail sporadically since there is already more input
available after the prompt and the "... $" wouldn't match.

The test prefix makes the test name unique:
PASS: gdb.btrace/enable-running.exp: thread 1: stopped

Regards,
Markus.

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Christian Lamprechter
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] btrace: allow recording to be started for running threads
  2016-12-09 12:22     ` Metzger, Markus T
@ 2017-01-19 15:15       ` Pedro Alves
  0 siblings, 0 replies; 9+ messages in thread
From: Pedro Alves @ 2017-01-19 15:15 UTC (permalink / raw)
  To: Metzger, Markus T, Luis Machado, gdb-patches

On 12/09/2016 12:21 PM, Metzger, Markus T wrote:
> I'm not entirely sure about using gdb_test on the first part, either.  If we got the
> stopped notification fast enough, I'm not sure whether the "...$gdb_prompt $"
> pattern in gdb_test would still match.  According to my understanding of how
> dejagnu works the test may fail sporadically since there is already more input
> available after the prompt and the "... $" wouldn't match.

You are correct.

(Haven't looked at the test at all, but in general.)

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] gdbserver: catch fetch registers error
  2016-12-06 15:55 [PATCH 0/2] btrace, gdbserver: allow "record btrace" for running threads Markus Metzger
  2016-12-06 15:55 ` [PATCH 2/2] btrace: allow recording to be started " Markus Metzger
@ 2016-12-06 15:55 ` Markus Metzger
  2017-01-19 13:22   ` Pedro Alves
  1 sibling, 1 reply; 9+ messages in thread
From: Markus Metzger @ 2016-12-06 15:55 UTC (permalink / raw)
  To: gdb-patches; +Cc: Daniel Jacobowitz

When the PTRACE_PEEKUSER ptrace request to read registers fails, gdbserer throws
an error that is caught in captured_main, where it causes a E01 error packet to
be sent and gdbserer to quit (if --once was specified) or the event loop to be
re-started (otherwise).

We may get such ptrace errors when trying to fetch registers for an exited or
running thread.  There are checks in GDB that check those conditions and throw
meaningful error messages before we could run into the above ptrace error,
e.g. thread.c:validate_registers_access.

I ran into a new case and, rather than adding another call to
validate_registers_access in GDB, I propose to catch the error already when
handling the 'g' packet in gdbserver and reply with an error packet - assuming
that gdbserver's internal state is still intact.

To not replace a meaningful error message with E01, I'm trying to generate a
useful error message when the error is detected and the exception is thrown.

It would look like this ...

gdb) PASS: gdb.btrace/enable-running.exp: continue to breakpoint: cont to 44
cont&
Continuing.
(gdb) PASS: gdb.btrace/enable-running.exp: cont&
record btrace
warning: Remote failure reply: E.Selected thread is running.
warning: Remote failure reply: E.Selected thread is running.

... although in this particular case, I'm going to suppress the warning.

To make this look a bit nicer, we could consider stripping the "E." or the
entire "Remote failure reply: E." when (re-)throwing the error inside GDB in
remote.c.

CC:  Daniel Jacobowitz	<drow@false.org>

2016-12-06  Markus Metzger  <markus.t.metzger@intel.com>

gdbserver/
	* server.c (process_serial_event): Add TRY/CATCH.
	* linux-low.c (fetch_register): Improve error message.
---
 gdb/gdbserver/linux-low.c | 19 ++++++++++++++++++-
 gdb/gdbserver/server.c    | 18 ++++++++++++++++--
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index e3e372c..a942b87 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -5692,7 +5692,24 @@ fetch_register (const struct usrregs_info *usrregs,
 		(PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4) 0);
       regaddr += sizeof (PTRACE_XFER_TYPE);
       if (errno != 0)
-	error ("reading register %d: %s", regno, strerror (errno));
+	{
+	  /* ESRCH could mean that the thread is not traced, exited, or is not
+	     stopped.  */
+	  if (errno == ESRCH)
+	    {
+	      struct lwp_info *lwp = get_thread_lwp (current_thread);
+
+	      if (!lwp_is_stopped (lwp))
+		error (_("Selected thread is running."));
+
+	      if (lwp_is_marked_dead (lwp))
+		error (_("Selected thread has terminated."));
+	    }
+
+	  /* Report a generic error if we could not determine the exact
+	     reason.  */
+	  error (_("Could not read register %d: %s."), regno, strerror (errno));
+	}
     }

   if (the_low_target.supply_ptrace_register)
diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
index ef8dd03..3064b4f 100644
--- a/gdb/gdbserver/server.c
+++ b/gdb/gdbserver/server.c
@@ -4132,8 +4132,22 @@ process_serial_event (void)
 	    write_enn (own_buf);
 	  else
 	    {
-	      regcache = get_thread_regcache (current_thread, 1);
-	      registers_to_string (regcache, own_buf);
+	      TRY
+		{
+		  regcache = get_thread_regcache (current_thread, 1);
+		  registers_to_string (regcache, own_buf);
+		}
+	      CATCH (exception, RETURN_MASK_ALL)
+		{
+		  const char *message;
+
+		  message = exception.message;
+		  if (message == NULL)
+		    message = _("Reading registers failed.");
+
+		  sprintf (own_buf, "E.%s", message);
+		}
+	      END_CATCH
 	    }
 	}
       break;
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] gdbserver: catch fetch registers error
  2016-12-06 15:55 ` [PATCH 1/2] gdbserver: catch fetch registers error Markus Metzger
@ 2017-01-19 13:22   ` Pedro Alves
  2017-01-19 14:48     ` Metzger, Markus T
  0 siblings, 1 reply; 9+ messages in thread
From: Pedro Alves @ 2017-01-19 13:22 UTC (permalink / raw)
  To: Markus Metzger, gdb-patches; +Cc: Daniel Jacobowitz

On 12/06/2016 03:54 PM, Markus Metzger wrote:
> When the PTRACE_PEEKUSER ptrace request to read registers fails, gdbserer throws
> an error that is caught in captured_main, where it causes a E01 error packet to
> be sent and gdbserer to quit (if --once was specified) or the event loop to be
> re-started (otherwise).
> 
> We may get such ptrace errors when trying to fetch registers for an exited or
> running thread.  There are checks in GDB that check those conditions and throw
> meaningful error messages before we could run into the above ptrace error,
> e.g. thread.c:validate_registers_access.
> 
> I ran into a new case and, rather than adding another call to
> validate_registers_access in GDB, I propose to catch the error already when
> handling the 'g' packet in gdbserver and reply with an error packet - assuming
> that gdbserver's internal state is still intact.

So there are two changes here:

#1 - don't kill the debug session just because 
     of register reading failure.  I.e., handle it gracefully.

The case of trying to read registers from a thread that is
running is always a client bug.  GDB should know which threads
it resumed, and checks like the validate_registers_access
you found should prevent such accesses reaching the server.
I don't object to being defensive in gdbserver if that's
not too invasive, but if we find such cases in GDB, we should
fix them.

However, the case of trying to read from an exited thread can
well happen, and there's nothing the client can do to prevent it.
The only thing it can do it handle it gracefully.
This is because threads can transition from stopped -> exited
without a "running" state in between.  And this can happen if
some _other_ thread of the inferior process is resumed and that
other thread causes the whole process to exit (e.g., calls "exit()")
just while gdb/gdbserver is accessing registers of the other
supposedly-stopped thread.  It can also happen when some child
thread other than the one we're trying to access registers
from execs, and the thread the client is trying to access registers
from is not the main thread (the exec makes the kernel kill all
threads, including the ones that were stopped).

These sorts of corner cases are exercised by the
process-dies-while-detaching.exp / process-dies-while-handling-bp.exp
testcase and maybe others I don't recall their names immediately.

See also: https://sourceware.org/bugzilla/show_bug.cgi?id=18749

Note process-dies-while-handling-bp.exp is largely kfailed when
running against remote, exactly because we don't handle this all
to well currently, in both gdbserver, and in gdb common code.

#2 - sending a human readable error message back to gdb.

> 
> To not replace a meaningful error message with E01, I'm trying to generate a
> useful error message when the error is detected and the exception is thrown.
> 
> It would look like this ...
> 
> gdb) PASS: gdb.btrace/enable-running.exp: continue to breakpoint: cont to 44
> cont&
> Continuing.
> (gdb) PASS: gdb.btrace/enable-running.exp: cont&
> record btrace
> warning: Remote failure reply: E.Selected thread is running.
> warning: Remote failure reply: E.Selected thread is running.
> 
> ... although in this particular case, I'm going to suppress the warning.

So it seems to me that btrace shouldn't even be trying to be enabled
on running threads.

> 
> To make this look a bit nicer, we could consider stripping the "E." or the
> entire "Remote failure reply: E." when (re-)throwing the error inside GDB in
> remote.c.

Agreed.

> @@ -5692,7 +5692,24 @@ fetch_register (const struct usrregs_info *usrregs,
>  		(PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4) 0);
>        regaddr += sizeof (PTRACE_XFER_TYPE);
>        if (errno != 0)
> -	error ("reading register %d: %s", regno, strerror (errno));
> +	{
> +	  /* ESRCH could mean that the thread is not traced, exited, or is not
> +	     stopped.  */
> +	  if (errno == ESRCH)
> +	    {
> +	      struct lwp_info *lwp = get_thread_lwp (current_thread);
> +
> +	      if (!lwp_is_stopped (lwp))
> +		error (_("Selected thread is running."));
> +
> +	      if (lwp_is_marked_dead (lwp))
> +		error (_("Selected thread has terminated."));

The order of the checks seems backwards -- if the lwp is dead,
lwp_is_stopped is meaningless?  But, if either of these conditions
is true, and the errors are reached, then it looks like
an internal gdbserver error to me.

lwp_is_stopped, only checks the lwp->stopped flag.
If !lwp->stopped, then this looks like a case of "we should not
have reached here in the first place".

Similarly for lwp_is_marked_dead -- if the lwp is known dead,
then why did we try to ptrace it?  Something higher up in the
call chain should have errored out earlier, IMO.

If we want gdbserver to be defensive against GDB mistakes,
we can look at the request as coming from server.c, to handle a 
g/G packet.  So in that case it's gdbserver's knowledge of 
gdb's perspective of whether the thread is running that counts
here.  I.e., a 'g' packet that tries to read from a thread
that gdb resumed before should error out even if the
thread happens to be momentarily paused due to some internal
gdbserver event handling.

That suggests to me that for the "running" case, 
server.c's g/G packet handling should be erroring out before
calling into the backend, if:

  if (thread->last_resume_kind != resume_stop
      || thread->last_status.kind == TARGET_WAITKIND_IGNORE)

See the linux-low.c:lwp_resumed function, which we could
move to common code.

Now, even with that in place, we'd still need to handle
the unpredictable "stopped -> exited" transitions.  And those
can only be handled by checking for failure after the fact...
But with an upfront "!running" check, then we can assume that
ESRCH means the thread is gone, and thus unconditionally
throw the 
  error (_("Selected thread has terminated."));
error.

Note that what gdbserver thinks of "selected" thread has
no 1-1 relation with what the user thinks as selected thread,
so that may be a bit confusing.  E.g., the user may well have thread 2,
selected, while that error triggers for some other thread because
gdb decided to temporarily switch to some other thread, internally,
e.g., to iterate over threads and read their registers. 

> +	    }
> +
> +	  /* Report a generic error if we could not determine the exact
> +	     reason.  */
> +	  error (_("Could not read register %d: %s."), regno, strerror (errno));
> +	}
>      }
>  
>    if (the_low_target.supply_ptrace_register)
> diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
> index ef8dd03..3064b4f 100644
> --- a/gdb/gdbserver/server.c
> +++ b/gdb/gdbserver/server.c
> @@ -4132,8 +4132,22 @@ process_serial_event (void)
>  	    write_enn (own_buf);
>  	  else
>  	    {
> -	      regcache = get_thread_regcache (current_thread, 1);
> -	      registers_to_string (regcache, own_buf);
> +	      TRY
> +		{
> +		  regcache = get_thread_regcache (current_thread, 1);
> +		  registers_to_string (regcache, own_buf);
> +		}
> +	      CATCH (exception, RETURN_MASK_ALL)

Why RETURN_MASK_ALL ?

> +		{
> +		  const char *message;
> +
> +		  message = exception.message;
> +		  if (message == NULL)
> +		    message = _("Reading registers failed.");
> +
> +		  sprintf (own_buf, "E.%s", message);
> +		}
> +	      END_CATCH
>  	    }
>  	}
>        break;
> 

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH 1/2] gdbserver: catch fetch registers error
  2017-01-19 13:22   ` Pedro Alves
@ 2017-01-19 14:48     ` Metzger, Markus T
  2017-01-19 15:02       ` Pedro Alves
  0 siblings, 1 reply; 9+ messages in thread
From: Metzger, Markus T @ 2017-01-19 14:48 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Daniel Jacobowitz

Hello Pedro,

Thanks for the review and the elaborate response.

> The case of trying to read registers from a thread that is
> running is always a client bug.  GDB should know which threads
> it resumed, and checks like the validate_registers_access
> you found should prevent such accesses reaching the server.

That's a general misunderstanding of the design on my part, then.  I typically
don't check if I can/should do something but instead just do it and handle errors
in case it failed.  Like reading registers.

Let me drop this patch and instead call validate_registers_access in btrace before
reading the PC.


> So it seems to me that btrace shouldn't even be trying to be enabled
> on running threads.

It works fine and I didn't see a reason why we would not want to allow it.

Thanks,
Markus.

> -----Original Message-----
> From: Pedro Alves [mailto:palves@redhat.com]
> Sent: Thursday, January 19, 2017 2:22 PM
> To: Metzger, Markus T <markus.t.metzger@intel.com>; gdb-
> patches@sourceware.org
> Cc: Daniel Jacobowitz <drow@false.org>
> Subject: Re: [PATCH 1/2] gdbserver: catch fetch registers error
> 
> On 12/06/2016 03:54 PM, Markus Metzger wrote:
> > When the PTRACE_PEEKUSER ptrace request to read registers fails, gdbserer
> throws
> > an error that is caught in captured_main, where it causes a E01 error packet to
> > be sent and gdbserer to quit (if --once was specified) or the event loop to be
> > re-started (otherwise).
> >
> > We may get such ptrace errors when trying to fetch registers for an exited or
> > running thread.  There are checks in GDB that check those conditions and throw
> > meaningful error messages before we could run into the above ptrace error,
> > e.g. thread.c:validate_registers_access.
> >
> > I ran into a new case and, rather than adding another call to
> > validate_registers_access in GDB, I propose to catch the error already when
> > handling the 'g' packet in gdbserver and reply with an error packet - assuming
> > that gdbserver's internal state is still intact.
> 
> So there are two changes here:
> 
> #1 - don't kill the debug session just because
>      of register reading failure.  I.e., handle it gracefully.
> 
> The case of trying to read registers from a thread that is
> running is always a client bug.  GDB should know which threads
> it resumed, and checks like the validate_registers_access
> you found should prevent such accesses reaching the server.
> I don't object to being defensive in gdbserver if that's
> not too invasive, but if we find such cases in GDB, we should
> fix them.
> 
> However, the case of trying to read from an exited thread can
> well happen, and there's nothing the client can do to prevent it.
> The only thing it can do it handle it gracefully.
> This is because threads can transition from stopped -> exited
> without a "running" state in between.  And this can happen if
> some _other_ thread of the inferior process is resumed and that
> other thread causes the whole process to exit (e.g., calls "exit()")
> just while gdb/gdbserver is accessing registers of the other
> supposedly-stopped thread.  It can also happen when some child
> thread other than the one we're trying to access registers
> from execs, and the thread the client is trying to access registers
> from is not the main thread (the exec makes the kernel kill all
> threads, including the ones that were stopped).
> 
> These sorts of corner cases are exercised by the
> process-dies-while-detaching.exp / process-dies-while-handling-bp.exp
> testcase and maybe others I don't recall their names immediately.
> 
> See also: https://sourceware.org/bugzilla/show_bug.cgi?id=18749
> 
> Note process-dies-while-handling-bp.exp is largely kfailed when
> running against remote, exactly because we don't handle this all
> to well currently, in both gdbserver, and in gdb common code.
> 
> #2 - sending a human readable error message back to gdb.
> 
> >
> > To not replace a meaningful error message with E01, I'm trying to generate a
> > useful error message when the error is detected and the exception is thrown.
> >
> > It would look like this ...
> >
> > gdb) PASS: gdb.btrace/enable-running.exp: continue to breakpoint: cont to 44
> > cont&
> > Continuing.
> > (gdb) PASS: gdb.btrace/enable-running.exp: cont&
> > record btrace
> > warning: Remote failure reply: E.Selected thread is running.
> > warning: Remote failure reply: E.Selected thread is running.
> >
> > ... although in this particular case, I'm going to suppress the warning.
> 
> So it seems to me that btrace shouldn't even be trying to be enabled
> on running threads.
> 
> >
> > To make this look a bit nicer, we could consider stripping the "E." or the
> > entire "Remote failure reply: E." when (re-)throwing the error inside GDB in
> > remote.c.
> 
> Agreed.
> 
> > @@ -5692,7 +5692,24 @@ fetch_register (const struct usrregs_info *usrregs,
> >  		(PTRACE_TYPE_ARG3) (uintptr_t) regaddr, (PTRACE_TYPE_ARG4)
> 0);
> >        regaddr += sizeof (PTRACE_XFER_TYPE);
> >        if (errno != 0)
> > -	error ("reading register %d: %s", regno, strerror (errno));
> > +	{
> > +	  /* ESRCH could mean that the thread is not traced, exited, or is not
> > +	     stopped.  */
> > +	  if (errno == ESRCH)
> > +	    {
> > +	      struct lwp_info *lwp = get_thread_lwp (current_thread);
> > +
> > +	      if (!lwp_is_stopped (lwp))
> > +		error (_("Selected thread is running."));
> > +
> > +	      if (lwp_is_marked_dead (lwp))
> > +		error (_("Selected thread has terminated."));
> 
> The order of the checks seems backwards -- if the lwp is dead,
> lwp_is_stopped is meaningless?  But, if either of these conditions
> is true, and the errors are reached, then it looks like
> an internal gdbserver error to me.
> 
> lwp_is_stopped, only checks the lwp->stopped flag.
> If !lwp->stopped, then this looks like a case of "we should not
> have reached here in the first place".
> 
> Similarly for lwp_is_marked_dead -- if the lwp is known dead,
> then why did we try to ptrace it?  Something higher up in the
> call chain should have errored out earlier, IMO.
> 
> If we want gdbserver to be defensive against GDB mistakes,
> we can look at the request as coming from server.c, to handle a
> g/G packet.  So in that case it's gdbserver's knowledge of
> gdb's perspective of whether the thread is running that counts
> here.  I.e., a 'g' packet that tries to read from a thread
> that gdb resumed before should error out even if the
> thread happens to be momentarily paused due to some internal
> gdbserver event handling.
> 
> That suggests to me that for the "running" case,
> server.c's g/G packet handling should be erroring out before
> calling into the backend, if:
> 
>   if (thread->last_resume_kind != resume_stop
>       || thread->last_status.kind == TARGET_WAITKIND_IGNORE)
> 
> See the linux-low.c:lwp_resumed function, which we could
> move to common code.
> 
> Now, even with that in place, we'd still need to handle
> the unpredictable "stopped -> exited" transitions.  And those
> can only be handled by checking for failure after the fact...
> But with an upfront "!running" check, then we can assume that
> ESRCH means the thread is gone, and thus unconditionally
> throw the
>   error (_("Selected thread has terminated."));
> error.
> 
> Note that what gdbserver thinks of "selected" thread has
> no 1-1 relation with what the user thinks as selected thread,
> so that may be a bit confusing.  E.g., the user may well have thread 2,
> selected, while that error triggers for some other thread because
> gdb decided to temporarily switch to some other thread, internally,
> e.g., to iterate over threads and read their registers.
> 
> 
> > +	    }
> > +
> > +	  /* Report a generic error if we could not determine the exact
> > +	     reason.  */
> > +	  error (_("Could not read register %d: %s."), regno, strerror (errno));
> > +	}
> >      }
> >
> >    if (the_low_target.supply_ptrace_register)
> > diff --git a/gdb/gdbserver/server.c b/gdb/gdbserver/server.c
> > index ef8dd03..3064b4f 100644
> > --- a/gdb/gdbserver/server.c
> > +++ b/gdb/gdbserver/server.c
> > @@ -4132,8 +4132,22 @@ process_serial_event (void)
> >  	    write_enn (own_buf);
> >  	  else
> >  	    {
> > -	      regcache = get_thread_regcache (current_thread, 1);
> > -	      registers_to_string (regcache, own_buf);
> > +	      TRY
> > +		{
> > +		  regcache = get_thread_regcache (current_thread, 1);
> > +		  registers_to_string (regcache, own_buf);
> > +		}
> > +	      CATCH (exception, RETURN_MASK_ALL)
> 
> Why RETURN_MASK_ALL ?
> 
> > +		{
> > +		  const char *message;
> > +
> > +		  message = exception.message;
> > +		  if (message == NULL)
> > +		    message = _("Reading registers failed.");
> > +
> > +		  sprintf (own_buf, "E.%s", message);
> > +		}
> > +	      END_CATCH
> >  	    }
> >  	}
> >        break;
> >
> 
> Thanks,
> Pedro Alves

Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Christian Lamprechter
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] gdbserver: catch fetch registers error
  2017-01-19 14:48     ` Metzger, Markus T
@ 2017-01-19 15:02       ` Pedro Alves
  0 siblings, 0 replies; 9+ messages in thread
From: Pedro Alves @ 2017-01-19 15:02 UTC (permalink / raw)
  To: Metzger, Markus T, gdb-patches; +Cc: Daniel Jacobowitz

On 01/19/2017 02:48 PM, Metzger, Markus T wrote:
> Hello Pedro,
> 
> Thanks for the review and the elaborate response.
> 
>> The case of trying to read registers from a thread that is
>> running is always a client bug.  GDB should know which threads
>> it resumed, and checks like the validate_registers_access
>> you found should prevent such accesses reaching the server.
> 
> That's a general misunderstanding of the design on my part, then.  I typically
> don't check if I can/should do something but instead just do it and handle errors
> in case it failed.  Like reading registers.
> 
> Let me drop this patch and instead call validate_registers_access in btrace before
> reading the PC.

We have similar checks in other places, like e.g., frame.c:has_stack_frames.
Might make sense to factor things out a bit to avoid a throw/catch.

>> So it seems to me that btrace shouldn't even be trying to be enabled
>> on running threads.
> 
> It works fine and I didn't see a reason why we would not want to allow it.

Sorry, that was bogus.  I agree -- what it shouldn't be doing
is reading registers.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-01-19 15:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-06 15:55 [PATCH 0/2] btrace, gdbserver: allow "record btrace" for running threads Markus Metzger
2016-12-06 15:55 ` [PATCH 2/2] btrace: allow recording to be started " Markus Metzger
2016-12-07 20:20   ` Luis Machado
2016-12-09 12:22     ` Metzger, Markus T
2017-01-19 15:15       ` Pedro Alves
2016-12-06 15:55 ` [PATCH 1/2] gdbserver: catch fetch registers error Markus Metzger
2017-01-19 13:22   ` Pedro Alves
2017-01-19 14:48     ` Metzger, Markus T
2017-01-19 15:02       ` Pedro Alves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).