[PATCH v2 0/4] GDB fixes for the remote end having gone astray

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH v2 0/4] GDB fixes for the remote end having gone astray
@ 2019-11-06 20:51 Maciej W. Rozycki
  2019-11-06 20:51 ` [PATCH v2 1/4] Remove stale breakpoint step-over information Maciej W. Rozycki
                   ` (14 more replies)
  0 siblings, 15 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-11-06 20:51 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Jim Wilson

Hi Pedro,

> So something like adding a thread_info parameter to
> clear_step_over_info, and then calling clear_step_over_info
> from clear_thread_inferior_resources.  clear_step_over_info
> would only clear the info if the thread matched, or if NULL
> is passed.  Would that work?

 Thank you for your review.

 As issues from software stepping having gone astray seem to have started 
piling up, and with your suggested modifications there is now a dependency 
between the changes, I have decided to make this patch submission a small 
series with 1/4 and 3/4 corresponding to original individual submissions 
and 2/4 and 4/4 being entrely new fixes.

 All these changes were regression-tested with the `x86_64-linux-gnu' 
native target and, now that RISC-V/Linux `gdbserver' I have been working 
on has become usable enough, also the `riscv64-linux-gnu' remote target 
driven by a `x86_64-linux-gnu' host.

 See individual change descriptions and any associated discussions for 
details.

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v2 1/4] Remove stale breakpoint step-over information
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
@ 2019-11-06 20:51 ` Maciej W. Rozycki
  2020-01-21  5:41   ` Simon Marchi
  2020-02-19 11:26   ` Luis Machado
  2019-11-06 20:52 ` [PATCH v2 2/4] Unregister the last inferior from the event loop Maciej W. Rozycki
                   ` (13 subsequent siblings)
  14 siblings, 2 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-11-06 20:51 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Jim Wilson

Fix an issue with the `step_over_info' structure introduced with commit 
31e77af205cf ("PR breakpoints/7143 - Watchpoint does not trigger when 
first set"), where information stored in there is not invalidated in a 
remote debug scenario where software stepping in the all-stop mode is 
forcibly interrupted and the target connection then closed.  As a result 
a subsequent target connection triggers an assertion failure on 
`ecs->event_thread->control.trap_expected' and cannot be successfully 
established, making the particular instance of GDB unusable.

This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux 
`gdbserver' being developed and an attempt to step over the `ret' aka 
`c.jr ra' instruction where the value held in the `ra' aka `x1' register 
and therefore the address of a software-stepping breakpoint to insert is 
0, as follows:

(gdb) target remote 1.2.3.4:56789
Remote debugging using 1.2.3.4:56789
warning: while parsing target description (at line 4): Target description specified unknown architecture "riscv:rv64id"
warning: Could not load XML target description; ignoring
Reading symbols from .../sysroot/lib/ld-linux-riscv64-lp64d.so.1...
0x0000001555556ef0 in _start ()
   from .../sysroot/lib/ld-linux-riscv64-lp64d.so.1
(gdb) break main
Breakpoint 1 at 0x1643c
(gdb) continue
Continuing.
Cannot access memory at address 0x0
(gdb) x /i $pc
=> 0x15555607b8 <__GI__dl_debug_state>: ret
(gdb) print /x $ra
$1 = 0x0
(gdb) stepi
^C^CInterrupted while waiting for the program.
Give up waiting? (y or n) y
Quit
(gdb) kill
Kill the program being debugged? (y or n) y
[Inferior 1 (process 8964) killed]
(gdb) target remote 1.2.3.4:56789
Remote debugging using 1.2.3.4:56789
warning: while parsing target description (at line 4): Target description specified unknown architecture "riscv:rv64id"
warning: Could not load XML target description; ignoring
.../gdb/infrun.c:5299: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) y

This is a bug, please report it.  For instructions, see:
<http://www.gnu.org/software/gdb/bugs/>.

.../gdb/infrun.c:5299: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n) n
Command aborted.
(gdb) 

Correct the issue by making a call to clear global breakpoint step-over 
information from `clear_thread_inferior_resources'.  To do so add an 
argument to `clear_step_over_info' giving the global thread number to 
check against, complementing one given to `set_step_over_info' when the 
information concerned is recorded, so that information is not removed 
for a thread that has stayed alive in a multi-target scenario.

Adjust all the existing `clear_step_over_info' call sites accordingly 
where a step over has completed successfully and the thread that has 
hit the breakpoint is expected to be one for which the breakpoint has 
been inserted.

	gdb/
	* infrun.h (clear_step_over_info): New prototype.
	* infrun.c (thread_is_stepping_over_breakpoint): Move earlier 
	on.
	(clear_step_over_info): Use it.  Make external.
	(resume_1, finish_step_over, handle_signal_stop)
	(keep_going_stepped_thread, keep_going_pass_signal): Adjust 
	accordingly.
	* thread.c (clear_thread_inferior_resources): Call 
	`clear_step_over_info'.
---
Hi Pedro,

On Thu, 31 Oct 2019, Pedro Alves wrote:

> > Correct the issue by making a call to clear global breakpoint step-over 
> > information from `exit_inferior_1', which is where we already do all 
> > kinds of similar clean-ups, e.g. `delete_thread' called from there 
> > clears per-thread step-over information.
> 
> This looks like a fragile place to clear this, considering
> multiprocess.  I.e., what if we're calling exit_inferior for some
> inferior other than the one that was stepping.

 Good point.

> The step over information is associated with a thread.
> I think it'd be better to clear the step over information
> when the corresponding thread is deleted.
> 
> So something like adding a thread_info parameter to
> clear_step_over_info, and then calling clear_step_over_info
> from clear_thread_inferior_resources.  clear_step_over_info
> would only clear the info if the thread matched, or if NULL
> is passed.  Would that work?

 Thanks for your suggestion.  That indeed works except that I have figured 
out we actually always have a thread to match available when making a call 
to `clear_step_over_info', so I have not implemented a special NULL case, 
and I'm not convinced matching thread numbers ahead of the call is worth 
an assertion either.

 OK to apply?

  Maciej

Changes from v1:

- Add and check against thread number argument to `clear_step_over_info'.

- Call the function from `clear_thread_inferior_resources' rather than 
  `exit_inferior_1'.

- Use the thread number argument for `clear_step_over_info' throughout.
---
 gdb/infrun.c |   40 +++++++++++++++++++++-------------------
 gdb/infrun.h |    4 ++++
 gdb/thread.c |    3 +++
 3 files changed, 28 insertions(+), 19 deletions(-)

gdb-clear-step-over-info.diff
Index: binutils-gdb/gdb/infrun.c
===================================================================
--- binutils-gdb.orig/gdb/infrun.c
+++ binutils-gdb/gdb/infrun.c
@@ -1330,12 +1330,23 @@ set_step_over_info (const address_space
   step_over_info.thread = thread;
 }

-/* Called when we're not longer stepping over a breakpoint / an
-   instruction, so all breakpoints are free to be (re)inserted.  */
+/* See infrun.h.  */

-static void
-clear_step_over_info (void)
+int
+thread_is_stepping_over_breakpoint (int thread)
+{
+  return (step_over_info.thread != -1
+	  && thread == step_over_info.thread);
+}
+
+/* See infrun.h.  */
+
+void
+clear_step_over_info (int thread)
 {
+  if (!thread_is_stepping_over_breakpoint (thread))
+    return;
+
   if (debug_infrun)
     fprintf_unfiltered (gdb_stdlog,
 			"infrun: clear_step_over_info\n");
@@ -1360,15 +1371,6 @@ stepping_past_instruction_at (struct add
 /* See infrun.h.  */

 int
-thread_is_stepping_over_breakpoint (int thread)
-{
-  return (step_over_info.thread != -1
-	  && thread == step_over_info.thread);
-}
-
-/* See infrun.h.  */
-
-int
 stepping_past_nonsteppable_watchpoint (void)
 {
   return step_over_info.nonsteppable_watchpoint_p;
@@ -2339,7 +2341,7 @@ resume_1 (enum gdb_signal sig)
 				"infrun: resume: skipping permanent breakpoint, "
 				"deliver signal first\n");

-	  clear_step_over_info ();
+	  clear_step_over_info (tp->global_num);
 	  tp->control.trap_expected = 0;

 	  if (tp->control.step_resume_breakpoint == NULL)
@@ -2496,7 +2498,7 @@ resume_1 (enum gdb_signal sig)

       delete_single_step_breakpoints (tp);

-      clear_step_over_info ();
+      clear_step_over_info (tp->global_num);
       tp->control.trap_expected = 0;

       insert_breakpoints ();
@@ -5298,7 +5300,7 @@ finish_step_over (struct execution_contr
 	 back an event.  */
       gdb_assert (ecs->event_thread->control.trap_expected);

-      clear_step_over_info ();
+      clear_step_over_info (ecs->event_thread->global_num);
     }

   if (!target_is_non_stop_p ())
@@ -5913,7 +5915,7 @@ handle_signal_stop (struct execution_con
                                 "infrun: signal may take us out of "
                                 "single-step range\n");

-	  clear_step_over_info ();
+	  clear_step_over_info (ecs->event_thread->global_num);
 	  insert_hp_step_resume_breakpoint_at_frame (frame);
 	  ecs->event_thread->step_after_step_resume_breakpoint = 1;
 	  /* Reset trap_expected to ensure breakpoints are re-inserted.  */
@@ -7004,7 +7006,7 @@ keep_going_stepped_thread (struct thread
 	 breakpoint, otherwise if we were previously trying to step
 	 over this exact address in another thread, the breakpoint is
 	 skipped.  */
-      clear_step_over_info ();
+      clear_step_over_info (tp->global_num);
       tp->control.trap_expected = 0;

       insert_single_step_breakpoint (get_frame_arch (frame),
@@ -7547,7 +7549,7 @@ keep_going_pass_signal (struct execution
 	{
 	  exception_print (gdb_stderr, e);
 	  stop_waiting (ecs);
-	  clear_step_over_info ();
+	  clear_step_over_info (ecs->event_thread->global_num);
 	  return;
 	}

Index: binutils-gdb/gdb/infrun.h
===================================================================
--- binutils-gdb.orig/gdb/infrun.h
+++ binutils-gdb/gdb/infrun.h
@@ -120,6 +120,10 @@ extern void insert_step_resume_breakpoin
 						  struct symtab_and_line ,
 						  struct frame_id);

+/* Called when we're not longer stepping over a breakpoint / an
+   instruction, so all breakpoints are free to be (re)inserted.  */
+void clear_step_over_info (int thread);
+
 /* Returns true if we're trying to step past the instruction at
    ADDRESS in ASPACE.  */
 extern int stepping_past_instruction_at (struct address_space *aspace,
Index: binutils-gdb/gdb/thread.c
===================================================================
--- binutils-gdb.orig/gdb/thread.c
+++ binutils-gdb/gdb/thread.c
@@ -191,6 +191,9 @@ clear_thread_inferior_resources (struct

   delete_longjmp_breakpoint_at_next_stop (tp->global_num);

+  /* Remove any stale breakpoint step-over information.  */
+  clear_step_over_info (tp->global_num);
+
   bpstat_clear (&tp->control.stop_bpstat);

   btrace_teardown (tp);

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v2 3/4] Remove breakpoint step-over information if failed to resume
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
  2019-11-06 20:51 ` [PATCH v2 1/4] Remove stale breakpoint step-over information Maciej W. Rozycki
  2019-11-06 20:52 ` [PATCH v2 2/4] Unregister the last inferior from the event loop Maciej W. Rozycki
@ 2019-11-06 20:52 ` Maciej W. Rozycki
  2020-01-21  8:29   ` Simon Marchi
  2020-02-19 13:30   ` Luis Machado
  2019-11-06 20:52 ` [PATCH v2 4/4] Unregister the inferior from the event loop " Maciej W. Rozycki
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-11-06 20:52 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Jim Wilson

Complement commit 31e77af205cf ("PR breakpoints/7143 - Watchpoint does 
not trigger when first set") and commit 71d378ae60a4 
("gdb.base/breakpoint-in-ro-region.exp regression on sss targets (PR 
gdb/22583)") and also remove breakpoint step-over information if the 
inferior has failed to resume, which may be due to for example a failure 
to insert a software breakpoint at an inaccessible location.  If this 
happens, the state of GDB becomes inconsistent and further attempts to 
start execution hang.

This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux 
`gdbserver' being developed and an attempt to step over the `ret' aka 
`c.jr ra' instruction where the value held in the `ra' aka `x1' register 
and therefore the address of a software-stepping breakpoint to insert is 
0.  Here's a record of a remote debug session leading to the issue:

Sending packet: $vCont?#49...Packet received: vCont;c;C;t
Packet vCont (verbose-resume) is supported
Sending packet: $vCont;c:p23cb.-1#08...Packet received: T05swbreak:;02:20da080000000000;20:b807565515000000;thread:p23cb.23cb;core:3;
Sending packet: $qXfer:libraries-svr4:read::0,1000#20...Packet received: l<library-list-svr4 version="1.0" main-lm="0x155556f160"><library name=".../sysroot/lib/ld-linux-riscv64-lp64d.so.1" lm="0x155556ea18" l_addr="0x1555556000" l_ld="0x155556de90"/><library name="linux-vdso.so.1" lm="0x155556f6f0" l_addr="0x1555570000" l_ld="0x1555570350"/></library-list-svr4>
Sending packet: $X15555607b8,2:\202\200#2e...Packet received: OK
Sending packet: $m15555607b8,2#07...Packet received: 8280
Sending packet: $m15555607b8,2#07...Packet received: 8280
Sending packet: $g#67...Packet received: 0000000000000000000000000000000020da08000000000000000000000000000000000000000000d0ae040000000000696e74000000000000000000000000000100000000000000020000000000000080d708000000000000000000000000000c0000000000000000000000000000000000000000000000000000000000000000e408000000000020e908000000000000000000000000000000000000000000d0ae04000000000072697363765f646f00000000000000000100000000000000030000000000000000e408000000000020f208000000000000000000000000000000000000000000d0ae04000000000072697363765f646f0000000000000000[552 bytes omitted]
Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
Cannot access memory at address 0x0
(gdb) stepi
Sending packet: $m15555607b8,4#09...Packet received: 82802a87
Sending packet: $m15555607b4,4#05...Packet received: 01452dbd
^Cremote_pass_ctrlc called
remote_interrupt called
^Cremote_pass_ctrlc called
Interrupted while waiting for the program.
Give up waiting? (y or n) y
Quit
(gdb) 

As observed here the `stepi' command does not even attempt to reinsert 
the breakpoint, let alone resume execution.

Correct the issue by making a call to clear global breakpoint step-over 
from the exception catch clause in `resume'.

With the change applied further `stepi' commands correctly retry 
breakpoint insertion:

(gdb) stepi
Sending packet: $m15555607b8,4#09...Packet received: 82802a87
Sending packet: $m15555607b4,4#05...Packet received: 01452dbd
Sending packet: $m15555607b8,2#07...Packet received: 8280
Sending packet: $m15555607b8,2#07...Packet received: 8280
Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
Cannot access memory at address 0x0
(gdb)

and changing the value of the `pc' register so as not to point at a 
`ret' instruction allows execution to be actually resumed.

	gdb/
	* infrun.c (resume): Also call `clear_step_over_info' in the 
	`catch' clause.
---
Changes from v1:

- Add a thread number argument to `clear_step_over_info'.
---
 gdb/infrun.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

gdb-resume-clear-step-over-info.diff
Index: binutils-gdb/gdb/infrun.c
===================================================================
--- binutils-gdb.orig/gdb/infrun.c
+++ binutils-gdb/gdb/infrun.c
@@ -2620,7 +2620,12 @@ resume (gdb_signal sig)
 	 single-step breakpoints perturbing other threads, in case
 	 we're running in non-stop mode.  */
       if (inferior_ptid != null_ptid)
-	delete_single_step_breakpoints (inferior_thread ());
+	{
+	  thread_info *tp = inferior_thread ();
+
+	  delete_single_step_breakpoints (tp);
+	  clear_step_over_info (tp->global_num);
+	}
       throw;
     }
 }

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v2 4/4] Unregister the inferior from the event loop if failed to resume
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (2 preceding siblings ...)
  2019-11-06 20:52 ` [PATCH v2 3/4] Remove breakpoint step-over information if failed to resume Maciej W. Rozycki
@ 2019-11-06 20:52 ` Maciej W. Rozycki
  2020-02-19 13:40   ` Luis Machado
  2019-11-18 12:38 ` [PING][PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-11-06 20:52 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Jim Wilson

Fix an issue with GDB starting to effectively busy-loop (though still 
accepting and interpreting commands) using the full processing power 
available when a remote stub has gone closing the connection while GDB 
is waiting for user input once the inferior has failed to resume.

This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux
`gdbserver' being developed and an attempt to step over the `ret' aka
`c.jr ra' instruction where the value held in the `ra' aka `x1' register
and therefore the address of a software-stepping breakpoint to insert is
0.  Here's a record of a remote debug session leading to the issue:

Sending packet: $vCont?#49...Packet received: vCont;c;C;t
Packet vCont (verbose-resume) is supported
Sending packet: $vCont;c:p6857.-1#b8...infrun: infrun_async(1)
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun:   -1.0.0 [process -1],
infrun:   status->kind = ignore
infrun: handle_inferior_event status->kind = ignore
infrun: prepare_to_wait
Packet received: T05swbreak:;02:f0f8ffff3f000000;20:b807565515000000;thread:p6857.6857;core:1;
infrun: target_wait (-1.0.0, status) =
infrun:   26711.26711.0 [Thread 26711.26711],
infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: handle_inferior_event status->kind = stopped, signal = GDB_SIGNAL_TRAP
infrun: stop_pc = 0x15555607b8
Sending packet: $qXfer:libraries-svr4:read::0,1000#20...Packet received: l<library-list-svr4 version="1.0" main-lm="0x155556f160"><library name=".../sysroot/lib/ld-linux-riscv64-lp64d.so.1" lm="0x155556ea18" l_addr="0x1555556000" l_ld="0x155556de90"/><library name="linux-vdso.so.1" lm="0x155556f6f0" l_addr="0x1555570000" l_ld="0x1555570350"/></library-list-svr4>
infrun: BPSTAT_WHAT_SINGLE
infrun: thread [Thread 26711.26711] still needs step-over
infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
Sending packet: $X15555607b8,2:\202\200#2e...Packet received: OK
infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 26711.26711] at 0x15555607b8
Sending packet: $m15555607b8,2#07...Packet received: 8280
Sending packet: $m15555607b8,2#07...Packet received: 8280
Sending packet: $g#67...Packet received: 00000000000000000000000000000000f0f8ffff3f000000d0c1b9aa2a00000020f76d55150000000f00000000000000ec6e555515000000ffffff6f00000000f0faffff3f00000090f0565515000000000000000000000052e57464000000000500000000000000887801000000000028f1565515000000010000000000000008f8ffff3f000000722f6c696236342f60f156551500000000000000000000000000000000000000000000000000000008f75655150000000100000000000000f5feffefffffffff80de56551500000050f9ffff3f00000068f9ffff3f000000ae585655150000000b00000000000000fffdff6f00000000fcffffffffffffff[552 bytes omitted]
Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
infrun: clear_step_over_info
Cannot access memory at address 0x0
(gdb) 

At this point a command was issued to kill `gdbserver' and GDB responded 
immediately by entering the event loop:

infrun: target_wait (-1.0.0, status) =
infrun:   -1.0.0 [process -1],
infrun:   status->kind = no-resumed
infrun: handle_inferior_event status->kind = no-resumed
infrun: TARGET_WAITKIND_NO_RESUMED (ignoring: bg)
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun:   -1.0.0 [process -1],
infrun:   status->kind = no-resumed
infrun: handle_inferior_event status->kind = no-resumed
infrun: TARGET_WAITKIND_NO_RESUMED (ignoring: bg)
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun:   -1.0.0 [process -1],
infrun:   status->kind = no-resumed
infrun: handle_inferior_event status->kind = no-resumed
infrun: TARGET_WAITKIND_NO_RESUMED (ignoring: bg)
infrun: prepare_to_wait
[...]

This is because `remote_target::resume' enables the asynchronous event
loop, as indicated by the first `infrun: infrun_async(1)' record above, 
and then the EOF condition on the remote connection is delivered as an 
event, but there are no resumed children to be waited for and no other 
event waiting that would stop the loop.

Correct that then by disabling the asynchronous event as execution 
completion would, where applicable, i.e. in the all-stop mode.  This 
makes the final sequence of the session look like:

Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
Sending packet: $m0,40#2d...Packet received: E01
Sending packet: $m0,1#fa...Packet received: E01
infrun: infrun_async(0)
infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
infrun: clear_step_over_info
Cannot access memory at address 0x0
(gdb)

and GDB does not trigger the loop at this point if `gdbserver' goes away 
as the asynchronous event loop has already been disabled.

	gdb/
	* infrun.c (resume): In the `catch' clause also unregister the 
	inferior from the event loop.
---
Hi,

 It might be that the non-stop mode requires additional clean-up here, 
however I have no way to work with that at the moment, so I'll be leaving 
any investigation to someone who has and will be inclined to look into it.

  Maciej

New change in v2.
---
 gdb/infrun.c |   16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

gdb-resume-async.diff
Index: binutils-gdb/gdb/infrun.c
===================================================================
--- binutils-gdb.orig/gdb/infrun.c
+++ binutils-gdb/gdb/infrun.c
@@ -2614,11 +2614,17 @@ resume (gdb_signal sig)
     }
   catch (const gdb_exception &ex)
     {
-      /* If resuming is being aborted for any reason, delete any
-	 single-step breakpoint resume_1 may have created, to avoid
-	 confusing the following resumption, and to avoid leaving
-	 single-step breakpoints perturbing other threads, in case
-	 we're running in non-stop mode.  */
+      /* If resuming is being aborted for any reason, and we are in
+	 the all-stop mode, then unregister the inferior from the event
+	 loop.  This is done so that when the inferior is not running
+	 we don't get distracted by spurious inferior output.  */
+      if (!non_stop && target_has_execution && target_can_async_p ())
+	target_async (0);
+
+      /* Also delete any single-step breakpoint resume_1 may have
+	 created, to avoid confusing the following resumption, and to
+	 avoid leaving single-step breakpoints perturbing other threads,
+	 in case we're running in non-stop mode.  */
       if (inferior_ptid != null_ptid)
 	{
 	  thread_info *tp = inferior_thread ();

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v2 2/4] Unregister the last inferior from the event loop
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
  2019-11-06 20:51 ` [PATCH v2 1/4] Remove stale breakpoint step-over information Maciej W. Rozycki
@ 2019-11-06 20:52 ` Maciej W. Rozycki
  2020-01-21  5:47   ` Simon Marchi
  2019-11-06 20:52 ` [PATCH v2 3/4] Remove breakpoint step-over information if failed to resume Maciej W. Rozycki
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-11-06 20:52 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Jim Wilson

Fix an issue with GDB starting to effectively busy-loop (though still 
accepting and interpreting commands) using the full processing power 
available when a remote connection is forcibly terminated while the 
target is running.

This was observed with RISC-V/Linux `gdbserver' modified to hang after a 
successful acceptance of a `vCont' packet, as follows:

(gdb) set debug infrun 2
(gdb) set debug remote 2
(gdb) continue
Continuing.
infrun: clear_proceed_status_thread (Thread 22854.22854)
infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
Sending packet: $Z0,10430,2#0c...Packet received:
Packet Z0 (software-breakpoint) is NOT supported
Sending packet: $m10430,2#c3...Packet received: 8147
Sending packet: $X10430,0:#e6...Packet received: OK
binary downloading supported by target
Sending packet: $X10430,2:\002\220#7a...Packet received: OK
Sending packet: $m15555607b8,2#07...Packet received: 8280
Sending packet: $X15555607b8,2:\002\220#be...Packet received: OK
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 22854.22854] at 0x1555556ef0
Sending packet: $QPassSignals:e;10;14;17;1a;1b;1c;21;24;25;2c;4c;97;#0a...Packet received: OK
Sending packet: $vCont?#49...Packet received: vCont;c;C;t
Packet vCont (verbose-resume) is supported
Sending packet: $vCont;c:p5946.-1#b6...infrun: infrun_async(1)
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun:   -1.0.0 [process -1],
infrun:   status->kind = ignore
infrun: handle_inferior_event status->kind = ignore
infrun: prepare_to_wait
^Cremote_pass_ctrlc called
remote_interrupt called
^Cremote_pass_ctrlc called
infrun: infrun_async(0)
The target is not responding to interrupt requests.
Stop debugging it? (y or n) y
infrun: infrun_async(1)
Disconnected from target.
(gdb) infrun: target_wait (-1.0.0, status) =
infrun:   -1.0.0 [process -1],
infrun:   status->kind = ignore
infrun: handle_inferior_event status->kind = ignore
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun:   -1.0.0 [process -1],
infrun:   status->kind = ignore
infrun: handle_inferior_event status->kind = ignore
infrun: prepare_to_wait
infrun: target_wait (-1.0.0, status) =
infrun:   -1.0.0 [process -1],
infrun:   status->kind = ignore
infrun: handle_inferior_event status->kind = ignore
infrun: prepare_to_wait
[...]

This is because `remote_target::resume' enables the asynchronous event 
loop, as indicated by the first `infrun: infrun_async(1)' record above, 
and then the confirmation dialogue temporarily disables it and then 
reenables, as indicated by the second `infrun: infrun_async(1)' record 
above.  The problem with that approach is that the reenabling also marks 
the handler for the `infrun_async_inferior_event_token' event ready, 
even though it was not before the temporary disabling, by calling 
`mark_async_event_handler' on it, and that triggers the infinite loop as 
there's no inferior anymore and consequently no event waiting that would 
stop it.

Unregister the `infrun_async_inferior_event_token' event from the loop 
then when the last inferior has gone.  The final part of the session now 
looks like:

^Cremote_pass_ctrlc called
remote_interrupt called
^Cremote_pass_ctrlc called
infrun: infrun_async(0)
The target is not responding to interrupt requests.
Stop debugging it? (y or n) y
infrun: infrun_async(1)
infrun: infrun_async(0)
Disconnected from target.
(gdb) 

and the looping does not happen anymore.

	gdb/
	* target.c (target_stack::unpush): Unregister the last inferior 
	from the event loop.
---
New change in v2.
---
 gdb/target.c |    6 ++++++
 1 file changed, 6 insertions(+)

gdb-unpush-target-async.diff
Index: binutils-gdb/gdb/target.c
===================================================================
--- binutils-gdb.orig/gdb/target.c
+++ binutils-gdb/gdb/target.c
@@ -634,6 +634,12 @@ target_stack::unpush (target_ops *t)
      implementation don't end up in T anymore.  */
   target_close (t);

+  /* If this was the last inferior, then make sure it's been unregistered
+     from the event loop so that we don't get distracted by spurious
+     inferior output.  */
+  if (!have_live_inferiors ())
+    infrun_async (0);
+
   return true;
 }

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (3 preceding siblings ...)
  2019-11-06 20:52 ` [PATCH v2 4/4] Unregister the inferior from the event loop " Maciej W. Rozycki
@ 2019-11-18 12:38 ` Maciej W. Rozycki
  2019-11-26 15:49 ` [PING^2][PATCH " Maciej W. Rozycki
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-11-18 12:38 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Jim Wilson

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping for:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING^2][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (4 preceding siblings ...)
  2019-11-18 12:38 ` [PING][PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
@ 2019-11-26 15:49 ` Maciej W. Rozycki
  2019-12-02 14:50 ` [PING^3][PATCH " Maciej W. Rozycki
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-11-26 15:49 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Jim Wilson

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping for:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING^3][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (5 preceding siblings ...)
  2019-11-26 15:49 ` [PING^2][PATCH " Maciej W. Rozycki
@ 2019-12-02 14:50 ` Maciej W. Rozycki
  2019-12-05 20:59   ` Palmer Dabbelt via gdb-patches
  2019-12-09 21:29 ` [PING^4][PATCH " Maciej W. Rozycki
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-12-02 14:50 UTC (permalink / raw)
  To: Pedro Alves, gdb-patches; +Cc: Jim Wilson

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping for:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PING^3][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-12-02 14:50 ` [PING^3][PATCH " Maciej W. Rozycki
@ 2019-12-05 20:59   ` Palmer Dabbelt via gdb-patches
  2019-12-05 21:21     ` Maciej W. Rozycki
  0 siblings, 1 reply; 32+ messages in thread
From: Palmer Dabbelt via gdb-patches @ 2019-12-05 20:59 UTC (permalink / raw)
  To: macro; +Cc: palves, gdb-patches, Jim Wilson

On Mon, 02 Dec 2019 06:50:11 PST (-0800), macro@wdc.com wrote:
> On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:
>
>>  As issues from software stepping having gone astray seem to have started
>> piling up, and with your suggested modifications there is now a dependency
>> between the changes, I have decided to make this patch submission a small
>> series with 1/4 and 3/4 corresponding to original individual submissions
>> and 2/4 and 4/4 being entrely new fixes.
>
>  Ping for:
>
> <https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
> <https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
> <https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
> <https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

Sorry, but these are way over my head.  I've had the patches sitting in my
inbox for a while now in the hope that I could review them, but I'm afraid I'm
going to have to drop them -- I certainly have no opposition to the patches, I
just don't know enough to review them.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PING^3][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-12-05 20:59   ` Palmer Dabbelt via gdb-patches
@ 2019-12-05 21:21     ` Maciej W. Rozycki
  0 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-12-05 21:21 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: Pedro Alves, gdb-patches, Jim Wilson

Hi Palmer,

> >  Ping for:
> >
> > <https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
> > <https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
> > <https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
> > <https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>
> 
> Sorry, but these are way over my head.  I've had the patches sitting in my
> inbox for a while now in the hope that I could review them, but I'm afraid I'm
> going to have to drop them -- I certainly have no opposition to the patches, I
> just don't know enough to review them.

 Thanks for your input; these however are generic changes so I wouldn't 
expect you to chime in and I've been waiting for a general maintainer to 
pick them.  I guess Pedro, who already commented on an earlier version of 
one of them, has been too busy with other stuff recently.  No worries, 
I'll keep pinging.

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING^4][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (6 preceding siblings ...)
  2019-12-02 14:50 ` [PING^3][PATCH " Maciej W. Rozycki
@ 2019-12-09 21:29 ` Maciej W. Rozycki
  2019-12-17  0:06 ` [PING^5][PATCH " Maciej W. Rozycki
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-12-09 21:29 UTC (permalink / raw)
  To: gdb-patches

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping for:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING^5][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (7 preceding siblings ...)
  2019-12-09 21:29 ` [PING^4][PATCH " Maciej W. Rozycki
@ 2019-12-17  0:06 ` Maciej W. Rozycki
  2020-01-06 15:40 ` [PING^6][PATCH " Maciej W. Rozycki
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2019-12-17  0:06 UTC (permalink / raw)
  To: gdb-patches

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping for:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING^6][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (8 preceding siblings ...)
  2019-12-17  0:06 ` [PING^5][PATCH " Maciej W. Rozycki
@ 2020-01-06 15:40 ` Maciej W. Rozycki
  2020-01-13 20:46 ` [PING^7][PATCH " Maciej W. Rozycki
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-01-06 15:40 UTC (permalink / raw)
  To: gdb-patches

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping for:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING^7][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (9 preceding siblings ...)
  2020-01-06 15:40 ` [PING^6][PATCH " Maciej W. Rozycki
@ 2020-01-13 20:46 ` Maciej W. Rozycki
  2020-01-21  4:21 ` [PING^8][PATCH " Maciej W. Rozycki
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-01-13 20:46 UTC (permalink / raw)
  To: gdb-patches

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping for:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING^8][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (10 preceding siblings ...)
  2020-01-13 20:46 ` [PING^7][PATCH " Maciej W. Rozycki
@ 2020-01-21  4:21 ` Maciej W. Rozycki
  2020-01-31 14:23 ` [PING^9][PATCH " Maciej W. Rozycki
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-01-21  4:21 UTC (permalink / raw)
  To: gdb-patches

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping for:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 1/4] Remove stale breakpoint step-over information
  2019-11-06 20:51 ` [PATCH v2 1/4] Remove stale breakpoint step-over information Maciej W. Rozycki
@ 2020-01-21  5:41   ` Simon Marchi
  2020-02-19 11:26   ` Luis Machado
  1 sibling, 0 replies; 32+ messages in thread
From: Simon Marchi @ 2020-01-21  5:41 UTC (permalink / raw)
  To: Maciej W. Rozycki, Pedro Alves, gdb-patches; +Cc: Jim Wilson

On 2019-11-06 3:51 p.m., Maciej W. Rozycki wrote:
> Fix an issue with the `step_over_info' structure introduced with commit 
> 31e77af205cf ("PR breakpoints/7143 - Watchpoint does not trigger when 
> first set"), where information stored in there is not invalidated in a 
> remote debug scenario where software stepping in the all-stop mode is 
> forcibly interrupted and the target connection then closed.  As a result 
> a subsequent target connection triggers an assertion failure on 
> `ecs->event_thread->control.trap_expected' and cannot be successfully 
> established, making the particular instance of GDB unusable.
> 
> This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux 
> `gdbserver' being developed and an attempt to step over the `ret' aka 
> `c.jr ra' instruction where the value held in the `ra' aka `x1' register 
> and therefore the address of a software-stepping breakpoint to insert is 
> 0, as follows:
> 
> (gdb) target remote 1.2.3.4:56789
> Remote debugging using 1.2.3.4:56789
> warning: while parsing target description (at line 4): Target description specified unknown architecture "riscv:rv64id"
> warning: Could not load XML target description; ignoring
> Reading symbols from .../sysroot/lib/ld-linux-riscv64-lp64d.so.1...
> 0x0000001555556ef0 in _start ()
>    from .../sysroot/lib/ld-linux-riscv64-lp64d.so.1
> (gdb) break main
> Breakpoint 1 at 0x1643c
> (gdb) continue
> Continuing.
> Cannot access memory at address 0x0
> (gdb) x /i $pc
> => 0x15555607b8 <__GI__dl_debug_state>: ret
> (gdb) print /x $ra
> $1 = 0x0
> (gdb) stepi
> ^C^CInterrupted while waiting for the program.
> Give up waiting? (y or n) y
> Quit
> (gdb) kill
> Kill the program being debugged? (y or n) y
> [Inferior 1 (process 8964) killed]
> (gdb) target remote 1.2.3.4:56789
> Remote debugging using 1.2.3.4:56789
> warning: while parsing target description (at line 4): Target description specified unknown architecture "riscv:rv64id"
> warning: Could not load XML target description; ignoring
> .../gdb/infrun.c:5299: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) y
> 
> This is a bug, please report it.  For instructions, see:
> <http://www.gnu.org/software/gdb/bugs/>.
> 
> .../gdb/infrun.c:5299: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Create a core file of GDB? (y or n) n
> Command aborted.
> (gdb) 
> 
> Correct the issue by making a call to clear global breakpoint step-over 
> information from `clear_thread_inferior_resources'.  To do so add an 
> argument to `clear_step_over_info' giving the global thread number to 
> check against, complementing one given to `set_step_over_info' when the 
> information concerned is recorded, so that information is not removed 
> for a thread that has stayed alive in a multi-target scenario.
> 
> Adjust all the existing `clear_step_over_info' call sites accordingly 
> where a step over has completed successfully and the thread that has 
> hit the breakpoint is expected to be one for which the breakpoint has 
> been inserted.
> 
> 	gdb/
> 	* infrun.h (clear_step_over_info): New prototype.
> 	* infrun.c (thread_is_stepping_over_breakpoint): Move earlier 
> 	on.
> 	(clear_step_over_info): Use it.  Make external.
> 	(resume_1, finish_step_over, handle_signal_stop)
> 	(keep_going_stepped_thread, keep_going_pass_signal): Adjust 
> 	accordingly.
> 	* thread.c (clear_thread_inferior_resources): Call 
> 	`clear_step_over_info'.

This patch looks good to me, but I'd really prefer if Pedro could look at it,
since he's much more adept in this area.

Simon

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/4] Unregister the last inferior from the event loop
  2019-11-06 20:52 ` [PATCH v2 2/4] Unregister the last inferior from the event loop Maciej W. Rozycki
@ 2020-01-21  5:47   ` Simon Marchi
  2020-01-21 11:21     ` Maciej W. Rozycki
  0 siblings, 1 reply; 32+ messages in thread
From: Simon Marchi @ 2020-01-21  5:47 UTC (permalink / raw)
  To: Maciej W. Rozycki, Pedro Alves, gdb-patches; +Cc: Jim Wilson

On 2019-11-06 3:51 p.m., Maciej W. Rozycki wrote:
> Fix an issue with GDB starting to effectively busy-loop (though still 
> accepting and interpreting commands) using the full processing power 
> available when a remote connection is forcibly terminated while the 
> target is running.
> 
> This was observed with RISC-V/Linux `gdbserver' modified to hang after a 
> successful acceptance of a `vCont' packet, as follows:
> 
> (gdb) set debug infrun 2
> (gdb) set debug remote 2
> (gdb) continue
> Continuing.
> infrun: clear_proceed_status_thread (Thread 22854.22854)
> infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> Sending packet: $Z0,10430,2#0c...Packet received:
> Packet Z0 (software-breakpoint) is NOT supported
> Sending packet: $m10430,2#c3...Packet received: 8147
> Sending packet: $X10430,0:#e6...Packet received: OK
> binary downloading supported by target
> Sending packet: $X10430,2:\002\220#7a...Packet received: OK
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $X15555607b8,2:\002\220#be...Packet received: OK
> infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current thread [Thread 22854.22854] at 0x1555556ef0
> Sending packet: $QPassSignals:e;10;14;17;1a;1b;1c;21;24;25;2c;4c;97;#0a...Packet received: OK
> Sending packet: $vCont?#49...Packet received: vCont;c;C;t
> Packet vCont (verbose-resume) is supported
> Sending packet: $vCont;c:p5946.-1#b6...infrun: infrun_async(1)
> infrun: prepare_to_wait
> infrun: target_wait (-1.0.0, status) =
> infrun:   -1.0.0 [process -1],
> infrun:   status->kind = ignore
> infrun: handle_inferior_event status->kind = ignore
> infrun: prepare_to_wait
> ^Cremote_pass_ctrlc called
> remote_interrupt called
> ^Cremote_pass_ctrlc called
> infrun: infrun_async(0)
> The target is not responding to interrupt requests.
> Stop debugging it? (y or n) y
> infrun: infrun_async(1)
> Disconnected from target.
> (gdb) infrun: target_wait (-1.0.0, status) =
> infrun:   -1.0.0 [process -1],
> infrun:   status->kind = ignore
> infrun: handle_inferior_event status->kind = ignore
> infrun: prepare_to_wait
> infrun: target_wait (-1.0.0, status) =
> infrun:   -1.0.0 [process -1],
> infrun:   status->kind = ignore
> infrun: handle_inferior_event status->kind = ignore
> infrun: prepare_to_wait
> infrun: target_wait (-1.0.0, status) =
> infrun:   -1.0.0 [process -1],
> infrun:   status->kind = ignore
> infrun: handle_inferior_event status->kind = ignore
> infrun: prepare_to_wait
> [...]
> 
> This is because `remote_target::resume' enables the asynchronous event 
> loop, as indicated by the first `infrun: infrun_async(1)' record above, 
> and then the confirmation dialogue temporarily disables it and then 
> reenables, as indicated by the second `infrun: infrun_async(1)' record 
> above.  The problem with that approach is that the reenabling also marks 
> the handler for the `infrun_async_inferior_event_token' event ready, 
> even though it was not before the temporary disabling, by calling 
> `mark_async_event_handler' on it, and that triggers the infinite loop as 
> there's no inferior anymore and consequently no event waiting that would 
> stop it.

I don't understand this description.  Let's assume that the second call to
infrun_async indeed left infrun_async_inferior_event_token.ready to true.
Then I would expect the event loop to call it once, setting the ready flag
to false.  After that, the ready being false, I don't see why the callback
of infrun_async_inferior_event_token would get called in a loop.

Simon

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/4] Remove breakpoint step-over information if failed to resume
  2019-11-06 20:52 ` [PATCH v2 3/4] Remove breakpoint step-over information if failed to resume Maciej W. Rozycki
@ 2020-01-21  8:29   ` Simon Marchi
  2020-02-19 13:30   ` Luis Machado
  1 sibling, 0 replies; 32+ messages in thread
From: Simon Marchi @ 2020-01-21  8:29 UTC (permalink / raw)
  To: Maciej W. Rozycki, Pedro Alves, gdb-patches; +Cc: Jim Wilson

On 2019-11-06 3:52 p.m., Maciej W. Rozycki wrote:
> Complement commit 31e77af205cf ("PR breakpoints/7143 - Watchpoint does 
> not trigger when first set") and commit 71d378ae60a4 
> ("gdb.base/breakpoint-in-ro-region.exp regression on sss targets (PR 
> gdb/22583)") and also remove breakpoint step-over information if the 
> inferior has failed to resume, which may be due to for example a failure 
> to insert a software breakpoint at an inaccessible location.  If this 
> happens, the state of GDB becomes inconsistent and further attempts to 
> start execution hang.
> 
> This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux 
> `gdbserver' being developed and an attempt to step over the `ret' aka 
> `c.jr ra' instruction where the value held in the `ra' aka `x1' register 
> and therefore the address of a software-stepping breakpoint to insert is 
> 0.  Here's a record of a remote debug session leading to the issue:
> 
> Sending packet: $vCont?#49...Packet received: vCont;c;C;t
> Packet vCont (verbose-resume) is supported
> Sending packet: $vCont;c:p23cb.-1#08...Packet received: T05swbreak:;02:20da080000000000;20:b807565515000000;thread:p23cb.23cb;core:3;
> Sending packet: $qXfer:libraries-svr4:read::0,1000#20...Packet received: l<library-list-svr4 version="1.0" main-lm="0x155556f160"><library name=".../sysroot/lib/ld-linux-riscv64-lp64d.so.1" lm="0x155556ea18" l_addr="0x1555556000" l_ld="0x155556de90"/><library name="linux-vdso.so.1" lm="0x155556f6f0" l_addr="0x1555570000" l_ld="0x1555570350"/></library-list-svr4>
> Sending packet: $X15555607b8,2:\202\200#2e...Packet received: OK
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $g#67...Packet received: 0000000000000000000000000000000020da08000000000000000000000000000000000000000000d0ae040000000000696e74000000000000000000000000000100000000000000020000000000000080d708000000000000000000000000000c0000000000000000000000000000000000000000000000000000000000000000e408000000000020e908000000000000000000000000000000000000000000d0ae04000000000072697363765f646f00000000000000000100000000000000030000000000000000e408000000000020f208000000000000000000000000000000000000000000d0ae04000000000072697363765f646f0000000000000000[552 bytes omitted]
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Cannot access memory at address 0x0
> (gdb) stepi
> Sending packet: $m15555607b8,4#09...Packet received: 82802a87
> Sending packet: $m15555607b4,4#05...Packet received: 01452dbd
> ^Cremote_pass_ctrlc called
> remote_interrupt called
> ^Cremote_pass_ctrlc called
> Interrupted while waiting for the program.
> Give up waiting? (y or n) y
> Quit
> (gdb) 
> 
> As observed here the `stepi' command does not even attempt to reinsert 
> the breakpoint, let alone resume execution.
> 
> Correct the issue by making a call to clear global breakpoint step-over 
> from the exception catch clause in `resume'.
> 
> With the change applied further `stepi' commands correctly retry 
> breakpoint insertion:
> 
> (gdb) stepi
> Sending packet: $m15555607b8,4#09...Packet received: 82802a87
> Sending packet: $m15555607b4,4#05...Packet received: 01452dbd
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Cannot access memory at address 0x0
> (gdb)
> 
> and changing the value of the `pc' register so as not to point at a 
> `ret' instruction allows execution to be actually resumed.
> 
> 	gdb/
> 	* infrun.c (resume): Also call `clear_step_over_info' in the 
> 	`catch' clause.

That looks ok to me, but again, I'd prefer if Pedro could take a look.

Simon

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/4] Unregister the last inferior from the event loop
  2020-01-21  5:47   ` Simon Marchi
@ 2020-01-21 11:21     ` Maciej W. Rozycki
  2020-01-21 17:34       ` Simon Marchi
  0 siblings, 1 reply; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-01-21 11:21 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Pedro Alves, gdb-patches, Jim Wilson

Hi Simon,

> > This is because `remote_target::resume' enables the asynchronous event 
> > loop, as indicated by the first `infrun: infrun_async(1)' record above, 
> > and then the confirmation dialogue temporarily disables it and then 
> > reenables, as indicated by the second `infrun: infrun_async(1)' record 
> > above.  The problem with that approach is that the reenabling also marks 
> > the handler for the `infrun_async_inferior_event_token' event ready, 
> > even though it was not before the temporary disabling, by calling 
> > `mark_async_event_handler' on it, and that triggers the infinite loop as 
> > there's no inferior anymore and consequently no event waiting that would 
> > stop it.
> 
> I don't understand this description.  Let's assume that the second call to
> infrun_async indeed left infrun_async_inferior_event_token.ready to true.
> Then I would expect the event loop to call it once, setting the ready flag
> to false.  After that, the ready being false, I don't see why the callback
> of infrun_async_inferior_event_token would get called in a loop.

 Thanks for looking into it.  It's been a while however, so my memory has 
become fuzzy on this.

 I have therefore gone reading through the code again and what I can see 
is that in the async mode the `ready' (readiness) flag is never cleared: 
see `infrun_async' and `remote_target::async', which are the only places 
to call `clear_async_event_handler' and then only when the async mode is 
being disabled.

 On the other hand waiting for an inferior event does get disabled in 
`handle_inferior_event' regardless of the readiness flag, by calling 
`stop_waiting', for certain events, but not for TARGET_WAITKIND_IGNORE.  
Instead for that event `prepare_to_wait' is called, which makes sense to 
me because such an event does not indicate whether waiting should or 
should not be disabled, and with an asynchronous target you can normally 
(i.e. if not indicated by a specific event received otherwise, e.g. 
TARGET_WAITKIND_EXITED) expect a further event to be received anytime.

 Does this clarify the problematic scenario to you?

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/4] Unregister the last inferior from the event loop
  2020-01-21 11:21     ` Maciej W. Rozycki
@ 2020-01-21 17:34       ` Simon Marchi
  2020-01-22 17:35         ` Maciej W. Rozycki
  0 siblings, 1 reply; 32+ messages in thread
From: Simon Marchi @ 2020-01-21 17:34 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Pedro Alves, gdb-patches, Jim Wilson

On 2020-01-21 3:29 a.m., Maciej W. Rozycki wrote:
> Hi Simon,
> 
>>> This is because `remote_target::resume' enables the asynchronous event 
>>> loop, as indicated by the first `infrun: infrun_async(1)' record above, 
>>> and then the confirmation dialogue temporarily disables it and then 
>>> reenables, as indicated by the second `infrun: infrun_async(1)' record 
>>> above.  The problem with that approach is that the reenabling also marks 
>>> the handler for the `infrun_async_inferior_event_token' event ready, 
>>> even though it was not before the temporary disabling, by calling 
>>> `mark_async_event_handler' on it, and that triggers the infinite loop as 
>>> there's no inferior anymore and consequently no event waiting that would 
>>> stop it.
>>
>> I don't understand this description.  Let's assume that the second call to
>> infrun_async indeed left infrun_async_inferior_event_token.ready to true.
>> Then I would expect the event loop to call it once, setting the ready flag
>> to false.  After that, the ready being false, I don't see why the callback
>> of infrun_async_inferior_event_token would get called in a loop.
> 
>  Thanks for looking into it.  It's been a while however, so my memory has 
> become fuzzy on this.
> 
>  I have therefore gone reading through the code again and what I can see 
> is that in the async mode the `ready' (readiness) flag is never cleared: 
> see `infrun_async' and `remote_target::async', which are the only places 
> to call `clear_async_event_handler' and then only when the async mode is 
> being disabled.

The ready flag is also cleared in `check_async_event_handlers`, just before
invoking the callback.  So I was thinking, why would the callback get called
repeatedly if the ready flag is cleared just before it gets called, and
there's nothing setting it back to true.  The answer is probably that the
busy loop is within that callback, as seen below?

>  On the other hand waiting for an inferior event does get disabled in 
> `handle_inferior_event' regardless of the readiness flag, by calling 
> `stop_waiting', for certain events, but not for TARGET_WAITKIND_IGNORE.  
> Instead for that event `prepare_to_wait' is called, which makes sense to 
> me because such an event does not indicate whether waiting should or 
> should not be disabled, and with an asynchronous target you can normally 
> (i.e. if not indicated by a specific event received otherwise, e.g. 
> TARGET_WAITKIND_EXITED) expect a further event to be received anytime.
> 
>  Does this clarify the problematic scenario to you?

Ok, so if I understand, the infinite loop is this one, inside wait_for_inferior?

  while (1)
    {
      ...

      /* Now figure out what to do with the result of the result.  */
      handle_inferior_event (ecs);

      if (!ecs->wait_some_more)
	break;
    }

After the remote target has been unpushed, the remaining target is probably
just the "exec file" target, which does not provide a ::wait implementation,
and therefore inherits default_target_wait:

ptid_t
default_target_wait (struct target_ops *ops,
		     ptid_t ptid, struct target_waitstatus *status,
		     int options)
{
  status->kind = TARGET_WAITKIND_IGNORE;
  return minus_one_ptid;
}


And because that returns TARGET_WAITKIND_IGNORE, which results in
ecs->wait_some_more getting set by handle_inferior_event/prepare_to_wait,
it results in the infinite loop in wait_for_inferior.

Does that look accurate?

Simon

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/4] Unregister the last inferior from the event loop
  2020-01-21 17:34       ` Simon Marchi
@ 2020-01-22 17:35         ` Maciej W. Rozycki
  2020-01-23  1:19           ` Maciej W. Rozycki
  0 siblings, 1 reply; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-01-22 17:35 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Pedro Alves, gdb-patches, Jim Wilson

On Tue, 21 Jan 2020, Simon Marchi wrote:

> >  I have therefore gone reading through the code again and what I can see 
> > is that in the async mode the `ready' (readiness) flag is never cleared: 
> > see `infrun_async' and `remote_target::async', which are the only places 
> > to call `clear_async_event_handler' and then only when the async mode is 
> > being disabled.
> 
> The ready flag is also cleared in `check_async_event_handlers`, just before
> invoking the callback.  So I was thinking, why would the callback get called
> repeatedly if the ready flag is cleared just before it gets called, and
> there's nothing setting it back to true.  The answer is probably that the
> busy loop is within that callback, as seen below?

 Indeed it was cleared in `check_async_event_handlers' and then set again 
via `mark_infrun_async_event_handler' in `prepare_to_wait' called from 
`handle_inferior_event'.

> >  On the other hand waiting for an inferior event does get disabled in 
> > `handle_inferior_event' regardless of the readiness flag, by calling 
> > `stop_waiting', for certain events, but not for TARGET_WAITKIND_IGNORE.  
> > Instead for that event `prepare_to_wait' is called, which makes sense to 
> > me because such an event does not indicate whether waiting should or 
> > should not be disabled, and with an asynchronous target you can normally 
> > (i.e. if not indicated by a specific event received otherwise, e.g. 
> > TARGET_WAITKIND_EXITED) expect a further event to be received anytime.
> > 
> >  Does this clarify the problematic scenario to you?
> 
> Ok, so if I understand, the infinite loop is this one, inside 
> wait_for_inferior?
> 
>   while (1)
>     {
>       ...
> 
>       /* Now figure out what to do with the result of the result.  */
>       handle_inferior_event (ecs);
> 
>       if (!ecs->wait_some_more)
> 	break;
>     }

 Nope, it was in `start_event_loop':

  while (1)
    {
      ...
	  result = gdb_do_one_event ();
      ...
      if (result < 0)
	break;
    }

calling `check_async_event_handlers', 
`infrun_async_inferior_event_handler', `inferior_event_handler' and 
ultimately `fetch_inferior_event':

    /* Now figure out what to do with the result of the result.  */
    handle_inferior_event (ecs);

    if (!ecs->wait_some_more)
      ...

> After the remote target has been unpushed, the remaining target is probably
> just the "exec file" target, which does not provide a ::wait implementation,
> and therefore inherits default_target_wait:
> 
> ptid_t
> default_target_wait (struct target_ops *ops,
> 		     ptid_t ptid, struct target_waitstatus *status,
> 		     int options)
> {
>   status->kind = TARGET_WAITKIND_IGNORE;
>   return minus_one_ptid;
> }

 And it was `dummy_target::wait' indeed invoking `default_target_wait'.

> And because that returns TARGET_WAITKIND_IGNORE, which results in
> ecs->wait_some_more getting set by handle_inferior_event/prepare_to_wait,
> it results in the infinite loop in wait_for_inferior.

 Right.

> Does that look accurate?

 Almost, except that code in question has since been heavily refactored 
and the call to `target_wait' does not happen anymore.  Now I need to 
bisect the tree, find the offending commit and figure out whether it has 
actually provided an alternative fix for the issue I have observed or just 
papered it over by chance somehow.

 Thanks for your input, I wish it happened earlier, before the code has 
been rearranged, sigh...

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/4] Unregister the last inferior from the event loop
  2020-01-22 17:35         ` Maciej W. Rozycki
@ 2020-01-23  1:19           ` Maciej W. Rozycki
  2020-01-23  5:39             ` Maciej W. Rozycki
  0 siblings, 1 reply; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-01-23  1:19 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Pedro Alves, gdb-patches, Jim Wilson

On Wed, 22 Jan 2020, Maciej W. Rozycki wrote:

> > And because that returns TARGET_WAITKIND_IGNORE, which results in
> > ecs->wait_some_more getting set by handle_inferior_event/prepare_to_wait,
> > it results in the infinite loop in wait_for_inferior.
> 
>  Right.
> 
> > Does that look accurate?
> 
>  Almost, except that code in question has since been heavily refactored 
> and the call to `target_wait' does not happen anymore.  Now I need to 
> bisect the tree, find the offending commit and figure out whether it has 
> actually provided an alternative fix for the issue I have observed or just 
> papered it over by chance somehow.

 OK, I have tracked it down now.  The semantics has changed with commit 
5b6d1e4fa4fc ("Multi-target support"), and in particular this change:

@@ -3719,17 +3910,28 @@ fetch_inferior_event (void *client_data)
       = make_scoped_restore (&execution_direction,
 			     target_execution_direction ());

-    ecs->ptid = do_target_wait (waiton_ptid, &ecs->ws,
-				target_can_async_p () ? TARGET_WNOHANG : 0);
+    if (!do_target_wait (minus_one_ptid, ecs, TARGET_WNOHANG))
+      return;
+
+    gdb_assert (ecs->ws.kind != TARGET_WAITKIND_IGNORE);
+
[...]

combined with the fact that the new `do_target_wait' wrapper (original 
`do_target_wait' has been renamed to `do_target_wait_1') retuns false if 
(ecs->ws.kind == TARGET_WAITKIND_IGNORE) means that the looping no longer 
happens and my problematic scenario has been deliberately taken care of, 
as a side effect if not as an intended one.

 This change (2/4) can therefore be dropped as no longer required.  Again, 
thank you, Simon, for your assistance!

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/4] Unregister the last inferior from the event loop
  2020-01-23  1:19           ` Maciej W. Rozycki
@ 2020-01-23  5:39             ` Maciej W. Rozycki
  2020-01-23 16:59               ` Simon Marchi
  0 siblings, 1 reply; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-01-23  5:39 UTC (permalink / raw)
  To: Simon Marchi; +Cc: Pedro Alves, gdb-patches, Jim Wilson

On Thu, 23 Jan 2020, Maciej W. Rozycki wrote:

>  This change (2/4) can therefore be dropped as no longer required.  Again, 
> thank you, Simon, for your assistance!

 On second thoughts and having experimented a bit I take my conclusion 
back.

 I think we do want to include my proposed change so as not to have the 
value of 1 linger in `infrun_is_async' after an abrupt termination of the 
target connection.  This is because it is not the case if a remote stub 
behaves in an orderly manner, where `infrun_is_async' is reset to 0 every 
time execution stops in the debuggee (in the scenario concerned, that is).

 Therefore I think we want the state of GDB to be consistently the same 
regardless of whether the remote connection has been ended with `detach', 
`disconnect', `kill' or abruptly, even though we currently do not know 
(anymore) of any observable difference in behaviour resulting from this 
discrepancy.

 OK to go ahead with this change then?

 NB the current session log looks like:

(gdb) continue
Continuing.
infrun: clear_proceed_status_thread (Thread 2061.2061)
infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
Sending packet: $Z0,1555560ad8,2#7c...Packet received:
Packet Z0 (software-breakpoint) is NOT supported
Sending packet: $m1555560ad8,2#33...Packet received: 8280
Sending packet: $X1555560ad8,0:#56...Packet received: OK
binary downloading supported by target
Sending packet: $X1555560ad8,2:\002\220#ea...Packet received: OK
infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current 
thread [Thread 2061.2061] at 0x1555556ef0
Sending packet: $QPassSignals:e;10;14;17;1a;1b;1c;21;24;25;2c;4c;97;#0a...Packet received: OK
Sending packet: $vCont?#49...Packet received: vCont;c;C;t
Packet vCont (verbose-resume) is supported
Sending packet: $vCont;c:p80d.-1#aa...infrun: infrun_async(1)
infrun: prepare_to_wait
^Cremote_pass_ctrlc called
remote_interrupt called
^Cremote_pass_ctrlc called
infrun: infrun_async(0)
The target is not responding to interrupt requests.
Stop debugging it? (y or n) y
infrun: infrun_async(1)
Disconnected from target.
(gdb) 

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 2/4] Unregister the last inferior from the event loop
  2020-01-23  5:39             ` Maciej W. Rozycki
@ 2020-01-23 16:59               ` Simon Marchi
  0 siblings, 0 replies; 32+ messages in thread
From: Simon Marchi @ 2020-01-23 16:59 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: Pedro Alves, gdb-patches, Jim Wilson

On 2020-01-22 9:33 p.m., Maciej W. Rozycki wrote:
> On Thu, 23 Jan 2020, Maciej W. Rozycki wrote:
> 
>>  This change (2/4) can therefore be dropped as no longer required.  Again, 
>> thank you, Simon, for your assistance!
> 
>  On second thoughts and having experimented a bit I take my conclusion 
> back.
> 
>  I think we do want to include my proposed change so as not to have the 
> value of 1 linger in `infrun_is_async' after an abrupt termination of the 
> target connection.  This is because it is not the case if a remote stub 
> behaves in an orderly manner, where `infrun_is_async' is reset to 0 every 
> time execution stops in the debuggee (in the scenario concerned, that is).
> 
>  Therefore I think we want the state of GDB to be consistently the same 
> regardless of whether the remote connection has been ended with `detach', 
> `disconnect', `kill' or abruptly, even though we currently do not know 
> (anymore) of any observable difference in behaviour resulting from this 
> discrepancy.
> 
>  OK to go ahead with this change then?
> 
>  NB the current session log looks like:
> 
> (gdb) continue
> Continuing.
> infrun: clear_proceed_status_thread (Thread 2061.2061)
> infrun: proceed (addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT)
> Sending packet: $Z0,1555560ad8,2#7c...Packet received:
> Packet Z0 (software-breakpoint) is NOT supported
> Sending packet: $m1555560ad8,2#33...Packet received: 8280
> Sending packet: $X1555560ad8,0:#56...Packet received: OK
> binary downloading supported by target
> Sending packet: $X1555560ad8,2:\002\220#ea...Packet received: OK
> infrun: resume (step=0, signal=GDB_SIGNAL_0), trap_expected=0, current 
> thread [Thread 2061.2061] at 0x1555556ef0
> Sending packet: $QPassSignals:e;10;14;17;1a;1b;1c;21;24;25;2c;4c;97;#0a...Packet received: OK
> Sending packet: $vCont?#49...Packet received: vCont;c;C;t
> Packet vCont (verbose-resume) is supported
> Sending packet: $vCont;c:p80d.-1#aa...infrun: infrun_async(1)
> infrun: prepare_to_wait
> ^Cremote_pass_ctrlc called
> remote_interrupt called
> ^Cremote_pass_ctrlc called
> infrun: infrun_async(0)
> The target is not responding to interrupt requests.
> Stop debugging it? (y or n) y
> infrun: infrun_async(1)
> Disconnected from target.
> (gdb) 
> 
>   Maciej
> 

I think it would make sense.  I was wondering about how your change would
behave with multiple targets, but it only triggers when there are no more
live inferiors across all targets, so I think it would be fine.

But in all honesty, I don't really understand what the infrun_is_async
variable is for.  Clearly, it makes it so this:

      if (enable)
	mark_async_event_handler (infrun_async_inferior_event_token);
      else
	clear_async_event_handler (infrun_async_inferior_event_token);

is only called when there is a transition (async becomes enabled or
becomes disabled), but I don't see why that is particularly important.

Simon

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING^9][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (11 preceding siblings ...)
  2020-01-21  4:21 ` [PING^8][PATCH " Maciej W. Rozycki
@ 2020-01-31 14:23 ` Maciej W. Rozycki
  2020-02-10  9:01 ` [PING^10][PATCH " Maciej W. Rozycki
  2020-02-17 14:07 ` [PATCH " Maciej W. Rozycki
  14 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-01-31 14:23 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Pedro Alves

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Where are we with this series:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

?  Simon was kind enough (thank you!) to partially review 1-3/4, but I 
need a final word on these, and also input on 4/4.  I will appreciate 
assistance with moving this forward.

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PING^10][PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (12 preceding siblings ...)
  2020-01-31 14:23 ` [PING^9][PATCH " Maciej W. Rozycki
@ 2020-02-10  9:01 ` Maciej W. Rozycki
  2020-02-17 14:07 ` [PATCH " Maciej W. Rozycki
  14 siblings, 0 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-02-10  9:01 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Pedro Alves

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping, where are we with this series:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

?  Simon was kind enough (thank you!) to partially review 1-3/4, but I
need a final word on these, and also input on 4/4.  I will appreciate
assistance with moving this forward.

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
                   ` (13 preceding siblings ...)
  2020-02-10  9:01 ` [PING^10][PATCH " Maciej W. Rozycki
@ 2020-02-17 14:07 ` Maciej W. Rozycki
  2020-02-18 10:38   ` Luis Machado
  2020-02-19 21:11   ` Tom Tromey
  14 siblings, 2 replies; 32+ messages in thread
From: Maciej W. Rozycki @ 2020-02-17 14:07 UTC (permalink / raw)
  To: gdb-patches; +Cc: Simon Marchi, Pedro Alves

On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:

>  As issues from software stepping having gone astray seem to have started 
> piling up, and with your suggested modifications there is now a dependency 
> between the changes, I have decided to make this patch submission a small 
> series with 1/4 and 3/4 corresponding to original individual submissions 
> and 2/4 and 4/4 being entrely new fixes.

 Ping, where are we with this series:

<https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
<https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

?  Simon was kind enough (thank you!) to partially review 1-3/4, but I
need a final word on these, and also input on 4/4.

 NB this is my final ping as I need to head to do something else and limit 
distractions.  Patches have been posted, feel free to apply anytime.

  Maciej

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2020-02-17 14:07 ` [PATCH " Maciej W. Rozycki
@ 2020-02-18 10:38   ` Luis Machado
  2020-02-19 21:11   ` Tom Tromey
  1 sibling, 0 replies; 32+ messages in thread
From: Luis Machado @ 2020-02-18 10:38 UTC (permalink / raw)
  To: Maciej W. Rozycki, gdb-patches; +Cc: Simon Marchi, Pedro Alves

On 2/17/20 11:07 AM, Maciej W. Rozycki wrote:
> On Wed, 6 Nov 2019, Maciej W. Rozycki wrote:
> 
>>   As issues from software stepping having gone astray seem to have started
>> piling up, and with your suggested modifications there is now a dependency
>> between the changes, I have decided to make this patch submission a small
>> series with 1/4 and 3/4 corresponding to original individual submissions
>> and 2/4 and 4/4 being entrely new fixes.
> 
>   Ping, where are we with this series:
> 
> <https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
> <https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
> <https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
> <https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>
> 
> ?  Simon was kind enough (thank you!) to partially review 1-3/4, but I
> need a final word on these, and also input on 4/4.
> 
>   NB this is my final ping as I need to head to do something else and limit
> distractions.  Patches have been posted, feel free to apply anytime.
> 
>    Maciej
> 

I gave this a look and, unfortunately, i don't think i know enough of 
the async* functions purpose to make sure things are correct for all the 
combinations of all-stop/non-stop and async.

I'll give this another look to see if anything comes out next time.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 1/4] Remove stale breakpoint step-over information
  2019-11-06 20:51 ` [PATCH v2 1/4] Remove stale breakpoint step-over information Maciej W. Rozycki
  2020-01-21  5:41   ` Simon Marchi
@ 2020-02-19 11:26   ` Luis Machado
  1 sibling, 0 replies; 32+ messages in thread
From: Luis Machado @ 2020-02-19 11:26 UTC (permalink / raw)
  To: Maciej W. Rozycki, Pedro Alves, gdb-patches; +Cc: Jim Wilson

On 11/6/19 5:51 PM, Maciej W. Rozycki wrote:
> Fix an issue with the `step_over_info' structure introduced with commit
> 31e77af205cf ("PR breakpoints/7143 - Watchpoint does not trigger when
> first set"), where information stored in there is not invalidated in a
> remote debug scenario where software stepping in the all-stop mode is
> forcibly interrupted and the target connection then closed.  As a result
> a subsequent target connection triggers an assertion failure on
> `ecs->event_thread->control.trap_expected' and cannot be successfully
> established, making the particular instance of GDB unusable.
> 
> This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux
> `gdbserver' being developed and an attempt to step over the `ret' aka
> `c.jr ra' instruction where the value held in the `ra' aka `x1' register
> and therefore the address of a software-stepping breakpoint to insert is
> 0, as follows:
> 
> (gdb) target remote 1.2.3.4:56789
> Remote debugging using 1.2.3.4:56789
> warning: while parsing target description (at line 4): Target description specified unknown architecture "riscv:rv64id"
> warning: Could not load XML target description; ignoring
> Reading symbols from .../sysroot/lib/ld-linux-riscv64-lp64d.so.1...
> 0x0000001555556ef0 in _start ()
>     from .../sysroot/lib/ld-linux-riscv64-lp64d.so.1
> (gdb) break main
> Breakpoint 1 at 0x1643c
> (gdb) continue
> Continuing.
> Cannot access memory at address 0x0
> (gdb) x /i $pc
> => 0x15555607b8 <__GI__dl_debug_state>: ret
> (gdb) print /x $ra
> $1 = 0x0
> (gdb) stepi
> ^C^CInterrupted while waiting for the program.
> Give up waiting? (y or n) y
> Quit
> (gdb) kill
> Kill the program being debugged? (y or n) y
> [Inferior 1 (process 8964) killed]
> (gdb) target remote 1.2.3.4:56789
> Remote debugging using 1.2.3.4:56789
> warning: while parsing target description (at line 4): Target description specified unknown architecture "riscv:rv64id"
> warning: Could not load XML target description; ignoring
> .../gdb/infrun.c:5299: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) y
> 
> This is a bug, please report it.  For instructions, see:
> <http://www.gnu.org/software/gdb/bugs/>.
> 
> .../gdb/infrun.c:5299: internal-error: int finish_step_over(execution_control_state*): Assertion `ecs->event_thread->control.trap_expected' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Create a core file of GDB? (y or n) n
> Command aborted.
> (gdb)
> 
> Correct the issue by making a call to clear global breakpoint step-over
> information from `clear_thread_inferior_resources'.  To do so add an
> argument to `clear_step_over_info' giving the global thread number to
> check against, complementing one given to `set_step_over_info' when the
> information concerned is recorded, so that information is not removed
> for a thread that has stayed alive in a multi-target scenario.
> 
> Adjust all the existing `clear_step_over_info' call sites accordingly
> where a step over has completed successfully and the thread that has
> hit the breakpoint is expected to be one for which the breakpoint has
> been inserted.
> 
> 	gdb/
> 	* infrun.h (clear_step_over_info): New prototype.
> 	* infrun.c (thread_is_stepping_over_breakpoint): Move earlier
> 	on.
> 	(clear_step_over_info): Use it.  Make external.
> 	(resume_1, finish_step_over, handle_signal_stop)
> 	(keep_going_stepped_thread, keep_going_pass_signal): Adjust
> 	accordingly.
> 	* thread.c (clear_thread_inferior_resources): Call
> 	`clear_step_over_info'.
> ---
> Hi Pedro,
> 
> On Thu, 31 Oct 2019, Pedro Alves wrote:
> 
>>> Correct the issue by making a call to clear global breakpoint step-over
>>> information from `exit_inferior_1', which is where we already do all
>>> kinds of similar clean-ups, e.g. `delete_thread' called from there
>>> clears per-thread step-over information.
>>
>> This looks like a fragile place to clear this, considering
>> multiprocess.  I.e., what if we're calling exit_inferior for some
>> inferior other than the one that was stepping.
> 
>   Good point.
> 
>> The step over information is associated with a thread.
>> I think it'd be better to clear the step over information
>> when the corresponding thread is deleted.
>>
>> So something like adding a thread_info parameter to
>> clear_step_over_info, and then calling clear_step_over_info
>> from clear_thread_inferior_resources.  clear_step_over_info
>> would only clear the info if the thread matched, or if NULL
>> is passed.  Would that work?
> 
>   Thanks for your suggestion.  That indeed works except that I have figured
> out we actually always have a thread to match available when making a call
> to `clear_step_over_info', so I have not implemented a special NULL case,
> and I'm not convinced matching thread numbers ahead of the call is worth
> an assertion either.
> 
>   OK to apply?
> 
>    Maciej
> 
> Changes from v1:
> 
> - Add and check against thread number argument to `clear_step_over_info'.
> 
> - Call the function from `clear_thread_inferior_resources' rather than
>    `exit_inferior_1'.
> 
> - Use the thread number argument for `clear_step_over_info' throughout.
> ---
>   gdb/infrun.c |   40 +++++++++++++++++++++-------------------
>   gdb/infrun.h |    4 ++++
>   gdb/thread.c |    3 +++
>   3 files changed, 28 insertions(+), 19 deletions(-)
> 
> gdb-clear-step-over-info.diff
> Index: binutils-gdb/gdb/infrun.c
> ===================================================================
> --- binutils-gdb.orig/gdb/infrun.c
> +++ binutils-gdb/gdb/infrun.c
> @@ -1330,12 +1330,23 @@ set_step_over_info (const address_space
>     step_over_info.thread = thread;
>   }
>   
> -/* Called when we're not longer stepping over a breakpoint / an
> -   instruction, so all breakpoints are free to be (re)inserted.  */
> +/* See infrun.h.  */
>   
> -static void
> -clear_step_over_info (void)
> +int
> +thread_is_stepping_over_breakpoint (int thread)
> +{
> +  return (step_over_info.thread != -1
> +	  && thread == step_over_info.thread);
> +}
> +
> +/* See infrun.h.  */
> +
> +void
> +clear_step_over_info (int thread)
>   {
> +  if (!thread_is_stepping_over_breakpoint (thread))
> +    return;
> +

Since the thread number is only used to check if a particular thread is 
stepping over a breakpoint, wouldn't it be better to move the check to 
thread.c as opposed to embedding it in clear_step_over_info and having 
to change all of its callers?

>     if (debug_infrun)
>       fprintf_unfiltered (gdb_stdlog,
>   			"infrun: clear_step_over_info\n");
> @@ -1360,15 +1371,6 @@ stepping_past_instruction_at (struct add
>   /* See infrun.h.  */
>   
>   int
> -thread_is_stepping_over_breakpoint (int thread)
> -{
> -  return (step_over_info.thread != -1
> -	  && thread == step_over_info.thread);
> -}
> -
> -/* See infrun.h.  */
> -
> -int
>   stepping_past_nonsteppable_watchpoint (void)
>   {
>     return step_over_info.nonsteppable_watchpoint_p;
> @@ -2339,7 +2341,7 @@ resume_1 (enum gdb_signal sig)
>   				"infrun: resume: skipping permanent breakpoint, "
>   				"deliver signal first\n");
>   
> -	  clear_step_over_info ();
> +	  clear_step_over_info (tp->global_num);
>   	  tp->control.trap_expected = 0;
>   
>   	  if (tp->control.step_resume_breakpoint == NULL)
> @@ -2496,7 +2498,7 @@ resume_1 (enum gdb_signal sig)
>   
>         delete_single_step_breakpoints (tp);
>   
> -      clear_step_over_info ();
> +      clear_step_over_info (tp->global_num);
>         tp->control.trap_expected = 0;
>   
>         insert_breakpoints ();
> @@ -5298,7 +5300,7 @@ finish_step_over (struct execution_contr
>   	 back an event.  */
>         gdb_assert (ecs->event_thread->control.trap_expected);
>   
> -      clear_step_over_info ();
> +      clear_step_over_info (ecs->event_thread->global_num);
>       }
>   
>     if (!target_is_non_stop_p ())
> @@ -5913,7 +5915,7 @@ handle_signal_stop (struct execution_con
>                                   "infrun: signal may take us out of "
>                                   "single-step range\n");
>   
> -	  clear_step_over_info ();
> +	  clear_step_over_info (ecs->event_thread->global_num);
>   	  insert_hp_step_resume_breakpoint_at_frame (frame);
>   	  ecs->event_thread->step_after_step_resume_breakpoint = 1;
>   	  /* Reset trap_expected to ensure breakpoints are re-inserted.  */
> @@ -7004,7 +7006,7 @@ keep_going_stepped_thread (struct thread
>   	 breakpoint, otherwise if we were previously trying to step
>   	 over this exact address in another thread, the breakpoint is
>   	 skipped.  */
> -      clear_step_over_info ();
> +      clear_step_over_info (tp->global_num);
>         tp->control.trap_expected = 0;
>   
>         insert_single_step_breakpoint (get_frame_arch (frame),
> @@ -7547,7 +7549,7 @@ keep_going_pass_signal (struct execution
>   	{
>   	  exception_print (gdb_stderr, e);
>   	  stop_waiting (ecs);
> -	  clear_step_over_info ();
> +	  clear_step_over_info (ecs->event_thread->global_num);
>   	  return;
>   	}
>   
> Index: binutils-gdb/gdb/infrun.h
> ===================================================================
> --- binutils-gdb.orig/gdb/infrun.h
> +++ binutils-gdb/gdb/infrun.h
> @@ -120,6 +120,10 @@ extern void insert_step_resume_breakpoin
>   						  struct symtab_and_line ,
>   						  struct frame_id);
>   
> +/* Called when we're not longer stepping over a breakpoint / an
> +   instruction, so all breakpoints are free to be (re)inserted.  */
> +void clear_step_over_info (int thread);
> +
>   /* Returns true if we're trying to step past the instruction at
>      ADDRESS in ASPACE.  */
>   extern int stepping_past_instruction_at (struct address_space *aspace,
> Index: binutils-gdb/gdb/thread.c
> ===================================================================
> --- binutils-gdb.orig/gdb/thread.c
> +++ binutils-gdb/gdb/thread.c
> @@ -191,6 +191,9 @@ clear_thread_inferior_resources (struct
>   
>     delete_longjmp_breakpoint_at_next_stop (tp->global_num);
>   
> +  /* Remove any stale breakpoint step-over information.  */
> +  clear_step_over_info (tp->global_num);
> +

So something like...

/* If this thread has a pending step-over request, clear it now.  */
if (thread_is_stepping_over_breakpoint (tp->global_num))
   clear_step_over_info ();

>     bpstat_clear (&tp->control.stop_bpstat);
>   
>     btrace_teardown (tp);
> 

Otherwise it looks OK to me, assuming the testsuite has executed 
properly against the thread tests.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 3/4] Remove breakpoint step-over information if failed to resume
  2019-11-06 20:52 ` [PATCH v2 3/4] Remove breakpoint step-over information if failed to resume Maciej W. Rozycki
  2020-01-21  8:29   ` Simon Marchi
@ 2020-02-19 13:30   ` Luis Machado
  1 sibling, 0 replies; 32+ messages in thread
From: Luis Machado @ 2020-02-19 13:30 UTC (permalink / raw)
  To: Maciej W. Rozycki, Pedro Alves, gdb-patches; +Cc: Jim Wilson

On 11/6/19 5:52 PM, Maciej W. Rozycki wrote:
> Complement commit 31e77af205cf ("PR breakpoints/7143 - Watchpoint does
> not trigger when first set") and commit 71d378ae60a4
> ("gdb.base/breakpoint-in-ro-region.exp regression on sss targets (PR
> gdb/22583)") and also remove breakpoint step-over information if the
> inferior has failed to resume, which may be due to for example a failure
> to insert a software breakpoint at an inaccessible location.  If this
> happens, the state of GDB becomes inconsistent and further attempts to
> start execution hang.
> 
> This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux
> `gdbserver' being developed and an attempt to step over the `ret' aka
> `c.jr ra' instruction where the value held in the `ra' aka `x1' register
> and therefore the address of a software-stepping breakpoint to insert is
> 0.  Here's a record of a remote debug session leading to the issue:
> 
> Sending packet: $vCont?#49...Packet received: vCont;c;C;t
> Packet vCont (verbose-resume) is supported
> Sending packet: $vCont;c:p23cb.-1#08...Packet received: T05swbreak:;02:20da080000000000;20:b807565515000000;thread:p23cb.23cb;core:3;
> Sending packet: $qXfer:libraries-svr4:read::0,1000#20...Packet received: l<library-list-svr4 version="1.0" main-lm="0x155556f160"><library name=".../sysroot/lib/ld-linux-riscv64-lp64d.so.1" lm="0x155556ea18" l_addr="0x1555556000" l_ld="0x155556de90"/><library name="linux-vdso.so.1" lm="0x155556f6f0" l_addr="0x1555570000" l_ld="0x1555570350"/></library-list-svr4>
> Sending packet: $X15555607b8,2:\202\200#2e...Packet received: OK
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $g#67...Packet received: 0000000000000000000000000000000020da08000000000000000000000000000000000000000000d0ae040000000000696e74000000000000000000000000000100000000000000020000000000000080d708000000000000000000000000000c0000000000000000000000000000000000000000000000000000000000000000e408000000000020e908000000000000000000000000000000000000000000d0ae04000000000072697363765f646f00000000000000000100000000000000030000000000000000e408000000000020f208000000000000000000000000000000000000000000d0ae04000000000072697363765f646f0000000000000000[552 bytes omitted]
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Cannot access memory at address 0x0
> (gdb) stepi
> Sending packet: $m15555607b8,4#09...Packet received: 82802a87
> Sending packet: $m15555607b4,4#05...Packet received: 01452dbd
> ^Cremote_pass_ctrlc called
> remote_interrupt called
> ^Cremote_pass_ctrlc called
> Interrupted while waiting for the program.
> Give up waiting? (y or n) y
> Quit
> (gdb)
> 
> As observed here the `stepi' command does not even attempt to reinsert
> the breakpoint, let alone resume execution.
> 
> Correct the issue by making a call to clear global breakpoint step-over
> from the exception catch clause in `resume'.
> 
> With the change applied further `stepi' commands correctly retry
> breakpoint insertion:
> 
> (gdb) stepi
> Sending packet: $m15555607b8,4#09...Packet received: 82802a87
> Sending packet: $m15555607b4,4#05...Packet received: 01452dbd
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Cannot access memory at address 0x0
> (gdb)
> 
> and changing the value of the `pc' register so as not to point at a
> `ret' instruction allows execution to be actually resumed.
> 
> 	gdb/
> 	* infrun.c (resume): Also call `clear_step_over_info' in the
> 	`catch' clause.
> ---
> Changes from v1:
> 
> - Add a thread number argument to `clear_step_over_info'.
> ---
>   gdb/infrun.c |    7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> gdb-resume-clear-step-over-info.diff
> Index: binutils-gdb/gdb/infrun.c
> ===================================================================
> --- binutils-gdb.orig/gdb/infrun.c
> +++ binutils-gdb/gdb/infrun.c
> @@ -2620,7 +2620,12 @@ resume (gdb_signal sig)
>   	 single-step breakpoints perturbing other threads, in case
>   	 we're running in non-stop mode.  */
>         if (inferior_ptid != null_ptid)
> -	delete_single_step_breakpoints (inferior_thread ());
> +	{
> +	  thread_info *tp = inferior_thread ();
> +
> +	  delete_single_step_breakpoints (tp);
> +	  clear_step_over_info (tp->global_num);
> +	}
>         throw;
>       }
>   }
> 

I'd suggest the same approach of checking if the thread has step-over 
information and then clearing the state without the need to add a global 
thread number parameter to clear_step_over_info.

Otherwise this looks reasonable to me.

It would've been nice to have more robust rollback logic to handle such 
cases of failure.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 4/4] Unregister the inferior from the event loop if failed to resume
  2019-11-06 20:52 ` [PATCH v2 4/4] Unregister the inferior from the event loop " Maciej W. Rozycki
@ 2020-02-19 13:40   ` Luis Machado
  0 siblings, 0 replies; 32+ messages in thread
From: Luis Machado @ 2020-02-19 13:40 UTC (permalink / raw)
  To: Maciej W. Rozycki, Pedro Alves, gdb-patches; +Cc: Jim Wilson

On 11/6/19 5:52 PM, Maciej W. Rozycki wrote:
> Fix an issue with GDB starting to effectively busy-loop (though still
> accepting and interpreting commands) using the full processing power
> available when a remote stub has gone closing the connection while GDB
> is waiting for user input once the inferior has failed to resume.
> 
> This was observed with a `riscv64-linux-gnu' cross-GDB and RISC-V/Linux
> `gdbserver' being developed and an attempt to step over the `ret' aka
> `c.jr ra' instruction where the value held in the `ra' aka `x1' register
> and therefore the address of a software-stepping breakpoint to insert is
> 0.  Here's a record of a remote debug session leading to the issue:
> 
> Sending packet: $vCont?#49...Packet received: vCont;c;C;t
> Packet vCont (verbose-resume) is supported
> Sending packet: $vCont;c:p6857.-1#b8...infrun: infrun_async(1)
> infrun: prepare_to_wait
> infrun: target_wait (-1.0.0, status) =
> infrun:   -1.0.0 [process -1],
> infrun:   status->kind = ignore
> infrun: handle_inferior_event status->kind = ignore
> infrun: prepare_to_wait
> Packet received: T05swbreak:;02:f0f8ffff3f000000;20:b807565515000000;thread:p6857.6857;core:1;
> infrun: target_wait (-1.0.0, status) =
> infrun:   26711.26711.0 [Thread 26711.26711],
> infrun:   status->kind = stopped, signal = GDB_SIGNAL_TRAP
> infrun: handle_inferior_event status->kind = stopped, signal = GDB_SIGNAL_TRAP
> infrun: stop_pc = 0x15555607b8
> Sending packet: $qXfer:libraries-svr4:read::0,1000#20...Packet received: l<library-list-svr4 version="1.0" main-lm="0x155556f160"><library name=".../sysroot/lib/ld-linux-riscv64-lp64d.so.1" lm="0x155556ea18" l_addr="0x1555556000" l_ld="0x155556de90"/><library name="linux-vdso.so.1" lm="0x155556f6f0" l_addr="0x1555570000" l_ld="0x1555570350"/></library-list-svr4>
> infrun: BPSTAT_WHAT_SINGLE
> infrun: thread [Thread 26711.26711] still needs step-over
> infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
> Sending packet: $X15555607b8,2:\202\200#2e...Packet received: OK
> infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
> infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
> infrun: resume (step=1, signal=GDB_SIGNAL_0), trap_expected=1, current thread [Thread 26711.26711] at 0x15555607b8
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $m15555607b8,2#07...Packet received: 8280
> Sending packet: $g#67...Packet received: 00000000000000000000000000000000f0f8ffff3f000000d0c1b9aa2a00000020f76d55150000000f00000000000000ec6e555515000000ffffff6f00000000f0faffff3f00000090f0565515000000000000000000000052e57464000000000500000000000000887801000000000028f1565515000000010000000000000008f8ffff3f000000722f6c696236342f60f156551500000000000000000000000000000000000000000000000000000008f75655150000000100000000000000f5feffefffffffff80de56551500000050f9ffff3f00000068f9ffff3f000000ae585655150000000b00000000000000fffdff6f00000000fcffffffffffffff[552 bytes omitted]
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
> infrun: clear_step_over_info
> Cannot access memory at address 0x0
> (gdb)
> 
> At this point a command was issued to kill `gdbserver' and GDB responded
> immediately by entering the event loop:
> 
> infrun: target_wait (-1.0.0, status) =
> infrun:   -1.0.0 [process -1],
> infrun:   status->kind = no-resumed
> infrun: handle_inferior_event status->kind = no-resumed
> infrun: TARGET_WAITKIND_NO_RESUMED (ignoring: bg)
> infrun: prepare_to_wait
> infrun: target_wait (-1.0.0, status) =
> infrun:   -1.0.0 [process -1],
> infrun:   status->kind = no-resumed
> infrun: handle_inferior_event status->kind = no-resumed
> infrun: TARGET_WAITKIND_NO_RESUMED (ignoring: bg)
> infrun: prepare_to_wait
> infrun: target_wait (-1.0.0, status) =
> infrun:   -1.0.0 [process -1],
> infrun:   status->kind = no-resumed
> infrun: handle_inferior_event status->kind = no-resumed
> infrun: TARGET_WAITKIND_NO_RESUMED (ignoring: bg)
> infrun: prepare_to_wait
> [...]
> 
> This is because `remote_target::resume' enables the asynchronous event
> loop, as indicated by the first `infrun: infrun_async(1)' record above,
> and then the EOF condition on the remote connection is delivered as an
> event, but there are no resumed children to be waited for and no other
> event waiting that would stop the loop.
> 
> Correct that then by disabling the asynchronous event as execution
> completion would, where applicable, i.e. in the all-stop mode.  This
> makes the final sequence of the session look like:
> 
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> Sending packet: $m0,40#2d...Packet received: E01
> Sending packet: $m0,1#fa...Packet received: E01
> infrun: infrun_async(0)
> infrun: skipping breakpoint: stepping past insn at: 0x15555607b8
> infrun: clear_step_over_info
> Cannot access memory at address 0x0
> (gdb)
> 
> and GDB does not trigger the loop at this point if `gdbserver' goes away
> as the asynchronous event loop has already been disabled.
> 
> 	gdb/
> 	* infrun.c (resume): In the `catch' clause also unregister the
> 	inferior from the event loop.
> ---
> Hi,
> 
>   It might be that the non-stop mode requires additional clean-up here,
> however I have no way to work with that at the moment, so I'll be leaving
> any investigation to someone who has and will be inclined to look into it.
> 
>    Maciej
> 
> New change in v2.
> ---
>   gdb/infrun.c |   16 +++++++++++-----
>   1 file changed, 11 insertions(+), 5 deletions(-)
> 
> gdb-resume-async.diff
> Index: binutils-gdb/gdb/infrun.c
> ===================================================================
> --- binutils-gdb.orig/gdb/infrun.c
> +++ binutils-gdb/gdb/infrun.c
> @@ -2614,11 +2614,17 @@ resume (gdb_signal sig)
>       }
>     catch (const gdb_exception &ex)
>       {
> -      /* If resuming is being aborted for any reason, delete any
> -	 single-step breakpoint resume_1 may have created, to avoid
> -	 confusing the following resumption, and to avoid leaving
> -	 single-step breakpoints perturbing other threads, in case
> -	 we're running in non-stop mode.  */
> +      /* If resuming is being aborted for any reason, and we are in
> +	 the all-stop mode, then unregister the inferior from the event
> +	 loop.  This is done so that when the inferior is not running
> +	 we don't get distracted by spurious inferior output.  */
> +      if (!non_stop && target_has_execution && target_can_async_p ())
> +	target_async (0);
> +
> +      /* Also delete any single-step breakpoint resume_1 may have
> +	 created, to avoid confusing the following resumption, and to
> +	 avoid leaving single-step breakpoints perturbing other threads,
> +	 in case we're running in non-stop mode.  */
>         if (inferior_ptid != null_ptid)
>   	{
>   	  thread_info *tp = inferior_thread ();
> 

This looks reasonable, though the call to target_async (0) is always a 
bit cryptic to me. Makes me wonder if we're missing such invocations in 
other places to make GDB more recoverable on failures.

Does this fix also address 02/04? Since 02/04 is also a failure during 
resume? Not sure if it goes through the catch block.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v2 0/4] GDB fixes for the remote end having gone astray
  2020-02-17 14:07 ` [PATCH " Maciej W. Rozycki
  2020-02-18 10:38   ` Luis Machado
@ 2020-02-19 21:11   ` Tom Tromey
  1 sibling, 0 replies; 32+ messages in thread
From: Tom Tromey @ 2020-02-19 21:11 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: gdb-patches, Simon Marchi, Pedro Alves

>>>>> "Maciej" == Maciej W Rozycki <macro@wdc.com> writes:

Maciej>  Ping, where are we with this series:
Maciej> <https://sourceware.org/ml/gdb-patches/2019-11/msg00191.html>
Maciej> <https://sourceware.org/ml/gdb-patches/2019-11/msg00192.html>
Maciej> <https://sourceware.org/ml/gdb-patches/2019-11/msg00193.html>
Maciej> <https://sourceware.org/ml/gdb-patches/2019-11/msg00194.html>

Pedro, could you possibly review these?  They touch on infrun and, IIUC,
on some recent changes related to multi-target.

Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2020-02-19 21:11 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-06 20:51 [PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
2019-11-06 20:51 ` [PATCH v2 1/4] Remove stale breakpoint step-over information Maciej W. Rozycki
2020-01-21  5:41   ` Simon Marchi
2020-02-19 11:26   ` Luis Machado
2019-11-06 20:52 ` [PATCH v2 2/4] Unregister the last inferior from the event loop Maciej W. Rozycki
2020-01-21  5:47   ` Simon Marchi
2020-01-21 11:21     ` Maciej W. Rozycki
2020-01-21 17:34       ` Simon Marchi
2020-01-22 17:35         ` Maciej W. Rozycki
2020-01-23  1:19           ` Maciej W. Rozycki
2020-01-23  5:39             ` Maciej W. Rozycki
2020-01-23 16:59               ` Simon Marchi
2019-11-06 20:52 ` [PATCH v2 3/4] Remove breakpoint step-over information if failed to resume Maciej W. Rozycki
2020-01-21  8:29   ` Simon Marchi
2020-02-19 13:30   ` Luis Machado
2019-11-06 20:52 ` [PATCH v2 4/4] Unregister the inferior from the event loop " Maciej W. Rozycki
2020-02-19 13:40   ` Luis Machado
2019-11-18 12:38 ` [PING][PATCH v2 0/4] GDB fixes for the remote end having gone astray Maciej W. Rozycki
2019-11-26 15:49 ` [PING^2][PATCH " Maciej W. Rozycki
2019-12-02 14:50 ` [PING^3][PATCH " Maciej W. Rozycki
2019-12-05 20:59   ` Palmer Dabbelt via gdb-patches
2019-12-05 21:21     ` Maciej W. Rozycki
2019-12-09 21:29 ` [PING^4][PATCH " Maciej W. Rozycki
2019-12-17  0:06 ` [PING^5][PATCH " Maciej W. Rozycki
2020-01-06 15:40 ` [PING^6][PATCH " Maciej W. Rozycki
2020-01-13 20:46 ` [PING^7][PATCH " Maciej W. Rozycki
2020-01-21  4:21 ` [PING^8][PATCH " Maciej W. Rozycki
2020-01-31 14:23 ` [PING^9][PATCH " Maciej W. Rozycki
2020-02-10  9:01 ` [PING^10][PATCH " Maciej W. Rozycki
2020-02-17 14:07 ` [PATCH " Maciej W. Rozycki
2020-02-18 10:38   ` Luis Machado
2020-02-19 21:11   ` Tom Tromey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).