[PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp
@ 2021-10-26 23:37 Tom de Vries
  2021-10-26 23:37 ` [PATCH 2/2] [gdb/testsuite] Add KFAIL in gdb.threads/fork-plus-threads.exp Tom de Vries
  2021-10-27 14:30 ` [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp Tom de Vries
  0 siblings, 2 replies; 10+ messages in thread
From: Tom de Vries @ 2021-10-26 23:37 UTC (permalink / raw)
  To: gdb-patches

As reported in PR26272, when running test-case
gdb.threads/fork-and-threads.exp on a VM with openSUSE Tumbleweed, with the VM
bound to 1 cpu with 75% execution cap, I get:
...
[Inferior 1 (process 21928) exited normally]PASS: \
  gdb.threads/fork-plus-threads.exp: detach-on-fork=off: inferior 1 exited
PASS: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
  no failure to remove breakpoints
PASS: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
  no spurious thread stop
^M
info threads^M
  Id   Target Id                         Frame ^M
  11.11 Thread 0x7ffff3470700 (LWP 22041) (running)^M
^M
No selected thread.  See `help thread'.^M
(gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
  no threads left
[Inferior 11 (process 21990) exited normally]^M
info inferiors^M
  Num  Description       Connection           Executable        ^M
* 1    <null>                                 fork-plus-threads ^M
  11   <null>                                 fork-plus-threads ^M
(gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
  only inferior 1 left (the program exited)
...

The initial process (inferior 1) creates 10 child processes, and then waits on
the child processes.  Consequently, the initial process is also the last
process to exit.

However, in the log above we see:
...
[Inferior 1 (process 21928) exited normally]
  ...
[Inferior 11 (process 21990) exited normally]
...
This seems counter-intuitive: if inferior 1 is the last to exit, shouldn't we
see it last?

However, looking at the debug infrun log:
...
[infrun] fetch_inferior_event: enter
  [infrun] scoped_disable_commit_resumed: reason=handling event
  [infrun] do_target_wait: Found 2 inferiors, starting at #0
  [infrun] random_pending_event_thread: None found.
  [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
  [infrun] print_target_wait_results:   17202.17202.0 [Thread 0x7ffff7c79740 (LWP 17202)],
  [infrun] print_target_wait_results:   status->kind = exited, status = 0
  [infrun] handle_inferior_event: status->kind = exited, status = 0
[Inferior 1 (process 17202) exited normally]
  [infrun] stop_waiting: stop_waiting
  [infrun] reset: reason=handling event
  [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target native
  [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target native
[infrun] fetch_inferior_event: exit
[infrun] fetch_inferior_event: enter
  [infrun] scoped_disable_commit_resumed: reason=handling event
  [infrun] random_pending_event_thread: None found.
  [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
  [infrun] print_target_wait_results:   17215.17215.0 [Thread 0x7ffff7c79740 (LWP 17215)],
  [infrun] print_target_wait_results:   status->kind = exited, status = 0
  [infrun] handle_inferior_event: status->kind = exited, status = 0
[Inferior 11 (process 17215) exited normally]
...
this seems plausible.

Doing a waitpid with a pid of -1 will wait for any child process, and if
both inferior 1 and 11 have exited, and not yet been reaped, waitpid may
return any of the two.

Fix the first FAIL by waiting for all inferiors to exit, rather than waiting
for inferior 1 to exit, assuming it's the last.

Tested on x86_64-linux.
---
 .../gdb.threads/fork-plus-threads.exp         | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/gdb/testsuite/gdb.threads/fork-plus-threads.exp b/gdb/testsuite/gdb.threads/fork-plus-threads.exp
index 2300279254e..3e3b84114b1 100644
--- a/gdb/testsuite/gdb.threads/fork-plus-threads.exp
+++ b/gdb/testsuite/gdb.threads/fork-plus-threads.exp
@@ -33,6 +33,7 @@ proc do_test { detach-on-fork } {
     global GDBFLAGS
     global srcfile testfile
     global gdb_prompt
+    global decimal

     set saved_gdbflags $GDBFLAGS
     set GDBFLAGS [concat $GDBFLAGS " -ex \"set non-stop on\""]
@@ -50,6 +51,12 @@ proc do_test { detach-on-fork } {
     }

     gdb_test_no_output "set detach-on-fork ${detach-on-fork}"
+    if { ${detach-on-fork} == "on" } {
+	set expected_exits 1
+    } else {
+	set expected_exits 11
+    }
+
     set test "continue &"
     gdb_test_multiple $test $test {
 	-re "$gdb_prompt " {
@@ -76,8 +83,8 @@ proc do_test { detach-on-fork } {
     set saw_cannot_remove_breakpoints 0
     set saw_thread_stopped 0

-    set test "inferior 1 exited"
-    gdb_test_multiple "" $test {
+    set nr_exits 0
+    gdb_test_multiple "" "last inferior exited" {
 	-re "Cannot remove breakpoints" {
 	    set saw_cannot_remove_breakpoints 1
 	    exp_continue
@@ -94,8 +101,12 @@ proc do_test { detach-on-fork } {
 	    # Avoid timeout with check-read1
 	    exp_continue
 	}
-	-re "Inferior 1 \(\[^\r\n\]+\) exited normally" {
-	    pass $test
+	-re "Inferior $decimal \(\[^\r\n\]+\) exited normally" {
+	    incr nr_exits
+	    if { $nr_exits < $expected_exits } {
+		exp_continue
+	    }
+	    pass $gdb_test_name
 	}
     }

base-commit: f85dcfc3af9cf7c2859241253f1d37b1133abea2
-- 
2.26.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/2] [gdb/testsuite] Add KFAIL in gdb.threads/fork-plus-threads.exp
  2021-10-26 23:37 [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp Tom de Vries
@ 2021-10-26 23:37 ` Tom de Vries
  2022-01-18 21:10   ` Simon Marchi
  2022-01-18 22:05   ` Simon Marchi
  2021-10-27 14:30 ` [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp Tom de Vries
  1 sibling, 2 replies; 10+ messages in thread
From: Tom de Vries @ 2021-10-26 23:37 UTC (permalink / raw)
  To: gdb-patches

When running test-case gdb.threads/fork-and-threads.exp on a VM with openSUSE
Tumbleweed, with the VM bound to 1 cpu with 75% execution cap, I get:
...
(gdb) info inferiors^M
  Num  Description       Connection           Executable        ^M
* 1    <null>                                 fork-plus-threads ^M
  11   <null>                                 fork-plus-threads ^M
(gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
  only inferior 1 left
...

The test checks that all removable inferiors are indeed removed from the
inferior list after exit, and evidently this didn't happen for inferior 11
(which is added by fork rather than a user command, and therefore removable).

I've investigated why that is that case, and it's because its refcount didn't
drop to 0.

This seems like a bug to me, so add a KFAIL for this.

Tested on x86_64-linux.
---
 .../gdb.threads/fork-plus-threads.exp          | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/gdb/testsuite/gdb.threads/fork-plus-threads.exp b/gdb/testsuite/gdb.threads/fork-plus-threads.exp
index 3e3b84114b1..7808f9f48ab 100644
--- a/gdb/testsuite/gdb.threads/fork-plus-threads.exp
+++ b/gdb/testsuite/gdb.threads/fork-plus-threads.exp
@@ -118,9 +118,21 @@ proc do_test { detach-on-fork } {
     gdb_test "info threads" "No threads\." \
 	"no threads left"

-    gdb_test "info inferiors" \
-	"Num\[ \t\]+Description\[ \t\]+Connection\[ \t\]+Executable\[ \t\]+\r\n\\* 1 \[^\r\n\]+" \
-	"only inferior 1 left"
+    set re \
+	[multi_line \
+	     "Num\[ \t\]+Description\[ \t\]+Connection\[ \t\]+Executable\[ \t\]+" \
+	     "\\* 1 \[^\r\n\]+"]
+    gdb_test_multiple "info inferiors" "only inferior 1 left" {
+	-re -wrap $re {
+	    pass $gdb_test_name
+	}
+	-re -wrap $re.* {
+	    if { ${detach-on-fork} == "off" } {
+		setup_kfail "threads/26272" *-*-*
+	    }
+	    fail $gdb_test_name
+	}
+    }
 }

 foreach_with_prefix detach-on-fork {"on" "off"} {
-- 
2.26.2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp
  2021-10-26 23:37 [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp Tom de Vries
  2021-10-26 23:37 ` [PATCH 2/2] [gdb/testsuite] Add KFAIL in gdb.threads/fork-plus-threads.exp Tom de Vries
@ 2021-10-27 14:30 ` Tom de Vries
  2022-01-18 21:04   ` Simon Marchi
  2022-01-18 22:04   ` Simon Marchi
  1 sibling, 2 replies; 10+ messages in thread
From: Tom de Vries @ 2021-10-27 14:30 UTC (permalink / raw)
  To: gdb-patches

[-- Attachment #1: Type: text/plain, Size: 308 bytes --]

On 10/27/21 1:37 AM, Tom de Vries via Gdb-patches wrote:

I ran into this FAIL:
...
FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: last
inferior exited (timeout)
...
which I then managed to reproduce using check-readmore.

Fixed by rewriting the gdb_test_multiple in -lbl style.

Thanks,
- Tom

[-- Attachment #2: 0001-gdb-testsuite-Fix-FAIL-in-gdb.threads-fork-and-threads.exp.patch --]
[-- Type: text/x-patch, Size: 5391 bytes --]

[gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp

As reported in PR26272, when running test-case
gdb.threads/fork-and-threads.exp on a VM with openSUSE Tumbleweed, with the VM
bound to 1 cpu with 75% execution cap, I get:
...
[Inferior 1 (process 21928) exited normally]PASS: \
  gdb.threads/fork-plus-threads.exp: detach-on-fork=off: inferior 1 exited
PASS: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
  no failure to remove breakpoints
PASS: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
  no spurious thread stop
^M
info threads^M
  Id   Target Id                         Frame ^M
  11.11 Thread 0x7ffff3470700 (LWP 22041) (running)^M
^M
No selected thread.  See `help thread'.^M
(gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
  no threads left
[Inferior 11 (process 21990) exited normally]^M
info inferiors^M
  Num  Description       Connection           Executable        ^M
* 1    <null>                                 fork-plus-threads ^M
  11   <null>                                 fork-plus-threads ^M
(gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
  only inferior 1 left (the program exited)
...

The initial process (inferior 1) creates 10 child processes, and then waits on
the child processes.  Consequently, the initial process is also the last
process to exit.

However, in the log above we see:
...
[Inferior 1 (process 21928) exited normally]
  ...
[Inferior 11 (process 21990) exited normally]
...
This seems counter-intuitive: if inferior 1 is the last to exit, shouldn't we
see it last?

However, looking at the debug infrun log:
...
[infrun] fetch_inferior_event: enter
  [infrun] scoped_disable_commit_resumed: reason=handling event
  [infrun] do_target_wait: Found 2 inferiors, starting at #0
  [infrun] random_pending_event_thread: None found.
  [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
  [infrun] print_target_wait_results:   17202.17202.0 [Thread 0x7ffff7c79740 (LWP 17202)],
  [infrun] print_target_wait_results:   status->kind = exited, status = 0
  [infrun] handle_inferior_event: status->kind = exited, status = 0
[Inferior 1 (process 17202) exited normally]
  [infrun] stop_waiting: stop_waiting
  [infrun] reset: reason=handling event
  [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target native
  [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target native
[infrun] fetch_inferior_event: exit
[infrun] fetch_inferior_event: enter
  [infrun] scoped_disable_commit_resumed: reason=handling event
  [infrun] random_pending_event_thread: None found.
  [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
  [infrun] print_target_wait_results:   17215.17215.0 [Thread 0x7ffff7c79740 (LWP 17215)],
  [infrun] print_target_wait_results:   status->kind = exited, status = 0
  [infrun] handle_inferior_event: status->kind = exited, status = 0
[Inferior 11 (process 17215) exited normally]
...
this seems plausible.

Doing a waitpid with a pid of -1 will wait for any child process, and if
both inferior 1 and 11 have exited, and not yet been reaped, waitpid may
return any of the two.

Fix the first FAIL by waiting for all inferiors to exit, rather than waiting
for inferior 1 to exit, assuming it's the last.

Tested on x86_64-linux.

---
 gdb/testsuite/gdb.threads/fork-plus-threads.exp | 31 ++++++++++++++-----------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/gdb/testsuite/gdb.threads/fork-plus-threads.exp b/gdb/testsuite/gdb.threads/fork-plus-threads.exp
index 2300279254e..fe73c40f1d7 100644
--- a/gdb/testsuite/gdb.threads/fork-plus-threads.exp
+++ b/gdb/testsuite/gdb.threads/fork-plus-threads.exp
@@ -33,6 +33,7 @@ proc do_test { detach-on-fork } {
     global GDBFLAGS
     global srcfile testfile
     global gdb_prompt
+    global decimal

     set saved_gdbflags $GDBFLAGS
     set GDBFLAGS [concat $GDBFLAGS " -ex \"set non-stop on\""]
@@ -50,6 +51,12 @@ proc do_test { detach-on-fork } {
     }

     gdb_test_no_output "set detach-on-fork ${detach-on-fork}"
+    if { ${detach-on-fork} == "on" } {
+	set expected_exits 1
+    } else {
+	set expected_exits 11
+    }
+
     set test "continue &"
     gdb_test_multiple $test $test {
 	-re "$gdb_prompt " {
@@ -76,26 +83,22 @@ proc do_test { detach-on-fork } {
     set saw_cannot_remove_breakpoints 0
     set saw_thread_stopped 0

-    set test "inferior 1 exited"
-    gdb_test_multiple "" $test {
-	-re "Cannot remove breakpoints" {
+    set nr_exits 0
+    gdb_test_multiple "" "last inferior exited" -lbl {
+	-re "^\r\nCannot remove breakpoints \[^\r\n\]+\\.(?=\r\n)" {
 	    set saw_cannot_remove_breakpoints 1
 	    exp_continue
 	}
-	-re "Thread \[^\r\n\]+ stopped\\." {
+	-re "^\r\n\\\[Thread \[^\r\n\]+ stopped\\.(?=\r\n)" {
 	    set saw_thread_stopped 1
 	    exp_continue
 	}
-	-re "(Thread|LWP) \[^\r\n\]+ exited" {
-	    # Avoid timeout with check-read1
-	    exp_continue
-	}
-	-re "New (Thread|LWP) \[^\r\n\]+" {
-	    # Avoid timeout with check-read1
-	    exp_continue
-	}
-	-re "Inferior 1 \(\[^\r\n\]+\) exited normally" {
-	    pass $test
+	-re "^\r\n\\\[Inferior $decimal \(\[^\r\n\]+\) exited normally\\\](?=\r\n)" {
+	    incr nr_exits
+	    if { $nr_exits < $expected_exits } {
+		exp_continue
+	    }
+	    pass $gdb_test_name
 	}
     }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp
  2021-10-27 14:30 ` [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp Tom de Vries
@ 2022-01-18 21:04   ` Simon Marchi
  2022-01-19 19:36     ` Simon Marchi
  2022-01-18 22:04   ` Simon Marchi
  1 sibling, 1 reply; 10+ messages in thread
From: Simon Marchi @ 2022-01-18 21:04 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches





On 2021-10-27 10:30, Tom de Vries via Gdb-patches wrote:

> On 10/27/21 1:37 AM, Tom de Vries via Gdb-patches wrote:

> 

> I ran into this FAIL:

> ...

> FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: last

> inferior exited (timeout)

> ...

> which I then managed to reproduce using check-readmore.

> 

> Fixed by rewriting the gdb_test_multiple in -lbl style.

> 

> Thanks,

> - Tom





> From 141b80ad57d9bafab4bf5f2653e3c7aa01c93e7b Mon Sep 17 00:00:00 2001

> From: Tom de Vries via Gdb-patches <gdb-patches@sourceware.org>

> Date: Wed, 27 Oct 2021 16:30:49 +0200

> Subject: [PATCH] Fix FAIL in gdb.threads/fork-and-threads.exp

> 

> On 10/27/21 1:37 AM, Tom de Vries via Gdb-patches wrote:

> 

> I ran into this FAIL:

> ...

> FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: last

> inferior exited (timeout)

> ...

> which I then managed to reproduce using check-readmore.

> 

> Fixed by rewriting the gdb_test_multiple in -lbl style.

> 

> Thanks,

> - Tom

> 

> [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp

> 

> As reported in PR26272, when running test-case

> gdb.threads/fork-and-threads.exp on a VM with openSUSE Tumbleweed, with the VM

> bound to 1 cpu with 75% execution cap, I get:

> ...

> [Inferior 1 (process 21928) exited normally]PASS: \

>   gdb.threads/fork-plus-threads.exp: detach-on-fork=off: inferior 1 exited

> PASS: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \

>   no failure to remove breakpoints

> PASS: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \

>   no spurious thread stop

> ^M

> info threads^M

>   Id   Target Id                         Frame ^M

>   11.11 Thread 0x7ffff3470700 (LWP 22041) (running)^M

> ^M

> No selected thread.  See `help thread'.^M

> (gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \

>   no threads left

> [Inferior 11 (process 21990) exited normally]^M

> info inferiors^M

>   Num  Description       Connection           Executable        ^M

> * 1    <null>                                 fork-plus-threads ^M

>   11   <null>                                 fork-plus-threads ^M

> (gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \

>   only inferior 1 left (the program exited)

> ...

> 

> The initial process (inferior 1) creates 10 child processes, and then waits on

> the child processes.  Consequently, the initial process is also the last

> process to exit.

> 

> However, in the log above we see:

> ...

> [Inferior 1 (process 21928) exited normally]

>   ...

> [Inferior 11 (process 21990) exited normally]

> ...

> This seems counter-intuitive: if inferior 1 is the last to exit, shouldn't we

> see it last?

> 

> However, looking at the debug infrun log:

> ...

> [infrun] fetch_inferior_event: enter

>   [infrun] scoped_disable_commit_resumed: reason=handling event

>   [infrun] do_target_wait: Found 2 inferiors, starting at #0

>   [infrun] random_pending_event_thread: None found.

>   [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =

>   [infrun] print_target_wait_results:   17202.17202.0 [Thread 0x7ffff7c79740 (LWP 17202)],

>   [infrun] print_target_wait_results:   status->kind = exited, status = 0

>   [infrun] handle_inferior_event: status->kind = exited, status = 0

> [Inferior 1 (process 17202) exited normally]

>   [infrun] stop_waiting: stop_waiting

>   [infrun] reset: reason=handling event

>   [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target native

>   [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target native

> [infrun] fetch_inferior_event: exit

> [infrun] fetch_inferior_event: enter

>   [infrun] scoped_disable_commit_resumed: reason=handling event

>   [infrun] random_pending_event_thread: None found.

>   [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =

>   [infrun] print_target_wait_results:   17215.17215.0 [Thread 0x7ffff7c79740 (LWP 17215)],

>   [infrun] print_target_wait_results:   status->kind = exited, status = 0

>   [infrun] handle_inferior_event: status->kind = exited, status = 0

> [Inferior 11 (process 17215) exited normally]

> ...

> this seems plausible.

> 

> Doing a waitpid with a pid of -1 will wait for any child process, and if

> both inferior 1 and 11 have exited, and not yet been reaped, waitpid may

> return any of the two.

> 

> Fix the first FAIL by waiting for all inferiors to exit, rather than waiting

> for inferior 1 to exit, assuming it's the last.



I randomly started to look at this failure and saw that patch of yours.

I reached the same conclusion about the why.



I was about to reply that I also agreed with how to fix it, but got that

last minute idea.  Instead of fixing the test, it would be really nice

from a user point of view if the causality could be respected somehow.

If you know that inferior 1 waits for inferior 11 to exit, it's just

weird to see inferior 1 exit before inferior 11.



What I'm thinking is that in this case, the kernel will probably always

return inferior 11's exit event before inferior 1's exit event.  And

that serialization is lost when we save the events as pending.



What would you think of making the linux-nat target remember the order

of the events it pulled from the kernel, and return them in that order,

when multiple events are available?  That would ensure, in this case,

that swe show inferior 11's exit before inferior 1's exit.  I haven't

tried, I'm just guessing that's what would happen.



I could imagine some cases where you could still see inferior 1's exit

before.  Imagine both exit events are saved as pending events and

everything is stopped.  You then only resume inferior 1.  Then the

target will have no choice but to return inferior 1's exit event and GDB

will show it.  But still, the more standard use cases would be handled

better.



Simon


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] [gdb/testsuite] Add KFAIL in gdb.threads/fork-plus-threads.exp
  2021-10-26 23:37 ` [PATCH 2/2] [gdb/testsuite] Add KFAIL in gdb.threads/fork-plus-threads.exp Tom de Vries
@ 2022-01-18 21:10   ` Simon Marchi
  2022-01-18 22:05   ` Simon Marchi
  1 sibling, 0 replies; 10+ messages in thread
From: Simon Marchi @ 2022-01-18 21:10 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches





On 2021-10-26 19:37, Tom de Vries wrote:

> When running test-case gdb.threads/fork-and-threads.exp on a VM with openSUSE

> Tumbleweed, with the VM bound to 1 cpu with 75% execution cap, I get:

> ...

> (gdb) info inferiors^M

>   Num  Description       Connection           Executable        ^M

> * 1    <null>                                 fork-plus-threads ^M

>   11   <null>                                 fork-plus-threads ^M

> (gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \

>   only inferior 1 left

> ...

> 

> The test checks that all removable inferiors are indeed removed from the

> inferior list after exit, and evidently this didn't happen for inferior 11

> (which is added by fork rather than a user command, and therefore removable).

> 

> I've investigated why that is that case, and it's because its refcount didn't

> drop to 0.

> 

> This seems like a bug to me, so add a KFAIL for this.



Hmm, I attached to GDB after this failure, and inferior 11's refcount

was 0.



My guess is that in the working case, all inferiors are pruned when

prune_inferiors is called, which is called by normal_stop, which is

called by fetch_inferior_event when the thread_fsm (for the continue

command) completes.



And in the failing case, the thread_fsm completes when inferior 1

exits, that is before inferior 11 exits.  So prune_inferior leaves

inferior 11 there.  When inferior 11 finally exits (immediatly after,

when we process the next event), no thread_fsm complete, so normal_stop

isn't called, so prune_inferiors isn't called.



This particular problem could maybe be solved by having some call to

prune_inferiors somehwere else, so that it's called after inferior 11

exit.



If we implement the suggestion I sent on patch 1 (about the order of

events), that additional call to prune_inferiors probably wouldn't be

necessary in this case.  But it might still be needed in other corner

cases.



Simon


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp
  2021-10-27 14:30 ` [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp Tom de Vries
  2022-01-18 21:04   ` Simon Marchi
@ 2022-01-18 22:04   ` Simon Marchi
  1 sibling, 0 replies; 10+ messages in thread
From: Simon Marchi @ 2022-01-18 22:04 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

> Fix the first FAIL by waiting for all inferiors to exit, rather than waiting
> for inferior 1 to exit, assuming it's the last.

I randomly started to look at this failure and saw that patch of yours.
I reached the same conclusion about the why.

I was about to reply that I also agreed with how to fix it, but got that
last minute idea.  Instead of fixing the test, it would be really nice
from a user point of view if the causality could be respected somehow.
If you know that inferior 1 waits for inferior 11 to exit, it's just
weird to see inferior 1 exit before inferior 11.

What I'm thinking is that in this case, the kernel will probably always
return inferior 11's exit event before inferior 1's exit event.  And
that serialization is lost when we save the events as pending.

What would you think of making the linux-nat target remember the order
of the events it pulled from the kernel, and return them in that order,
when multiple events are available?  That would ensure, in this case,
that swe show inferior 11's exit before inferior 1's exit.  I haven't
tried, I'm just guessing that's what would happen.

I could imagine some cases where you could still see inferior 1's exit
before.  Imagine both exit events are saved as pending events and
everything is stopped.  You then only resume inferior 1.  Then the
target will have no choice but to return inferior 1's exit event and GDB
will show it.  But still, the more standard use cases would be handled
better.

Simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] [gdb/testsuite] Add KFAIL in gdb.threads/fork-plus-threads.exp
  2021-10-26 23:37 ` [PATCH 2/2] [gdb/testsuite] Add KFAIL in gdb.threads/fork-plus-threads.exp Tom de Vries
  2022-01-18 21:10   ` Simon Marchi
@ 2022-01-18 22:05   ` Simon Marchi
  1 sibling, 0 replies; 10+ messages in thread
From: Simon Marchi @ 2022-01-18 22:05 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

On 2021-10-26 19:37, Tom de Vries wrote:
> When running test-case gdb.threads/fork-and-threads.exp on a VM with openSUSE
> Tumbleweed, with the VM bound to 1 cpu with 75% execution cap, I get:
> ...
> (gdb) info inferiors^M
>   Num  Description       Connection           Executable        ^M
> * 1    <null>                                 fork-plus-threads ^M
>   11   <null>                                 fork-plus-threads ^M
> (gdb) FAIL: gdb.threads/fork-plus-threads.exp: detach-on-fork=off: \
>   only inferior 1 left
> ...
> 
> The test checks that all removable inferiors are indeed removed from the
> inferior list after exit, and evidently this didn't happen for inferior 11
> (which is added by fork rather than a user command, and therefore removable).
> 
> I've investigated why that is that case, and it's because its refcount didn't
> drop to 0.
> 
> This seems like a bug to me, so add a KFAIL for thi
Hmm, I attached to GDB after this failure, and inferior 11's refcount
was 0.

My guess is that in the working case, all inferiors are pruned when
prune_inferiors is called, which is called by normal_stop, which is
called by fetch_inferior_event when the thread_fsm (for the continue
command) completes.

And in the failing case, the thread_fsm completes when inferior 1
exits, that is before inferior 11 exits.  So prune_inferior leaves
inferior 11 there.  When inferior 11 finally exits (immediatly after,
when we process the next event), no thread_fsm complete, so normal_stop
isn't called, so prune_inferiors isn't called.

This particular problem could maybe be solved by having some call to
prune_inferiors somehwere else, so that it's called after inferior 11
exit.

If we implement the suggestion I sent on patch 1 (about the order of
events), that additional call to prune_inferiors probably wouldn't be
necessary in this case.  But it might still be needed in other corner
cases.

Simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp
  2022-01-18 21:04   ` Simon Marchi
@ 2022-01-19 19:36     ` Simon Marchi
  2022-04-05  9:58       ` Tom de Vries
  0 siblings, 1 reply; 10+ messages in thread
From: Simon Marchi @ 2022-01-19 19:36 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches

> What would you think of making the linux-nat target remember the order
> of the events it pulled from the kernel, and return them in that order,
> when multiple events are available?  That would ensure, in this case,
> that swe show inferior 11's exit before inferior 1's exit.  I haven't
> tried, I'm just guessing that's what would happen.

So, I think we can scratch that, as there doesn't seem to be anything in
the kernel that really guarantees that it will give us inferior 11's
exit before inferior 1's exit.

Simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp
  2022-01-19 19:36     ` Simon Marchi
@ 2022-04-05  9:58       ` Tom de Vries
  2022-04-05 12:48         ` Simon Marchi
  0 siblings, 1 reply; 10+ messages in thread
From: Tom de Vries @ 2022-04-05  9:58 UTC (permalink / raw)
  To: Simon Marchi, gdb-patches

On 1/19/22 20:36, Simon Marchi wrote:
>> What would you think of making the linux-nat target remember the order
>> of the events it pulled from the kernel, and return them in that order,
>> when multiple events are available?  That would ensure, in this case,
>> that swe show inferior 11's exit before inferior 1's exit.  I haven't
>> tried, I'm just guessing that's what would happen.
> 
> So, I think we can scratch that, as there doesn't seem to be anything in
> the kernel that really guarantees that it will give us inferior 11's
> exit before inferior 1's exit.

Sorry for the late reaction.

Then, are you back at the position "agreed with how to fix it" ?

Thanks,
- Tom


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp
  2022-04-05  9:58       ` Tom de Vries
@ 2022-04-05 12:48         ` Simon Marchi
  0 siblings, 0 replies; 10+ messages in thread
From: Simon Marchi @ 2022-04-05 12:48 UTC (permalink / raw)
  To: Tom de Vries, gdb-patches



On 2022-04-05 05:58, Tom de Vries wrote:
> On 1/19/22 20:36, Simon Marchi wrote:
>>> What would you think of making the linux-nat target remember the order
>>> of the events it pulled from the kernel, and return them in that order,
>>> when multiple events are available?  That would ensure, in this case,
>>> that swe show inferior 11's exit before inferior 1's exit.  I haven't
>>> tried, I'm just guessing that's what would happen.
>>
>> So, I think we can scratch that, as there doesn't seem to be anything in
>> the kernel that really guarantees that it will give us inferior 11's
>> exit before inferior 1's exit.
> 
> Sorry for the late reaction.
> 
> Then, are you back at the position "agreed with how to fix it" ?
> 
> Thanks,
> - Tom
> 

Yeah, in fact it looks very much like the fix in fork-plus-threads :)

Simon

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-04-05 12:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-26 23:37 [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp Tom de Vries
2021-10-26 23:37 ` [PATCH 2/2] [gdb/testsuite] Add KFAIL in gdb.threads/fork-plus-threads.exp Tom de Vries
2022-01-18 21:10   ` Simon Marchi
2022-01-18 22:05   ` Simon Marchi
2021-10-27 14:30 ` [PATCH 1/2] [gdb/testsuite] Fix FAIL in gdb.threads/fork-and-threads.exp Tom de Vries
2022-01-18 21:04   ` Simon Marchi
2022-01-19 19:36     ` Simon Marchi
2022-04-05  9:58       ` Tom de Vries
2022-04-05 12:48         ` Simon Marchi
2022-01-18 22:04   ` Simon Marchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).