public inbox for gdb-prs@sourceware.org
help / color / mirror / Atom feed
* [Bug testsuite/31632] New: [gdb/testsuite] Connecting to wrong gdbserver during parallel testing
@ 2024-04-11 13:36 vries at gcc dot gnu.org
  2024-04-13  9:48 ` [Bug testsuite/31632] " vries at gcc dot gnu.org
  2024-04-15 15:59 ` vries at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2024-04-11 13:36 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31632

            Bug ID: 31632
           Summary: [gdb/testsuite] Connecting to wrong gdbserver during
                    parallel testing
           Product: gdb
           Version: HEAD
            Status: NEW
          Severity: normal
          Priority: P2
         Component: testsuite
          Assignee: unassigned at sourceware dot org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

The opensuse gdb package runs the testsuite using:
...
$ make -j 16 \
  check//unix/-m64 \
  check//unix/-m64/-fPIE/-pie\
  check//unix/-m32 \
  check//unix/-m32/-fPIE/-pie
...
or alternatively with -fno-PIE/-no-pie, depending on the default.

So, AFAIU, four target boards are tested in parallel, and then each target
board is also tested in parallel.

I'm seeing (with gdb 13.2 based package) these FAILs for one target board:
...
FAIL: gdb.server/connect-without-multi-process.exp: multiprocess=off: target
remote (got interactive prompt)
FAIL: gdb.server/connect-without-multi-process.exp: multiprocess=off: continue
to main
FAIL: gdb.server/connect-without-multi-process.exp: multiprocess=off: continue
until exit
...
and these for another:
...
FAIL: gdb.server/reconnect-ctrl-c.exp: second: continue for ctrl-c (the program
is no longer running)
FAIL: gdb.server/reconnect-ctrl-c.exp: second: stop with control-c
...

Curiously, when investigating the first we see:
...
(gdb) PASS: gdb.server/connect-without-multi-process.exp: multiprocess=off:
break -q main
target remote localhost:2346^M
Remote debugging using localhost:2346^M
warning: Build ID mismatch between current exec-file
/home/abuild/rpmbuild/BUILD/gdb-13.2/build-x86_64-suse-linux/gdb/testsuite.unix.-m32.-fno-PIE.-no-pie/outputs/gdb.server/connect-without-multi-process/connect-without-multi-process^M
and automatically determined exec-file
/home/abuild/rpmbuild/BUILD/gdb-13.2/build-x86_64-suse-linux/gdb/testsuite.unix.-m32/outputs/gdb.server/reconnect-ctrl-c/reconnect-ctrl-c^M
exec-file-mismatch handling is currently "ask"^M
Load new symbol table from
"/home/abuild/rpmbuild/BUILD/gdb-13.2/build-x86_64-suse-linux/gdb/testsuite.unix.-m32/outputs/gdb.server/reconnect-ctrl-c/reconnect-ctrl-c"?
(y or n) n^M
warning: loading
/home/abuild/rpmbuild/BUILD/gdb-13.2/build-x86_64-suse-linux/gdb/testsuite.unix.-m32/outputs/gdb.server/reconnect-ctrl-c/reconnect-ctrl-c
Not confirmed.^M
warning: Could not load shared library symbols for linux-gate.so.1.^M
Do you need "set solib-search-path" or "set sysroot"?^M
Reading symbols from /lib/libc.so.6...^M
(No debugging symbols found in /lib/libc.so.6)^M
Reading symbols from /lib/ld-linux.so.2...^M
(No debugging symbols found in /lib/ld-linux.so.2)^M
0xf7fc7579 in ?? ()^M
(gdb) FAIL: gdb.server/connect-without-multi-process.exp: multiprocess=off:
target remote (got interactive prompt)
...

Looking at the other test-case, the same portnum is used.

It seems we're connecting to the wrong gdbserver.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug testsuite/31632] [gdb/testsuite] Connecting to wrong gdbserver during parallel testing
  2024-04-11 13:36 [Bug testsuite/31632] New: [gdb/testsuite] Connecting to wrong gdbserver during parallel testing vries at gcc dot gnu.org
@ 2024-04-13  9:48 ` vries at gcc dot gnu.org
  2024-04-15 15:59 ` vries at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2024-04-13  9:48 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31632

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
I'm not sure exactly how this failure is caused, but I think making sure
test-cases run in parallel get unique portnums is a way to fix this.

The current approach is that we have:
...
global portnum
set portnum "2345"
...
and in gdbserver_start we do:
...
        # Bump the port number to avoid conflicts with hung ports.              
        incr portnum
...
which works ok for serial testing.

There there's this bit:
...
           -re "Can't (bind address|listen on socket): Address already in
use\\.\r\n" {
                verbose -log "Port $portnum is already in use."
                if ![target_info exists gdb,socketport] {
                    # Bump the port number to avoid the conflict.               
                    wait -i $expect_out(spawn_id)
                    incr portnum
                    continue
                }
            }
...
which should avoid clashes with other uses.

We can't avoid clashes completely, given that things may happen outside of the
scope of gdb testing, but we should avoid running into clashes due to parallel
gdb testing.

So, we're gonna need a unique dir that needs to be created/emptied when making
make-target check//%, and passed down using a variable say, GDB_LOCK_DIR.

Each make invocation can then check GDB_LOCK_DIR, if it doesn't exist (because
of not using make check//), create it's own and use it, and if it does, use
that one.

[ I then wonder what happens when using check// and plain check next to each
other.  But perhaps this is already unsupported/broken today. ]

Then we can track portnum in $GDB_LOCK_DIR/portnum and use
$GDB_LOCK_DIR/portnum.lock to serialize parallel access.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug testsuite/31632] [gdb/testsuite] Connecting to wrong gdbserver during parallel testing
  2024-04-11 13:36 [Bug testsuite/31632] New: [gdb/testsuite] Connecting to wrong gdbserver during parallel testing vries at gcc dot gnu.org
  2024-04-13  9:48 ` [Bug testsuite/31632] " vries at gcc dot gnu.org
@ 2024-04-15 15:59 ` vries at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: vries at gcc dot gnu.org @ 2024-04-15 15:59 UTC (permalink / raw)
  To: gdb-prs

https://sourceware.org/bugzilla/show_bug.cgi?id=31632

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
https://sourceware.org/pipermail/gdb-patches/2024-April/208117.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-04-15 15:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-11 13:36 [Bug testsuite/31632] New: [gdb/testsuite] Connecting to wrong gdbserver during parallel testing vries at gcc dot gnu.org
2024-04-13  9:48 ` [Bug testsuite/31632] " vries at gcc dot gnu.org
2024-04-15 15:59 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).