* [PATCH][gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 @ 2021-07-07 14:09 Tom de Vries 2021-07-13 4:49 ` Simon Marchi 0 siblings, 1 reply; 5+ messages in thread From: Tom de Vries @ 2021-07-07 14:09 UTC (permalink / raw) To: gdb-patches Hi, When running test-case gdb.threads/check-libthread-db.exp on openSUSE Tumbleweed with glibc 2.33, I get: ... (gdb) maint check libthread-db^M Running libthread_db integrity checks:^M Got thread 0x7ffff7c79b80 => 9354 => 0x7ffff7c79b80; errno = 0 ... OK^M libthread_db integrity checks passed.^M (gdb) FAIL: gdb.threads/check-libthread-db.exp: user-initiated check: libpthread.so not initialized (pattern 2) ... The test-case expects instead: ... Got thread 0x0 => 9354 => 0x0 ... OK^M ... which is what I get on openSUSE Leap 15.2 with glibc 2.26, and what is described in the test-case like this: ... # libthread_db should fake a single thread with th_unique == NULL. ... Using a breakpoint on check_thread_db_callback we can compare the two scenarios, and find that in the latter case we hit this code in glibc function iterate_thread_list in nptl_db/td_ta_thr_iter.c: ... if (next == 0 && fake_empty) { /* __pthread_initialize_minimal has not run. There is just the main thread to return. We cannot rely on its thread register. They sometimes contain garbage that would confuse us, left by the kernel at exec. So if it looks like initialization is incomplete, we only fake a special descriptor for the initial thread. */ td_thrhandle_t th = { ta, 0 }; return callback (&th, cbdata_p) != 0 ? TD_DBERR : TD_OK; } ... while in the former case we don't because this preceding statement doesn't result in next == 0: ... err = DB_GET_FIELD (next, ta, head, list_t, next, 0); ... Note that the comment mentions __pthread_initialize_minimal, but in both cases it has already run before we hit the callback, so it's possible the comment is no longer accurate. Anyway, the results do not look wrong, so fix this by updating the regexp patterns to agree with what libthread-db is telling us. Tested on x86_64-linux, both with glibc 2.33 and 2.26. Any comments? Thanks, - Tom [gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 gdb/testsuite/ChangeLog: 2021-07-07 Tom de Vries <tdevries@suse.de> PR testsuite/27690 * gdb.threads/check-libthread-db.exp: Update patterns for glibc 2.33. --- gdb/testsuite/gdb.threads/check-libthread-db.exp | 37 ++++++++++++++---------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/gdb/testsuite/gdb.threads/check-libthread-db.exp b/gdb/testsuite/gdb.threads/check-libthread-db.exp index dfb3bad3712..6ecf40db6a9 100644 --- a/gdb/testsuite/gdb.threads/check-libthread-db.exp +++ b/gdb/testsuite/gdb.threads/check-libthread-db.exp @@ -32,6 +32,13 @@ if {[gdb_compile_pthreads "${srcdir}/${subdir}/${srcfile}" "${binfile}" \ return -1 } +# Case 1: libthread_db fakes a single thread with th_unique == NULL. +set thread_re1 "0 => $decimal => 0" +# Case 2: libthread_db already initialized. +set thread_re2 "$hex => $decimal => ${hex}(; errno = 0)?" +# Match either of the two cases. +set initial_thread_re "($thread_re1|$thread_re2)" + with_test_prefix "user-initiated check" { # User-initiated check with libthread_db not loaded. @@ -49,18 +56,16 @@ with_test_prefix "user-initiated check" { "No libthread_db loaded" \ "no libpthread.so loaded" - - # User-initiated check with NPTL uninitialized. - # libthread_db should fake a single thread with th_unique == NULL. + # User-initiated check with NPTL possibly uninitialized. gdb_test "continue" \ ".*Stopped due to shared library event.*Inferior loaded .*libpthread.*" gdb_test_sequence "maint check libthread-db" \ - "libpthread.so not initialized" { - "\[\r\n\]+Running libthread_db integrity checks:" - "\[\r\n\]+\[ \]+Got thread 0x0 => \[0-9\]+ => 0x0 ... OK" - "\[\r\n\]+libthread_db integrity checks passed." - } + "libpthread.so possibly not initialized" \ + [list \ + "\[\r\n\]+Running libthread_db integrity checks:" \ + "\[\r\n\]+\[ \]+Got thread $initial_thread_re ... OK" \ + "\[\r\n\]+libthread_db integrity checks passed."] # User-initiated check with NPTL fully operational. gdb_test_no_output "set stop-on-solib-events 0" @@ -78,8 +83,8 @@ with_test_prefix "user-initiated check" { with_test_prefix "automated load-time check" { - # Automated load-time check with NPTL uninitialized. - with_test_prefix "libpthread.so not initialized" { + # Automated load-time check with NPTL possibly uninitialized. + with_test_prefix "libpthread.so possibly not initialized" { clean_restart ${binfile} gdb_test_no_output "maint set check-libthread-db 1" @@ -88,12 +93,12 @@ with_test_prefix "automated load-time check" { gdb_run_cmd gdb_test_sequence "" \ - "check debug libthread-db output" { - "\[\r\n\]+Running libthread_db integrity checks:" - "\[\r\n\]+\[ \]+Got thread 0x0 => \[0-9\]+ => 0x0 ... OK" - "\[\r\n\]+libthread_db integrity checks passed." - "\[\r\n\]+[Thread debugging using libthread_db enabled]" - } + "check debug libthread-db output" \ + [list \ + "\[\r\n\]+Running libthread_db integrity checks:" \ + "\[\r\n\]+\[ \]+Got thread $initial_thread_re ... OK" \ + "\[\r\n\]+libthread_db integrity checks passed." \ + "\[\r\n\]+\\\[Thread debugging using libthread_db enabled\\\]"] } # Automated load-time check with NPTL fully operational. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH][gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 2021-07-07 14:09 [PATCH][gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 Tom de Vries @ 2021-07-13 4:49 ` Simon Marchi 2021-07-13 12:59 ` Tom de Vries 0 siblings, 1 reply; 5+ messages in thread From: Simon Marchi @ 2021-07-13 4:49 UTC (permalink / raw) To: Tom de Vries, gdb-patches On 2021-07-07 10:09 a.m., Tom de Vries wrote: > Hi, > > When running test-case gdb.threads/check-libthread-db.exp on openSUSE > Tumbleweed with glibc 2.33, I get: > ... > (gdb) maint check libthread-db^M > Running libthread_db integrity checks:^M > Got thread 0x7ffff7c79b80 => 9354 => 0x7ffff7c79b80; errno = 0 ... OK^M > libthread_db integrity checks passed.^M > (gdb) FAIL: gdb.threads/check-libthread-db.exp: user-initiated check: libpthread.so not initialized (pattern 2) > ... > > The test-case expects instead: > ... > Got thread 0x0 => 9354 => 0x0 ... OK^M > ... > which is what I get on openSUSE Leap 15.2 with glibc 2.26, and what is > described in the test-case like this: > ... > # libthread_db should fake a single thread with th_unique == NULL. > ... > > Using a breakpoint on check_thread_db_callback we can compare the two > scenarios, and find that in the latter case we hit this code in glibc function > iterate_thread_list in nptl_db/td_ta_thr_iter.c: > ... > if (next == 0 && fake_empty) > { > /* __pthread_initialize_minimal has not run. There is just the main > thread to return. We cannot rely on its thread register. They > sometimes contain garbage that would confuse us, left by the > kernel at exec. So if it looks like initialization is incomplete, > we only fake a special descriptor for the initial thread. */ > td_thrhandle_t th = { ta, 0 }; > return callback (&th, cbdata_p) != 0 ? TD_DBERR : TD_OK; > } > ... > while in the former case we don't because this preceding statement doesn't > result in next == 0: > ... > err = DB_GET_FIELD (next, ta, head, list_t, next, 0); > ... > > Note that the comment mentions __pthread_initialize_minimal, but in both cases > it has already run before we hit the callback, so it's possible the comment is > no longer accurate. > > Anyway, the results do not look wrong, so fix this by updating the regexp > patterns to agree with what libthread-db is telling us. > > Tested on x86_64-linux, both with glibc 2.33 and 2.26. > > Any comments? I got a bit lost in the glibc code, but I am wondering if this change in behavior might have been caused by this glibc change: https://gitlab.com/gnutools/glibc/-/commit/1daccf403b1bd86370eb94edca794dc106d02039 Before this, the stack_user list was global variables. This is one of the two lists that get walked to enumerate the threads. Since it isn't initialized statically, it got placed in .bss and is zeroed-out by default, hence why thread-db expected to read next == 0, I guess. Or, it may be that the moment that this nptl minimal initialization is done has changed, at least relative to where we do our checks. It looks like __pthread_initialize_minimal_internal is called as a constructor. Has it always been this way? By the time we stop on the solib load event, have the constructors of the lib ran? In any case, I agree that this looks benign, the regexp checks we have exactly one thread. But I would still be curious to know the real reason for the change in behavior. Simon ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH][gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 2021-07-13 4:49 ` Simon Marchi @ 2021-07-13 12:59 ` Tom de Vries 2021-07-13 13:15 ` Tom de Vries 0 siblings, 1 reply; 5+ messages in thread From: Tom de Vries @ 2021-07-13 12:59 UTC (permalink / raw) To: Simon Marchi, gdb-patches On 7/13/21 6:49 AM, Simon Marchi wrote: > > > On 2021-07-07 10:09 a.m., Tom de Vries wrote: >> Hi, >> >> When running test-case gdb.threads/check-libthread-db.exp on openSUSE >> Tumbleweed with glibc 2.33, I get: >> ... >> (gdb) maint check libthread-db^M >> Running libthread_db integrity checks:^M >> Got thread 0x7ffff7c79b80 => 9354 => 0x7ffff7c79b80; errno = 0 ... OK^M >> libthread_db integrity checks passed.^M >> (gdb) FAIL: gdb.threads/check-libthread-db.exp: user-initiated check: libpthread.so not initialized (pattern 2) >> ... >> >> The test-case expects instead: >> ... >> Got thread 0x0 => 9354 => 0x0 ... OK^M >> ... >> which is what I get on openSUSE Leap 15.2 with glibc 2.26, and what is >> described in the test-case like this: >> ... >> # libthread_db should fake a single thread with th_unique == NULL. >> ... >> >> Using a breakpoint on check_thread_db_callback we can compare the two >> scenarios, and find that in the latter case we hit this code in glibc function >> iterate_thread_list in nptl_db/td_ta_thr_iter.c: >> ... >> if (next == 0 && fake_empty) >> { >> /* __pthread_initialize_minimal has not run. There is just the main >> thread to return. We cannot rely on its thread register. They >> sometimes contain garbage that would confuse us, left by the >> kernel at exec. So if it looks like initialization is incomplete, >> we only fake a special descriptor for the initial thread. */ >> td_thrhandle_t th = { ta, 0 }; >> return callback (&th, cbdata_p) != 0 ? TD_DBERR : TD_OK; >> } >> ... >> while in the former case we don't because this preceding statement doesn't >> result in next == 0: >> ... >> err = DB_GET_FIELD (next, ta, head, list_t, next, 0); >> ... >> >> Note that the comment mentions __pthread_initialize_minimal, but in both cases >> it has already run before we hit the callback, so it's possible the comment is >> no longer accurate. >> >> Anyway, the results do not look wrong, so fix this by updating the regexp >> patterns to agree with what libthread-db is telling us. >> >> Tested on x86_64-linux, both with glibc 2.33 and 2.26. >> >> Any comments? > > I got a bit lost in the glibc code, but I am wondering if this change in > behavior might have been caused by this glibc change: > > https://gitlab.com/gnutools/glibc/-/commit/1daccf403b1bd86370eb94edca794dc106d02039 > It is, I've bisect the change in behavior to precisely that commit. > Before this, the stack_user list was global variables. This is one of > the two lists that get walked to enumerate the threads. Since it isn't > initialized statically, it got placed in .bss and is zeroed-out by > default, hence why thread-db expected to read next == 0, I guess. > > Or, it may be that the moment that this nptl minimal initialization is > done has changed, at least relative to where we do our checks. It looks > like __pthread_initialize_minimal_internal is called as a constructor. > Has it always been this way? By the time we stop on the solib load > event, have the constructors of the lib ran? > Before the commit, we have: 1. Initial event, hitting _dl_debug_state: ... Stopped due to shared library event (no libraries added or removed) ... 2. Loading some libs: ... [Thread debugging using libthread_db enabled] Using host libthread_db library "/home/vries/glibc/build/nptl_db/libthread_db.so.1". Stopped due to shared library event: Inferior loaded /home/vries/glibc/build/nptl/libpthread.so.0 /home/vries/glibc/build/math/libm.so.6 /home/vries/glibc/build/libc.so.6 ... 3. __pthread_initialize_minimal_internal entry, at which point we still have null __stack_user: ... Continuing. Breakpoint 1, __pthread_initialize_minimal_internal () at nptl-init.c:227 227 { (gdb) p __stack_user $2 = {next = 0x0, prev = 0x0} ... 4. __pthread_initialize_minimal_internal exit, at which point we have initialized stack_user: ... (gdb) fin Run till exit from #0 __pthread_initialize_minimal_internal () at nptl-init.c:227 _init () at ../sysdeps/x86_64/crtn.S:40 40 addq $8, %rsp (gdb) p __stack_user $3 = {next = 0x7ffff7ff2e40, prev = 0x7ffff7ff2e40} ... After the commit: 1. Initial event, hitting _dl_debug_state: ... Stopped due to shared library event (no libraries added or removed) ... 2. We hit this list_add which adds the main thread to dl_stack_user: ... (gdb) up #1 init_tls (naudit=naudit@entry=0) at rtld.c:804 804 list_add (&THREAD_SELF->list, &GL (dl_stack_user)); ... 3. Loading some libs: ... (gdb) c Continuing. [Thread debugging using libthread_db enabled] Using host libthread_db library "/home/vries/glibc/build/nptl_db/libthread_db.so.1". Stopped due to shared library event: Inferior loaded /home/vries/glibc/build/nptl/libpthread.so.0 /home/vries/glibc/build/math/libm.so.6 /home/vries/glibc/build/libc.so.6 ... 4. __pthread_initialize_minimal_internal entry: ... Continuing. Breakpoint 2, __pthread_initialize_minimal_internal () at nptl-init.c:227 227 { ... So, it doesn't look like the moment when __pthread_initialize_minimal_internal is called changed, just the location of where stack_user is initialized. I'll try to update the log message a bit and resubmit. Thanks, - Tom > In any case, I agree that this looks benign, the regexp checks we have > exactly one thread. But I would still be curious to know the real > reason for the change in behavior. > > Simon > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH][gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 2021-07-13 12:59 ` Tom de Vries @ 2021-07-13 13:15 ` Tom de Vries 2021-07-13 13:31 ` Simon Marchi 0 siblings, 1 reply; 5+ messages in thread From: Tom de Vries @ 2021-07-13 13:15 UTC (permalink / raw) To: Simon Marchi, gdb-patches [-- Attachment #1: Type: text/plain, Size: 136 bytes --] On 7/13/21 2:59 PM, Tom de Vries wrote: > I'll try to update the log message a bit and resubmit. Any further comments? Thanks, - Tom [-- Attachment #2: 0001-gdb-testsuite-Fix-check-libthread-db.exp-FAILs-with-glibc-2.33.patch --] [-- Type: text/x-patch, Size: 5506 bytes --] [gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 When running test-case gdb.threads/check-libthread-db.exp on openSUSE Tumbleweed with glibc 2.33, I get: ... (gdb) maint check libthread-db^M Running libthread_db integrity checks:^M Got thread 0x7ffff7c79b80 => 9354 => 0x7ffff7c79b80; errno = 0 ... OK^M libthread_db integrity checks passed.^M (gdb) FAIL: gdb.threads/check-libthread-db.exp: user-initiated check: \ libpthread.so not initialized (pattern 2) ... The test-case expects instead: ... Got thread 0x0 => 9354 => 0x0 ... OK^M ... which is what I get on openSUSE Leap 15.2 with glibc 2.26, and what is described in the test-case like this: ... # libthread_db should fake a single thread with th_unique == NULL. ... Using a breakpoint on check_thread_db_callback we can compare the two scenarios, and find that in the latter case we hit this code in glibc function iterate_thread_list in nptl_db/td_ta_thr_iter.c: ... if (next == 0 && fake_empty) { /* __pthread_initialize_minimal has not run. There is just the main thread to return. We cannot rely on its thread register. They sometimes contain garbage that would confuse us, left by the kernel at exec. So if it looks like initialization is incomplete, we only fake a special descriptor for the initial thread. */ td_thrhandle_t th = { ta, 0 }; return callback (&th, cbdata_p) != 0 ? TD_DBERR : TD_OK; } ... while in the former case we don't because this preceding statement doesn't result in next == 0: ... err = DB_GET_FIELD (next, ta, head, list_t, next, 0); ... Note that the comment mentions __pthread_initialize_minimal, but in both cases it has already run before we hit the callback, so it's possible the comment is no longer accurate. The change in behaviour bisect to glibc commit 1daccf403b "nptl: Move stack list variables into _rtld_global", which moves the initialization of stack list variables such as __stack_user to an earlier moment, which explains well enough the observed difference. Fix this by updating the regexp patterns to agree with what libthread-db is telling us. Tested on x86_64-linux, both with glibc 2.33 and 2.26. gdb/testsuite/ChangeLog: 2021-07-07 Tom de Vries <tdevries@suse.de> PR testsuite/27690 * gdb.threads/check-libthread-db.exp: Update patterns for glibc 2.33. --- gdb/testsuite/gdb.threads/check-libthread-db.exp | 37 ++++++++++++++---------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/gdb/testsuite/gdb.threads/check-libthread-db.exp b/gdb/testsuite/gdb.threads/check-libthread-db.exp index dfb3bad3712..6ecf40db6a9 100644 --- a/gdb/testsuite/gdb.threads/check-libthread-db.exp +++ b/gdb/testsuite/gdb.threads/check-libthread-db.exp @@ -32,6 +32,13 @@ if {[gdb_compile_pthreads "${srcdir}/${subdir}/${srcfile}" "${binfile}" \ return -1 } +# Case 1: libthread_db fakes a single thread with th_unique == NULL. +set thread_re1 "0 => $decimal => 0" +# Case 2: libthread_db already initialized. +set thread_re2 "$hex => $decimal => ${hex}(; errno = 0)?" +# Match either of the two cases. +set initial_thread_re "($thread_re1|$thread_re2)" + with_test_prefix "user-initiated check" { # User-initiated check with libthread_db not loaded. @@ -49,18 +56,16 @@ with_test_prefix "user-initiated check" { "No libthread_db loaded" \ "no libpthread.so loaded" - - # User-initiated check with NPTL uninitialized. - # libthread_db should fake a single thread with th_unique == NULL. + # User-initiated check with NPTL possibly uninitialized. gdb_test "continue" \ ".*Stopped due to shared library event.*Inferior loaded .*libpthread.*" gdb_test_sequence "maint check libthread-db" \ - "libpthread.so not initialized" { - "\[\r\n\]+Running libthread_db integrity checks:" - "\[\r\n\]+\[ \]+Got thread 0x0 => \[0-9\]+ => 0x0 ... OK" - "\[\r\n\]+libthread_db integrity checks passed." - } + "libpthread.so possibly not initialized" \ + [list \ + "\[\r\n\]+Running libthread_db integrity checks:" \ + "\[\r\n\]+\[ \]+Got thread $initial_thread_re ... OK" \ + "\[\r\n\]+libthread_db integrity checks passed."] # User-initiated check with NPTL fully operational. gdb_test_no_output "set stop-on-solib-events 0" @@ -78,8 +83,8 @@ with_test_prefix "user-initiated check" { with_test_prefix "automated load-time check" { - # Automated load-time check with NPTL uninitialized. - with_test_prefix "libpthread.so not initialized" { + # Automated load-time check with NPTL possibly uninitialized. + with_test_prefix "libpthread.so possibly not initialized" { clean_restart ${binfile} gdb_test_no_output "maint set check-libthread-db 1" @@ -88,12 +93,12 @@ with_test_prefix "automated load-time check" { gdb_run_cmd gdb_test_sequence "" \ - "check debug libthread-db output" { - "\[\r\n\]+Running libthread_db integrity checks:" - "\[\r\n\]+\[ \]+Got thread 0x0 => \[0-9\]+ => 0x0 ... OK" - "\[\r\n\]+libthread_db integrity checks passed." - "\[\r\n\]+[Thread debugging using libthread_db enabled]" - } + "check debug libthread-db output" \ + [list \ + "\[\r\n\]+Running libthread_db integrity checks:" \ + "\[\r\n\]+\[ \]+Got thread $initial_thread_re ... OK" \ + "\[\r\n\]+libthread_db integrity checks passed." \ + "\[\r\n\]+\\\[Thread debugging using libthread_db enabled\\\]"] } # Automated load-time check with NPTL fully operational. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH][gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 2021-07-13 13:15 ` Tom de Vries @ 2021-07-13 13:31 ` Simon Marchi 0 siblings, 0 replies; 5+ messages in thread From: Simon Marchi @ 2021-07-13 13:31 UTC (permalink / raw) To: Tom de Vries, gdb-patches On 2021-07-13 9:15 a.m., Tom de Vries wrote: > On 7/13/21 2:59 PM, Tom de Vries wrote: >> I'll try to update the log message a bit and resubmit. > > Any further comments? > > Thanks, > - Tom You explanation of the sequence of events was very clear, thanks. This LGTM. Simon ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-07-13 13:31 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-07 14:09 [PATCH][gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 Tom de Vries 2021-07-13 4:49 ` Simon Marchi 2021-07-13 12:59 ` Tom de Vries 2021-07-13 13:15 ` Tom de Vries 2021-07-13 13:31 ` Simon Marchi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).