From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id CC36C384A015 for ; Tue, 13 Jul 2021 12:59:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CC36C384A015 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id C0E152045C; Tue, 13 Jul 2021 12:59:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1626181149; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pz/Y4yBP3+X4g1USc34NWVLCJ0VvpubdkpCXaR5uaZ8=; b=V2IdoArZKW+JVixu/FM/VwdeNE3ZzkMXnPanYGP8RWjb20lEjRb5CjqUbH1QLnYr7qJC5D e/VvgX1Uii04OZZaVjhU3iPVw1IZx9BXIDE+aQXWBOCgG5vyt26hTUs00P68u3RbbWd0dg qDTXnnApQuCrhkvsCITnntSWHlZN5fo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1626181149; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pz/Y4yBP3+X4g1USc34NWVLCJ0VvpubdkpCXaR5uaZ8=; b=ruXMgSbWF3hGFkyg6NxNVvmy1cr1FI0JVDQ05wzQwiX4NkH1lhTBjsdQhXjxJsdoGm0lQF ujj4ezU/ZE5Gt+DA== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 9F17E13AE9; Tue, 13 Jul 2021 12:59:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id nkp3JR2O7WAUDAAAMHmgww (envelope-from ); Tue, 13 Jul 2021 12:59:09 +0000 Subject: Re: [PATCH][gdb/testsuite] Fix check-libthread-db.exp FAILs with glibc 2.33 To: Simon Marchi , gdb-patches@sourceware.org References: <20210707140950.GA2241@delia> <2421785c-a5b9-64b7-371c-0abf35a4cb63@polymtl.ca> From: Tom de Vries Message-ID: Date: Tue, 13 Jul 2021 14:59:09 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <2421785c-a5b9-64b7-371c-0abf35a4cb63@polymtl.ca> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Jul 2021 12:59:12 -0000 On 7/13/21 6:49 AM, Simon Marchi wrote: > > > On 2021-07-07 10:09 a.m., Tom de Vries wrote: >> Hi, >> >> When running test-case gdb.threads/check-libthread-db.exp on openSUSE >> Tumbleweed with glibc 2.33, I get: >> ... >> (gdb) maint check libthread-db^M >> Running libthread_db integrity checks:^M >> Got thread 0x7ffff7c79b80 => 9354 => 0x7ffff7c79b80; errno = 0 ... OK^M >> libthread_db integrity checks passed.^M >> (gdb) FAIL: gdb.threads/check-libthread-db.exp: user-initiated check: libpthread.so not initialized (pattern 2) >> ... >> >> The test-case expects instead: >> ... >> Got thread 0x0 => 9354 => 0x0 ... OK^M >> ... >> which is what I get on openSUSE Leap 15.2 with glibc 2.26, and what is >> described in the test-case like this: >> ... >> # libthread_db should fake a single thread with th_unique == NULL. >> ... >> >> Using a breakpoint on check_thread_db_callback we can compare the two >> scenarios, and find that in the latter case we hit this code in glibc function >> iterate_thread_list in nptl_db/td_ta_thr_iter.c: >> ... >> if (next == 0 && fake_empty) >> { >> /* __pthread_initialize_minimal has not run. There is just the main >> thread to return. We cannot rely on its thread register. They >> sometimes contain garbage that would confuse us, left by the >> kernel at exec. So if it looks like initialization is incomplete, >> we only fake a special descriptor for the initial thread. */ >> td_thrhandle_t th = { ta, 0 }; >> return callback (&th, cbdata_p) != 0 ? TD_DBERR : TD_OK; >> } >> ... >> while in the former case we don't because this preceding statement doesn't >> result in next == 0: >> ... >> err = DB_GET_FIELD (next, ta, head, list_t, next, 0); >> ... >> >> Note that the comment mentions __pthread_initialize_minimal, but in both cases >> it has already run before we hit the callback, so it's possible the comment is >> no longer accurate. >> >> Anyway, the results do not look wrong, so fix this by updating the regexp >> patterns to agree with what libthread-db is telling us. >> >> Tested on x86_64-linux, both with glibc 2.33 and 2.26. >> >> Any comments? > > I got a bit lost in the glibc code, but I am wondering if this change in > behavior might have been caused by this glibc change: > > https://gitlab.com/gnutools/glibc/-/commit/1daccf403b1bd86370eb94edca794dc106d02039 > It is, I've bisect the change in behavior to precisely that commit. > Before this, the stack_user list was global variables. This is one of > the two lists that get walked to enumerate the threads. Since it isn't > initialized statically, it got placed in .bss and is zeroed-out by > default, hence why thread-db expected to read next == 0, I guess. > > Or, it may be that the moment that this nptl minimal initialization is > done has changed, at least relative to where we do our checks. It looks > like __pthread_initialize_minimal_internal is called as a constructor. > Has it always been this way? By the time we stop on the solib load > event, have the constructors of the lib ran? > Before the commit, we have: 1. Initial event, hitting _dl_debug_state: ... Stopped due to shared library event (no libraries added or removed) ... 2. Loading some libs: ... [Thread debugging using libthread_db enabled] Using host libthread_db library "/home/vries/glibc/build/nptl_db/libthread_db.so.1". Stopped due to shared library event: Inferior loaded /home/vries/glibc/build/nptl/libpthread.so.0 /home/vries/glibc/build/math/libm.so.6 /home/vries/glibc/build/libc.so.6 ... 3. __pthread_initialize_minimal_internal entry, at which point we still have null __stack_user: ... Continuing. Breakpoint 1, __pthread_initialize_minimal_internal () at nptl-init.c:227 227 { (gdb) p __stack_user $2 = {next = 0x0, prev = 0x0} ... 4. __pthread_initialize_minimal_internal exit, at which point we have initialized stack_user: ... (gdb) fin Run till exit from #0 __pthread_initialize_minimal_internal () at nptl-init.c:227 _init () at ../sysdeps/x86_64/crtn.S:40 40 addq $8, %rsp (gdb) p __stack_user $3 = {next = 0x7ffff7ff2e40, prev = 0x7ffff7ff2e40} ... After the commit: 1. Initial event, hitting _dl_debug_state: ... Stopped due to shared library event (no libraries added or removed) ... 2. We hit this list_add which adds the main thread to dl_stack_user: ... (gdb) up #1 init_tls (naudit=naudit@entry=0) at rtld.c:804 804 list_add (&THREAD_SELF->list, &GL (dl_stack_user)); ... 3. Loading some libs: ... (gdb) c Continuing. [Thread debugging using libthread_db enabled] Using host libthread_db library "/home/vries/glibc/build/nptl_db/libthread_db.so.1". Stopped due to shared library event: Inferior loaded /home/vries/glibc/build/nptl/libpthread.so.0 /home/vries/glibc/build/math/libm.so.6 /home/vries/glibc/build/libc.so.6 ... 4. __pthread_initialize_minimal_internal entry: ... Continuing. Breakpoint 2, __pthread_initialize_minimal_internal () at nptl-init.c:227 227 { ... So, it doesn't look like the moment when __pthread_initialize_minimal_internal is called changed, just the location of where stack_user is initialized. I'll try to update the log message a bit and resubmit. Thanks, - Tom > In any case, I agree that this looks benign, the regexp checks we have > exactly one thread. But I would still be curious to know the real > reason for the change in behavior. > > Simon >