From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-f50.google.com (mail-wr1-f50.google.com [209.85.221.50]) by sourceware.org (Postfix) with ESMTPS id 4A1F13858D1E for ; Wed, 13 Jul 2022 21:35:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4A1F13858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wr1-f50.google.com with SMTP id b26so17303152wrc.2 for ; Wed, 13 Jul 2022 14:35:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:from:to:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=xbqhw/NXOyNpGLxwH3Y7rAhoz0kCgNppfz6k3JHYZXs=; b=OyBilGNMtHRakHqjkavxJFVqpQb1YkhcWFMv5E23OXR8m5OeMfeP7J7JK8q+7ZhkCO VnY8XceprXif0JxrvtDcbO5xEFzxeRJRooyorzuc5oJQrRwm1fzRcm5eLX2Ibty9GL9Z T29bfChXuKxdtcjQxlERsN4mManzQ5mqjOEauJFSKKhH9fx8HIda0hfMhmAkoc8u2MtT 01cwIbU3cxa9CIn9yNT8mcMwx6tPF7q6zL5XlK/tA9/hOnJFP7TZdY2XLshKzewbGU83 Kyn7GkABxHxqzlH4htrzrSOOUb4dNxYS/10tCawRcwHjY88rODsoHE+XAl0tP2aJ3FoF k0sg== X-Gm-Message-State: AJIora96knOgwaFTeBndKzeEuJIXP8nttYciNdGupOK2EnnAXeGi55Xg Jg1+5F6EV+JbygANqTTXI+5Q0wP+bgs= X-Google-Smtp-Source: AGRyM1v8AqyijjtVQ/bYFzoZo88rpygjojdA0mWlTGJeOaEDmg9w6TxZPvYGaRPz961HMek+j5MKZA== X-Received: by 2002:a05:6000:1ac9:b0:21d:a85c:298b with SMTP id i9-20020a0560001ac900b0021da85c298bmr5132722wry.185.1657748111235; Wed, 13 Jul 2022 14:35:11 -0700 (PDT) Received: from ?IPv6:2001:8a0:f924:2600:209d:85e2:409e:8726? ([2001:8a0:f924:2600:209d:85e2:409e:8726]) by smtp.gmail.com with ESMTPSA id t1-20020adfeb81000000b0021d6e49e4ebsm11648792wrn.10.2022.07.13.14.35.09 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 13 Jul 2022 14:35:10 -0700 (PDT) Subject: Re: [PATCH 01/25] Don't use pthread_mutex_t in gdb.base/step-over-clone.c From: Pedro Alves To: gdb-patches@sourceware.org References: <20220620225419.382221-1-pedro@palves.net> <20220620225419.382221-2-pedro@palves.net> Message-ID: <55a0ce6e-1a6e-bdba-543d-9b3cd807e810@palves.net> Date: Wed, 13 Jul 2022 22:35:09 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <20220620225419.382221-2-pedro@palves.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Jul 2022 21:35:15 -0000 I'm dropping this patch from a v2 of this series, because I've since remembered that current glibc is eliminating separte libpthread, merging it into the main library, and gdb will load libthread_db.so even with non-threaded programs, so that printing errno always works... This all means that with new enough glibc, I think we'd get back the warnings... Pedro Alves On 2022-06-20 11:53 p.m., Pedro Alves wrote: > I noticed this in gdb.log after running gdb.base/step-over-clone.exp: > > ... > gdbserver: PID mismatch! Expected 1790818, got 1790817 > gdbserver: Cannot find thread after clone. > gdbserver: PID mismatch! Expected 1790819, got 1790817 > gdbserver: Cannot find thread after clone. > gdbserver: PID mismatch! Expected 1790820, got 1790817 > gdbserver: Cannot find thread after clone. > ... > > Those "PID mismatch" come from gdbserver/thread_db.c. The problem is > that the testcase program is testing raw clone, which bypasses > libpthread entirely and leaves libthread_db confused. The testcase is > linking with pthreads because it wants to use pthread_mutex_t for > synchronization between the clones. Mixing pthreads and raw clone is > just something we shouldn't do, however. > > My first thought was to fix this by using an atomic decrement > (__atomic_fetch_sub) instead of a mutex, for synchronization. > However, on some archs, that may require linking with -latomic, which > can itself pull in libpthread. > > My next idea, is to make each thread write to its own "I'm ready" > variable, such that we can't actually have read-modify-write races. > This is what this patch does. > > Change-Id: Id418978ac86bfa6d51d0af1e1625a86cdd039a20 > --- > gdb/testsuite/gdb.base/step-over-clone.c | 69 +++++++++++++------- > gdb/testsuite/gdb.base/step-over-syscall.exp | 7 +- > 2 files changed, 45 insertions(+), 31 deletions(-) > > diff --git a/gdb/testsuite/gdb.base/step-over-clone.c b/gdb/testsuite/gdb.base/step-over-clone.c > index c0f67af188b..8a56b492e5e 100644 > --- a/gdb/testsuite/gdb.base/step-over-clone.c > +++ b/gdb/testsuite/gdb.base/step-over-clone.c > @@ -19,7 +19,7 @@ > #include > #include > #include > -#include > +#include > > static void > marker () > @@ -27,30 +27,55 @@ marker () > > #define STACK_SIZE 0x1000 > > -/* These are used to signal that the threads have started correctly. The > - GLOBAL_THREAD_COUNT is set to the number of threads in main, then > - decremented (under a lock) in each new thread. */ > -pthread_mutex_t global_lock = PTHREAD_MUTEX_INITIALIZER; > -int global_thread_count = 0; > +#define NUM_THREADS 6 > + > +/* This is used to signal that the threads have started correctly. We > + can't use a single global updated by all thread guarded by a > + pthread mutex, or anything pthread related for the matter, since we > + are using raw clone. A single global updated with atomics > + (__atomic_fetch* etc.) instead of a pthread mutex would sound > + appealing, but we avoid that too because for some architectures, > + we'd have to link with -latomic, which itself links with > + -lpthread... So instead have one array with one element per > + thread, and each thread only ever writes to its own array element. > + We make the array have sig_atomic_t elements so that the elements > + are portably naturally aligned and free from data races on all > + archs, when different threads write to different elements. In > + practice, "int" would work too, as accesses to int are pretty much > + garanteed to be atomic on all Linux systems, but sig_atomic_t is > + explicit. */ > +volatile sig_atomic_t thread_started[NUM_THREADS]; > > static int > -clone_fn (void *unused) > +clone_fn (void *started) > { > /* Signal that this thread has started correctly. */ > - if (pthread_mutex_lock (&global_lock) != 0) > - abort (); > - global_thread_count--; > - if (pthread_mutex_unlock (&global_lock) != 0) > - abort (); > + *(volatile sig_atomic_t *) started = 1; > > return 0; > } > > +/* Return true if all threads have started. */ > + > +static int > +all_threads_started (void) > +{ > + int i; > + > + /* Force full memory barrier so that caches are flushed and > + THREAD_STARTED is refetched. */ > + __sync_synchronize (); > + for (i = 0; i < NUM_THREADS; i++) > + if (thread_started[i] == 0) > + return 0; > + return 1; > +} > + > int > main (void) > { > int i, pid; > - unsigned char *stack[6]; > + unsigned char *stack[NUM_THREADS]; > > /* Due to bug gdb/19675 the cloned thread _might_ try to reenter main > (this depends on where the displaced instruction is placed for > @@ -62,18 +87,16 @@ main (void) > else > abort (); > > - for (i = 0; i < (sizeof (stack) / sizeof (stack[0])); i++) > + for (i = 0; i < NUM_THREADS; i++) > stack[i] = malloc (STACK_SIZE); > > - global_thread_count = (sizeof (stack) / sizeof (stack[0])); > - > - for (i = 0; i < (sizeof (stack) / sizeof (stack[0])); i++) > + for (i = 0; i < NUM_THREADS; i++) > { > pid = clone (clone_fn, stack[i] + STACK_SIZE, CLONE_FILES | CLONE_VM, > - NULL); > + (void *) &thread_started[i]); > } > > - for (i = 0; i < (sizeof (stack) / sizeof (stack[0])); i++) > + for (i = 0; i < NUM_THREADS; i++) > free (stack[i]); > > /* Set an alarm so we don't end up stuck waiting for threads that might > @@ -81,12 +104,8 @@ main (void) > alarm (120); > > /* Now wait for all the threads to start up. */ > - while (global_thread_count != 0) > - { > - /* Force memory barrier so GLOBAL_THREAD_COUNT will be refetched. */ > - asm volatile ("" ::: "memory"); > - sleep (1); > - } > + while (!all_threads_started ()) > + sleep (1); > > /* Call marker, this is what GDB is waiting for. */ > marker (); > diff --git a/gdb/testsuite/gdb.base/step-over-syscall.exp b/gdb/testsuite/gdb.base/step-over-syscall.exp > index 788f6e3f5d0..e87d391cd5f 100644 > --- a/gdb/testsuite/gdb.base/step-over-syscall.exp > +++ b/gdb/testsuite/gdb.base/step-over-syscall.exp > @@ -241,12 +241,7 @@ proc step_over_syscall { syscall } { > > set testfile "step-over-$syscall" > > - set options [list debug] > - if { $syscall == "clone" } { > - lappend options "pthreads" > - } > - > - if [build_executable ${testfile}.exp ${testfile} ${testfile}.c $options] { > + if [build_executable ${testfile}.exp ${testfile} ${testfile}.c {debug}] { > untested "failed to compile" > return -1 > } >