From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by sourceware.org (Postfix) with ESMTP id 67240394843A for ; Wed, 26 May 2021 17:58:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 67240394843A Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-391-pPGgT-8FM66-p-7UHuur0Q-1; Wed, 26 May 2021 13:58:16 -0400 X-MC-Unique: pPGgT-8FM66-p-7UHuur0Q-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 636A3501E1; Wed, 26 May 2021 17:58:15 +0000 (UTC) Received: from oldenburg.str.redhat.com (ovpn-113-228.ams2.redhat.com [10.36.113.228]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 701325C241; Wed, 26 May 2021 17:58:14 +0000 (UTC) From: Florian Weimer To: Adhemerval Zanella Cc: Adhemerval Zanella via Libc-alpha Subject: Re: [PATCH 05/11] nptl: Deallocate the thread stack on setup failure (BZ #19511) References: <20210526165728.1772546-1-adhemerval.zanella@linaro.org> <20210526165728.1772546-6-adhemerval.zanella@linaro.org> <87fsy9qu1p.fsf@oldenburg.str.redhat.com> Date: Wed, 26 May 2021 19:58:12 +0200 In-Reply-To: (Adhemerval Zanella's message of "Wed, 26 May 2021 14:51:41 -0300") Message-ID: <877djlqswb.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 May 2021 17:58:23 -0000 * Adhemerval Zanella: > On 26/05/2021 14:33, Florian Weimer wrote: >> * Adhemerval Zanella via Libc-alpha: >> >>> @@ -819,9 +809,13 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr, >>> CONCURRENCY NOTES above). We can assert that STOPPED_START >>> must have been true because thread creation didn't fail, but >>> thread attribute setting did. */ >>> - /* See bug 19511 which explains why doing nothing here is a >>> - resource leak for a joinable thread. */ >>> - assert (stopped_start); >>> + { >>> + assert (stopped_start); >>> + if (stopped_start) >>> + /* State (a), we own PD. The thread blocked on this lock will >>> + finish due setup_failed being set. */ >>> + lll_unlock (pd->lock, LLL_PRIVATE); >>> + } >>> else >>> { >>> /* State (e) and we have ownership of PD (see CONCURRENCY >> >> The assert followed by the if doesn't look right. One should check >> setup_failed, I assume. Is there any way to trigger the broken assert >> or the hang from the missing wakeup? > > This just check what is described in the CONCURRENCY NOTES: > > 204 Axioms: > 205 > 206 * The create_thread function can never set stopped_start to false. > 207 * The created thread can read stopped_start but *never write to it*. I meant this: + assert (stopped_start); + if (stopped_start) It doesn't look right to me. >> I happend to have looked at this today for the sigaltstack stuff. >> >> I felt that it would be simpler to perform the kernel-side setup >> (affinity and scheduling policy) on the new thread. Essentially use >> stopped_start mode unconditionally. And use a separate lock word for >> the barrier/latch, instead of reusing the pd->lock. >> >> The advantage is that we can pass an arbitrary temporary object into the >> internal start function, such as the signal mask. The creating thread >> blocks until the initialization has happened and the immediate fate of >> the thread has been decided. After that point the creating thread >> deallocates the temporary object, and if the thread couldn't start, any >> thread-related data. Otherwise the new thread takes ownership of that >> data. >> >> I think it will be much easier to explain what is going on in the >> concurrency notes. We have an additional synchronization step on all >> created threads, but I doubt that this matters in practice. >> >> What do you think? I'm not sure if the change should happen as part of >> this series. > > I see two potential issues on this approach: > > 1. The cpuset information is tied to the pthread_attr_t and can be > potentially large, it means that if we want to make the created > thread to consume we will need to either duplicate, add something > like a reference count, or add more synchronization for the > consumer. None is really simpler to implement and I don't think > it is really an improvement. No, I think that's covered by the temporary object thing. The creating thread will block until it's guaranteed that the new thread won't need this data anymore. > 2. For both schedule parameters and affinity ideally we want to inform > right away to user if the operation has failed. If we move it to > thread itself, it still would require the synchronization to inform > pthread_create the operation has failed. I don't see much improvement, > the pthread_create latency will be essentially the same. Correct. But it will be much simpler because we can treat all kernel-side initialization the same. For example, sigaltstack might fail with invalid flags, too. And we'd definitely have to run that on the new thread. Thanks, Florian