From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by sourceware.org (Postfix) with ESMTPS id EC1F43858D34 for ; Wed, 1 May 2024 23:45:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EC1F43858D34 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=igalia.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=igalia.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EC1F43858D34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=178.60.130.6 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714607137; cv=none; b=VpjWxPDCs4wEg2yekYlu2xre4Dr2ATmhgJIx2JyOALz3Wojn6ECrxvgr2oPtHoUmnu1vDHSS6bA35Yx7yIqg61T1FJoaFdxyRgxg5WPaWQy7G1nRd+fNSdwyvkF65o+/M2jR7+s8ld1qSFKKWRVn2vNJlolPHeihgpWBVKvxHBQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714607137; c=relaxed/simple; bh=G4pqX1RMSl4RCZYG0IsdfgPoTzDclcLMoFT8Uh1hu1Q=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=ov/fADHlmv3PoKz3U8DCeReqcYhyk/V9M6qNkdueYQShG52sn6h1+KvvLC2fE6KBYpk59hL3Hx7CdIjYLDazd3MuNNDSW2IwCYVc9+UtkBJxBvg+n/gPZtP+ktNexpqpwDCZEJL4cOzEYMk17bkZAgQDNS7DSvhDNIVvtRRIkxQ= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From: References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=Ut6b0SDeFkL5LMt8SlT7mycy5WHbPm1GhBrY2OAen2o=; b=CjKthMwsLJXjsFlsgkKRHaOJGD 2WpPy8u5p46Z1R6MfHGCNl48S1JRNPwHK/cJmsjI7Rzn0UeF2TCRXmsf7g74zGEVBNe1oOxUqXgDg 5VOqxkfsWnmKV6GH/3m/mosxHIqKm2Ic7z3aZpeM2PdYxUHYkF7yzJ5epiVQ90DSDtuFXoO6pxXhH XYwnaK+5yaH1F7DC/ejrh2YDh4d/8G3C0iUl33yXtZpBWtVy7qi5W8RvQLPLKTOwHTaTCAUt8CoHP 0mPbb51pBnan9zydhbUMlyF5qpLdVBnAfBhuLPhWtN4Ue0wIDcD6bikaB0Byd+YfAkwn3Nqg9LePM o/emag1w==; Received: from 201-42-129-95.dsl.telesp.net.br ([201.42.129.95] helo=[192.168.1.111]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_128_GCM:128) (Exim) id 1s2Jd0-002mpu-CW; Thu, 02 May 2024 01:44:46 +0200 Message-ID: Date: Wed, 1 May 2024 20:44:36 -0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 0/1] Add FUTEX_SPIN operation To: Christian Brauner Cc: Mathieu Desnoyers , Peter Zijlstra , Thomas Gleixner , linux-kernel@vger.kernel.org, "Paul E . McKenney" , Boqun Feng , "H . Peter Anvin" , Paul Turner , linux-api@vger.kernel.org, Florian Weimer , David.Laight@aculab.com, carlos@redhat.com, Peter Oskolkov , Alexander Mikhalitsyn , Chris Kennelly , Ingo Molnar , Darren Hart , Davidlohr Bueso , libc-alpha@sourceware.org, Steven Rostedt , Jonathan Corbet , Noah Goldstein , Daniel Colascione , longman@redhat.com, kernel-dev@igalia.com References: <20240425204332.221162-1-andrealmeid@igalia.com> <20240426-gaumen-zweibeinig-3490b06e86c2@brauner> Content-Language: en-US From: =?UTF-8?Q?Andr=C3=A9_Almeida?= In-Reply-To: <20240426-gaumen-zweibeinig-3490b06e86c2@brauner> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_BARRACUDACENTRAL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Christian, Em 26/04/2024 07:26, Christian Brauner escreveu: > On Thu, Apr 25, 2024 at 05:43:31PM -0300, André Almeida wrote: >> Hi, >> >> In the last LPC, Mathieu Desnoyers and I presented[0] a proposal to extend the >> rseq interface to be able to implement spin locks in userspace correctly. Thomas >> Gleixner agreed that this is something that Linux could improve, but asked for >> an alternative proposal first: a futex operation that allows to spin a user >> lock inside the kernel. This patchset implements a prototype of this idea for >> further discussion. >> >> With FUTEX2_SPIN flag set during a futex_wait(), the futex value is expected to >> be the PID of the lock owner. Then, the kernel gets the task_struct of the >> corresponding PID, and checks if it's running. It spins until the futex >> is awaken, the task is scheduled out or if a timeout happens. If the lock owner >> is scheduled out at any time, then the syscall follows the normal path of >> sleeping as usual. >> >> If the futex is awaken and we are spinning, we can return to userspace quickly, >> avoid the scheduling out and in again to wake from a futex_wait(), thus >> speeding up the wait operation. >> >> I didn't manage to find a good mechanism to prevent race conditions between >> setting *futex = PID in userspace and doing find_get_task_by_vpid(PID) in kernel >> space, giving that there's enough room for the original PID owner exit and such >> PID to be relocated to another unrelated task in the system. I didn't performed > > One option would be to also allow pidfds. Starting with v6.9 they can be > used to reference individual threads. > > So for the really fast case where you have multiple threads and you > somehow may really do care about the impact of the atomic_long_inc() on > pidfd_file->f_count during fdget() (for the single-threaded case the > increment is elided), callers can pass the TID. But in cases where the > inc and put aren't a performance sensitive, you can use pidfds. > Thank you very much for making the effort here, much appreciated :) While I agree that pidfds would fix the PID race conditions, I will move this interface to support TIDs instead, as noted by Florian and Peter. With TID the race conditions are diminished I reckon?