From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by sourceware.org (Postfix) with ESMTPS id 07E30385E83F for ; Thu, 19 Aug 2021 16:49:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 07E30385E83F Received: by mail-pj1-x1036.google.com with SMTP id gz13-20020a17090b0ecdb0290178c0e0ce8bso7569964pjb.1 for ; Thu, 19 Aug 2021 09:49:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=JqWMXCusWbXPBxQKSwqXq4KlTMpqFQg/Nhkp8URPdlA=; b=XT72DEtx7pAmq2UYyM3XCNYHwJSFtR2+SwUHJVetbrXCtyvv6e5zU4ju7MS/Lw/S/8 llaeXOX2M2Q1di2hCFYaGDBxghxMllA3/B3GQzzvnGaq1cbeZCrSeRFyrYKsoMetmv/5 97FB1Lp6h8edY3nEtqNiYSo/WNN9Ivd02H28AKFJLJN+nsQpCSSJCkjDJX4y2DcV7ocv O/zOCkiRUf95nkzXkUrts2CMrDdeh+Dx3ovMvdbDV1KSqH23xASi8EuBS2OyY5x3UJiG foYqrpzkacbL0am8MCSvCcYtkRFVRE6TA3T1I4yQ1NdWUlET828JQXQlQHuRaUNLrrxe Dn8w== X-Gm-Message-State: AOAM531C+qraeSm8pZrtCa7hWQVwIFVsG5daVKZdkgMqox++rgYb7zs6 h3HOlBi1s5kGzlie4BRR9OD7Yhvj2sutyQ== X-Google-Smtp-Source: ABdhPJwcHzCbIDz+AsXGtu6hIL6cxvuhPt7uusytgxU5UsUe5TF0sjuPHGyoYQWuCLTuIO2uiyvjJw== X-Received: by 2002:a17:90a:d181:: with SMTP id fu1mr15910520pjb.137.1629391772927; Thu, 19 Aug 2021 09:49:32 -0700 (PDT) Received: from ?IPv6:2804:431:c7ca:cd83:aa1a:7bd:9935:9bba? ([2804:431:c7ca:cd83:aa1a:7bd:9935:9bba]) by smtp.gmail.com with ESMTPSA id z11sm4021839pfn.69.2021.08.19.09.49.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 19 Aug 2021 09:49:32 -0700 (PDT) Subject: Re: [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) To: Florian Weimer , Adhemerval Zanella via Libc-alpha References: <811aff39780f8d997421dcda8c6afaf01ea1ee6a.1629208174.git.fweimer@redhat.com> <88bdbb1e-2338-4fd2-0c03-89f2196ba239@linaro.org> <87r1epz8ym.fsf@oldenburg.str.redhat.com> From: Adhemerval Zanella Message-ID: <84b846a9-5304-3bca-bfc4-f1d15c26204f@linaro.org> Date: Thu, 19 Aug 2021 13:49:30 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <87r1epz8ym.fsf@oldenburg.str.redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Aug 2021 16:49:44 -0000 On 19/08/2021 13:37, Florian Weimer wrote: > * Adhemerval Zanella via Libc-alpha: > >> On 17/08/2021 10:51, Florian Weimer via Libc-alpha wrote: >>> Thread exit is delayed until all pending pthread_kill operations >>> on the thread have completed. This avoids sending signals to the >>> wrong thread or a spurious ESRCH error. >>> >>> The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived >>> from a downstream test originally written by Marek Polacek. >> >> And I really hope we could first sort out the BZ#19951 and move away >> of using pthread::tid as the synchronization member to check thread >> termination. I sent a possible fix [1], but it does not fully handle >> the pthread::tid access, it would require to a proper lock to >> synchronize between thread exit and pthread_kill (as Rich has suggested >> on BZ#12889). >> >> I will need to send an updated version of my pthread fixes patchset, >> since clone3 changes broke it and INVALID_TD_P needs fixing as well. >> >> [1] https://patchwork.sourceware.org/project/glibc/patch/20210610193639.3650754-5-adhemerval.zanella@linaro.org/ > > I think my patch is independent of the other fixes. Its advantage over > other proposals is that it does not add waiting to pthread_kill, so > there is no signal-safety issue. I am not sure, the way I think we should fix to also fix BZ#19951 would to add a lock to access 'tid' and uses not only on pthread_kill, but also on pthread_getcpuclockid, pthread_getschedparam, pthread_setschedparam, and pthread_setschedprio. The lock also simplifies the code, there is no need to special handling of futex internal state. I also think pthread_kill would also be simplified. The lock is just to simplify the code, if we really need to add some scalability we might use a more clever code as you are proposing. > > (But I probably should test if cancellation from a signal handler that > is invoked synchronously via pthread_kill still works.) > > Thanks, > Florian >