From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24436 invoked by alias); 5 Aug 2019 10:32:24 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 24267 invoked by uid 89); 5 Aug 2019 10:32:24 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL,BAYES_00,BODY_8BITS,GARBLED_BODY,SPF_HELO_PASS autolearn=no version=3.3.1 spammy=0900, locks, HContent-Transfer-Encoding:8bit X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 05 Aug 2019 10:32:22 +0000 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5F9BF8553B; Mon, 5 Aug 2019 10:32:21 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-200.ams2.redhat.com [10.36.116.200]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EC5C05DA60; Mon, 5 Aug 2019 10:32:20 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id x75AWId0017959; Mon, 5 Aug 2019 12:32:19 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id x75AWGti017958; Mon, 5 Aug 2019 12:32:16 +0200 Date: Mon, 05 Aug 2019 10:32:00 -0000 From: Jakub Jelinek To: =?utf-8?B?6rmA6rec656Y?= Cc: gcc@gcc.gnu.org Subject: Re: Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime Message-ID: <20190805103216.GQ2726@tucnak> Reply-To: Jakub Jelinek References: <2ec486a9ba251a2ffc757ed3b06192@cweb004.nm.nfra.io> <20190713062848.GE2125@tucnak> <20190722185413.GA13123@laptop.zalov.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.11.3 (2019-02-01) X-IsSubscribed: yes X-SW-Source: 2019-08/txt/msg00013.txt.bz2 On Sat, Aug 03, 2019 at 06:11:58PM +0900, 김규래 wrote: > I'm currently having trouble implementing the thread sleeping mechanism when the queue is out of tasks. > Problem is, it's hard to maintain consistency between the thread sleeping routine and the queues. > See the pseudocode below, > > 1. check queue is empty > 2. go to sleep > > if we go lock-free, the consistency between 1 and 2 cannot be maintained. I thought we don't want to go lock-free, the queue operations aren't easily implementable lock-free, but instead with a lock for each of the queues, so in the multi-queue setting having locks on the implicit tasks that hold those queues. What can and should be done without lock is perhaps some preliminary check if a queue is empty, that can be done through __atomic_load. And, generally go to sleep is done outside of the critical section, inside of the critical section we decide if we go to sleep or not, and then go to sleep either (on Linux) using futexes, or otherwise using semaphores, both have the properties that one can already post to them before some other thread sleeps on it, and in that case the other thread doesn't actually go to sleep. The wake up (post on the semaphore or updating the memory + later futex wake) is sometimes done inside of a critical section, the updating of memory if it is not atomic increase/decrease and the latter depending on whether we remember from the atomic operation whether the wake up is needed or not and defer it until after the critical section. Given say: ++team->task_count; ++team->task_queued_count; gomp_team_barrier_set_task_pending (&team->barrier); do_wake = team->task_running_count + !parent->in_tied_task < team->nthreads; gomp_mutex_unlock (&team->task_lock); if (do_wake) gomp_team_barrier_wake (&team->barrier, 1); you can see the wake up is done outside of the critical section. If team->task_lock isn't used, there will be of course problems, say team->task_count and team->task_queued_count need to be bumped atomically, ditto operations on team->barrier, and the question is what to do with the team->task_running_count check, if that one is updated atomically too, maybe __atomic_load might be good enough, though perhaps worst case it might mean we don't in some cases wake anybody, so there will be threads idling instead of doing useful work, but at least one thread probably should handle it later. Jakub