From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x32b.google.com (mail-ot1-x32b.google.com [IPv6:2607:f8b0:4864:20::32b]) by sourceware.org (Postfix) with ESMTPS id EF9403857C74 for ; Fri, 12 Feb 2021 14:37:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org EF9403857C74 Received: by mail-ot1-x32b.google.com with SMTP id o12so8497941ote.12 for ; Fri, 12 Feb 2021 06:37:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Gz73LHF/1UXw0otym8sYqRUFcJKcQ/NiHAqGlJ3cxKQ=; b=hqrXFZj5Lj6aRgWq+Aq9jtKOOCdVZmhZ7xU3DaQNjmQVar8aNZFdvmwcQ2AOAHFin5 asv8C6D+kF6aDcoX3hoNdkHJM+McTpLW5Hin9s4isvwYG/2kpA192sGn2MC+6fRbZfvM 768V3Bv53LKeOuMnAuXPrr+WVtKYF9OEJfs7aPbEYm0M/SyPARR0D+bngV2bg3R8z9jf xl4pOVWSSU434TtSEH7EQxnTP4qnvDjCIJpMM3suv+e9snaIcXT6GBVkbP9cKMLgDtKz dR+oGQ0EytX50mH/wj+1dGCnpUtKN2mNaseRKpVdkLiV56nnXq0axAwfIWb3bGIROh/w ZIwA== X-Gm-Message-State: AOAM531hl/ndUN9OT7NaP12OqfScSlnzTr9qH+ihORmLJY4NuS88S+Jm Q/yXq83eeFoliXcaGcZXDuasjDPL52ZPyrS2cmM= X-Google-Smtp-Source: ABdhPJyQdcpl4SRdiVMe2bUCZCVx/nPiJFIfTAB23xf/0fR9AcVQK2cU5ooJheB9g7AdXO4o8uW5Oexgea4zxXW+kVU= X-Received: by 2002:a9d:1717:: with SMTP id i23mr2186114ota.179.1613140639277; Fri, 12 Feb 2021 06:37:19 -0800 (PST) MIME-Version: 1.0 References: <20210129150317.GN4020736@tucnak> In-Reply-To: <20210129150317.GN4020736@tucnak> From: "H.J. Lu" Date: Fri, 12 Feb 2021 06:36:43 -0800 Message-ID: Subject: Re: [PATCH] openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] To: Jakub Jelinek Cc: Kwok Cheung Yeung , GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3029.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Feb 2021 14:37:22 -0000 On Fri, Jan 29, 2021 at 7:53 AM Jakub Jelinek via Gcc-patches wrote: > > On Thu, Jan 21, 2021 at 07:33:34PM +0000, Kwok Cheung Yeung wrote: > > The detach support clearly needs more work, but is this particular patch > > okay for trunk? > > Sorry for the delay. > > I'm afraid it is far from being ready. > > > @@ -2402,17 +2437,41 @@ ialias (omp_in_final) > > void > > omp_fulfill_event (omp_event_handle_t event) > > { > > - gomp_sem_t *sem = (gomp_sem_t *) event; > > + struct gomp_task *task = (struct gomp_task *) event; > > + struct gomp_task *parent = task->parent; > > struct gomp_thread *thr = gomp_thread (); > > struct gomp_team *team = thr ? thr->ts.team : NULL; > > > > - if (gomp_sem_getcount (sem) > 0) > > - gomp_fatal ("omp_fulfill_event: %p event already fulfilled!\n", sem); > > + if (gomp_sem_getcount (&task->completion_sem) > 0) > > + gomp_fatal ("omp_fulfill_event: %p event already fulfilled!\n", task); > > As written earlier, the intent of omp_fulfill_event is that it should be > callable from anywhere, not necessarily one of the threads in the team. > The application could have other threads (often called unshackeled threads) > from which it would call it, or just some other parallel or whatever else, > as long as it is not racy to pass in the omp_event_handle_t to there. > So, > struct gomp_thread *thr = gomp_thread (); > struct gomp_team *team = thr ? thr->ts.team : NULL; > is incorrect, it will give you the team of the current thread, rather than > the team of the task to be fulfilled. > > It can also crash if team is NULL, which will happen any time > this is called outside of a parallel. Just try (should go into testsuite > too): > #include > > int > main () > { > omp_event_handle_t ev; > #pragma omp task detach (ev) > omp_fulfill_event (ev); > return 0; > } > > Additionally, there is an important difference between fulfill for > included tasks and for non-included tasks, for the former there is no team > or anything to care about, for the latter there is a team and one needs to > take the task_lock, but at that point it can do pretty much everything in > omp_fulfill_event rather than handling it elsewhere. > > So, what I'm suggesting is: > > Replace > bool detach; > gomp_sem_t completion_sem; > with > struct gomp_task_detach *detach; > and add struct gomp_task_detach that would contain everything that will be > needed (indirect so that we don't waste space for it in every task, but only > for those that have detach clause). > We need: > 1) some way to tell if it is an included task or not > 2) for included tasks the gomp_sem_t completion_sem > (and nothing but 1) and 2) for those), > 3) struct gomp_team * for non-included tasks > 4) some way to find out if the task has finished and is just waiting for > fulfill event (perhaps your GOMP_TASK_DETACHED is ok for that) > 5) some way to find out if the task has been fulfilled already > (gomp_sem_t for that seems an overkill though) > > 1) could be done through the struct gomp_team *team; member, > set it to NULL in included tasks (no matter if they are in some team or not) > and to non-NULL team of the task (non-included tasks must have a team). > > And I don't see the point of task_detach_queue if we can handle the > dependers etc. all in omp_fulfill_event, which I think we can if we take the > task_lock. > > So, I think omp_fulfill_event should look at the task->detach it got, > if task->detach->team is NULL, it is included task, GOMP_task should have > initialized task->detach->completion_sem and omp_fulfill_event should just > gomp_sem_post it and that is all, GOMP_task for included task needs to > gomp_sem_wait after it finishes before it returns. > > Otherwise, take the team's task_lock, and look at whether the task is still > running, in that case just set the bool that it has been fulfilled (or > whatever way of signalling 5), perhaps it can be say clearing task->detach > pointer). When creating non-included tasks in GOMP_task with detach clause > through gomp_malloc, it would add the size needed for struct > gomp_task_detach. > But if the task is already in GOMP_TASK_DETACHED state, instead we need > while holding the task_lock do everything that would have been done normally > on task finish, but we've skipped it because it hasn't been fulfilled. > Including the waking/sem_posts when something could be waiting on that task. > > Do you agree with this, or see some reason why this can't work? > > And testsuite should include also cases where we wait for the tasks with > detach clause to be fulfilled at the end of taskgroup (i.e. need to cover > all of taskwait, taskgroup end and barrier). > task-detach-6.f90 should be disabled for now. It has been blocking my testers for weeks. -- H.J.