From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21998 invoked by alias); 11 Dec 2015 18:05:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 21989 invoked by uid 89); 11 Dec 2015 18:05:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 11 Dec 2015 18:05:35 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 14C55C0AEE21 for ; Fri, 11 Dec 2015 18:05:34 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-113-142.phx2.redhat.com [10.3.113.142]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id tBBI5W9o028426 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Fri, 11 Dec 2015 13:05:33 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id tBBI5UoH011434 for ; Fri, 11 Dec 2015 19:05:31 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id tBBI5T1r009874 for gcc-patches@gcc.gnu.org; Fri, 11 Dec 2015 19:05:29 +0100 Date: Fri, 11 Dec 2015 18:05:00 -0000 From: Jakub Jelinek To: GCC Patches Subject: Re: [hsa 2/10] Modifications to libgomp proper Message-ID: <20151211180529.GH5675@tucnak.redhat.com> Reply-To: Jakub Jelinek References: <20151207111758.GA24234@virgil.suse.cz> <20151207111957.GC24234@virgil.suse.cz> <20151209113536.GP5675@tucnak.redhat.com> <20151210175223.GH3534@virgil.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151210175223.GH3534@virgil.suse.cz> User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes X-SW-Source: 2015-12/txt/msg01288.txt.bz2 On Thu, Dec 10, 2015 at 06:52:23PM +0100, Martin Jambor wrote: > I see, I prefer the clean approach, even if it is more work, this > interface looks like it is going to be extended in the future. But I > am wondering whether embedding the value into the identifier element > is actually worth it. The passed array is going to be a small local > variable and I wonder whether there is going to be any benefit in it > having two elements instead of four (or four instead of six for > gridified kernels), especially if it means introducing control flow on > the part of the caller. But if you really want it that way, I will > implement that. I'm fine with implementing the two (num_threads and thread_limit) always as separate argument, or perhaps what you could do is if the argument is constant and fits into the signed 16 bits on 32-bit arches (or any constant, that fits into 48 bits), use embedded argument (then there is no extra runtime cost), and if it is variable and you'd need to shift, put it as separate argument. > OK, so if I understand this correctly, I should not be re-queing the > task in gomp_target_task_completion. I have left the check to be > NULLness of ttask->tgt but can test the capability if you prefer (but > I at least hope the two options are equivalent). > > > --- a/libgomp/task.c > > > +++ b/libgomp/task.c > > > @@ -581,6 +581,7 @@ GOMP_PLUGIN_target_task_completion (void *data) > > > gomp_mutex_unlock (&team->task_lock); > > > } > > > ttask->state = GOMP_TARGET_TASK_FINISHED; > > > + free (ttask->firstprivate_copies); > > > gomp_target_task_completion (team, task); > > > gomp_mutex_unlock (&team->task_lock); > > > } > > > > So, this function should have a special case for the SHARED_MEM case, handle > > it closely to say how GOMP_taskgroup_end handles the finish_cancelled: > > case. Just note that the target task is missing from certain queues at that > > point. > > I'm afraid I need some help here. I do not quite understand how is > finish_cancelled in GOMP_taskgroup_end similar, it seems to be doing > much more than freeing one pointer. What is exactly the issue with > the above? > > Nevertheless, after reading through bits of task.c again, I wonder > whether any copying (for both shared memory target and the host) in > gomp_target_task_fn is actually necessary because it seems to be also > done in gomp_create_target_task. Does that not apply somehow? The target task is scheduled for the first action as normal task, and the scheduling of it already removes it from some of the queues (each task is put into 1-3 queues), i.e. actions performed mostly by gomp_task_run_pre. Then the team task lock is unlocked and the task is run. Finally, for normal tasks, gomp_task_run_post_handle_depend, gomp_task_run_post_remove_parent, etc. is run. Now, for async target tasks that have something running on some other device at that point, we don't do that, but instead make it GOMP_TASK_ASYNC_RUNNING. And continue with other stuff, until gomp_target_task_completion is run. For non-shared mem that needs to readd the task again into the queues, so that it will be scheduled again. But you don't need that for shared mem target tasks, they can just free the firstprivate_copies and finalize the task. At the time gomp_target_task_completion is called, the task is pretty much in the same state as it is around the finish_cancelled:; label. So instead of what the gomp_target_task_completion function does, you would for SHARED_MEM do something like: size_t new_tasks = gomp_task_run_post_handle_depend (task, team); gomp_task_run_post_remove_parent (task); gomp_clear_parent (&task->children_queue); gomp_task_run_post_remove_taskgroup (task); team->task_count--; do_wake = 0; if (new_tasks > 1) { do_wake = team->nthreads - team->task_running_count - !task->in_tied_task; if (do_wake > new_tasks) do_wake = new_tasks; } // Unlike other places, the following will be also run with the // task_lock held, but I'm afraid there is nothing to do about it. // See the comment in gomp_target_task_completion. gomp_finish_task (task); free (task); if (do_wake) gomp_team_barrier_wake (&team->barrier, do_wake); Jakub