From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 52277 invoked by alias); 8 Nov 2018 17:16:19 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 52264 invoked by uid 89); 8 Nov 2018 17:16:18 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=BAYES_00,KAM_NUMSUBJECT,SPF_HELO_PASS autolearn=no version=3.3.2 spammy=cancelled, specifications, shaping, nonmonotonic X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 08 Nov 2018 17:16:16 +0000 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6C11D4E916 for ; Thu, 8 Nov 2018 17:16:15 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-75.ams2.redhat.com [10.36.116.75]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0B2765882A for ; Thu, 8 Nov 2018 17:16:14 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id wA8HGCt5011087 for ; Thu, 8 Nov 2018 18:16:13 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id wA8HGB0L011086 for gcc-patches@gcc.gnu.org; Thu, 8 Nov 2018 18:16:11 +0100 Date: Thu, 08 Nov 2018 17:16:00 -0000 From: Jakub Jelinek To: gcc-patches@gcc.gnu.org Subject: [committed 0/4] (Partial) OpenMP 5.0 support for GCC 9 Message-ID: <20181108171611.GK11625@tucnak> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.9.2 (2017-12-15) X-IsSubscribed: yes X-SW-Source: 2018-11/txt/msg00628.txt.bz2 Hi! The OpenMP 5.0 specification, https://www.openmp.org/specifications/ , has been just released a few minutes ago and to celebrate that, I've merged gomp-5_0-branch into trunk after bootstrapping/regtesting it on x86_64-linux and i686-linux. Because the amount of changes in OpenMP 5.0 is much bigger than in any of the earlier releases of the standard, unfortunately the whole spec isn't implemented at this point, not even for C/C++. So, let me start by listing features that are implemented. Unless otherwise stated, the implementation is for now for C/C++ only, Fortran to follow after C/C++ is fully done. New OpenMP 5.0 features in this patchset: - task reductions, including task modifier on parallel/worksharing construct reduction - != conditions in OpenMP loops - C++1[147] range for loops in worksharing loop, taskloop and distribute (and combined/composite constructs) - allow private or lastprivate clauses for iterator variable(s) on simd construct or combined/composite constructs including simd - iterators in depend clause - support for lvalue expressions in depend clauses (note, some expressions in depend clauses that got allowed very recently are still unsupported) - mutexinoutset dependence kind (right now this is implemented in the runtime library as less efficient inout, but can be improved later solely on in the runtime library) - depobj construct, depobj dependence kind and omp_depend_t - depend clause on taskwait construct - host teams construct (the library implementation still needs work, so that it is actually beneficial on NUMA setups, see below) - cancel if clause modifier - if and nontemporal clauses on simd (both are parsed only right now, unless I figure out how to propagate it through IL to the vectorizer quickly, if will either force no simd at all, or will cause duplication of the loop; nontemporal either will use nontemporal stores, or for GCC 9 will do nothing) - defaultmap clause extensions - hint and memory order clauses on atomic (hint is parsed and ignored by the implementation for now, memory order fully implemented) - memory order clauses on flush - support for new combined #pragma omp parallel master and #pragma omp {,parallel }master taskloop{, simd} constructs - affinity display support, omp_{[sg]et_affinity_format,{display,capture}_affinity}, OMP_DISPLAY_AFFINITY and OMP_AFFINITY_FORMAT env vars (this is implemented also for Fortran) - omp_pause_resource{,_all} support, omp_pause_resource_t type (for now the runtime library is able to free resources on the host only; this is implemented also for Fortran) - worksharing loop schedules now default to nonmonotonic with the exception of static schedules, nonmonotonic allowed on static, runtime and auto, omp_sched_monotonic modifier and OMP_SCHEDULE env var parsing changes (note, the runtime library is told if monotonic or nonmonotonic schedule is used in a backwards compatible way, but the runtime library ATM doesn't take advantage of nonmonotonic schedules, everything is still effectively monotonic) - allow only use_device_ptr clause(s) on target data construct - change data sharing for readonly variables without mutable members, they are no longer predetermined shared (this actually changed in earlier OpenMP standard releases, but was considered a mistake; for 5.0 it was decided it isn't going to be reverted; this makes a difference mainly when using default(none)) - allow atomic constructs in simd regions (note, for now this causes the vectorizer to fail to vectorize such regions) - allow comma in between (name) and hint clause on critical construct - device routine prototype changes (void * to const void * arguments) - omp_lock_hint* to omp_sync_hint* changes - partial requires construct support, only atomic_default_mem_order fully implemented Features I'll still try to implement for GCC 9: - make sure all expressions in the OpenMP grammar within clauses are assignment-expression, with the exception of array section expressions - verify taskloop construct cancellation works - stop diagnosing threadprivate directive after first use to allow definitions of variables with constructors followed by threadprivate, rather than the currently required workaround of extern declaration of it, followed by threadprivate directive followed by definition New OpenMP 5.0 features that won't be available in GCC 9, are planned for GCC 10 or later versions as time permits: - requires directive other than atomic_default_mem_order (parsing is implemented, but the rest is not) - inscan reduction clause modifier and scan directive - lastprivate clause with conditional modifier - OMP_TARGET_OFFLOAD env var and target-offload-var ICV - max-active-levels-var / nested-var ICV, omp_[sg]et_nested and OMP_NESTED changes - omp_get_supported_active_levels API addition - array shaping support, array sections with non-unit strides in to/from/depend clauses - metadirective support - declare variant support - non-rectangular loop support, support for not perfectly nested loops - order(concurrent) clause support - loop construct support - affinity clause support - detach clause support, omp_fulfill_event - OpenMP allocator support, omp_alloc/omp_init_allocator etc., allocate directive and clause - use_device_addr clause support - ancestor modifier on device clause, reverse offloading - implicit declare target support - lvalue expressions in map/to/from clauses - nested declare target support - various target variable mapping changes - declare mapper directive support - omp_get_device_num API addition - OMPT support - OMPD support - OpenMP 5.0 Fortran support (and finish up the remaining missing Fortran 4.5 features) My short term todo list: - requires directive other than *atomic* (currently parsed and then ignored); remove or sorry for GCC9? - inscan modifier (currently parsed in the clause only and then ignored); remove for GCC9? - add testsuite coverage for ordered and doacross loop with task reductions - finish up cancelled parallel handling of worksharing reductions - simd if (perhaps force simdlen 1 for GCC9 or duplicate loop) - simd nontemporal (try to actually use nontemporal stores) - host teams runtime (either implement for real for NUMA, or always use 1 team for GCC 9) - lastprivate conditional (currently parsed and then ignored); remove or sorry for GCC9? - check what clauses are not handled in tree-nested.c, add testsuite coverage for those use - check auto schedule what we default to, we should default to nonmonotonic - check omp_init_*lock_with hint state - testsuite coverage for taskloop construct cancellation - implement mutexinoutset better than inout in the runtime (GCC 10?) Jakub