From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 115498 invoked by alias); 2 Dec 2015 11:48:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 115488 invoked by uid 89); 2 Dec 2015 11:48:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL,BAYES_50,SPF_HELO_PASS,T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 02 Dec 2015 11:48:10 +0000 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (Postfix) with ESMTPS id F276C65655; Wed, 2 Dec 2015 11:48:07 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-116-34.ams2.redhat.com [10.36.116.34]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id tB2Bm65u019025 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 2 Dec 2015 06:48:07 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id tB2Bm57F004288; Wed, 2 Dec 2015 12:48:05 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id tB2Bm4FJ004287; Wed, 2 Dec 2015 12:48:04 +0100 Date: Wed, 02 Dec 2015 11:48:00 -0000 From: Jakub Jelinek To: Alexander Monakov Cc: gcc-patches@gcc.gnu.org, Bernd Schmidt , Dmitry Melnik Subject: Re: [gomp-nvptx 9/9] adjust SIMD loop lowering for SIMT targets Message-ID: <20151202114803.GJ5675@tucnak.redhat.com> Reply-To: Jakub Jelinek References: <1448983707-18854-1-git-send-email-amonakov@ispras.ru> <1448983707-18854-10-git-send-email-amonakov@ispras.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1448983707-18854-10-git-send-email-amonakov@ispras.ru> User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes X-SW-Source: 2015-12/txt/msg00235.txt.bz2 On Tue, Dec 01, 2015 at 06:28:27PM +0300, Alexander Monakov wrote: > @@ -10218,12 +10218,37 @@ expand_omp_simd (struct omp_region *region, struct omp_for_data *fd) > > n1 = fd->loop.n1; > n2 = fd->loop.n2; > + step = fd->loop.step; > + bool do_simt_transform > + = (cgraph_node::get (current_function_decl)->offloadable > + && !broken_loop > + && !safelen > + && !simduid > + && !(fd->collapse > 1)); expand_omp is depth-first expansion, so for the case where the simd region is in lexically (directly or indirectly) nested inside of a target region, the above will not trigger. You'd need to use cgraph_node::get (current_function_decl)->offloadable or just walk through outer fields of region up and see if this isn't in a target region. Also, please consider privatized variables in the simd loops. int foo (int *p) { int r = 0, i; #pragma omp simd reduction(+:r) for (i = 0; i < 32; i++) { p[i] += i; r += i; } return r; } #pragma omp declare target to (foo) int main () { int p[32], err, i; for (i = 0; i < 32; i++) p[i] = i; #pragma omp target map(tofrom:p) map(from:err) { int r = 0; #pragma omp simd reduction(+:r) for (i = 0; i < 32; i++) { p[i] += i; r += i; } err = r != 31 * 32 / 2; err |= foo (p) != 31 * 32 / 2; } if (err) __builtin_abort (); for (i = 0; i < 32; i++) if (p[i] != 3 * i) __builtin_abort (); return 0; } Here, it would be nice to extend omp_max_vf in the host compiler, such that if PTX offloading is enabled, and optimize && !optimize_debug (and vectorizer on the host not disabled, otherwise it won't be cleaned up on the host), it returns MIN (32, whatever it would return otherwise). And then arrange for the stores to and other operations on the "omp simd array" attributed arrays before/after the simd loop to be handled specially for SIMT, basically you want those to be .local, if non-addressable handled as any other scalars, the loop up to GOMP_SIMD_LANES run exactly once, and for the various reductions or lastprivate selection reduce it the SIMT way or pick value from the thread in warp that had the last SIMT lane, etc. > + if (do_simt_transform) > + { > + tree simt_lane > + = build_call_expr_internal_loc (UNKNOWN_LOCATION, IFN_GOMP_SIMT_LANE, > + integer_type_node, 0); > + simt_lane = fold_convert (TREE_TYPE (step), simt_lane); > + simt_lane = fold_build2 (MULT_EXPR, TREE_TYPE (step), step, simt_lane); > + cfun->curr_properties &= ~PROP_gimple_lomp_dev; How does this even compile? simt_lane is a local var in the if (do_simt_transform) body. > + } > + > if (gimple_omp_for_combined_into_p (fd->for_stmt)) > { > tree innerc = find_omp_clause (gimple_omp_for_clauses (fd->for_stmt), > OMP_CLAUSE__LOOPTEMP_); > gcc_assert (innerc); > n1 = OMP_CLAUSE_DECL (innerc); > + if (do_simt_transform) > + { > + n1 = fold_convert (type, n1); > + if (POINTER_TYPE_P (type)) > + n1 = fold_build_pointer_plus (n1, simt_lane); And then you use it here, outside of its scope. BTW, again, it would help if you post a simple *.ompexp dump on what exactly you want to look it up. Jakub