From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 18528 invoked by alias); 27 Aug 2013 11:40:02 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 18513 invoked by uid 89); 27 Aug 2013 11:40:01 -0000 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 27 Aug 2013 11:40:01 +0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.8 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r7RBdxB6007162 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 27 Aug 2013 07:39:59 -0400 Received: from tucnak.zalov.cz (vpn1-5-63.ams2.redhat.com [10.36.5.63]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r7RBdvGq024438 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 27 Aug 2013 07:39:58 -0400 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.14.7/8.14.7) with ESMTP id r7RBduvt006972; Tue, 27 Aug 2013 13:39:57 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.14.7/8.14.7/Submit) id r7RBduId006971; Tue, 27 Aug 2013 13:39:56 +0200 Date: Tue, 27 Aug 2013 16:22:00 -0000 From: Jakub Jelinek To: "Michael V. Zolotukhin" Cc: Kirill Yukhin , Richard Henderson , gcc@gcc.gnu.org, triegel@redhat.com Subject: Re: [RFC] Offloading Support in libgomp Message-ID: <20130827113956.GH21876@tucnak.zalov.cz> Reply-To: Jakub Jelinek References: <20130822142814.GB1814@tucnak.redhat.com> <20130823092810.GA36483@msticlxl57.ims.intel.com> <20130823095250.GJ1814@tucnak.redhat.com> <20130823153052.GA2974@msticlxl57.ims.intel.com> <20130823161631.GO1814@tucnak.redhat.com> <20130826115911.GA40923@msticlxl57.ims.intel.com> <20130826125116.GE21876@tucnak.zalov.cz> <20130826132936.GB40923@msticlxl57.ims.intel.com> <20130826141117.GF21876@tucnak.zalov.cz> <20130827112609.GA4093@msticlxl57.ims.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130827112609.GA4093@msticlxl57.ims.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-SW-Source: 2013-08/txt/msg00316.txt.bz2 On Tue, Aug 27, 2013 at 03:26:09PM +0400, Michael V. Zolotukhin wrote: > > Anyway, the GOMP_target_data implementation and part of GOMP_target would > > be something along the lines of following pseudocode: > > > > device_data = lookup_device_id (device_id); > > ... > Thanks, I've seen that similarly. But the problem with passing > arguments to the target is still open. I'll try to explain, what is the > problem. > > Remember what we did for 'pragma parallel': > struct .omp_data_s.0 .omp_data_o.2; > .omp_data_o.2.s = 0.0; > .omp_data_o.2.b = &b; > .omp_data_o.2.c = &c; > .omp_data_o.2.y = y_7(D); > .omp_data_o.2.j = j_9(D); > __builtin_GOMP_parallel (bar._omp_fn.0, &.omp_data_o.2, 0, 0); > s_12 = .omp_data_o.2.s; > y_13 = .omp_data_o.2.y; > j_14 = .omp_data_o.2.j; > > I.e. compiler prepares a structure with all arguments and pass it to the > runtime. Runtime passes this structure as-is to callee (i.e. to > bar._omp_fn.0). > > In bar._omp_fn.0 the compiler just emits code that extracts > corresponding fields from the given struct and thus initialize all > needed local vars: > bar._omp_fn.0 (struct .omp_data_s.0 * .omp_data_i) > { > int _12; > int _13; > ... > _12 = .omp_data_i_11(D)->y; > _13 = .omp_data_i_11(D)->j; > ... > } > > That scheme would work perfectly for implementing host fallback, but as > I see it, can't be applied as is for target offloading. The reason is > the following: > *) Compiler doesn't know runtime info, i.e. it doesn't know target > addresses so it can't fill the structure for passing to target version > of the routine. > *) Runtime doesn't know the structure layout - runtime should firstly > translate addresses and only then pass it to the callee, but it don't > know which addresses to translate, because it doesn't know which > variables are used by the callee. > > Currently, I see two possible solutions for this: > 1) add to the structure with arguments fields, describing size of each > field. Then GOMP_target parses this struct and replace every found > address with the corresponding target address, and only then call > target_call. > 2) Lift mapping/allocation stuff from runtime to compile time, i.e. > allow the compiler to generate calls like this: > .omp_data_o.2.s = 0.0; > .omp_data_o.2.b = &b; > .omp_data_o.2.c = &c; > .omp_data_o.2.y = y_7(D); > .omp_data_o.2.j = j_9(D); > .omp_data_o.target.2.s = GOMP_translate_target_address (0.0); > .omp_data_o.target.2.b = GOMP_translate_target_address (&b); > .omp_data_o.target.2.c = GOMP_translate_target_address (&c); > .omp_data_o.target.2.y = GOMP_translate_target_address (y_7(D)); > .omp_data_o.target.2.j = GOMP_translate_target_address (j_9(D)); > GOMP_target (bar._omp_fn.0, &.omp_data_o.2, &.omp_data_o.target.2, 0, 0, ); > Thus runtime would have two versions of structure with arguments and > will be able to pass it as-is to target callee. But probably we'll need > a version of that struct for each target and that would look very ugly. > > What do you think on that? Maybe I'm missing or overcomplicating > something, but for now I can't get how all this stuff could work > together without answers to these questions. What I meant was just that if you call GOMP_target with num_descs N, then the structure will look like: struct .omp_target_data { sometype0 *var0; sometype1 *var1; ... sometypeNminus1 *varNminus1; }; so pretty much the runtime will call the target routine with address of an array of N pointers, and the compiler generated target routine will just use a struct to access it to make it more debuggable. As there won't be any paddings in the structure, I'd hope the structure layout will be exactly the same as the array. Jakub