From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 43093 invoked by alias); 2 Dec 2015 15:18:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 43029 invoked by uid 89); 2 Dec 2015 15:18:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qk0-f172.google.com Received: from mail-qk0-f172.google.com (HELO mail-qk0-f172.google.com) (209.85.220.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 02 Dec 2015 15:18:14 +0000 Received: by qkas77 with SMTP id s77so17432332qka.0 for ; Wed, 02 Dec 2015 07:18:12 -0800 (PST) X-Received: by 10.55.72.10 with SMTP id v10mr4259189qka.14.1449069492518; Wed, 02 Dec 2015 07:18:12 -0800 (PST) Received: from ?IPv6:2601:181:c000:c497:a2a8:cdff:fe3e:b48? ([2601:181:c000:c497:a2a8:cdff:fe3e:b48]) by smtp.googlemail.com with ESMTPSA id n138sm1363745qhc.31.2015.12.02.07.18.11 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 02 Dec 2015 07:18:12 -0800 (PST) Subject: Re: [gomp-nvptx 2/9] nvptx backend: new "uniform SIMT" codegen variant To: Jakub Jelinek , Alexander Monakov References: <1448983707-18854-1-git-send-email-amonakov@ispras.ru> <1448983707-18854-3-git-send-email-amonakov@ispras.ru> <20151202104034.GG5675@tucnak.redhat.com> <565EEBF7.8070105@acm.org> <20151202131013.GL5675@tucnak.redhat.com> <20151202151205.GS5675@tucnak.redhat.com> Cc: gcc-patches@gcc.gnu.org, Bernd Schmidt , Dmitry Melnik , Thomas Schwinge From: Nathan Sidwell Message-ID: <565F0BB3.5020608@acm.org> Date: Wed, 02 Dec 2015 15:18:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <20151202151205.GS5675@tucnak.redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2015-12/txt/msg00289.txt.bz2 On 12/02/15 10:12, Jakub Jelinek wrote: > If we have a reasonable IPA pass to discover which addressable variables can > be shared by multiple threads and which can't, then we could use soft-stack > for those that can be shared by multiple PTX threads (different warps, or > same warp, different threads in it), then we shouldn't need to copy any > stack, just broadcast the scalar vars. Note the current scalar (.reg) broadcasting uses the live register set. Not the subset of that that is actually read within the partitioned region. That'd be a relatively straightforward optimization I think. nathan