From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4726 invoked by alias); 1 Aug 2010 15:21:36 -0000 Received: (qmail 4716 invoked by uid 22791); 1 Aug 2010 15:21:34 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE X-Spam-Check-By: sourceware.org Received: from mail-qw0-f47.google.com (HELO mail-qw0-f47.google.com) (209.85.216.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 01 Aug 2010 15:21:27 +0000 Received: by qwg8 with SMTP id 8so1065124qwg.20 for ; Sun, 01 Aug 2010 08:21:25 -0700 (PDT) Received: by 10.220.60.75 with SMTP id o11mr3251256vch.131.1280676085242; Sun, 01 Aug 2010 08:21:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.50.87 with HTTP; Sun, 1 Aug 2010 08:20:55 -0700 (PDT) In-Reply-To: References: <1280522440-27919-1-git-send-email-sebpop@gmail.com> From: Sebastian Pop Date: Sun, 01 Aug 2010 15:21:00 -0000 Message-ID: Subject: Re: [PATCH 0/2] Loop distribution for memset zero To: Richard Guenther Cc: gcc-patches@gcc.gnu.org, matz@suse.de Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2010-08/txt/msg00019.txt.bz2 On Sun, Aug 1, 2010 at 07:19, Richard Guenther wrote: > Hm. =A0I don't like inflation of command-line arguments too much, but Ok, so for now let's have only -ftree-loop-distribute-patterns. > See other responses. =A0Can we detect for eample daxpy? > > for (i=3D0; i =A0dy[i] =3D dy[i] + da * dx[i]; > Yes we could detect these patterns. > ? =A0In principle all the blas routines have one destination, so we'd need > to distribute all stores, like with regular loop distribution but then > after analyzing the partitions and detecting which ones we recognize > we need to fuse the unhandled parts together again. =A0Can we do > that from inside the loop distribution machinery? > The loop distribution does not fuse back the partitions if, due to other dependences we have to pull in the same partition more data references than in the analysis part, and so the kernel to be code generated is not exactly the one for which we have the lib function. I think it would not be difficult to implement the fusion of the partitions that do not match the patterns with the default partition. > Ok, so we'd need to do the pattern recognition before distributing > the loop? We have to detect the pattern both before we create the partitions, as this would create the initial root of the partition, and then after the partition is created by aggregation of other data references, we have to run the pattern matching again to make sure the pattern matches again. > =A0But we need to make sure that the partition only contains > side-effects the replacement function has. =A0Consider > > =A0for (i =3D 0; i < n ; ++i) > =A0 { > =A0 =A0 dx[i] =3D i; > =A0 =A0 dy[i] =3D dy[i] + da * dx[i]; > =A0 } > > will loop distribution include the assignment to dx[i] in the partition > if the worklist contains dy[i]? I think that in this case you would get two partitions, because there is no SCC in the data dep graph that would require the write to dx to be in the same partition as the write to dy, and the distribution would lead to: for (i =3D 0; i < n ; ++i) dx[i] =3D i; for (i =3D 0; i < n ; ++i) dy[i] =3D dy[i] + da * dx[i]; Sebastian