From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12395 invoked by alias); 9 Jul 2009 14:37:14 -0000 Received: (qmail 12386 invoked by uid 22791); 9 Jul 2009 14:37:14 -0000 X-SWARE-Spam-Status: No, hits=-1.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_46,SARE_MSGID_LONG40,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mail-vw0-f198.google.com (HELO mail-vw0-f198.google.com) (209.85.212.198) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 09 Jul 2009 14:37:07 +0000 Received: by vwj36 with SMTP id 36so147749vwj.0 for ; Thu, 09 Jul 2009 07:37:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.91.213 with SMTP id o21mr1062116vcm.117.1247150224986; Thu, 09 Jul 2009 07:37:04 -0700 (PDT) In-Reply-To: <20090709141953.GA4672@rei> References: <20090709141953.GA4672@rei> Date: Thu, 09 Jul 2009 14:37:00 -0000 Message-ID: <84fc9c000907090737i77340d8br14fd4f06d82c5b6a@mail.gmail.com> Subject: Re: Strange Performance Hit on 2D-Loop From: Richard Guenther To: =?ISO-8859-1?Q?Andreas_Sch=E4fer?= Cc: gcc@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2009-07/txt/msg00172.txt.bz2 On Thu, Jul 9, 2009 at 4:19 PM, Andreas Sch=E4fer wrote: > Hey guys, > > I noticed a strange performance hit in one of our stencil codes, > causing it to run twice as long. > > To nail down the error, I reduced our code to the two attached demo > programs. Basically they take two matrices and average each matrix > element with its four direct neighbors. Depending on how these > matrices are allocated, the performance hit occurs -- or does not. > > Here is the diff of the two files: > @@ -17,8 +17,7 @@ > > =A0void test(double (*grid)[GRID_WIDTH]) > =A0{ > - =A0 =A0double (*gridOld)[GRID_WIDTH] =3D > - =A0 =A0 =A0 =A0malloc(GRID_WIDTH * GRID_HEIGHT * sizeof(double)); > + =A0 =A0double (*gridOld)[GRID_WIDTH] =3D gridOldArray; > =A0 =A0 double (*gridNew)[GRID_WIDTH] =3D gridNewArray; > =A0 =A0 printAddress(&gridNew[0][0]); > =A0 =A0 printAddress(&gridOld[0][0]); > > where gridOldArray is a statically allocated array. Depending on the > machines processor the performance hit varies from negligible to > dramatic: > > > Processor =A0 =A0 =A0 =A0 =A0GCC Version Time(slow) Time(fast) Performanc= e Hit > ------------------ ----------- ---------- ---------- --------------- > Core 2 Quad Q9550 =A04.3.3 =A0 =A0 =A0 12.19s =A0 =A0 =A05.11s =A0 =A0 13= 8% > Athlon 64 X2 3800+ 4.3.3 =A0 =A0 =A0 =A07.34s =A0 =A0 =A06.61s =A0 =A0 = =A011% > Opteron 2378 =A0 =A0 =A0 4.3.2 =A0 =A0 =A0 =A06.13s =A0 =A0 =A05.60s =A0 = =A0 =A0 9% > Opteron 2352 =A0 =A0 =A0 4.3.3 =A0 =A0 =A0 =A08.16s =A0 =A0 =A07.96s =A0 = =A0 =A0 2% > Xeon 3.00GHz =A0 =A0 =A0 4.3.3 =A0 =A0 =A0 18.98s =A0 =A0 14.67s =A0 =A0 = =A029% > > Apparently Intel systems are more susceptible to this effect. > > Can anyone reproduce these results? > And could anyone explain, why this happens? Depends on the GCC version used. First of all printAddress(&gridNew[0][0]); printAddress(&gridOld[0][0]); makes the addresses escape and GCC versions other than the current development trunk think that the malloced address can alias the global variables. Richard. > Thanks in advance > -Andreas > > > -- > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > Andreas Sch=E4fer > Cluster and Metacomputing Working Group > Friedrich-Schiller-Universit=E4t Jena, Germany > 0049/3641-9-46376 > PGP/GPG key via keyserver > I'm a bright... http://www.the-brights.net > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > (\___/) > (+'.'+) > (")_(") > This is Bunny. Copy and paste Bunny into your > signature to help him gain world domination! >