public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Andreas Schäfer" <gentryx@gmx.de>
To: gcc@gcc.gnu.org
Subject: Strange Performance Hit on 2D-Loop
Date: Thu, 09 Jul 2009 14:19:00 -0000	[thread overview]
Message-ID: <20090709141953.GA4672@rei> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 1950 bytes --]

Hey guys,

I noticed a strange performance hit in one of our stencil codes,
causing it to run twice as long. 

To nail down the error, I reduced our code to the two attached demo
programs. Basically they take two matrices and average each matrix
element with its four direct neighbors. Depending on how these
matrices are allocated, the performance hit occurs -- or does not.

Here is the diff of the two files:
@@ -17,8 +17,7 @@

 void test(double (*grid)[GRID_WIDTH])
 {
-    double (*gridOld)[GRID_WIDTH] =
-        malloc(GRID_WIDTH * GRID_HEIGHT * sizeof(double));
+    double (*gridOld)[GRID_WIDTH] = gridOldArray;
     double (*gridNew)[GRID_WIDTH] = gridNewArray;
     printAddress(&gridNew[0][0]);
     printAddress(&gridOld[0][0]);

where gridOldArray is a statically allocated array. Depending on the
machines processor the performance hit varies from negligible to
dramatic:


Processor          GCC Version Time(slow) Time(fast) Performance Hit
------------------ ----------- ---------- ---------- ---------------
Core 2 Quad Q9550  4.3.3       12.19s      5.11s     138%
Athlon 64 X2 3800+ 4.3.3        7.34s      6.61s      11%
Opteron 2378       4.3.2        6.13s      5.60s       9%
Opteron 2352       4.3.3        8.16s      7.96s       2%
Xeon 3.00GHz       4.3.3       18.98s     14.67s      29%

Apparently Intel systems are more susceptible to this effect. 

Can anyone reproduce these results?
And could anyone explain, why this happens?

Thanks in advance
-Andreas


-- 
============================================
Andreas Schäfer
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany
0049/3641-9-46376
PGP/GPG key via keyserver
I'm a bright... http://www.the-brights.net
============================================

(\___/)
(+'.'+)
(")_(")
This is Bunny. Copy and paste Bunny into your 
signature to help him gain world domination!

[-- Attachment #1.2: slowdown.slow.c --]
[-- Type: text/x-csrc, Size: 1880 bytes --]

#define GRID_WIDTH  1024
#define GRID_HEIGHT 1024
#define MAX_STEPS 1024

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

double grid[GRID_HEIGHT][GRID_WIDTH];
double gridNewArray[GRID_HEIGHT][GRID_WIDTH];
double gridOldArray[GRID_HEIGHT][GRID_WIDTH];

void printAddress(void *p)
{
    printf("address %p\n", p);
}

void test(double (*grid)[GRID_WIDTH])
{
    double (*gridOld)[GRID_WIDTH] = gridOldArray;
    double (*gridNew)[GRID_WIDTH] = gridNewArray;
    printAddress(&gridNew[0][0]);
    printAddress(&gridOld[0][0]);

    // copy initial state
    for (int y = 0; y < GRID_HEIGHT; ++y) {
        memcpy(&gridOld[y][0], &grid[y][0], GRID_WIDTH * sizeof(double));
        memset(&gridNew[y][0], 0, GRID_WIDTH * sizeof(double));
    }

    // update matrices
    for (int step = 0; step < MAX_STEPS; ++step) {
        for (int y = 1; y < GRID_HEIGHT-1; ++y) 
            for (int x = 1; x < GRID_WIDTH-1; ++x)
                gridNew[y][x] = 
                    (gridOld[y-1][x  ] + 
                     gridOld[y  ][x-1] + 
                     gridOld[y  ][x  ] + 
                     gridOld[y  ][x+1] + 
                     gridOld[y+1][x  ]) * 0.2;
        double (*tmp)[GRID_WIDTH] = gridOld;
        gridOld = gridNew;
        gridNew = tmp;
    }

    // copy result back
    for (int y = 0; y < GRID_HEIGHT; ++y)
        memcpy(&grid[y][0], &gridOld[y][0], GRID_WIDTH * sizeof(double));
}

void setupGrid()
{
    for (int y = 0; y < GRID_HEIGHT; ++y)
        for (int x = 0; x < GRID_WIDTH; ++x)
            grid[y][x] = 0;

    for (int y = 10; y < 20; ++y)
        for (int x = 10; x < 20; ++x)
            grid[y][x] = 1;
}

int main(int argc, char** argv)
{
    setupGrid();
    test(grid);
    printf("res: %f\n", grid[10][10]); // prevent dead code elimination
    return 0;
}

[-- Attachment #1.3: slowdown.fast --]
[-- Type: application/octet-stream, Size: 8392 bytes --]

[-- Attachment #1.4: test.sh --]
[-- Type: application/x-sh, Size: 233 bytes --]

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

             reply	other threads:[~2009-07-09 14:19 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-09 14:19 Andreas Schäfer [this message]
2009-07-09 14:37 ` Richard Guenther
2009-07-09 14:48   ` Andreas Schäfer
2011-01-13 16:28 ` Andreas Schäfer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090709141953.GA4672@rei \
    --to=gentryx@gmx.de \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).