public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Andreas Schäfer" <gentryx@gmx.de>
To: gcc@gcc.gnu.org
Subject: Re: Strange Performance Hit on 2D-Loop
Date: Thu, 13 Jan 2011 16:28:00 -0000	[thread overview]
Message-ID: <20110113162838.GA5571@rei.informatik.uni-erlangen.de> (raw)
In-Reply-To: <20090709141953.GA4672@rei>

[-- Attachment #1: Type: text/plain, Size: 4931 bytes --]

Just for the records: I finally found the issue here. It's a problem
of both, alignment and cache thrashing. When using aligned memory
(e.g. via posix_memalign()) and using a suitable offset within that
memory, the effect goes away. So it's a processor effect, not a
compiler issue. :-) 

Best
-Andreas


On 16:19 Thu 09 Jul     , Andreas Schäfer wrote:
> Hey guys,
> 
> I noticed a strange performance hit in one of our stencil codes,
> causing it to run twice as long. 
> 
> To nail down the error, I reduced our code to the two attached demo
> programs. Basically they take two matrices and average each matrix
> element with its four direct neighbors. Depending on how these
> matrices are allocated, the performance hit occurs -- or does not.
> 
> Here is the diff of the two files:
> @@ -17,8 +17,7 @@
> 
>  void test(double (*grid)[GRID_WIDTH])
>  {
> -    double (*gridOld)[GRID_WIDTH] =
> -        malloc(GRID_WIDTH * GRID_HEIGHT * sizeof(double));
> +    double (*gridOld)[GRID_WIDTH] = gridOldArray;
>      double (*gridNew)[GRID_WIDTH] = gridNewArray;
>      printAddress(&gridNew[0][0]);
>      printAddress(&gridOld[0][0]);
> 
> where gridOldArray is a statically allocated array. Depending on the
> machines processor the performance hit varies from negligible to
> dramatic:
> 
> 
> Processor          GCC Version Time(slow) Time(fast) Performance Hit
> ------------------ ----------- ---------- ---------- ---------------
> Core 2 Quad Q9550  4.3.3       12.19s      5.11s     138%
> Athlon 64 X2 3800+ 4.3.3        7.34s      6.61s      11%
> Opteron 2378       4.3.2        6.13s      5.60s       9%
> Opteron 2352       4.3.3        8.16s      7.96s       2%
> Xeon 3.00GHz       4.3.3       18.98s     14.67s      29%
> 
> Apparently Intel systems are more susceptible to this effect. 
> 
> Can anyone reproduce these results?
> And could anyone explain, why this happens?
> 
> Thanks in advance
> -Andreas
> 
> 
> -- 
> ============================================
> Andreas Schäfer
> Cluster and Metacomputing Working Group
> Friedrich-Schiller-Universität Jena, Germany
> 0049/3641-9-46376
> PGP/GPG key via keyserver
> I'm a bright... http://www.the-brights.net
> ============================================
> 
> (\___/)
> (+'.'+)
> (")_(")
> This is Bunny. Copy and paste Bunny into your 
> signature to help him gain world domination!

> #define GRID_WIDTH  1024
> #define GRID_HEIGHT 1024
> #define MAX_STEPS 1024
> 
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> 
> double grid[GRID_HEIGHT][GRID_WIDTH];
> double gridNewArray[GRID_HEIGHT][GRID_WIDTH];
> double gridOldArray[GRID_HEIGHT][GRID_WIDTH];
> 
> void printAddress(void *p)
> {
>     printf("address %p\n", p);
> }
> 
> void test(double (*grid)[GRID_WIDTH])
> {
>     double (*gridOld)[GRID_WIDTH] = gridOldArray;
>     double (*gridNew)[GRID_WIDTH] = gridNewArray;
>     printAddress(&gridNew[0][0]);
>     printAddress(&gridOld[0][0]);
> 
>     // copy initial state
>     for (int y = 0; y < GRID_HEIGHT; ++y) {
>         memcpy(&gridOld[y][0], &grid[y][0], GRID_WIDTH * sizeof(double));
>         memset(&gridNew[y][0], 0, GRID_WIDTH * sizeof(double));
>     }
> 
>     // update matrices
>     for (int step = 0; step < MAX_STEPS; ++step) {
>         for (int y = 1; y < GRID_HEIGHT-1; ++y) 
>             for (int x = 1; x < GRID_WIDTH-1; ++x)
>                 gridNew[y][x] = 
>                     (gridOld[y-1][x  ] + 
>                      gridOld[y  ][x-1] + 
>                      gridOld[y  ][x  ] + 
>                      gridOld[y  ][x+1] + 
>                      gridOld[y+1][x  ]) * 0.2;
>         double (*tmp)[GRID_WIDTH] = gridOld;
>         gridOld = gridNew;
>         gridNew = tmp;
>     }
> 
>     // copy result back
>     for (int y = 0; y < GRID_HEIGHT; ++y)
>         memcpy(&grid[y][0], &gridOld[y][0], GRID_WIDTH * sizeof(double));
> }
> 
> void setupGrid()
> {
>     for (int y = 0; y < GRID_HEIGHT; ++y)
>         for (int x = 0; x < GRID_WIDTH; ++x)
>             grid[y][x] = 0;
> 
>     for (int y = 10; y < 20; ++y)
>         for (int x = 10; x < 20; ++x)
>             grid[y][x] = 1;
> }
> 
> int main(int argc, char** argv)
> {
>     setupGrid();
>     test(grid);
>     printf("res: %f\n", grid[10][10]); // prevent dead code elimination
>     return 0;
> }






-- 
==========================================================
Andreas Schäfer
HPC and Grid Computing
Chair of Computer Science 3
Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
+49 9131 85-27910
PGP/GPG key via keyserver
I'm a bright... http://www.the-brights.net
==========================================================

(\___/)
(+'.'+)
(")_(")
This is Bunny. Copy and paste Bunny into your 
signature to help him gain world domination!

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

      parent reply	other threads:[~2011-01-13 16:28 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-09 14:19 Andreas Schäfer
2009-07-09 14:37 ` Richard Guenther
2009-07-09 14:48   ` Andreas Schäfer
2011-01-13 16:28 ` Andreas Schäfer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110113162838.GA5571@rei.informatik.uni-erlangen.de \
    --to=gentryx@gmx.de \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).