public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/42194] New: performance degradation with STL complex convolution operation
@ 2009-11-27 12:02 jagjeet dot nain at gmail dot com
2009-11-27 15:02 ` [Bug c++/42194] " rguenth at gcc dot gnu dot org
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: jagjeet dot nain at gmail dot com @ 2009-11-27 12:02 UTC (permalink / raw)
To: gcc-bugs
I have very simple program which basically does complex matrix convolution
operation.
I am seeing 3 times performance degradation if this program is compiled with
4.3.2 version vs compiled with 4.0.2. I am compiling this program with -O3
option, no additional optimization flags supplied. Also one more interesting
thing to note is that this behavior is seen only with complex data type, if i
use plain float data type, timings are better with 4.3.2 version.
Please help me.
#include <complex>
#include <iostream>
#include <stdio.h>
#include <time.h>
float procTimeInSeconds()
{
return clock()/static_cast<float>(CLOCKS_PER_SEC);
}
using namespace std;
int main(int argc , char** arg )
{
const int Nc = 32; // total matrix
const int Nx = 512; // columns
const int Nn = 16; //typical value
const int Ns = 10;
const int Nw = Nc * Nn;
complex<float>* all_weights = new complex<float>[Nx*Nw*Nc];
complex<float>* input = (complex<float>*)new complex<float>[Nx*Nw*Ns];
complex<float>* output = (complex<float>*)new complex<float>[Nx*Nc*Ns];
int weights_stride_c = Nx * Nw;
int weights_stride_w = Nx;
int weights_stride_x = 1;
int input_stride_s = Nx * Nw;
int input_stride_w = Nx;
int input_stride_x = 1;
int output_stride_s = Nx * Nc;
int output_stride_c = Nx;
int output_stride_x = 1;
// ================================================================
// Round 1
// Do array reductions as we decend into the loop nesting,
// keeping temporary pointers for each result.
// Results: Faster for unoptimized compilation, but slower for
// compiler optimization on.
// ================================================================
int count = 0;
float startTime = procTimeInSeconds();
complex<float>* input_s;
complex<float>* output_s ;
complex<float>* curr_weight_c;
complex<float>* output_sc;
complex<float>* curr_weight_cw;
complex<float>* input_sw;
for(int is = 0; is < Ns; ++is )
{
input_s = &input[is*input_stride_s];
output_s = &output[is*output_stride_s];
for (int ic=0; ic<Nc; ++ic)
{
curr_weight_c = &all_weights[ic * weights_stride_c];
output_sc = &output_s[ic*output_stride_c];
// for that matrix, loop through w
for (int iw=0; iw<Nw; ++iw)
{
curr_weight_cw = &curr_weight_c[weights_stride_w * iw];
input_sw = &input_s[iw*input_stride_w];
for (int ix=0; ix<Nx; ++ix)
{
output_sc[ix*output_stride_x] +=
curr_weight_cw[ix*weights_stride_x] * input_sw[ix*input_stride_x];
++count;
}
}
}
}
//delete [] all_weights;
float netTime = procTimeInSeconds() - startTime;
cout << count << " in " << netTime << " seconds, round 1" << std::endl;
return 0;
}
--
Summary: performance degradation with STL complex convolution
operation
Product: gcc
Version: 4.3.3
Status: UNCONFIRMED
Severity: major
Priority: P3
Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jagjeet dot nain at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42194
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug c++/42194] performance degradation with STL complex convolution operation
2009-11-27 12:02 [Bug c++/42194] New: performance degradation with STL complex convolution operation jagjeet dot nain at gmail dot com
@ 2009-11-27 15:02 ` rguenth at gcc dot gnu dot org
2009-11-30 9:57 ` jagjeet dot nain at gmail dot com
2010-01-10 5:54 ` jagjeet dot nain at gmail dot com
2 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-11-27 15:02 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from rguenth at gcc dot gnu dot org 2009-11-27 15:02 -------
This is because with GCC 4.3 we properly implement complex arithmetic.
Use -fcx-fortran-rules or -fcx-limited-range for speed.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |WORKSFORME
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42194
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug c++/42194] performance degradation with STL complex convolution operation
2009-11-27 12:02 [Bug c++/42194] New: performance degradation with STL complex convolution operation jagjeet dot nain at gmail dot com
2009-11-27 15:02 ` [Bug c++/42194] " rguenth at gcc dot gnu dot org
@ 2009-11-30 9:57 ` jagjeet dot nain at gmail dot com
2010-01-10 5:54 ` jagjeet dot nain at gmail dot com
2 siblings, 0 replies; 5+ messages in thread
From: jagjeet dot nain at gmail dot com @ 2009-11-30 9:57 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from jagjeet dot nain at gmail dot com 2009-11-30 09:57 -------
Will -fcx-limited-range or -fcx-fortran-rules change the results compared to
compiled with 4.0.2 without these flags ?
Or in otherwords, A complex division program compiled with and without
-fcx-limited-range flag of gcc 4.3.3, will results differ ?
with regards
J. S. Nain
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42194
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug c++/42194] performance degradation with STL complex convolution operation
2009-11-27 12:02 [Bug c++/42194] New: performance degradation with STL complex convolution operation jagjeet dot nain at gmail dot com
2009-11-27 15:02 ` [Bug c++/42194] " rguenth at gcc dot gnu dot org
2009-11-30 9:57 ` jagjeet dot nain at gmail dot com
@ 2010-01-10 5:54 ` jagjeet dot nain at gmail dot com
2 siblings, 0 replies; 5+ messages in thread
From: jagjeet dot nain at gmail dot com @ 2010-01-10 5:54 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from jagjeet dot nain at gmail dot com 2010-01-10 05:54 -------
Got the speedup improvement with fcx-limited-range flag.
but got into another problem.
This particular code when run on Sunx4100 server(with OpenSuse 10.1) shows
unusual behavior. Different runs have different runtime. always no other load
on server was ensured.
One in three runs have double runtime of normal run.
Processor is dual core AMD opteron.
any help in this matter is appreciated.
--
jagjeet dot nain at gmail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |UNCONFIRMED
Resolution|WORKSFORME |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42194
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug c++/42194] performance degradation with STL complex convolution operation
[not found] <bug-42194-4@http.gcc.gnu.org/bugzilla/>
@ 2011-09-23 21:54 ` paolo.carlini at oracle dot com
0 siblings, 0 replies; 5+ messages in thread
From: paolo.carlini at oracle dot com @ 2011-09-23 21:54 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42194
Paolo Carlini <paolo.carlini at oracle dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |INVALID
Severity|major |normal
--- Comment #4 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-09-23 21:47:26 UTC ---
Closing.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-09-23 21:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-27 12:02 [Bug c++/42194] New: performance degradation with STL complex convolution operation jagjeet dot nain at gmail dot com
2009-11-27 15:02 ` [Bug c++/42194] " rguenth at gcc dot gnu dot org
2009-11-30 9:57 ` jagjeet dot nain at gmail dot com
2010-01-10 5:54 ` jagjeet dot nain at gmail dot com
[not found] <bug-42194-4@http.gcc.gnu.org/bugzilla/>
2011-09-23 21:54 ` paolo.carlini at oracle dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).