* restricted pointers
@ 2008-03-01 14:47 Carlos Alvarez
2008-03-02 21:39 ` Segher Boessenkool
0 siblings, 1 reply; 3+ messages in thread
From: Carlos Alvarez @ 2008-03-01 14:47 UTC (permalink / raw)
To: gcc-help
Good morning,
I am currently trying to make use of the "restrict" ("__restrict__")
keyord with the gcc-4.2 (-O3 -std=c99) and g++-4.2 (-O3) compilers to
allow restricted pointers. For this I am making a test with this code:
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
void vecmult(int n, int * restrict a, int * restrict b, int * restrict c)
{
int i;
for (i=0; i<n; ++i) {
a[i] = b[i] * c[i];
}
}
int main(){
int Nsteps = 100000;
int n = 1000;
int* a=NULL;
int* b=NULL;
int* c=NULL;
//allocate memory
a = malloc(n*sizeof(int));
b = malloc(n*sizeof(int));
c = malloc(n*sizeof(int));
//initialize arrays
for(int i = 0; i < n; ++i){
a[i] = i;
b[i] = 1;
c[i] = 0;
}
//initialize time
struct timeval tim;
gettimeofday(&tim, NULL);
long tcpu = clock();
for(int i = 0; i < Nsteps; ++i){
vecmult(n, a, b, c);
}
//time difference evaluation
double t1 = tim.tv_sec + tim.tv_usec / 1000000.0;
double start = (double)(tcpu);
gettimeofday(&tim, NULL);
double t2 = tim.tv_sec + tim.tv_usec / 1000000.0;
tcpu = clock();
double stop = (double)(tcpu);
double t_elap = (t2 - t1);
double t_cpu = (stop - start) / 1000000.0;
//print
printf("%f %f\n",t_elap, t_cpu);
//deallocate memory
free(a);
free(b);
free(c);
return 0;
}
but the times with or without restrict are the same. Why isn't it
improving performance, am I doing something wrong?
Thank you in advance for your help.
Sincerely,
Carlos Alvarez
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: restricted pointers
2008-03-01 14:47 restricted pointers Carlos Alvarez
@ 2008-03-02 21:39 ` Segher Boessenkool
2008-03-03 22:22 ` Carlos Alvarez
0 siblings, 1 reply; 3+ messages in thread
From: Segher Boessenkool @ 2008-03-02 21:39 UTC (permalink / raw)
To: Carlos Alvarez; +Cc: gcc-help
> void vecmult(int n, int * restrict a, int * restrict b, int * restrict
> c)
> {
> int i;
> for (i=0; i<n; ++i) {
> a[i] = b[i] * c[i];
> }
> }
> but the times with or without restrict are the same. Why isn't it
> improving performance, am I doing something wrong?
Well, no matter what, the program will have to perform n reads of
elements of b[] and c[], and perform n writes to elements of a[].
You could try adding -O3 and/or some of the vectorisation options.
Also, you could test something like
void vecmult(int n, int * restrict a, int * restrict b, int * restrict
c)
{
int i;
for (i=0; i<n; ++i) {
a[i] = b[0] * c[i];
}
}
which has a much bigger opportunity for optimisation (no need to load
from b[0] more than once, only if there is a restrict on b[]).
Segher
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: restricted pointers
2008-03-02 21:39 ` Segher Boessenkool
@ 2008-03-03 22:22 ` Carlos Alvarez
0 siblings, 0 replies; 3+ messages in thread
From: Carlos Alvarez @ 2008-03-03 22:22 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: gcc-help
Thank you for your response. I am already compiling with the -O3
option and the -std=c99 option in the case of c. I tried fixing the
b[0] element but i obtained the same result. This is important for me
because I am making scientific simulation software in c/c++ and I need
it to be as fast as possible, and when I learned about the aliasing
problem I wanted to know how to overcome it, for which I have to
understand when does the problem presents and how to solve it.
Would you happen to know any example in which the pointer aliasing
problem is present and then can be overcame by restricted pointers in
a manner that I can test it?
Thank you again for your time,
Best regards,
Carlos Alvarez
On Sun, Mar 2, 2008 at 4:38 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> > void vecmult(int n, int * restrict a, int * restrict b, int * restrict
> > c)
> > {
> > int i;
> > for (i=0; i<n; ++i) {
> > a[i] = b[i] * c[i];
> > }
> > }
>
>
> > but the times with or without restrict are the same. Why isn't it
> > improving performance, am I doing something wrong?
>
> Well, no matter what, the program will have to perform n reads of
> elements of b[] and c[], and perform n writes to elements of a[].
>
> You could try adding -O3 and/or some of the vectorisation options.
>
> Also, you could test something like
>
>
> void vecmult(int n, int * restrict a, int * restrict b, int * restrict
> c)
> {
> int i;
> for (i=0; i<n; ++i) {
> a[i] = b[0] * c[i];
> }
> }
>
> which has a much bigger opportunity for optimisation (no need to load
> from b[0] more than once, only if there is a restrict on b[]).
>
>
> Segher
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-03-03 22:22 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-01 14:47 restricted pointers Carlos Alvarez
2008-03-02 21:39 ` Segher Boessenkool
2008-03-03 22:22 ` Carlos Alvarez
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).