public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* restricted pointers
@ 2008-03-01 14:47 Carlos Alvarez
  2008-03-02 21:39 ` Segher Boessenkool
  0 siblings, 1 reply; 3+ messages in thread
From: Carlos Alvarez @ 2008-03-01 14:47 UTC (permalink / raw)
  To: gcc-help

Good morning,

I am currently trying to make use of the "restrict" ("__restrict__")
keyord with the gcc-4.2 (-O3 -std=c99) and g++-4.2 (-O3) compilers to
allow restricted pointers. For this I am making a test with this code:


#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <sys/time.h>

void vecmult(int n, int * restrict a, int * restrict b, int * restrict c)
{
  int i;
  for (i=0; i<n; ++i) {
    a[i] = b[i] * c[i];
  }
}

int main(){
  int Nsteps = 100000;
  int n = 1000;
  int* a=NULL;
  int* b=NULL;
  int* c=NULL;

  //allocate memory
  a = malloc(n*sizeof(int));
  b = malloc(n*sizeof(int));
  c = malloc(n*sizeof(int));

  //initialize arrays
  for(int i = 0; i < n; ++i){
    a[i] = i;
    b[i] = 1;
    c[i] = 0;
  }

  //initialize time
  struct timeval tim;
  gettimeofday(&tim, NULL);
  long tcpu = clock();

  for(int i = 0; i < Nsteps; ++i){
    vecmult(n, a, b, c);
  }

  //time difference evaluation
  double t1 = tim.tv_sec + tim.tv_usec / 1000000.0;
  double start = (double)(tcpu);
  gettimeofday(&tim, NULL);
  double t2 = tim.tv_sec + tim.tv_usec / 1000000.0;
  tcpu = clock();
  double stop = (double)(tcpu);
  double t_elap = (t2 - t1);
  double t_cpu = (stop - start) / 1000000.0;
  //print
  printf("%f %f\n",t_elap, t_cpu);

  //deallocate memory
  free(a);
  free(b);
  free(c);

  return 0;
}


but the times with or without restrict are the same. Why isn't it
improving performance, am I doing something wrong?

Thank you in advance for your help.

Sincerely,

Carlos Alvarez

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: restricted pointers
  2008-03-01 14:47 restricted pointers Carlos Alvarez
@ 2008-03-02 21:39 ` Segher Boessenkool
  2008-03-03 22:22   ` Carlos Alvarez
  0 siblings, 1 reply; 3+ messages in thread
From: Segher Boessenkool @ 2008-03-02 21:39 UTC (permalink / raw)
  To: Carlos Alvarez; +Cc: gcc-help

> void vecmult(int n, int * restrict a, int * restrict b, int * restrict 
> c)
> {
>   int i;
>   for (i=0; i<n; ++i) {
>     a[i] = b[i] * c[i];
>   }
> }

> but the times with or without restrict are the same. Why isn't it
> improving performance, am I doing something wrong?

Well, no matter what, the program will have to perform n reads of
elements of b[] and c[], and perform n writes to elements of a[].

You could try adding -O3 and/or some of the vectorisation options.

Also, you could test something like

void vecmult(int n, int * restrict a, int * restrict b, int * restrict 
c)
{
   int i;
   for (i=0; i<n; ++i) {
     a[i] = b[0] * c[i];
   }
}

which has a much bigger opportunity for optimisation (no need to load
from b[0] more than once, only if there is a restrict on b[]).


Segher

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: restricted pointers
  2008-03-02 21:39 ` Segher Boessenkool
@ 2008-03-03 22:22   ` Carlos Alvarez
  0 siblings, 0 replies; 3+ messages in thread
From: Carlos Alvarez @ 2008-03-03 22:22 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-help

Thank you for your response. I am already compiling with the -O3
option and the -std=c99 option in the case of c. I tried fixing the
b[0] element but i obtained the same result. This is important for me
because I am making scientific simulation software in c/c++ and I need
it to be as fast as possible, and when I learned about the aliasing
problem I wanted to know how to overcome it, for which I have to
understand when does the problem presents and how to solve it.

Would you happen to know any example in which the pointer aliasing
problem is present and then can be overcame by restricted pointers in
a manner that I can test it?

Thank you again for your time,

Best regards,

Carlos Alvarez

On Sun, Mar 2, 2008 at 4:38 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> > void vecmult(int n, int * restrict a, int * restrict b, int * restrict
>  > c)
>  > {
>  >   int i;
>  >   for (i=0; i<n; ++i) {
>  >     a[i] = b[i] * c[i];
>  >   }
>  > }
>
>
> > but the times with or without restrict are the same. Why isn't it
>  > improving performance, am I doing something wrong?
>
>  Well, no matter what, the program will have to perform n reads of
>  elements of b[] and c[], and perform n writes to elements of a[].
>
>  You could try adding -O3 and/or some of the vectorisation options.
>
>  Also, you could test something like
>
>
>  void vecmult(int n, int * restrict a, int * restrict b, int * restrict
>  c)
>  {
>    int i;
>    for (i=0; i<n; ++i) {
>      a[i] = b[0] * c[i];
>    }
>  }
>
>  which has a much bigger opportunity for optimisation (no need to load
>  from b[0] more than once, only if there is a restrict on b[]).
>
>
>  Segher
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-03-03 22:22 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-03-01 14:47 restricted pointers Carlos Alvarez
2008-03-02 21:39 ` Segher Boessenkool
2008-03-03 22:22   ` Carlos Alvarez

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).