Re: Performance measurements (thanks and conclusion)

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: Performance measurements (thanks and conclusion)
@ 1998-06-26  6:07 N8TM
  0 siblings, 0 replies; 2+ messages in thread
From: N8TM @ 1998-06-26  6:07 UTC (permalink / raw)
  To: martin.kahlert; +Cc: egcs

In a message dated 6/25/98 8:25:38 AM Pacific Daylight Time,
martin.kahlert@mchp.siemens.de writes:

> -Why is it so difficult for gcc to transform the code
>   for(i=0;i<n;i++)
>      result[i]=a[i]+2*a[i+1]+3*a[i+2];
>  
>   into something like
>  
>   _tmp0=a[0];_tmp1=a[1];_tmp2=a[2];
>   for(i=0;i<n;i++)
>      {
>       result[i]=_tmp0+2*_tmp1+3*_tmp2;
>       _tmp0=_tmp1;
>       _tmp1=_tmp2;
>       _tmp2=a[i+2];
>      }

C code is often written with pointers which cannot be "disambiguated" to
resolve potential aliases (supposing that result[] overlaps a[]) except
possibly with a global analysis across all functions.  Maybe this is why many
C compilers don't attempt such analysis even in the simple cases.  I've seen
that even the best compilers don't do the analysis where the read and written
data come from different cross-sections of the same array.

Next, gcc has to deal with architectures where such a transformation can't be
implemented efficiently. A partial solution is to eliminate duplicate memory
references in an unrolled loop body, and I'd certainly like to see that happen
more reliably. 

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Performance measurements (thanks and conclusion)
       [not found]       ` <19980624170051.21290@haegar.physiol.med.tu-muenchen.de>
@ 1998-06-25  3:09         ` Martin Kahlert
  0 siblings, 0 replies; 2+ messages in thread
From: Martin Kahlert @ 1998-06-25  3:09 UTC (permalink / raw)
  To: axp-list; +Cc: scr, robert, egcs

Quoting Robert Wilhelm (robert@physiol.med.tu-muenchen.de):
> > [Robert: could you please compile my code on your Alpha using
> >          egcs and report your results to me]
> 
> I get about 275 MFLOPS for my 533MHz 21164a for both egcs 1.0 and
> egcs-current with haifa enabled.
> 
> If I use different local variables lfA*, egcs seems to shedule a bit
> better and I get 290 MFLOPS.
> 
> Robert

I was really overwhelmed with the repsonse to this thread on 
axp-list. Thanks a lot for all people who tried my source
and even tried to get more out of the compilers.

I tried both versions on my PPro 200:
Stefan Schroepfer's version:
pgcc:
85.98 MFLOPS
gcc-2.7.2.1:
97.10 MFLOPS
gcc-without double align:
95.46 MFLOPS
egcs-2.91.42:
84.06 MFLOPS
tcc:
17.20 MFLOPS

Robert Wilhelm's version:
pgcc:
81.62 MFLOPS
gcc-2.7.2.1:
98.81 MFLOPS
gcc-without double align:
98.81 MFLOPS
egcs-2.91.42:
83.44 MFLOPS
tcc:
16.44 MFLOPS

It seems that tcc is not the fastest and the most reliable 
under the sun...

Can i conclude, that it's a good idea to insert as many local
vars as possible to get good results from compilers?

Now i have two questions:

-Why is it so difficult for gcc to transform the code
 for(i=0;i<n;i++)
    result[i]=a[i]+2*a[i+1]+3*a[i+2];

 into something like

 _tmp0=a[0];_tmp1=a[1];_tmp2=a[2];
 for(i=0;i<n;i++)
    {
     result[i]=_tmp0+2*_tmp1+3*_tmp2;
     _tmp0=_tmp1;
     _tmp1=_tmp2;
     _tmp2=a[i+2];
    }
 for itself? I think, especially in Fortran such things are a
 common task.
-What's the reason for the performace loss between gcc-2.7.2.1 
 and egcs-2.91.42 - it's nearly 20%, that gcc-2.7.2.1 is better?

Thanks a lot,
Martin.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~1998-06-26  6:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-06-26  6:07 Performance measurements (thanks and conclusion) N8TM
  -- strict thread matches above, loose matches on Subject: below --
1998-06-24  2:28 Performance measurements Martin Kahlert
     [not found] ` <3590D5AE.167EB0E7@iis.fhg.de>
     [not found]   ` <19980624124843.A15248@keksy.mchp.siemens.de>
     [not found]     ` <3591031A.2781E494@iis.fhg.de>
     [not found]       ` <19980624170051.21290@haegar.physiol.med.tu-muenchen.de>
1998-06-25  3:09         ` Performance measurements (thanks and conclusion) Martin Kahlert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).