public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* altivec support in gcc
@ 2003-04-08 10:10 Michel LESPINASSE
  2003-04-09 16:57 ` Aldy Hernandez
  0 siblings, 1 reply; 14+ messages in thread
From: Michel LESPINASSE @ 2003-04-08 10:10 UTC (permalink / raw)
  To: gcc

Hi,

I have a few questions about the altivec support in gcc...

First, is there any reason why altivec.h defines the second argument
in vec_ld as being a vector pointer instead of a const vector pointer ?
This is causing me no ends of trouble.

Second, I have not tried gcc 3.3 yet, but gcc 3.2 has a lot of trouble
compiling the following construct... it does compile it eventually,
but it requires hundreds of megabytes of virtual memory for doing
it...

(This code is meant as a shortcut for calculating (A+B+C+D+2)>>2, for
a vector of 16 unsigned char values. This is used in the motion
compensation loop of an mpeg2 decoder.)

    ones = vec_splat_u8 (1);
    avg0 = vec_avg (A, B);
    xor0 = vec_xor (A, B);
    avg1 = vec_avg (C, D);
    xor1 = vec_xor (C, D);
    tmp = vec_and (vec_and (ones, vec_or (xor0, xor1)),
                   vec_xor (avg0, avg1));
    out = vec_sub (vec_avg (avg0, avg1), tmp);

Initially I had only one out= assignment, i.e. I had put the tmp
expression in place of the tmp variable in the current out
assignment. I could not even get that code to compile, it made gcc
inflate to over 700 MB. After splitting it as shown above, the code
does compile fine, but GCC still inflates to over 300 MB compiling it.

This is on a debian/sid system, the gcc -v version indicates:
gcc -v
Reading specs from /usr/lib/gcc-lib/powerpc-linux/3.2.3/specs
Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,proto,objc,ada --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.2 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-java-gc=boehm --enable-objc-gc powerpc-linux
Thread model: posix
gcc version 3.2.3 20030316 (Debian prerelease)

It's probably not a critical bug as it can be worked around by
splitting expressions in smaller pieces, but I thought it should be
signaled as it makes some code extremely slow to compile. For
information, the same code used to compile just fine with apple's old
altivec-patched gcc 2.95.x compiler.

Hope this helps,

-- 
Michel "Walken" LESPINASSE
"In this time of war against Osama bin Laden and the oppressive
Taliban regime, we are thankful that OUR leader isn't the spoiled son
of a powerful politician from a wealthy oil family who is supported by
religious fundamentalists, operates through clandestine organizations,
has no respect for the democratic electoral process, bombs innocents,
and uses war to deny people their civil liberties." --The Boondocks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: altivec support in gcc
  2003-04-08 10:10 altivec support in gcc Michel LESPINASSE
@ 2003-04-09 16:57 ` Aldy Hernandez
  2003-04-09 22:55   ` Michel LESPINASSE
  0 siblings, 1 reply; 14+ messages in thread
From: Aldy Hernandez @ 2003-04-09 16:57 UTC (permalink / raw)
  To: Michel LESPINASSE; +Cc: gcc

>>>>> "Michel" == Michel LESPINASSE <walken@zoy.org> writes:

 > Hi,
 > I have a few questions about the altivec support in gcc...

 > First, is there any reason why altivec.h defines the second argument
 > in vec_ld as being a vector pointer instead of a const vector pointer ?
 > This is causing me no ends of trouble.

Why is this causing you trouble?

 > Second, I have not tried gcc 3.3 yet, but gcc 3.2 has a lot of trouble
 > compiling the following construct... it does compile it eventually,
 > but it requires hundreds of megabytes of virtual memory for doing
 > it...

 > (This code is meant as a shortcut for calculating (A+B+C+D+2)>>2, for
 > a vector of 16 unsigned char values. This is used in the motion
 > compensation loop of an mpeg2 decoder.)

 >     ones = vec_splat_u8 (1);
 >     avg0 = vec_avg (A, B);
 >     xor0 = vec_xor (A, B);
 >     avg1 = vec_avg (C, D);
 >     xor1 = vec_xor (C, D);
 >     tmp = vec_and (vec_and (ones, vec_or (xor0, xor1)),
 >                    vec_xor (avg0, avg1));
 >     out = vec_sub (vec_avg (avg0, avg1), tmp);

You need to split the last two assignments as you have discovered.  If
you want to see why, compile with -save-temps and look at the
preprocessed output (the .i file).

All the altivec functions in C get expanded into a disgusting set of
macros.  These macros expand exponentially when you use them to call
themselves.  This is not likely to change until the C front end has
overloaded functions, and we have no need for the macros.

If you want something that compiles in less than infinity minus 1 for
these constructs, I recommend you use C++.

Aldy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: altivec support in gcc
  2003-04-09 16:57 ` Aldy Hernandez
@ 2003-04-09 22:55   ` Michel LESPINASSE
  2003-04-21 13:30     ` altivec support in gcc - bug with vec_mergel Michel LESPINASSE
  0 siblings, 1 reply; 14+ messages in thread
From: Michel LESPINASSE @ 2003-04-09 22:55 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc

Hi Aldy,

On Wed, Apr 09, 2003 at 09:15:42AM -0700, Aldy Hernandez wrote:
>  > First, is there any reason why altivec.h defines the second argument
>  > in vec_ld as being a vector pointer instead of a const vector pointer ?
>  > This is causing me no ends of trouble.
> 
> Why is this causing you trouble?

Well, this is nothing that I could not work around, but if I passed a
const vector pointer gcc would complain that vec_ld discards the const
modifier (dont remember the exact wording, but its the message you get
when you pass a const pointer to a function that just takes a regular
pointer, and gcc warns that function might write into your supposedly
const data). In the context of vec_ld, the warning is stupid since
vec_ld does NOT write into the data, and gcc should know it I think.

Anyway - I had to add an explicit cast for the second argument of each
vec_ld, just to keep gcc happy. In that regard, gnu gcc is not
compatible with the old apple altivec-patched gcc. In the end, because
I did not want to change all my vec_ld calls, I defined a my_vec_ld
function with the cast, and then I used the preprocessor to replace
all the vec_ld's with my_vec_ld. Like this:

static inline vector_u8_t my_vec_ld (int const A, const uint8_t * const B)
{
    return vec_ld (A, (uint8_t *)B);
}
#undef vec_ld
#define vec_ld my_vec_ld

This is ugly but it solved my problem :) I understand you can not do
this in altivec.h though, because my version is limited by the fact it
will only work on unsigned char vectors.

> You need to split the last two assignments as you have discovered.  If
> you want to see why, compile with -save-temps and look at the
> preprocessed output (the .i file).

Yes. I looked at altivec.h and I understood what the issue is - it has
to do all these weird convolutions to emulate function overloading,
and as each argument gets expanded more than once this leads to
exponential explosion.

Still from a user point of view, this does look a bit silly - I can
write expressions like out = a & b & c & d; and this works with any
mix of integer types for a, b, c and d, and thanks god I don't have to
split up a small expression like that - the compiler just figures it
out. As a user, I would expect to be able to do the same things with
vector types. Trying to understand what the difference is from gcc's
point of view, I see that '&' is an operator, while vec_and internally
relies on a builtin function, and that builtin functions are more
limited as they dont support overloading, while C operators do. I dont
know if it would be possible to somehow make gcc know about new
operators for altivec ? that way vec_and(a,b) could be defined as
something like ((a) __altivec_operator_and (b)) and
__altivec_operator_and would be some operator, similar to &, which can
take various types as input. I'm probably talking way out of my league
here though.

Once again, I used functions and preprocessor tricks to get rid of the
exponential explosion issue, by taking advantage of the fact I only
needed unsigned char vector versions of vec_and and vec_avg:

#ifndef COFFEE_BREAK    /* Workarounds for gcc suckage */

static inline vector_u8_t my_vec_and (vector_u8_t const A, vector_u8_t const B)
{
    return vec_and (A, B);
}
#undef vec_and
#define vec_and my_vec_and

static inline vector_u8_t my_vec_avg (vector_u8_t const A, vector_u8_t const B)
{
    return vec_avg (A, B);
}
#undef vec_avg
#define vec_avg my_vec_avg

#endif

This is ugly but it does the trick for me - and frankly I did not want
to break up all my expressions using temporaries if that makes them
unreadable.

I'm not sure what a good solution would be here. For starters, if
altivec.h exported some non-overloaded versions of these functions,
that might help a little - for example vec_and would be the overloaded
version and vec_and_u8, vec_and_u16, vec_and_float, ... would be the
non overloaded versions... I'm not sure if its doable as apple's
altivec spec does not define these non overloaded versions though.

> All the altivec functions in C get expanded into a disgusting set of
> macros.  These macros expand exponentially when you use them to call
> themselves.  This is not likely to change until the C front end has
> overloaded functions, and we have no need for the macros.

hmmm is there actually any plan for implementing overloaded functions
in the C front end ?

Not that I'd push for this thing (I dont think we want to make C look
too much like C++) but it might be useful, at least for builtins, so
gcc can better support the altivec intrinsics and stuff.

Thanks,

-- 
Michel "Walken" LESPINASSE
"In this time of war against Osama bin Laden and the oppressive
Taliban regime, we are thankful that OUR leader isn't the spoiled son
of a powerful politician from a wealthy oil family who is supported by
religious fundamentalists, operates through clandestine organizations,
has no respect for the democratic electoral process, bombs innocents,
and uses war to deny people their civil liberties." --The Boondocks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* altivec support in gcc - bug with vec_mergel
  2003-04-09 22:55   ` Michel LESPINASSE
@ 2003-04-21 13:30     ` Michel LESPINASSE
  2003-04-22 15:11       ` Daniel Egger
  0 siblings, 1 reply; 14+ messages in thread
From: Michel LESPINASSE @ 2003-04-21 13:30 UTC (permalink / raw)
  To: Aldy Hernandez; +Cc: gcc

Hi,

I have some code that compiles and works fine in apple's version of
gcc 3.1 (as used in darwin) but fails to work when compiled with FSF
gcc 3.2. Looking at the issue, I think it's due to vec_mergel being
miscompiled into vmrghh instead of vmrglh. Basically gcc miscompiles
vec_mergel to do what vec_mergeh should be doing !

The following allows me to work around the issue by using vec_perm to
do the same work, but I think you'll agree that this should not be
necessary:

#if 1	/* work around gcc vec_mergel bug */
static inline vector_s16_t my_vec_mergel (vector_s16_t const A,
					  vector_s16_t const B)
{
    static const vector_u8_t mergel = {
	0x08, 0x09, 0x18, 0x19, 0x0a, 0x0b, 0x1a, 0x1b,
	0x0c, 0x0d, 0x1c, 0x1d, 0x0e, 0x0f, 0x1e, 0x1f
    };
    return vec_perm (A, B, mergel);
}
#undef vec_mergel
#define vec_mergel my_vec_mergel
#endif

Can you double check the issue and see if you can reproduce it locally ?
I'm guessing it's probably a cut and paste error in gcc, but I couldnt
be sure... I did look at the altivec.h file though, and I think the
error is not there.

Cheers,

-- 
Michel "Walken" LESPINASSE
"In this time of war against Osama bin Laden and the oppressive
Taliban regime, we are thankful that OUR leader isn't the spoiled son
of a powerful politician from a wealthy oil family who is supported by
religious fundamentalists, operates through clandestine organizations,
has no respect for the democratic electoral process, bombs innocents,
and uses war to deny people their civil liberties." --The Boondocks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: altivec support in gcc - bug with vec_mergel
  2003-04-21 13:30     ` altivec support in gcc - bug with vec_mergel Michel LESPINASSE
@ 2003-04-22 15:11       ` Daniel Egger
  2003-04-22 16:43         ` Michel LESPINASSE
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Egger @ 2003-04-22 15:11 UTC (permalink / raw)
  To: Michel LESPINASSE; +Cc: Aldy Hernandez, gcc

[-- Attachment #1: Type: text/plain, Size: 659 bytes --]

On Mon, 2003-04-21 at 05:06, Michel LESPINASSE wrote:

> I have some code that compiles and works fine in apple's version of
> gcc 3.1 (as used in darwin) but fails to work when compiled with FSF
> gcc 3.2. Looking at the issue, I think it's due to vec_mergel being
> miscompiled into vmrghh instead of vmrglh. Basically gcc miscompiles
> vec_mergel to do what vec_mergeh should be doing !

2002-02-26  Daniel Egger  <degger@fhm.edu>
 
        * config/rs6000/rs6000.md: Swap define_insn attributes to
        fix incorrect generation of merge high instructions instead
        of merge low.

This one maybe? :) 

-- 
Servus,
       Daniel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: altivec support in gcc - bug with vec_mergel
  2003-04-22 15:11       ` Daniel Egger
@ 2003-04-22 16:43         ` Michel LESPINASSE
  2003-04-22 19:55           ` Daniel Egger
  0 siblings, 1 reply; 14+ messages in thread
From: Michel LESPINASSE @ 2003-04-22 16:43 UTC (permalink / raw)
  To: Daniel Egger; +Cc: Aldy Hernandez, gcc

On Tue, Apr 22, 2003 at 02:04:11PM +0200, Daniel Egger wrote:
> 2002-02-26  Daniel Egger  <degger@fhm.edu>
>  
>         * config/rs6000/rs6000.md: Swap define_insn attributes to
>         fix incorrect generation of merge high instructions instead
>         of merge low.
> 
> This one maybe? :) 

Yes, most probably :) Yesterday I downloaded the latest 3.3 snapshot -
I intended to grep for vmrghh and find out how to fix the issue, but
it turned out it was fixed already :)

Do you know what's the status of this in the 3.2.3-frozen tree ? I
tried to figure this out by looking at config/ in cvsweb but I guess
some magic happens there at make dist time ? Well at least I did not
find where config/rs6000 is.

And, thanks a lot Daniel for fixing this. gcc's altivec support works
nicely for me now :)   (well, in 3.3 at least)

Cheers,

-- 
Michel "Walken" LESPINASSE
"In this time of war against Osama bin Laden and the oppressive
Taliban regime, we are thankful that OUR leader isn't the spoiled son
of a powerful politician from a wealthy oil family who is supported by
religious fundamentalists, operates through clandestine organizations,
has no respect for the democratic electoral process, bombs innocents,
and uses war to deny people their civil liberties." --The Boondocks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: altivec support in gcc - bug with vec_mergel
  2003-04-22 16:43         ` Michel LESPINASSE
@ 2003-04-22 19:55           ` Daniel Egger
  2003-04-22 21:40             ` Aldy Hernandez
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Egger @ 2003-04-22 19:55 UTC (permalink / raw)
  To: Michel LESPINASSE; +Cc: Aldy Hernandez, gcc

[-- Attachment #1: Type: text/plain, Size: 378 bytes --]

On Tue, 2003-04-22 at 18:09, Michel LESPINASSE wrote:

> Do you know what's the status of this in the 3.2.3-frozen tree ? I
> tried to figure this out by looking at config/ in cvsweb but I guess
> some magic happens there at make dist time ? Well at least I did not
> find where config/rs6000 is.

Holy cow, it's still broken there. Aldy?

-- 
Servus,
       Daniel

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: altivec support in gcc - bug with vec_mergel
  2003-04-22 19:55           ` Daniel Egger
@ 2003-04-22 21:40             ` Aldy Hernandez
  0 siblings, 0 replies; 14+ messages in thread
From: Aldy Hernandez @ 2003-04-22 21:40 UTC (permalink / raw)
  To: Daniel Egger; +Cc: Michel LESPINASSE, gcc


On Tuesday, April 22, 2003, at 02:44  PM, Daniel Egger wrote:

> On Tue, 2003-04-22 at 18:09, Michel LESPINASSE wrote:
>
>> Do you know what's the status of this in the 3.2.3-frozen tree ? I
>> tried to figure this out by looking at config/ in cvsweb but I guess
>> some magic happens there at make dist time ? Well at least I did not
>> find where config/rs6000 is.
>
> Holy cow, it's still broken there. Aldy?
>

Dunno.  Haven't looked at 3.2.* in ages.  If you have a patch, it 
should go in as obvious... if Mark agrees.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AltiVec support in GCC
  2000-02-18 17:13 Mike Stump
  2000-02-18 18:04 ` Stan Shebs
@ 2000-02-20 10:56 ` Richard Henderson
  1 sibling, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2000-02-20 10:56 UTC (permalink / raw)
  To: Mike Stump; +Cc: gcc, kumar

On Fri, Feb 18, 2000 at 05:12:53PM -0800, Mike Stump wrote:
> Sounds half way reasonable.  I have here in my hand a piece of paper,
> no, a compiler for the Pentium III SSE with intrinsic support for SSE.
> It adds a vector type to the compiler also.

Actually, there's a big difference.  The SSE support adds a vector type
to the _back end_.  It does not change the C/C++ front end at all.  There
are no new language constructs.  All the SSE support is had though
builtin functions that operate on a TImode type.

I am a big fan of not doing language extensions.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AltiVec support in GCC
  2000-02-18 18:47 AltiVec support in GCC Mike Stump
@ 2000-02-19 15:03 ` Geoff Keating
  0 siblings, 0 replies; 14+ messages in thread
From: Geoff Keating @ 2000-02-19 15:03 UTC (permalink / raw)
  To: Mike Stump; +Cc: gcc

Mike Stump <mrs@windriver.com> writes:

> > Date: Fri, 18 Feb 2000 18:04:50 -0800
> > From: Stan Shebs <shebs@apple.com>
> > To: Mike Stump <mrs@windriver.com>
> 
> > Since this is an extension to the dialect of C accepted by GCC, I
> > would hope there would be some effort to define the dialect before
> > putting in dueling patches. :-)
> 
> Well, people might say, this is how this vendor defined it, and we
> want gcc to be compatible with it.  Fairly reasonable, though long
> term, would be nice to do better than this, if we can.

My concern about this is that it would be very powerpc-specific.
There are similar but different language extensions for Sparc VIS, for
MMX (and for all I know even more for all the non-Intel MMX extensions).

So we would end up having a different frontend for ppc than for all
others.  Code would not be portable between ppc and other
architectures.  There would be a ppc-specific set of frontend bugs.
We would probably need to have a ppc-specific C++ frontend too.


In the end, I suspect that the right thing to do is to implement all
this properly: to do a proper loop-vectorisation pass.  Such a thing
need not be very complicated, because to be useful it only has to
recognise _one_ idiom for the affected operation; then you get
portable code that only runs fast on one architecture, but if this is
a problem you can always improve the loop vectoriser.

For instance, there is nothing especially difficult about recognising
the following code:

unsigned char a[1024];
unsigned char b[1024];
unsigned char c[2048];
int i;

for (i = 0; i < sizeof(c); i++)
{
  int t1, t2;

  t1 = a[i] + b[i];
  if (t1 > 255) t1 = 255;

  t2 = a[i] - b[i];
  if (t2 < 0) t2 = 0;
  
  c[2*i] = t1;
  c[2*i+1] = t2;
}

as the following vector instructions in a loop:

	lvx	v0,r3,r4
	lvx	v1,r3,r5
	vaddubs	v2,v0,v1
	vsububs	v3,v0,v1
	vmrghb  v4,v2,v3
	vmrglb	v5,v2,v3
	stvx	v4,r3,r6
	stvx	v5,r3,r7

and even if it turned out to be difficult to get best performance with
a long loop, it would be even easier to recognise loops the size of
the vector registers:

unsigned char a[16];
unsigned char b[16];
unsigned char c[16];

for (i = 0; i < 16; i++)
{
  int t;
  t = a[i] + b[i];
  if (t > 255) t = 255;
  c[i] = t;
}

as the equivalent 'vaddubs' instruction, with a, b, and c held in
vector registers.  You would only need to recognise one such loop for
each instruction; the user would generate such instructions by the use
of a suitable header file (or C++ class).

-- 
- Geoffrey Keating <geoffk@cygnus.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AltiVec support in GCC
@ 2000-02-18 18:47 Mike Stump
  2000-02-19 15:03 ` Geoff Keating
  0 siblings, 1 reply; 14+ messages in thread
From: Mike Stump @ 2000-02-18 18:47 UTC (permalink / raw)
  To: shebs; +Cc: gcc, kumar

> Date: Fri, 18 Feb 2000 18:04:50 -0800
> From: Stan Shebs <shebs@apple.com>
> To: Mike Stump <mrs@windriver.com>

> Since this is an extension to the dialect of C accepted by GCC, I
> would hope there would be some effort to define the dialect before
> putting in dueling patches. :-)

Well, people might say, this is how this vendor defined it, and we
want gcc to be compatible with it.  Fairly reasonable, though long
term, would be nice to do better than this, if we can.

> I don't know as much about the SSE situation, but the AltiVec
> extensions to C have been added to several other compilers

Same.

> and in real-world use for a while now, so there is an existing code
> base to consider.

Ditto (though I don't know how extensive their use it).

> Kumar has kindly provided the URLs to Motorola's spec; is there
> something similar for SSE that we can look at?

ftp://download.intel.com/design/PentiumII/manuals/24319102.PDF

and if that doesn't work for any reason, you can find it at:

http://developer.intel.com/design/PentiumII/manuals/243191.htm

See section 3.1.3 and appendix C.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AltiVec support in GCC
  2000-02-18 17:13 Mike Stump
@ 2000-02-18 18:04 ` Stan Shebs
  2000-02-20 10:56 ` Richard Henderson
  1 sibling, 0 replies; 14+ messages in thread
From: Stan Shebs @ 2000-02-18 18:04 UTC (permalink / raw)
  To: Mike Stump; +Cc: gcc, kumar

Mike Stump wrote:
> 
> > Date: Thu, 17 Feb 2000 22:30:00 -0600 (CST)
> > From: Kumar Gala <kumar@chaos.ph.utexas.edu>
> > To: gcc@gcc.gnu.org
> 
> > The major issues that I have been told about, surrounds the fact that the
> > Motorola changes add a 'vector' primitive.
> 
> Sounds half way reasonable.  I have here in my hand a piece of paper,
> no, a compiler for the Pentium III SSE with intrinsic support for SSE.
> It adds a vector type to the compiler also.
> 
> Would be nice if a person in the know took the moto patches and the
> SSE patches and unified them...  If not, then the first set in, wins,
> and the other will just have to cope, extend and transform.

Since this is an extension to the dialect of C accepted by GCC, I would
hope there would be some effort to define the dialect before putting in
dueling patches. :-)  I don't know as much about the SSE situation, but
the AltiVec extensions to C have been added to several other compilers
and in real-world use for a while now, so there is an existing code base
to consider.

Kumar has kindly provided the URLs to Motorola's spec; is there something
similar for SSE that we can look at?  Also, are there any other vector
extension specs worth looking at?  (Doesn't Mips have something?)

Stan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: AltiVec support in GCC
@ 2000-02-18 17:13 Mike Stump
  2000-02-18 18:04 ` Stan Shebs
  2000-02-20 10:56 ` Richard Henderson
  0 siblings, 2 replies; 14+ messages in thread
From: Mike Stump @ 2000-02-18 17:13 UTC (permalink / raw)
  To: gcc, kumar

> Date: Thu, 17 Feb 2000 22:30:00 -0600 (CST)
> From: Kumar Gala <kumar@chaos.ph.utexas.edu>
> To: gcc@gcc.gnu.org

> The major issues that I have been told about, surrounds the fact that the
> Motorola changes add a 'vector' primitive.  

Sounds half way reasonable.  I have here in my hand a piece of paper,
no, a compiler for the Pentium III SSE with intrinsic support for SSE.
It adds a vector type to the compiler also.

Would be nice if a person in the know took the moto patches and the
SSE patches and unified them...  If not, then the first set in, wins,
and the other will just have to cope, extend and transform.

> the C API is much more like 'c' function calls

Same with the SSE stuff.

> Also, people have discussed the creation of auto-vectorizing
> compilers.  While this can be useful for the simple case, from the
> AltiVec code I think it would be very diffucult to get full use of
> the AltiVec instruction set.

Yes, this is a longer range goal that shouldn't be worried about at
first.  I think we can postpone talks about it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* AltiVec support in GCC
@ 2000-02-17 20:30 Kumar Gala
  0 siblings, 0 replies; 14+ messages in thread
From: Kumar Gala @ 2000-02-17 20:30 UTC (permalink / raw)
  To: gcc

I am new this list and wanted to start up a discussion on this issues with
getting AltiVec support into GCC.  I am aware that Motorola has worked on
a set of patches against GCC 2.95.2, binutils, gdb, and are in the process
of assigning the copyright to the FSF/GNU.

The major issues that I have been told about, surrounds the fact that the
Motorola changes add a 'vector' primitive.  

(ie)

vector pixel
vector float
vector short
vector int
vector char
vector bool
...

These are the major AltiVec docs.
http://www.mot.com/SPS/PowerPC/teksupport/teklibrary/manuals/altivec_pem.pdf
http://www.mot.com/SPS/PowerPC/teksupport/teklibrary/manuals/altivecpim.pdf

The PEM documents the assembly level instructions that AltiVec is composed
of.  The PIM documents the C/C++ extensions.  It also discusses ABI
changes and the such.  The patches that Motorola have done following the
PIM.

The rest of the PIM documents the C API.  To clarify what some people have
said on this list, the C API is much more like 'c' function calls then
overloaded operators in c++.

for example to add to vectors you would do the following code

   vector int a, b, c;
   vec_add(a,b,c)  /* a + b = c */

Also, people have discussed the creation of auto-vectorizing compilers.
While this can be useful for the simple case, from the AltiVec code I
think it would be very diffucult to get full use of the AltiVec
instruction set. 

There are instructions like vec_perm, vec_max, vec_min, vec_avg.  That
would be very diffucult for any auto-vectorizing utility to use.  Some of
these operations provide the greatest benefit.  For one example, some 2-d
median code developed at Motorola has shown a 30-40x speedup with the use
of altivec due to these unique instructions.  Also, these operations allow
the coder to think in new/creative ways.

I also believe it is time that SIMD is taken more seriously.  I believe
that the AltiVec SIMD engine provides one of the more robust set of
instructions to the programmer (as apposed to MMX, SSE, etc).  While SIMD
may not be ment for the general programmer it may well be that the general
programmer will unknowingly use it.  Here's a list of ideas that I have
had for places where AltiVec could enhance performance.

Software RAID code (XOR), high bandwidth memory operations (128 bit data
paths), String manipluations, encryption (take a look at the latest
powerpc rc5 client), checksums.  GIMP, MESA.  

The reason I am brinking this up know is that I think the sooner the
discussion starts about this the faster programmers well have access to
it and the better for everyone.  

I am looking into getting the patches from Motorola and have a contact for
anyone interested.  

thanks

 - kumar gala


ignorance is bliss.







^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2003-04-22 19:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-08 10:10 altivec support in gcc Michel LESPINASSE
2003-04-09 16:57 ` Aldy Hernandez
2003-04-09 22:55   ` Michel LESPINASSE
2003-04-21 13:30     ` altivec support in gcc - bug with vec_mergel Michel LESPINASSE
2003-04-22 15:11       ` Daniel Egger
2003-04-22 16:43         ` Michel LESPINASSE
2003-04-22 19:55           ` Daniel Egger
2003-04-22 21:40             ` Aldy Hernandez
  -- strict thread matches above, loose matches on Subject: below --
2000-02-18 18:47 AltiVec support in GCC Mike Stump
2000-02-19 15:03 ` Geoff Keating
2000-02-18 17:13 Mike Stump
2000-02-18 18:04 ` Stan Shebs
2000-02-20 10:56 ` Richard Henderson
2000-02-17 20:30 Kumar Gala

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).