* altivec support in gcc @ 2003-04-08 10:10 Michel LESPINASSE 2003-04-09 16:57 ` Aldy Hernandez 0 siblings, 1 reply; 8+ messages in thread From: Michel LESPINASSE @ 2003-04-08 10:10 UTC (permalink / raw) To: gcc Hi, I have a few questions about the altivec support in gcc... First, is there any reason why altivec.h defines the second argument in vec_ld as being a vector pointer instead of a const vector pointer ? This is causing me no ends of trouble. Second, I have not tried gcc 3.3 yet, but gcc 3.2 has a lot of trouble compiling the following construct... it does compile it eventually, but it requires hundreds of megabytes of virtual memory for doing it... (This code is meant as a shortcut for calculating (A+B+C+D+2)>>2, for a vector of 16 unsigned char values. This is used in the motion compensation loop of an mpeg2 decoder.) ones = vec_splat_u8 (1); avg0 = vec_avg (A, B); xor0 = vec_xor (A, B); avg1 = vec_avg (C, D); xor1 = vec_xor (C, D); tmp = vec_and (vec_and (ones, vec_or (xor0, xor1)), vec_xor (avg0, avg1)); out = vec_sub (vec_avg (avg0, avg1), tmp); Initially I had only one out= assignment, i.e. I had put the tmp expression in place of the tmp variable in the current out assignment. I could not even get that code to compile, it made gcc inflate to over 700 MB. After splitting it as shown above, the code does compile fine, but GCC still inflates to over 300 MB compiling it. This is on a debian/sid system, the gcc -v version indicates: gcc -v Reading specs from /usr/lib/gcc-lib/powerpc-linux/3.2.3/specs Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,proto,objc,ada --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.2 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-java-gc=boehm --enable-objc-gc powerpc-linux Thread model: posix gcc version 3.2.3 20030316 (Debian prerelease) It's probably not a critical bug as it can be worked around by splitting expressions in smaller pieces, but I thought it should be signaled as it makes some code extremely slow to compile. For information, the same code used to compile just fine with apple's old altivec-patched gcc 2.95.x compiler. Hope this helps, -- Michel "Walken" LESPINASSE "In this time of war against Osama bin Laden and the oppressive Taliban regime, we are thankful that OUR leader isn't the spoiled son of a powerful politician from a wealthy oil family who is supported by religious fundamentalists, operates through clandestine organizations, has no respect for the democratic electoral process, bombs innocents, and uses war to deny people their civil liberties." --The Boondocks ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: altivec support in gcc 2003-04-08 10:10 altivec support in gcc Michel LESPINASSE @ 2003-04-09 16:57 ` Aldy Hernandez 2003-04-09 22:55 ` Michel LESPINASSE 0 siblings, 1 reply; 8+ messages in thread From: Aldy Hernandez @ 2003-04-09 16:57 UTC (permalink / raw) To: Michel LESPINASSE; +Cc: gcc >>>>> "Michel" == Michel LESPINASSE <walken@zoy.org> writes: > Hi, > I have a few questions about the altivec support in gcc... > First, is there any reason why altivec.h defines the second argument > in vec_ld as being a vector pointer instead of a const vector pointer ? > This is causing me no ends of trouble. Why is this causing you trouble? > Second, I have not tried gcc 3.3 yet, but gcc 3.2 has a lot of trouble > compiling the following construct... it does compile it eventually, > but it requires hundreds of megabytes of virtual memory for doing > it... > (This code is meant as a shortcut for calculating (A+B+C+D+2)>>2, for > a vector of 16 unsigned char values. This is used in the motion > compensation loop of an mpeg2 decoder.) > ones = vec_splat_u8 (1); > avg0 = vec_avg (A, B); > xor0 = vec_xor (A, B); > avg1 = vec_avg (C, D); > xor1 = vec_xor (C, D); > tmp = vec_and (vec_and (ones, vec_or (xor0, xor1)), > vec_xor (avg0, avg1)); > out = vec_sub (vec_avg (avg0, avg1), tmp); You need to split the last two assignments as you have discovered. If you want to see why, compile with -save-temps and look at the preprocessed output (the .i file). All the altivec functions in C get expanded into a disgusting set of macros. These macros expand exponentially when you use them to call themselves. This is not likely to change until the C front end has overloaded functions, and we have no need for the macros. If you want something that compiles in less than infinity minus 1 for these constructs, I recommend you use C++. Aldy ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: altivec support in gcc 2003-04-09 16:57 ` Aldy Hernandez @ 2003-04-09 22:55 ` Michel LESPINASSE 2003-04-21 13:30 ` altivec support in gcc - bug with vec_mergel Michel LESPINASSE 0 siblings, 1 reply; 8+ messages in thread From: Michel LESPINASSE @ 2003-04-09 22:55 UTC (permalink / raw) To: Aldy Hernandez; +Cc: gcc Hi Aldy, On Wed, Apr 09, 2003 at 09:15:42AM -0700, Aldy Hernandez wrote: > > First, is there any reason why altivec.h defines the second argument > > in vec_ld as being a vector pointer instead of a const vector pointer ? > > This is causing me no ends of trouble. > > Why is this causing you trouble? Well, this is nothing that I could not work around, but if I passed a const vector pointer gcc would complain that vec_ld discards the const modifier (dont remember the exact wording, but its the message you get when you pass a const pointer to a function that just takes a regular pointer, and gcc warns that function might write into your supposedly const data). In the context of vec_ld, the warning is stupid since vec_ld does NOT write into the data, and gcc should know it I think. Anyway - I had to add an explicit cast for the second argument of each vec_ld, just to keep gcc happy. In that regard, gnu gcc is not compatible with the old apple altivec-patched gcc. In the end, because I did not want to change all my vec_ld calls, I defined a my_vec_ld function with the cast, and then I used the preprocessor to replace all the vec_ld's with my_vec_ld. Like this: static inline vector_u8_t my_vec_ld (int const A, const uint8_t * const B) { return vec_ld (A, (uint8_t *)B); } #undef vec_ld #define vec_ld my_vec_ld This is ugly but it solved my problem :) I understand you can not do this in altivec.h though, because my version is limited by the fact it will only work on unsigned char vectors. > You need to split the last two assignments as you have discovered. If > you want to see why, compile with -save-temps and look at the > preprocessed output (the .i file). Yes. I looked at altivec.h and I understood what the issue is - it has to do all these weird convolutions to emulate function overloading, and as each argument gets expanded more than once this leads to exponential explosion. Still from a user point of view, this does look a bit silly - I can write expressions like out = a & b & c & d; and this works with any mix of integer types for a, b, c and d, and thanks god I don't have to split up a small expression like that - the compiler just figures it out. As a user, I would expect to be able to do the same things with vector types. Trying to understand what the difference is from gcc's point of view, I see that '&' is an operator, while vec_and internally relies on a builtin function, and that builtin functions are more limited as they dont support overloading, while C operators do. I dont know if it would be possible to somehow make gcc know about new operators for altivec ? that way vec_and(a,b) could be defined as something like ((a) __altivec_operator_and (b)) and __altivec_operator_and would be some operator, similar to &, which can take various types as input. I'm probably talking way out of my league here though. Once again, I used functions and preprocessor tricks to get rid of the exponential explosion issue, by taking advantage of the fact I only needed unsigned char vector versions of vec_and and vec_avg: #ifndef COFFEE_BREAK /* Workarounds for gcc suckage */ static inline vector_u8_t my_vec_and (vector_u8_t const A, vector_u8_t const B) { return vec_and (A, B); } #undef vec_and #define vec_and my_vec_and static inline vector_u8_t my_vec_avg (vector_u8_t const A, vector_u8_t const B) { return vec_avg (A, B); } #undef vec_avg #define vec_avg my_vec_avg #endif This is ugly but it does the trick for me - and frankly I did not want to break up all my expressions using temporaries if that makes them unreadable. I'm not sure what a good solution would be here. For starters, if altivec.h exported some non-overloaded versions of these functions, that might help a little - for example vec_and would be the overloaded version and vec_and_u8, vec_and_u16, vec_and_float, ... would be the non overloaded versions... I'm not sure if its doable as apple's altivec spec does not define these non overloaded versions though. > All the altivec functions in C get expanded into a disgusting set of > macros. These macros expand exponentially when you use them to call > themselves. This is not likely to change until the C front end has > overloaded functions, and we have no need for the macros. hmmm is there actually any plan for implementing overloaded functions in the C front end ? Not that I'd push for this thing (I dont think we want to make C look too much like C++) but it might be useful, at least for builtins, so gcc can better support the altivec intrinsics and stuff. Thanks, -- Michel "Walken" LESPINASSE "In this time of war against Osama bin Laden and the oppressive Taliban regime, we are thankful that OUR leader isn't the spoiled son of a powerful politician from a wealthy oil family who is supported by religious fundamentalists, operates through clandestine organizations, has no respect for the democratic electoral process, bombs innocents, and uses war to deny people their civil liberties." --The Boondocks ^ permalink raw reply [flat|nested] 8+ messages in thread
* altivec support in gcc - bug with vec_mergel 2003-04-09 22:55 ` Michel LESPINASSE @ 2003-04-21 13:30 ` Michel LESPINASSE 2003-04-22 15:11 ` Daniel Egger 0 siblings, 1 reply; 8+ messages in thread From: Michel LESPINASSE @ 2003-04-21 13:30 UTC (permalink / raw) To: Aldy Hernandez; +Cc: gcc Hi, I have some code that compiles and works fine in apple's version of gcc 3.1 (as used in darwin) but fails to work when compiled with FSF gcc 3.2. Looking at the issue, I think it's due to vec_mergel being miscompiled into vmrghh instead of vmrglh. Basically gcc miscompiles vec_mergel to do what vec_mergeh should be doing ! The following allows me to work around the issue by using vec_perm to do the same work, but I think you'll agree that this should not be necessary: #if 1 /* work around gcc vec_mergel bug */ static inline vector_s16_t my_vec_mergel (vector_s16_t const A, vector_s16_t const B) { static const vector_u8_t mergel = { 0x08, 0x09, 0x18, 0x19, 0x0a, 0x0b, 0x1a, 0x1b, 0x0c, 0x0d, 0x1c, 0x1d, 0x0e, 0x0f, 0x1e, 0x1f }; return vec_perm (A, B, mergel); } #undef vec_mergel #define vec_mergel my_vec_mergel #endif Can you double check the issue and see if you can reproduce it locally ? I'm guessing it's probably a cut and paste error in gcc, but I couldnt be sure... I did look at the altivec.h file though, and I think the error is not there. Cheers, -- Michel "Walken" LESPINASSE "In this time of war against Osama bin Laden and the oppressive Taliban regime, we are thankful that OUR leader isn't the spoiled son of a powerful politician from a wealthy oil family who is supported by religious fundamentalists, operates through clandestine organizations, has no respect for the democratic electoral process, bombs innocents, and uses war to deny people their civil liberties." --The Boondocks ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: altivec support in gcc - bug with vec_mergel 2003-04-21 13:30 ` altivec support in gcc - bug with vec_mergel Michel LESPINASSE @ 2003-04-22 15:11 ` Daniel Egger 2003-04-22 16:43 ` Michel LESPINASSE 0 siblings, 1 reply; 8+ messages in thread From: Daniel Egger @ 2003-04-22 15:11 UTC (permalink / raw) To: Michel LESPINASSE; +Cc: Aldy Hernandez, gcc [-- Attachment #1: Type: text/plain, Size: 659 bytes --] On Mon, 2003-04-21 at 05:06, Michel LESPINASSE wrote: > I have some code that compiles and works fine in apple's version of > gcc 3.1 (as used in darwin) but fails to work when compiled with FSF > gcc 3.2. Looking at the issue, I think it's due to vec_mergel being > miscompiled into vmrghh instead of vmrglh. Basically gcc miscompiles > vec_mergel to do what vec_mergeh should be doing ! 2002-02-26 Daniel Egger <degger@fhm.edu> * config/rs6000/rs6000.md: Swap define_insn attributes to fix incorrect generation of merge high instructions instead of merge low. This one maybe? :) -- Servus, Daniel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: altivec support in gcc - bug with vec_mergel 2003-04-22 15:11 ` Daniel Egger @ 2003-04-22 16:43 ` Michel LESPINASSE 2003-04-22 19:55 ` Daniel Egger 0 siblings, 1 reply; 8+ messages in thread From: Michel LESPINASSE @ 2003-04-22 16:43 UTC (permalink / raw) To: Daniel Egger; +Cc: Aldy Hernandez, gcc On Tue, Apr 22, 2003 at 02:04:11PM +0200, Daniel Egger wrote: > 2002-02-26 Daniel Egger <degger@fhm.edu> > > * config/rs6000/rs6000.md: Swap define_insn attributes to > fix incorrect generation of merge high instructions instead > of merge low. > > This one maybe? :) Yes, most probably :) Yesterday I downloaded the latest 3.3 snapshot - I intended to grep for vmrghh and find out how to fix the issue, but it turned out it was fixed already :) Do you know what's the status of this in the 3.2.3-frozen tree ? I tried to figure this out by looking at config/ in cvsweb but I guess some magic happens there at make dist time ? Well at least I did not find where config/rs6000 is. And, thanks a lot Daniel for fixing this. gcc's altivec support works nicely for me now :) (well, in 3.3 at least) Cheers, -- Michel "Walken" LESPINASSE "In this time of war against Osama bin Laden and the oppressive Taliban regime, we are thankful that OUR leader isn't the spoiled son of a powerful politician from a wealthy oil family who is supported by religious fundamentalists, operates through clandestine organizations, has no respect for the democratic electoral process, bombs innocents, and uses war to deny people their civil liberties." --The Boondocks ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: altivec support in gcc - bug with vec_mergel 2003-04-22 16:43 ` Michel LESPINASSE @ 2003-04-22 19:55 ` Daniel Egger 2003-04-22 21:40 ` Aldy Hernandez 0 siblings, 1 reply; 8+ messages in thread From: Daniel Egger @ 2003-04-22 19:55 UTC (permalink / raw) To: Michel LESPINASSE; +Cc: Aldy Hernandez, gcc [-- Attachment #1: Type: text/plain, Size: 378 bytes --] On Tue, 2003-04-22 at 18:09, Michel LESPINASSE wrote: > Do you know what's the status of this in the 3.2.3-frozen tree ? I > tried to figure this out by looking at config/ in cvsweb but I guess > some magic happens there at make dist time ? Well at least I did not > find where config/rs6000 is. Holy cow, it's still broken there. Aldy? -- Servus, Daniel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: altivec support in gcc - bug with vec_mergel 2003-04-22 19:55 ` Daniel Egger @ 2003-04-22 21:40 ` Aldy Hernandez 0 siblings, 0 replies; 8+ messages in thread From: Aldy Hernandez @ 2003-04-22 21:40 UTC (permalink / raw) To: Daniel Egger; +Cc: Michel LESPINASSE, gcc On Tuesday, April 22, 2003, at 02:44 PM, Daniel Egger wrote: > On Tue, 2003-04-22 at 18:09, Michel LESPINASSE wrote: > >> Do you know what's the status of this in the 3.2.3-frozen tree ? I >> tried to figure this out by looking at config/ in cvsweb but I guess >> some magic happens there at make dist time ? Well at least I did not >> find where config/rs6000 is. > > Holy cow, it's still broken there. Aldy? > Dunno. Haven't looked at 3.2.* in ages. If you have a patch, it should go in as obvious... if Mark agrees. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-04-22 19:55 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-04-08 10:10 altivec support in gcc Michel LESPINASSE 2003-04-09 16:57 ` Aldy Hernandez 2003-04-09 22:55 ` Michel LESPINASSE 2003-04-21 13:30 ` altivec support in gcc - bug with vec_mergel Michel LESPINASSE 2003-04-22 15:11 ` Daniel Egger 2003-04-22 16:43 ` Michel LESPINASSE 2003-04-22 19:55 ` Daniel Egger 2003-04-22 21:40 ` Aldy Hernandez
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).