From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30170 invoked by alias); 3 Dec 2004 16:37:24 -0000 Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org Received: (qmail 30046 invoked from network); 3 Dec 2004 16:37:16 -0000 Received: from unknown (HELO mail2.codesourcery.com) (66.160.135.55) by sourceware.org with SMTP; 3 Dec 2004 16:37:16 -0000 Received: (qmail 22676 invoked from network); 3 Dec 2004 16:37:15 -0000 Received: from admin.voldemort.codesourcery.com (HELO mail.codesourcery.com) (65.74.133.9) by mail2.codesourcery.com with SMTP; 3 Dec 2004 16:37:15 -0000 Received: (qmail 25594 invoked from network); 3 Dec 2004 16:37:14 -0000 Received: from localhost (HELO ?192.168.189.167?) (nathan@127.0.0.1) by mail.codesourcery.com with SMTP; 3 Dec 2004 16:37:14 -0000 Message-ID: <41B09632.8040005@codesourcery.com> Date: Fri, 03 Dec 2004 16:37:00 -0000 From: Nathan Sidwell Organization: Codesourcery LLC User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040913 MIME-Version: 1.0 To: David Palao CC: gcc-help@gcc.gnu.org Subject: Re: problems with gcc inline assembly using xmm registers References: <200412031628.53453.david.palao@uv.es> In-Reply-To: <200412031628.53453.david.palao@uv.es> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2004-12/txt/msg00042.txt.bz2 David Palao wrote: > __asm__ __volatile__ ("movsd %0, %%xmm3 \n\t" \ > "movsd %1, %%xmm6 \n\t" \ > "movsd %2, %%xmm4 \n\t" \ > "movsd %3, %%xmm7 \n\t" \ > "movsd %4, %%xmm5 \n\t" \ > "unpcklpd %%xmm3, %%xmm3 \n\t" \ > "unpcklpd %%xmm6, %%xmm6 \n\t" \ > "unpcklpd %%xmm4, %%xmm4 \n\t" \ > "mulpd %%xmm0, %%xmm3 \n\t" \ .... > "addpd %%xmm6, %%xmm5 \n\t" \ > "addpd %%xmm7, %%xmm3 \n\t" \ > "movsd %7, %%xmm6 \n\t" \ > "movsd %8, %%xmm7 \n\t" \ > "unpcklpd %%xmm6, %%xmm6 \n\t" \ > "unpcklpd %%xmm7, %%xmm7 \n\t" \ > "mulpd %%xmm1, %%xmm6 \n\t" \ > "mulpd %%xmm2, %%xmm7 \n\t" \ > "addpd %%xmm6, %%xmm4 \n\t" \ > "addpd %%xmm7, %%xmm5" \ don't write it this way, use the mmx builtins directly and then the compiler can handle all the register allocation for you. You'll have to be careful to arrange for no more than 8 mmx things to be live at one time though. That's not too hard to achieve if you're careful. I had success using this technique to do some 2D FFTs, it was way simpler than writing assembly directly. nathan -- Nathan Sidwell :: http://www.codesourcery.com :: CodeSourcery LLC nathan@codesourcery.com :: http://www.planetfall.pwp.blueyonder.co.uk