From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2697 invoked by alias); 30 Aug 2004 23:46:28 -0000 Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org Received: (qmail 2673 invoked from network); 30 Aug 2004 23:46:26 -0000 Received: from unknown (HELO nosedive.stowetel.com) (216.243.48.230) by sourceware.org with SMTP; 30 Aug 2004 23:46:26 -0000 Received: from talentg.com ([206.113.40.112]) by nosedive.stowetel.com (Stowe Telecom, LLC ) with ASMTP id JFA74625 for ; Mon, 30 Aug 2004 19:46:25 -0400 Message-ID: <4133BB9F.3010705@talentg.com> Date: Tue, 31 Aug 2004 03:22:00 -0000 From: Mike Sharov User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7b) Gecko/20040408 MIME-Version: 1.0 To: gcc-help@gcc.gnu.org Subject: How can I tell the compiler to not store excessively? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2004-08/txt/msg00288.txt.bz2 Dear gcc-help, I am trying to write several inline assembly functions to facilitate use of MMX instructions and I would like for them to be concatenated without transfer in and out of registers between calls if the data is already loaded in a register. Example: =========================================== #include typedef int mmx_t __attribute__ ((mode(V8QI))); inline void padd (const char* p, char* r) { asm ("paddb %1, %0" : "=&y"(*(mmx_t*)r) : "y"(*(const mmx_t*)p), "0"(*(mmx_t*)r)); } inline void psub (const char* p, char* r) { asm ("psubb %1, %0" : "=&y"(*(mmx_t*)r) : "y"(*(const mmx_t*)p), "0"(*(mmx_t*)r)); } int main (void) { char v1[8], v2[8]; padd (v1, v2); psub (v1, v2); printf ("v2[3] = %d\n", v2[3]); return (0); } =========================================== Here I would like to see v1 and v2 loaded into registers before the call to padd and not leave them until the return from psub. Now, gcc already can do this for the read-only v1, generating the following (-O3 -march=athlon-mp): =========================================== pxor %mm0, %mm0 movl %esp, %ebp .LCFI1: subl $24, %esp .LCFI2: movq -8(%ebp), %mm1 andl $-16, %esp #APP paddb %mm0, %mm1 #NO_APP movq %mm1, -8(%ebp) subl $16, %esp #APP psubb %mm0, %mm1 #NO_APP movq %mm1, -8(%ebp) movsbl -5(%ebp),%eax movl $.LC0, (%esp) movl %eax, 4(%esp) call printf =========================================== Here %mm0 is not reloaded because it remains the same value for both padd and psub calls. However, the value from %mm1 is stored into v2 twice; the first time after padd and the second time after psub. It is understandable that the optimizer would do this, because, for example, several threads could be accessing v2 concurrently. Is it possible to add some directives to the source code above to tell the optimizer that it is ok to keep v2 in a register until something (like the printf) actually asks for its contents through another method? -- Mike Sharov msharov@talentg.com