From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9889 invoked by alias); 8 Nov 2013 11:08:47 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 9876 invoked by uid 89); 8 Nov 2013 11:08:46 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=2.2 required=5.0 tests=AWL,BAYES_50,FREEMAIL_FROM,RDNS_NONE,SPAM_SUBJECT,SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-vb0-f45.google.com Received: from Unknown (HELO mail-vb0-f45.google.com) (209.85.212.45) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Fri, 08 Nov 2013 11:08:24 +0000 Received: by mail-vb0-f45.google.com with SMTP id p6so1261231vbe.18 for ; Fri, 08 Nov 2013 03:08:16 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.52.33.44 with SMTP id o12mr9594560vdi.7.1383908896703; Fri, 08 Nov 2013 03:08:16 -0800 (PST) Received: by 10.52.72.132 with HTTP; Fri, 8 Nov 2013 03:08:16 -0800 (PST) Date: Fri, 08 Nov 2013 11:08:00 -0000 Message-ID: Subject: m68k optimisations? From: Fredrik Olsson To: gcc@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-SW-Source: 2013-11/txt/msg00137.txt.bz2 I have this simple functions: int sum_vec(int c, ...) { va_list argptr; va_start(argptr, c); int sum = 0; while (c--) { int x = va_arg(argptr, int); sum += x; } va_end(argptr); return sum; } When compiling with "-fomit-frame-pointer -Os -march=68000 -c -S -mshort" I get this assembly (I have manually added comments with clock cycles per instruction and a total for a count of 0, 8 and n>0): .even .globl _sum_vec _sum_vec: lea (6,%sp),%a0 | 8 move.w 4(%sp),%d1 | 12 clr.w %d0 | 4 jra .L1 | 12 .L2: add.w (%a0)+,%d0 | 8 .L1: dbra %d1,.L2 | 16,12 rts | 16 | c==0: 8+12+4+12+12+16=64 | c==8: 8+12+4+12+(16+8)*8+12+16=256 | c==n: =64+24n When instead compiling with "-fomit-frame-pointer -O3 -march=68000 -c -S -mshort" I expect to get more aggressive optimisation than -Os, or at least just as performant, but instead I get this: .even .globl _sum_vec _sum_vec: move.w 4(%sp),%d0 | 12 jeq .L2 | 12,8 lea (6,%sp),%a0 | 8 subq.w #1,%d0 | 4 and.l #65535,%d0 | 16 add.l %d0,%d0 | 8 lea 8(%sp,%d0.l),%a1 | 16 clr.w %d0 | 4 .L1: add.w (%a0)+,%d0 | 8 cmp.l %a0,%a1 | 8 jne .L1 | 12|8 rts | 16 .L2: clr.w %d0 | 4 rts | 16 | c==0: 12+12+4+16=44 | c==8: 12+8+8+4+16+8+16+4+(8+8+12)*4-4+16=316 | c==n: =88+28n The count==0 case is better. I can see what optimisation has been tried for the loop, but it just not working since both the ini for the loop and the loop itself becomes more costly. Being a GCC beginner I would like a few pointers as to how I should go about to fix this? // Fredrik