From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11870 invoked by alias); 5 Dec 2013 00:48:56 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 11823 invoked by uid 89); 5 Dec 2013 00:48:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.1 required=5.0 tests=AWL,BAYES_05,SPF_PASS,URIBL_BLOCKED autolearn=ham version=3.3.2 X-HELO: mail-pd0-f175.google.com Received: from Unknown (HELO mail-pd0-f175.google.com) (209.85.192.175) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 05 Dec 2013 00:48:54 +0000 Received: by mail-pd0-f175.google.com with SMTP id w10so23326166pde.6 for ; Wed, 04 Dec 2013 16:48:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=FbZZlSpUjADAgILkRFZN30LYt2LAagrkz2Wq3cxT8hQ=; b=cimw5hW3cbcr0MnMTl/ujFq/zKEQFE2JVSBqYpfIEaEWYG/hzj64tqV1vz9UQ9SVJY MXBy3KTXDekC0RqlQbmUha0w42o+eTyqvVPL6bDGkkwJdB2iMHS0O2W9ecY2+mO4d9lZ kMx790IoNPk3+61ukKvlcBDnEvWjq42pItYP7N+G52XSQCdAMH7a15Dy5pU32pRmKPfS Otq8QrzVxH5URHSbHDH4WcuiZwlMe7ia7bk/VlX7OjVToSK4AfbMcxfSFM8SiyqtzUDt 1423xBim6YdMllQUCD03qISt5qPBYQjehIINElrZoGYFR5o1F7IGOQsMcQdg6tuFGRDK TRXw== X-Gm-Message-State: ALoCoQnwhIpaiuoyft05m+QJes1mzRE5mPryfWcDfqQsddDHJcfoKVji0+uANyegZa4+Zs8Wt+nz X-Received: by 10.68.198.97 with SMTP id jb1mr48763884pbc.104.1386204526378; Wed, 04 Dec 2013 16:48:46 -0800 (PST) Received: from [192.168.1.142] (121-72-151-47.dsl.telstraclear.net. [121.72.151.47]) by mx.google.com with ESMTPSA id e6sm32884219pbg.4.2013.12.04.16.48.44 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 04 Dec 2013 16:48:45 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1822\)) Subject: Re: m68k optimisations? From: Maxim Kuvyrkov In-Reply-To: Date: Thu, 05 Dec 2013 00:48:00 -0000 Cc: gcc Content-Transfer-Encoding: quoted-printable Message-Id: References: To: Fredrik Olsson X-SW-Source: 2013-12/txt/msg00040.txt.bz2 On 9/11/2013, at 12:08 am, Fredrik Olsson wrote: > I have this simple functions: > int sum_vec(int c, ...) { > va_list argptr; > va_start(argptr, c); > int sum =3D 0; > while (c--) { > int x =3D va_arg(argptr, int); > sum +=3D x; > } > va_end(argptr); > return sum; > } >=20 >=20 > When compiling with "-fomit-frame-pointer -Os -march=3D68000 -c -S > -mshort" I get this assembly (I have manually added comments with > clock cycles per instruction and a total for a count of 0, 8 and n>0): > .even > .globl _sum_vec > _sum_vec: > lea (6,%sp),%a0 | 8 > move.w 4(%sp),%d1 | 12 > clr.w %d0 | 4 > jra .L1 | 12 > .L2: > add.w (%a0)+,%d0 | 8 > .L1: > dbra %d1,.L2 | 16,12 > rts | 16 > | c=3D=3D0: 8+12+4+12+12+16=3D64 > | c=3D=3D8: 8+12+4+12+(16+8)*8+12+16=3D256 > | c=3D=3Dn: =3D64+24n >=20 > When instead compiling with "-fomit-frame-pointer -O3 -march=3D68000 -c > -S -mshort" I expect to get more aggressive optimisation than -Os, or > at least just as performant, but instead I get this: > .even > .globl _sum_vec > _sum_vec: > move.w 4(%sp),%d0 | 12 > jeq .L2 | 12,8 > lea (6,%sp),%a0 | 8 > subq.w #1,%d0 | 4 > and.l #65535,%d0 | 16 > add.l %d0,%d0 | 8 > lea 8(%sp,%d0.l),%a1 | 16 > clr.w %d0 | 4 > .L1: > add.w (%a0)+,%d0 | 8 > cmp.l %a0,%a1 | 8 > jne .L1 | 12|8 > rts | 16 > .L2: > clr.w %d0 | 4 > rts | 16 > | c=3D=3D0: 12+12+4+16=3D44 > | c=3D=3D8: 12+8+8+4+16+8+16+4+(8+8+12)*4-4+16=3D316 > | c=3D=3Dn: =3D88+28n >=20 > The count=3D=3D0 case is better. I can see what optimisation has been > tried for the loop, but it just not working since both the ini for the > loop and the loop itself becomes more costly. >=20 > Being a GCC beginner I would like a few pointers as to how I should go > about to fix this? You investigate such problems by comparing intermediate debug dumps of two = compilation scenarios; by the assembly time it is almost impossible to gues= s where the problem is coming from. Add -fdump-tree-all and -fdump-rtl-all= to the compilation flags and find which optimization pass makes the wrong = decision. Then you trace that optimization pass or file a bug report in ho= pes that someone (optimization maintainer) will look at it. Read through GCC wiki for information on debugging and troubleshooting GCC: - http://gcc.gnu.org/wiki/GettingStarted - http://gcc.gnu.org/wiki/FAQ - http://gcc.gnu.org/wiki/ Thanks, -- Maxim Kuvyrkov www.kugelworks.com