From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29495 invoked by alias); 30 Jul 2009 12:51:34 -0000 Received: (qmail 29486 invoked by uid 22791); 30 Jul 2009 12:51:32 -0000 X-SWARE-Spam-Status: No, hits=-2.1 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from susu.bendor.com.au (HELO susu.bendor.com.au) (203.16.199.2) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 30 Jul 2009 12:51:25 +0000 Received: from manocska.bendor.com.au (manocska.bendor.com.au [203.16.199.6]) by susu.bendor.com.au (Postfix) with ESMTP id 501D972E for ; Thu, 30 Jul 2009 22:51:22 +1000 (EST) Date: Thu, 30 Jul 2009 12:51:00 -0000 From: =?UTF-8?B?Wm9sdMOhbiBLw7Njc2k=?= To: gcc@gcc.gnu.org Subject: ARM conditional instruction optimisation bug (feature?) Message-ID: <20090730231654.580cc8ae@manocska.bendor.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2009-07/txt/msg00616.txt.bz2 On the ARM every instruction can be executed conditionally. GCC very cleverly uses this feature: int bar ( int x, int a, int b ) { if ( x ) return a; else return b; } compiles to: bar: cmp r0, #0 // test x movne r0, r1 // retval = 'a' if !0 ('ne') moveq r0, r2 // retval = 'b' if 0 ('eq') bx lr However, the following function: extern unsigned array[ 128 ]; int foo( int x ) { int y; y = array[ x & 127 ]; if ( x & 128 ) y = 123456789 & ( y >> 2 ); else y = 123456789 & y; return y; } compiled with gcc 4.4.0, using -Os generates this: foo: ldr r3, .L8 tst r0, #128 and r0, r0, #127 ldr r3, [r3, r0, asl #2] ldrne r0, .L8+4 *** ldreq r0, .L8+4 *** movne r3, r3, asr #2 andne r0, r3, r0 *** andeq r0, r3, r0 *** bx lr .L8: .word array .word 123456789 The lines marked with the *** -s do the same, one executing if the condition is one way, the other if the condition is the opposite. That is, together they perform one unconditional instruction, except that they use two instuctions (and clocks) instead of one. Compiling with -O2 makes things even worse, because an other issue hits: gcc sometimes changes a "load constant" to a "generate the constant on the fly" even when the latter is both slower and larger, other times it chooses to load a constant even when it can easily (and more cheaply) generate it from already available values. In this particular case it decides to build the constant from pieces and combines that with the generate an unconditional instruction using two complementary conditional instructions method, resulting in this: foo: ldr r3, .L8 tst r0, #128 and r0, r0, #127 ldr r0, [r3, r0, asl #2] movne r0, r0, asr #2 bicne r0, r0, #-134217728 biceq r0, r0, #-134217728 bicne r0, r0, #10747904 biceq r0, r0, #10747904 bicne r0, r0, #12992 biceq r0, r0, #12992 bicne r0, r0, #42 biceq r0, r0, #42 bx lr .L8: .word array Should I report a bug? Thanks, Zoltan