From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12417 invoked by alias); 30 Jul 2009 16:01:49 -0000 Received: (qmail 12011 invoked by uid 22791); 30 Jul 2009 16:01:46 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,BAYES_00,SARE_MSGID_LONG40,SPF_PASS X-Spam-Check-By: sourceware.org Received: from fg-out-1718.google.com (HELO fg-out-1718.google.com) (72.14.220.155) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 30 Jul 2009 16:01:36 +0000 Received: by fg-out-1718.google.com with SMTP id l27so1186393fgb.5 for ; Thu, 30 Jul 2009 09:01:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.86.84.12 with SMTP id h12mr473646fgb.21.1248969691986; Thu, 30 Jul 2009 09:01:31 -0700 (PDT) In-Reply-To: <571f6b510907300859u4669adaeoef21ccdc18f25d09@mail.gmail.com> References: <20090730231654.580cc8ae@manocska.bendor.com.au> <571f6b510907300859u4669adaeoef21ccdc18f25d09@mail.gmail.com> Date: Thu, 30 Jul 2009 16:01:00 -0000 Message-ID: <571f6b510907300901o2a923176xe0acd1c340cf3043@mail.gmail.com> Subject: Re: ARM conditional instruction optimisation bug (feature?) From: Steven Bosscher To: =?ISO-8859-1?B?Wm9sdOFuIEvzY3Np?= Cc: gcc@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2009-07/txt/msg00624.txt.bz2 On 7/30/09, Steven Bosscher wrote: > On 7/30/09, Zolt=E1n K=F3csi wrote: > > On the ARM every instruction can be executed conditionally. GCC very > > cleverly uses this feature: > > > > int bar ( int x, int a, int b ) > > { > > if ( x ) > > > > return a; > > else > > return b; > > } > > > > compiles to: > > > > bar: > > cmp r0, #0 // test x > > movne r0, r1 // retval =3D 'a' if !0 ('ne') > > moveq r0, r2 // retval =3D 'b' if 0 ('eq') > > bx lr > > > > However, the following function: > > > > extern unsigned array[ 128 ]; > > > > int foo( int x ) > > { > > int y; > > > > y =3D array[ x & 127 ]; > > > > if ( x & 128 ) > > > > y =3D 123456789 & ( y >> 2 ); > > else > > y =3D 123456789 & y; > > > > return y; > > } > > > > compiled with gcc 4.4.0, using -Os generates this: > > > > foo: > > > > ldr r3, .L8 > > tst r0, #128 > > and r0, r0, #127 > > ldr r3, [r3, r0, asl #2] > > ldrne r0, .L8+4 *** > > ldreq r0, .L8+4 *** > > movne r3, r3, asr #2 > > andne r0, r3, r0 *** > > andeq r0, r3, r0 *** > > bx lr > > .L8: > > .word array > > .word 123456789 > > > > The lines marked with the *** -s do the same, one executing if the > > condition is one way, the other if the condition is the opposite. > > That is, together they perform one unconditional instruction, except > > that they use two instuctions (and clocks) instead of one. > > > > Compiling with -O2 makes things even worse, because an other issue hits: > > gcc sometimes changes a "load constant" to a "generate the constant on > > the fly" even when the latter is both slower and larger, other times it > > chooses to load a constant even when it can easily (and more cheaply) > > generate it from already available values. In this particular case it > > decides to build the constant from pieces and combines that with > > the generate an unconditional instruction using two complementary > > conditional instructions method, resulting in this: > > > > foo: > > ldr r3, .L8 > > tst r0, #128 > > and r0, r0, #127 > > ldr r0, [r3, r0, asl #2] > > movne r0, r0, asr #2 > > bicne r0, r0, #-134217728 > > biceq r0, r0, #-134217728 > > bicne r0, r0, #10747904 > > biceq r0, r0, #10747904 > > bicne r0, r0, #12992 > > biceq r0, r0, #12992 > > bicne r0, r0, #42 > > biceq r0, r0, #42 > > bx lr > > .L8: > > .word array > > > > Should I report a bug? > > This looks like my bug PR21803 (gcc.gnu.org/PR21803). Can you check if > the ce3 pass creates this code? (Compile with -fdump-rtl-all and look > at the .ce3 dump and one dump before to see if the .ce3 pass created > your funny sequence.) > > If your problem is indeed caused by the ce3 pass, you should add your > problem to PR21803, change the "Component" field to "middle-end", and > adjust the bug summary to make it clear that this is not ia64 > specific. Oh, and you may also want to try my patch "crossjump_abstract.diff" in PR20070, it solves problems like yours sometimes (if the sequence is just right) by crossjumping earlier. Ciao! Steven