public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* ARM conditional instruction optimisation bug (feature?)
@ 2009-07-30 12:51 Zoltán Kócsi
  2009-07-30 15:59 ` Steven Bosscher
  0 siblings, 1 reply; 3+ messages in thread
From: Zoltán Kócsi @ 2009-07-30 12:51 UTC (permalink / raw)
  To: gcc

On the ARM every instruction can be executed conditionally. GCC very
cleverly uses this feature:

int bar ( int x, int a, int b )
{
   if ( x )

      return a;
    else
      return b;
}

compiles to:

bar:
        cmp     r0, #0		// test x
        movne   r0, r1		// retval = 'a' if !0 ('ne')
        moveq   r0, r2 		// retval = 'b' if 0 ('eq')
        bx      lr

However, the following function:

extern unsigned array[ 128 ];

int     foo( int x )
{
   int     y;

   y = array[ x & 127 ];

   if ( x & 128 )

      y = 123456789 & ( y >> 2 );
   else
      y = 123456789 & y;

   return y;
}

compiled with gcc 4.4.0, using -Os generates this:

foo:

        ldr     r3, .L8
        tst     r0, #128
        and     r0, r0, #127
        ldr     r3, [r3, r0, asl #2]
        ldrne   r0, .L8+4            ***
        ldreq   r0, .L8+4            ***
        movne   r3, r3, asr #2
        andne   r0, r3, r0           ***
        andeq   r0, r3, r0           ***
        bx      lr
.L8:
        .word   array
        .word   123456789

The lines marked with the *** -s do the same, one executing if the
condition is one way, the other if the condition is the opposite.
That is, together they perform one unconditional instruction, except
that they use two instuctions (and clocks) instead of one.

Compiling with -O2 makes things even worse, because an other issue hits:
gcc sometimes changes a "load constant" to a "generate the constant on
the fly" even when the latter is both slower and larger, other times it
chooses to load a constant even when it can easily (and more cheaply)
generate it from already available values. In this particular case it
decides to build the constant from pieces and combines that with
the generate an unconditional instruction using two complementary
conditional instructions method, resulting in this:

foo:
        ldr     r3, .L8
        tst     r0, #128
        and     r0, r0, #127
        ldr     r0, [r3, r0, asl #2]
        movne   r0, r0, asr #2
        bicne   r0, r0, #-134217728
        biceq   r0, r0, #-134217728
        bicne   r0, r0, #10747904
        biceq   r0, r0, #10747904
        bicne   r0, r0, #12992
        biceq   r0, r0, #12992
        bicne   r0, r0, #42
        biceq   r0, r0, #42
        bx      lr
.L8:
        .word   array

Should I report a bug?

Thanks,

Zoltan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ARM conditional instruction optimisation bug (feature?)
  2009-07-30 12:51 ARM conditional instruction optimisation bug (feature?) Zoltán Kócsi
@ 2009-07-30 15:59 ` Steven Bosscher
  2009-07-30 16:01   ` Steven Bosscher
  0 siblings, 1 reply; 3+ messages in thread
From: Steven Bosscher @ 2009-07-30 15:59 UTC (permalink / raw)
  To: Zoltán Kócsi; +Cc: gcc

On 7/30/09, Zoltán Kócsi <zoltan@bendor.com.au> wrote:
> On the ARM every instruction can be executed conditionally. GCC very
> cleverly uses this feature:
>
> int bar ( int x, int a, int b )
> {
>   if ( x )
>
>      return a;
>    else
>      return b;
> }
>
> compiles to:
>
> bar:
>        cmp     r0, #0          // test x
>        movne   r0, r1          // retval = 'a' if !0 ('ne')
>        moveq   r0, r2          // retval = 'b' if 0 ('eq')
>        bx      lr
>
> However, the following function:
>
> extern unsigned array[ 128 ];
>
> int     foo( int x )
> {
>   int     y;
>
>   y = array[ x & 127 ];
>
>   if ( x & 128 )
>
>      y = 123456789 & ( y >> 2 );
>   else
>      y = 123456789 & y;
>
>   return y;
> }
>
> compiled with gcc 4.4.0, using -Os generates this:
>
> foo:
>
>        ldr     r3, .L8
>        tst     r0, #128
>        and     r0, r0, #127
>        ldr     r3, [r3, r0, asl #2]
>        ldrne   r0, .L8+4            ***
>        ldreq   r0, .L8+4            ***
>        movne   r3, r3, asr #2
>        andne   r0, r3, r0           ***
>        andeq   r0, r3, r0           ***
>        bx      lr
> .L8:
>        .word   array
>        .word   123456789
>
> The lines marked with the *** -s do the same, one executing if the
> condition is one way, the other if the condition is the opposite.
> That is, together they perform one unconditional instruction, except
> that they use two instuctions (and clocks) instead of one.
>
> Compiling with -O2 makes things even worse, because an other issue hits:
> gcc sometimes changes a "load constant" to a "generate the constant on
> the fly" even when the latter is both slower and larger, other times it
> chooses to load a constant even when it can easily (and more cheaply)
> generate it from already available values. In this particular case it
> decides to build the constant from pieces and combines that with
> the generate an unconditional instruction using two complementary
> conditional instructions method, resulting in this:
>
> foo:
>        ldr     r3, .L8
>        tst     r0, #128
>        and     r0, r0, #127
>        ldr     r0, [r3, r0, asl #2]
>        movne   r0, r0, asr #2
>        bicne   r0, r0, #-134217728
>        biceq   r0, r0, #-134217728
>        bicne   r0, r0, #10747904
>        biceq   r0, r0, #10747904
>        bicne   r0, r0, #12992
>        biceq   r0, r0, #12992
>        bicne   r0, r0, #42
>        biceq   r0, r0, #42
>        bx      lr
> .L8:
>        .word   array
>
> Should I report a bug?

This looks like my bug PR21803 (gcc.gnu.org/PR21803). Can you check if
the ce3 pass creates this code? (Compile with -fdump-rtl-all and look
at the .ce3 dump and one dump before to see if the .ce3 pass created
your funny sequence.)

If your problem is indeed caused by the ce3 pass, you should add your
problem to PR21803, change the "Component" field to "middle-end", and
adjust the bug summary to make it clear that this is not ia64
specific.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ARM conditional instruction optimisation bug (feature?)
  2009-07-30 15:59 ` Steven Bosscher
@ 2009-07-30 16:01   ` Steven Bosscher
  0 siblings, 0 replies; 3+ messages in thread
From: Steven Bosscher @ 2009-07-30 16:01 UTC (permalink / raw)
  To: Zoltán Kócsi; +Cc: gcc

On 7/30/09, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On 7/30/09, Zoltán Kócsi <zoltan@bendor.com.au> wrote:
> > On the ARM every instruction can be executed conditionally. GCC very
> > cleverly uses this feature:
> >
> > int bar ( int x, int a, int b )
> > {
> >   if ( x )
> >
> >      return a;
> >    else
> >      return b;
> > }
> >
> > compiles to:
> >
> > bar:
> >        cmp     r0, #0          // test x
> >        movne   r0, r1          // retval = 'a' if !0 ('ne')
> >        moveq   r0, r2          // retval = 'b' if 0 ('eq')
> >        bx      lr
> >
> > However, the following function:
> >
> > extern unsigned array[ 128 ];
> >
> > int     foo( int x )
> > {
> >   int     y;
> >
> >   y = array[ x & 127 ];
> >
> >   if ( x & 128 )
> >
> >      y = 123456789 & ( y >> 2 );
> >   else
> >      y = 123456789 & y;
> >
> >   return y;
> > }
> >
> > compiled with gcc 4.4.0, using -Os generates this:
> >
> > foo:
> >
> >        ldr     r3, .L8
> >        tst     r0, #128
> >        and     r0, r0, #127
> >        ldr     r3, [r3, r0, asl #2]
> >        ldrne   r0, .L8+4            ***
> >        ldreq   r0, .L8+4            ***
> >        movne   r3, r3, asr #2
> >        andne   r0, r3, r0           ***
> >        andeq   r0, r3, r0           ***
> >        bx      lr
> > .L8:
> >        .word   array
> >        .word   123456789
> >
> > The lines marked with the *** -s do the same, one executing if the
> > condition is one way, the other if the condition is the opposite.
> > That is, together they perform one unconditional instruction, except
> > that they use two instuctions (and clocks) instead of one.
> >
> > Compiling with -O2 makes things even worse, because an other issue hits:
> > gcc sometimes changes a "load constant" to a "generate the constant on
> > the fly" even when the latter is both slower and larger, other times it
> > chooses to load a constant even when it can easily (and more cheaply)
> > generate it from already available values. In this particular case it
> > decides to build the constant from pieces and combines that with
> > the generate an unconditional instruction using two complementary
> > conditional instructions method, resulting in this:
> >
> > foo:
> >        ldr     r3, .L8
> >        tst     r0, #128
> >        and     r0, r0, #127
> >        ldr     r0, [r3, r0, asl #2]
> >        movne   r0, r0, asr #2
> >        bicne   r0, r0, #-134217728
> >        biceq   r0, r0, #-134217728
> >        bicne   r0, r0, #10747904
> >        biceq   r0, r0, #10747904
> >        bicne   r0, r0, #12992
> >        biceq   r0, r0, #12992
> >        bicne   r0, r0, #42
> >        biceq   r0, r0, #42
> >        bx      lr
> > .L8:
> >        .word   array
> >
> > Should I report a bug?
>
> This looks like my bug PR21803 (gcc.gnu.org/PR21803). Can you check if
> the ce3 pass creates this code? (Compile with -fdump-rtl-all and look
> at the .ce3 dump and one dump before to see if the .ce3 pass created
> your funny sequence.)
>
> If your problem is indeed caused by the ce3 pass, you should add your
> problem to PR21803, change the "Component" field to "middle-end", and
> adjust the bug summary to make it clear that this is not ia64
> specific.

Oh, and you may also want to try my patch "crossjump_abstract.diff" in
PR20070, it solves problems like yours sometimes (if the sequence is
just right) by crossjumping earlier.

Ciao!
Steven

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-07-30 16:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-30 12:51 ARM conditional instruction optimisation bug (feature?) Zoltán Kócsi
2009-07-30 15:59 ` Steven Bosscher
2009-07-30 16:01   ` Steven Bosscher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).