public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory
@ 2012-07-12 10:58 gregpsmith at live dot co.uk
  2012-07-12 11:27 ` [Bug target/53938] " rguenth at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: gregpsmith at live dot co.uk @ 2012-07-12 10:58 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53938

             Bug #: 53938
           Summary: ARM target generates sub-optimal code (extra
                    instructions) on load from memory
    Classification: Unclassified
           Product: gcc
           Version: 4.6.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: gregpsmith@live.co.uk


Created attachment 27781
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27781
Example C source code

We are targetting an embedded device and we do a lot of work accessing an FPGA
(but this applies just as well to memory access). It has annoyed me for years
that the GCC compiler emits unncessary code, wasting memory and cycles when
reading 8 and 16-bit values.

The attached script shows opportunities to generate better code. when compiled
with:

gcc -c -O3 -mcpu=arm946e-s codegen.c

It compiles to (I have added comments):

<DeviceAccess>
    mov   r2, #0xE0000000  // base address of the device
    ldrb  r1, [r2]         // load an unsigned byte, 0 extend
    ldrb  r12, [r2]        // load signed byte - WHY NOT ldrsb?
    and   r1, r1, #0xFF    // WHAT IS THIS FOR
    ldrh  r3, [r2]         // load unsigned short
    tst   r1, #0x80        // if (i & 0x80)
    movne r1, #0           //     i = 0
    lsl   r12, r12, #24    // sign extend j (but could be avoided)
    tst   r3, #0x80        // if (k & 0x80)
    ldrh  r0, [r2]         // load signed short - WHY NOT ldrsh?
    movne r3, #0           //     k = 0
    add   r1, r1, r12, asr #24 // add sign extended
    add   r3, r1, r3
    lsl   r0, r0, #16      // sign extend l
    add   r0, r3, r0, asr #16
    bx lr

There are two issues:

1) There is a completely redundant and r1,r1,#0xff. This does not occur when
loading the unsigned short (which is why I have the similar code for loading an
unsigned short).
2) There is unneccesary sign extension taking place. ARM has allowed signed
loads of 8 and 16-bit values since v4. Spotting this has to be opportunistic as
there are offset restrictions.

Ideally the code would look like:
    mov   r2, #0xE0000000 // base address of the device
    ldrb  r1, [r2]        // load an unsigned byte, 0 extend
    ldrsb r12, [r2]       // load signed byte
    ldrh  r3, [r2]        // load unsigned short
    tst   r1, #0x80       // if (i & 0x80)
    movne r1, #0          //     i = 0
    tst   r3, #0x80       // if (k & 0x80)
    ldrsh r0, [r2]        // load signed short, extend to 32-bits
    movne r3, #0          //     k = 0
    add   r1, r1, r12     // add sign extended
    add   r3, r1, r3
    add   r0, r3, r0
    bx lr


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-04-24 13:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-12 10:58 [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory gregpsmith at live dot co.uk
2012-07-12 11:27 ` [Bug target/53938] " rguenth at gcc dot gnu.org
2012-07-12 16:10 ` pinskia at gcc dot gnu.org
2012-07-12 19:09 ` gregpsmith at live dot co.uk
2013-08-05 21:20 ` rearnsha at gcc dot gnu.org
2024-01-16 22:13 ` pinskia at gcc dot gnu.org
2024-04-24 13:27 ` rsaxvc at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).