public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory
@ 2012-07-12 10:58 gregpsmith at live dot co.uk
2012-07-12 11:27 ` [Bug target/53938] " rguenth at gcc dot gnu.org
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: gregpsmith at live dot co.uk @ 2012-07-12 10:58 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53938
Bug #: 53938
Summary: ARM target generates sub-optimal code (extra
instructions) on load from memory
Classification: Unclassified
Product: gcc
Version: 4.6.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: gregpsmith@live.co.uk
Created attachment 27781
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27781
Example C source code
We are targetting an embedded device and we do a lot of work accessing an FPGA
(but this applies just as well to memory access). It has annoyed me for years
that the GCC compiler emits unncessary code, wasting memory and cycles when
reading 8 and 16-bit values.
The attached script shows opportunities to generate better code. when compiled
with:
gcc -c -O3 -mcpu=arm946e-s codegen.c
It compiles to (I have added comments):
<DeviceAccess>
mov r2, #0xE0000000 // base address of the device
ldrb r1, [r2] // load an unsigned byte, 0 extend
ldrb r12, [r2] // load signed byte - WHY NOT ldrsb?
and r1, r1, #0xFF // WHAT IS THIS FOR
ldrh r3, [r2] // load unsigned short
tst r1, #0x80 // if (i & 0x80)
movne r1, #0 // i = 0
lsl r12, r12, #24 // sign extend j (but could be avoided)
tst r3, #0x80 // if (k & 0x80)
ldrh r0, [r2] // load signed short - WHY NOT ldrsh?
movne r3, #0 // k = 0
add r1, r1, r12, asr #24 // add sign extended
add r3, r1, r3
lsl r0, r0, #16 // sign extend l
add r0, r3, r0, asr #16
bx lr
There are two issues:
1) There is a completely redundant and r1,r1,#0xff. This does not occur when
loading the unsigned short (which is why I have the similar code for loading an
unsigned short).
2) There is unneccesary sign extension taking place. ARM has allowed signed
loads of 8 and 16-bit values since v4. Spotting this has to be opportunistic as
there are offset restrictions.
Ideally the code would look like:
mov r2, #0xE0000000 // base address of the device
ldrb r1, [r2] // load an unsigned byte, 0 extend
ldrsb r12, [r2] // load signed byte
ldrh r3, [r2] // load unsigned short
tst r1, #0x80 // if (i & 0x80)
movne r1, #0 // i = 0
tst r3, #0x80 // if (k & 0x80)
ldrsh r0, [r2] // load signed short, extend to 32-bits
movne r3, #0 // k = 0
add r1, r1, r12 // add sign extended
add r3, r1, r3
add r0, r3, r0
bx lr
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/53938] ARM target generates sub-optimal code (extra instructions) on load from memory
2012-07-12 10:58 [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory gregpsmith at live dot co.uk
@ 2012-07-12 11:27 ` rguenth at gcc dot gnu.org
2012-07-12 16:10 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-12 11:27 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53938
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |arm*-*-*
Status|UNCONFIRMED |WAITING
Last reconfirmed| |2012-07-12
Ever Confirmed|0 |1
--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-07-12 11:27:07 UTC ---
Can you verify if the situation improves with GCC 4.7 or current development
trunk?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/53938] ARM target generates sub-optimal code (extra instructions) on load from memory
2012-07-12 10:58 [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory gregpsmith at live dot co.uk
2012-07-12 11:27 ` [Bug target/53938] " rguenth at gcc dot gnu.org
@ 2012-07-12 16:10 ` pinskia at gcc dot gnu.org
2012-07-12 19:09 ` gregpsmith at live dot co.uk
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-07-12 16:10 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53938
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> 2012-07-12 16:10:09 UTC ---
I think this is the standard volatile vs combine issue.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/53938] ARM target generates sub-optimal code (extra instructions) on load from memory
2012-07-12 10:58 [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory gregpsmith at live dot co.uk
2012-07-12 11:27 ` [Bug target/53938] " rguenth at gcc dot gnu.org
2012-07-12 16:10 ` pinskia at gcc dot gnu.org
@ 2012-07-12 19:09 ` gregpsmith at live dot co.uk
2013-08-05 21:20 ` rearnsha at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: gregpsmith at live dot co.uk @ 2012-07-12 19:09 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53938
--- Comment #3 from Greg Smith <gregpsmith at live dot co.uk> 2012-07-12 19:09:41 UTC ---
(In reply to comment #1)
> Can you verify if the situation improves with GCC 4.7 or current development
> trunk?
I am an end user of the Rowley CrossWorks system and they are not on to 4.7
yet. I am not really set up to conveniently build my own ARM cross compiler
(from Windows)... though this is not impossible.
However, I see nothing in the 4.7.0 and 4.7.1 release notes to suggest that any
changes have been made in this area. I recall seeing this type of code
generation for several years...
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/53938] ARM target generates sub-optimal code (extra instructions) on load from memory
2012-07-12 10:58 [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory gregpsmith at live dot co.uk
` (2 preceding siblings ...)
2012-07-12 19:09 ` gregpsmith at live dot co.uk
@ 2013-08-05 21:20 ` rearnsha at gcc dot gnu.org
2024-01-16 22:13 ` pinskia at gcc dot gnu.org
2024-04-24 13:27 ` rsaxvc at gmail dot com
5 siblings, 0 replies; 7+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2013-08-05 21:20 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53938
Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|WAITING |NEW
--- Comment #4 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
Trunk as of this weekend still generates:
mov r3, #-536870912
ldrb r1, [r3] @ zero_extendqisi2
ldrb ip, [r3] @ zero_extendqisi2
and r1, r1, #255
ldrh r2, [r3]
tst r1, #128
movne r1, #0
tst r2, #128
movne r2, #0
mov ip, ip, asl #24
ldrh r0, [r3]
add r1, r1, ip, asr #24
add r2, r1, r2
mov r0, r0, asl #16
add r0, r2, r0, asr #16
bx lr
The real problem is that the RTL expansion passes never generate zero- or
sign-extended values directly. They expect combine to pick this up.
Unfortunately, combine won't touch a memory access that is volatile.
What does still surprise me is that we fail to eliminate the zero-expand
operation. After expand we have:
(insn 8 7 9 (set (reg:SI 126)
(zero_extend:SI (mem/v:QI (reg/f:SI 124) [0 MEM[(union io
*)3758096384B].uch+0 S1 A64]))) test.c:30 -1
(nil))
(insn 9 8 10 (set (reg:QI 125)
(subreg:QI (reg:SI 126) 0)) test.c:30 -1
(nil))
(insn 10 9 0 (set (reg/v:SI 111 [ i ])
(and:SI (subreg:SI (reg:QI 125) 0)
(const_int 255 [0xff]))) test.c:30 -1
(nil))
I would have expected at the very least that some pass would have worked out
that regs 126 and 111 are equivalent.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/53938] ARM target generates sub-optimal code (extra instructions) on load from memory
2012-07-12 10:58 [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory gregpsmith at live dot co.uk
` (3 preceding siblings ...)
2013-08-05 21:20 ` rearnsha at gcc dot gnu.org
@ 2024-01-16 22:13 ` pinskia at gcc dot gnu.org
2024-04-24 13:27 ` rsaxvc at gmail dot com
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-16 22:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53938
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rsaxvc at gmail dot com
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 113432 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/53938] ARM target generates sub-optimal code (extra instructions) on load from memory
2012-07-12 10:58 [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory gregpsmith at live dot co.uk
` (4 preceding siblings ...)
2024-01-16 22:13 ` pinskia at gcc dot gnu.org
@ 2024-04-24 13:27 ` rsaxvc at gmail dot com
5 siblings, 0 replies; 7+ messages in thread
From: rsaxvc at gmail dot com @ 2024-04-24 13:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53938
--- Comment #6 from rsaxvc at gmail dot com ---
This also impacts Cortex-M0 & M23 on GCC13.2.0, just with the new extension
instructions.
Oddly, when loading a volatile u8 or u16 on Cortex-M3/4/7 does not generate
extra zero extension instructions. But these cores do still have separate
ldrb/ldrb + sxtab/sxtah sign extension instead of LDRSB/LDRSH.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-04-24 13:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-12 10:58 [Bug target/53938] New: ARM target generates sub-optimal code (extra instructions) on load from memory gregpsmith at live dot co.uk
2012-07-12 11:27 ` [Bug target/53938] " rguenth at gcc dot gnu.org
2012-07-12 16:10 ` pinskia at gcc dot gnu.org
2012-07-12 19:09 ` gregpsmith at live dot co.uk
2013-08-05 21:20 ` rearnsha at gcc dot gnu.org
2024-01-16 22:13 ` pinskia at gcc dot gnu.org
2024-04-24 13:27 ` rsaxvc at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).