public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/47855] New: missed cbnz optimization
@ 2011-02-23 11:48 carrot at google dot com
2011-02-25 9:49 ` [Bug target/47855] " carrot at google dot com
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: carrot at google dot com @ 2011-02-23 11:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855
Summary: missed cbnz optimization
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: carrot@google.com
Target: arm-linux-androideabi
Created attachment 23440
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23440
testcase
Compile the attached source code with options -march=armv7-a -mthumb -Os, GCC
4.6 generates
pnm_gethdr:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 0, uses_anonymous_args = 0
push {r0, r1, r2, r4, r5, lr}
mov r5, r0
mov r4, r1
bl foo2
cmp r0, #0 // A
bne .L13 // B
adds r1, r4, #4
mov r0, r5
bl foo3
cmp r0, #0 // C
bne .L13 // D
add r1, r4, #8
mov r0, r5
bl foo1
cbnz r0, .L13 // E
ldr r0, [r4, #0]
bl pnm_type
cmp r0, #2
beq .L3
mov r0, r5
add r1, sp, #4
bl pnm_getsintstr
cbz r0, .L4
b .L13
.L3:
movs r3, #1
str r3, [sp, #4]
.L4:
ldr r3, [sp, #4]
cmp r3, #0
bge .L5
negs r3, r3
str r3, [r4, #16]
movs r3, #1
b .L14
.L5:
str r3, [r4, #16]
movs r3, #0
.L14:
strb r3, [r4, #20]
ldr r0, [r4, #0]
bl pnm_type
cmp r0, #0
beq .L8
blt .L7
cmp r0, #2
bgt .L7
movs r3, #1
movs r0, #0
str r3, [r4, #12]
b .L2
.L8:
movs r3, #3
str r3, [r4, #12]
b .L2
.L7:
bl abort
.L13:
mov r0, #-1
.L2:
pop {r1, r2, r3, r4, r5, pc}
The branch distance of cbz/cbnz is 126 bytes. The size of the whole function is
124 bytes. So instructions AB and CD can be replaced by
cbnz r0, .L13
same as instruction E.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/47855] missed cbnz optimization
2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
@ 2011-02-25 9:49 ` carrot at google dot com
2011-03-26 16:56 ` steven at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: carrot at google dot com @ 2011-02-25 9:49 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855
--- Comment #1 from Carrot <carrot at google dot com> 2011-02-25 07:18:07 UTC ---
I printed out the address of each instruction from function arm_reorg()
id=173 addr=0
id=2 addr=4
id=3 addr=8
id=15 addr=12
id=18 addr=16
id=199 addr=24
id=21 addr=26
id=23 addr=30
id=26 addr=34
id=28 addr=42
id=29 addr=46
id=31 addr=50
id=34 addr=54
id=36 addr=62
id=37 addr=66
id=39 addr=70
id=40 addr=74
id=43 addr=78
id=44 addr=82
id=45 addr=86
id=48 addr=90
id=201 addr=98
id=198 addr=102
id=55 addr=104
id=58 addr=108
id=59 addr=112
id=60 addr=116
id=196 addr=120
id=63 addr=122
id=197 addr=126
id=203 addr=128
id=71 addr=132
id=195 addr=136
id=74 addr=138
id=77 addr=142
id=78 addr=146
id=80 addr=150
id=81 addr=154
id=83 addr=158
id=84 addr=162
id=85 addr=166
id=193 addr=170
id=194 addr=172
id=93 addr=174
id=205 addr=178
id=192 addr=182
id=99 addr=184
id=207 addr=188
id=104 addr=192
id=6 addr=196
id=115 addr=200
id=176 addr=200
The GCC computed function length is more than 200 bytes, much larger than the
actual 126 bytes. Apparently there are a lot of length attribute errors! Take
the second instruction as an example,
55 (insn:TI 2 174 3 2 (set (reg/v/f:SI 5 r5 [orig:149 in ] [149])
56 (reg:SI 0 r0 [ in ])) src/t08.c:19 694 {*thumb2_movsi_insn}
57 (nil))
or
mov r5, r0
It should has a length of 2. The matched insn pattern is
167 (define_insn "*thumb2_movsi_insn"
168 [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,l
,*hk,m,*m")
169 (match_operand:SI 1 "general_operand" "rk
,I,K,j,mi,*mi,l,*hk" ))]
170 "TARGET_THUMB2 && ! TARGET_IWMMXT
171 && !(TARGET_HARD_FLOAT && TARGET_VFP)
172 && ( register_operand (operands[0], SImode)
173 || register_operand (operands[1], SImode))"
174 "@
175 mov%?\\t%0, %1
176 mov%?\\t%0, %1
177 mvn%?\\t%0, #%B1
178 movw%?\\t%0, %1
179 ldr%?\\t%0, %1
180 ldr%?\\t%0, %1
181 str%?\\t%1, %0
182 str%?\\t%1, %0"
183 [(set_attr "type" "*,*,*,*,load1,load1,store1,store1")
184 (set_attr "predicable" "yes")
185 (set_attr "pool_range" "*,*,*,*,1020,4096,*,*")
186 (set_attr "neg_pool_range" "*,*,*,*,0,0,*,*")]
187 )
It doesn't specify length, so gcc takes the default value of 4.
>From this test case I can find at least the following patterns containing wrong
length attribute.
push_multi
thumb2_movsi_insn
thumb2_cbnz
thumb2_cbz
arm_cmpsi_insn
arm_cond_branch
arm_addsi3
arm_jump
arm_cond_branch
arm_movqi_insn
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/47855] missed cbnz optimization
2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
2011-02-25 9:49 ` [Bug target/47855] " carrot at google dot com
@ 2011-03-26 16:56 ` steven at gcc dot gnu.org
2011-03-26 17:54 ` steven at gcc dot gnu.org
2011-09-22 7:21 ` jye2 at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: steven at gcc dot gnu.org @ 2011-03-26 16:56 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855
Steven Bosscher <steven at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |steven at gcc dot gnu.org
--- Comment #2 from Steven Bosscher <steven at gcc dot gnu.org> 2011-03-26 16:06:21 UTC ---
If you compile with the -dAP option and look at the .s output of gcc, you can
see for each asm instruction which insn it came from and what length gcc
assumed for it.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/47855] missed cbnz optimization
2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
2011-02-25 9:49 ` [Bug target/47855] " carrot at google dot com
2011-03-26 16:56 ` steven at gcc dot gnu.org
@ 2011-03-26 17:54 ` steven at gcc dot gnu.org
2011-09-22 7:21 ` jye2 at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: steven at gcc dot gnu.org @ 2011-03-26 17:54 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855
Steven Bosscher <steven at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target|arm-linux-androideabi |arm-*
Status|UNCONFIRMED |NEW
Last reconfirmed| |2011.03.26 17:15:33
Ever Confirmed|0 |1
--- Comment #3 from Steven Bosscher <steven at gcc dot gnu.org> 2011-03-26 17:15:33 UTC ---
Confirmed. Many of the lengths are wrong, even for simple instructions.
$ arm-eabi-gcc -c -Os -march=armv7-a -mthumb -dAp -Wa,-ahlms=t.lst t.c
$ cat t.lst
ARM GAS /tmp/ccLnp6gX.s page 1
1 .syntax unified
2 .arch armv7-a
3 .eabi_attribute 27, 3
4 .fpu neon
5 .eabi_attribute 20, 1
6 .eabi_attribute 21, 1
7 .eabi_attribute 23, 3
8 .eabi_attribute 24, 1
9 .eabi_attribute 25, 1
10 .eabi_attribute 26, 1
11 .eabi_attribute 30, 4
12 .eabi_attribute 18, 4
13 .file "t.c"
14 .text
15 .align 1
16 .global pnm_gethdr
17 .thumb
18 .thumb_func
19 .type pnm_gethdr, %function
20 pnm_gethdr:
21 @ args = 0, pretend = 0, frame = 8
22 @ frame_needed = 0, uses_anonymous_args = 0
23 @ basic block 2
24 0000 37B5 push {r0, r1, r2, r4, r5, lr} @ 173
*push_multi [length = 4]
25 0002 0546 mov r5, r0 @ 2 *thumb2_movsi_vfp/1
[length = 4]
26 0004 0C46 mov r4, r1 @ 3 *thumb2_movsi_vfp/1
[length = 4]
27 0006 FFF7FEFF bl foo2 @ 15 *call_value_symbol [length
= 4]
28 000a 0028 cmp r0, #0 @ 18 *thumb2_cbnz/1 [length =
8]
29 000c 33D1 bne .L13
30 @ basic block 3
31 000e 211D adds r1, r4, #4 @ 199 *thumb2_addsi_short/1
[length = 2]
32 0010 2846 mov r0, r5 @ 21 *thumb2_movsi_vfp/1
[length = 4]
33 0012 FFF7FEFF bl foo3 @ 23 *call_value_symbol [length
= 4]
34 0016 0028 cmp r0, #0 @ 26 *thumb2_cbnz/1 [length =
8]
35 0018 2DD1 bne .L13
36 @ basic block 4
37 001a 04F10801 add r1, r4, #8 @ 28 *arm_addsi3/1
[length = 4]
38 001e 2846 mov r0, r5 @ 29 *thumb2_movsi_vfp/1
[length = 4]
39 0020 FFF7FEFF bl foo1 @ 31 *call_value_symbol [length
= 4]
40 0024 38BB cbnz r0, .L13 @ 34 *thumb2_cbnz/1
[length = 2]
41 @ basic block 5
42 0026 2068 ldr r0, [r4, #0] @ 36 *thumb2_movsi_vfp/5
[length = 4]
43 0028 FFF7FEFF bl pnm_type @ 37 *call_value_symbol
[length = 4]
44 002c 0228 cmp r0, #2 @ 39 *arm_cmpsi_insn/1
[length = 4]
45 002e 05D0 beq .L3 @ 40 *arm_cond_branch [length =
4]
46 @ basic block 6
47 0030 2846 mov r0, r5 @ 43 *thumb2_movsi_vfp/1
[length = 4]
48 0032 01A9 add r1, sp, #4 @ 44 *arm_addsi3/1
[length = 4]
49 0034 FFF7FEFF bl pnm_getsintstr @ 45 *call_value_symbol
[length = 4]
50 0038 10B1 cbz r0, .L4 @ 48 *thumb2_cbz/1 [length =
2]
51 @ basic block 7
52 003a 1CE0 b .L13 @ 201 *arm_jump [length = 4]
53 .L3:
54 @ basic block 8
55 003c 0123 movs r3, #1 @ 198 *thumb2_movsi_shortim
[length = 2]
56 003e 0193 str r3, [sp, #4] @ 55 *thumb2_movsi_vfp/7
[length = 4]
57 .L4:
\fARM GAS /tmp/ccLnp6gX.s page 2
58 @ basic block 9
59 0040 019B ldr r3, [sp, #4] @ 58 *thumb2_movsi_vfp/5
[length = 4]
60 0042 002B cmp r3, #0 @ 59 *arm_cmpsi_insn/1
[length = 4]
61 0044 03DA bge .L5 @ 60 *arm_cond_branch [length =
4]
62 @ basic block 10
63 0046 5B42 negs r3, r3 @ 196 *thumb2_negsi2_short
[length = 2]
64 0048 2361 str r3, [r4, #16] @ 63 *thumb2_movsi_vfp/7
[length = 4]
65 004a 0123 movs r3, #1 @ 197 *thumb2_movsi_shortim
[length = 2]
66 004c 01E0 b .L14 @ 203 *arm_jump [length = 4]
67 .L5:
68 @ basic block 11
69 004e 2361 str r3, [r4, #16] @ 71 *thumb2_movsi_vfp/7
[length = 4]
70 0050 0023 movs r3, #0 @ 195 *thumb2_movsi_shortim
[length = 2]
71 .L14:
72 @ basic block 12
73 0052 2375 strb r3, [r4, #20] @ 74 *arm_movqi_insn/4
[length = 4]
74 0054 2068 ldr r0, [r4, #0] @ 77 *thumb2_movsi_vfp/5
[length = 4]
75 0056 FFF7FEFF bl pnm_type @ 78 *call_value_symbol
[length = 4]
76 005a 0028 cmp r0, #0 @ 80 *arm_cmpsi_insn/1
[length = 4]
77 005c 06D0 beq .L8 @ 81 *arm_cond_branch [length =
4]
78 @ basic block 13
79 005e 08DB blt .L7 @ 83 *arm_cond_branch [length =
4]
80 @ basic block 14
81 0060 0228 cmp r0, #2 @ 84 *arm_cmpsi_insn/1
[length = 4]
82 0062 06DC bgt .L7 @ 85 *arm_cond_branch [length =
4]
83 @ basic block 15
84 0064 0123 movs r3, #1 @ 193 *thumb2_movsi_shortim
[length = 2]
85 0066 0020 movs r0, #0 @ 194 *thumb2_movsi_shortim
[length = 2]
86 0068 E360 str r3, [r4, #12] @ 93 *thumb2_movsi_vfp/7
[length = 4]
87 006a 06E0 b .L2 @ 205 *arm_jump [length = 4]
88 .L8:
89 @ basic block 16
90 006c 0323 movs r3, #3 @ 192 *thumb2_movsi_shortim
[length = 2]
91 006e E360 str r3, [r4, #12] @ 99 *thumb2_movsi_vfp/7
[length = 4]
92 0070 03E0 b .L2 @ 207 *arm_jump [length = 4]
93 .L7:
94 @ basic block 17
95 0072 FFF7FEFF bl abort @ 104 *call_symbol [length = 4]
96 .L13:
97 @ basic block 18
98 0076 4FF0FF30 mov r0, #-1 @ 6 *thumb2_movsi_vfp/2
[length = 4]
99 .L2:
100 @ basic block 19
101 007a 3EBD pop {r1, r2, r3, r4, r5, pc}
102 .size pnm_gethdr, .-pnm_gethdr
103 .ident "GCC: (GNU) 4.7.0 20110326 (experimental)
[trunk revision 171556]"
\fARM GAS /tmp/ccLnp6gX.s page 3
DEFINED SYMBOLS
*ABS*:0000000000000000 t.c
/tmp/ccLnp6gX.s:20 .text:0000000000000000 pnm_gethdr
/tmp/ccLnp6gX.s:24 .text:0000000000000000 $t
UNDEFINED SYMBOLS
foo2
foo3
foo1
pnm_type
pnm_getsintstr
abort
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/47855] missed cbnz optimization
2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
` (2 preceding siblings ...)
2011-03-26 17:54 ` steven at gcc dot gnu.org
@ 2011-09-22 7:21 ` jye2 at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: jye2 at gcc dot gnu.org @ 2011-09-22 7:21 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855
--- Comment #4 from jye2 at gcc dot gnu.org 2011-09-22 06:41:49 UTC ---
Author: jye2
Date: Thu Sep 22 06:41:44 2011
New Revision: 179077
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179077
Log:
2011-09-22 Joey Ye <joey.ye@arm.com>
Backport r178852 from mainline
2011-09-14 Julian Brown <julian@codesourcery.com>
* config/arm/arm.c (arm_override_options): Add unaligned_access
support.
(arm_file_start): Emit attribute for unaligned access as appropriate.
* config/arm/arm.md (UNSPEC_UNALIGNED_LOAD)
(UNSPEC_UNALIGNED_STORE): Add constants for unspecs.
(insv, extzv): Add unaligned-access support.
(extv): Change to expander. Likewise.
(extzv_t1, extv_regsi): Add helpers.
(unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu)
(unaligned_storesi, unaligned_storehi): New.
(*extv_reg): New (previous extv implementation).
* config/arm/arm.opt (munaligned_access): Add option.
* config/arm/constraints.md (Uw): New constraint.
* expmed.c (store_bit_field_1): Adjust bitfield numbering according
to size of access, not size of unit, when BITS_BIG_ENDIAN !=
BYTES_BIG_ENDIAN. Don't use bitfield accesses for
volatile accesses when -fstrict-volatile-bitfields is in effect.
(extract_bit_field_1): Likewise.
Backport r172697 from mainline
2011-04-19 Wei Guozhi <carrot@google.com>
PR target/47855
* config/arm/arm-protos.h (thumb1_legitimate_address_p): New prototype.
* config/arm/arm.c (thumb1_legitimate_address_p): Remove the static
linkage.
* config/arm/constraints.md (Uu): New constraint.
* config/arm/arm.md (*arm_movqi_insn): Compute attr "length".
Modified:
branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm
branches/ARM/embedded-4_6-branch/gcc/config/arm/arm-protos.h
branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.c
branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.md
branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.opt
branches/ARM/embedded-4_6-branch/gcc/config/arm/constraints.md
branches/ARM/embedded-4_6-branch/gcc/expmed.c
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-09-22 6:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
2011-02-25 9:49 ` [Bug target/47855] " carrot at google dot com
2011-03-26 16:56 ` steven at gcc dot gnu.org
2011-03-26 17:54 ` steven at gcc dot gnu.org
2011-09-22 7:21 ` jye2 at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).