public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/47855] New: missed cbnz optimization
@ 2011-02-23 11:48 carrot at google dot com
  2011-02-25  9:49 ` [Bug target/47855] " carrot at google dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: carrot at google dot com @ 2011-02-23 11:48 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855

           Summary: missed cbnz optimization
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: carrot@google.com
            Target: arm-linux-androideabi


Created attachment 23440
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23440
testcase

Compile the attached source code with options -march=armv7-a -mthumb -Os, GCC
4.6 generates 

pnm_gethdr:
    @ args = 0, pretend = 0, frame = 8
    @ frame_needed = 0, uses_anonymous_args = 0
    push    {r0, r1, r2, r4, r5, lr}
    mov    r5, r0
    mov    r4, r1
    bl    foo2
    cmp    r0, #0                // A
    bne    .L13                  // B
    adds    r1, r4, #4
    mov    r0, r5
    bl    foo3
    cmp    r0, #0                // C
    bne    .L13                  // D
    add    r1, r4, #8
    mov    r0, r5
    bl    foo1
    cbnz    r0, .L13              // E
    ldr    r0, [r4, #0]
    bl    pnm_type
    cmp    r0, #2
    beq    .L3
    mov    r0, r5
    add    r1, sp, #4
    bl    pnm_getsintstr
    cbz    r0, .L4
    b    .L13
.L3:
    movs    r3, #1
    str    r3, [sp, #4]
.L4:
    ldr    r3, [sp, #4]
    cmp    r3, #0
    bge    .L5
    negs    r3, r3
    str    r3, [r4, #16]
    movs    r3, #1
    b    .L14
.L5:
    str    r3, [r4, #16]
    movs    r3, #0
.L14:
    strb    r3, [r4, #20]
    ldr    r0, [r4, #0]
    bl    pnm_type
    cmp    r0, #0
    beq    .L8
    blt    .L7
    cmp    r0, #2
    bgt    .L7
    movs    r3, #1
    movs    r0, #0
    str    r3, [r4, #12]
    b    .L2
.L8:
    movs    r3, #3
    str    r3, [r4, #12]
    b    .L2
.L7:
    bl    abort
.L13:
    mov    r0, #-1
.L2:
    pop    {r1, r2, r3, r4, r5, pc}

The branch distance of cbz/cbnz is 126 bytes. The size of the whole function is
124 bytes. So instructions AB and CD can be replaced by

     cbnz    r0,  .L13

same as instruction E.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/47855] missed cbnz optimization
  2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
@ 2011-02-25  9:49 ` carrot at google dot com
  2011-03-26 16:56 ` steven at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: carrot at google dot com @ 2011-02-25  9:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855

--- Comment #1 from Carrot <carrot at google dot com> 2011-02-25 07:18:07 UTC ---
I printed out the address of each instruction from function arm_reorg()

id=173 addr=0
id=2 addr=4
id=3 addr=8
id=15 addr=12
id=18 addr=16
id=199 addr=24
id=21 addr=26
id=23 addr=30
id=26 addr=34
id=28 addr=42
id=29 addr=46
id=31 addr=50
id=34 addr=54
id=36 addr=62
id=37 addr=66
id=39 addr=70
id=40 addr=74
id=43 addr=78
id=44 addr=82
id=45 addr=86
id=48 addr=90
id=201 addr=98
id=198 addr=102
id=55 addr=104
id=58 addr=108
id=59 addr=112
id=60 addr=116
id=196 addr=120
id=63 addr=122
id=197 addr=126
id=203 addr=128
id=71 addr=132
id=195 addr=136
id=74 addr=138
id=77 addr=142
id=78 addr=146
id=80 addr=150
id=81 addr=154
id=83 addr=158
id=84 addr=162
id=85 addr=166
id=193 addr=170
id=194 addr=172
id=93 addr=174
id=205 addr=178
id=192 addr=182
id=99 addr=184
id=207 addr=188
id=104 addr=192
id=6 addr=196
id=115 addr=200
id=176 addr=200

The GCC computed function length is more than 200 bytes, much larger than the
actual 126 bytes. Apparently there are a lot of length attribute errors! Take
the second instruction as an example,

 55 (insn:TI 2 174 3 2 (set (reg/v/f:SI 5 r5 [orig:149 in ] [149])
 56         (reg:SI 0 r0 [ in ])) src/t08.c:19 694 {*thumb2_movsi_insn}
 57      (nil))

or 
     mov   r5, r0

It should has a length of 2. The matched insn pattern is

 167 (define_insn "*thumb2_movsi_insn"
 168   [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,l
,*hk,m,*m")
 169         (match_operand:SI 1 "general_operand"      "rk
,I,K,j,mi,*mi,l,*hk"     ))]
 170   "TARGET_THUMB2 && ! TARGET_IWMMXT
 171    && !(TARGET_HARD_FLOAT && TARGET_VFP)
 172    && (   register_operand (operands[0], SImode)
 173        || register_operand (operands[1], SImode))"
 174   "@
 175    mov%?\\t%0, %1
 176    mov%?\\t%0, %1
 177    mvn%?\\t%0, #%B1
 178    movw%?\\t%0, %1
 179    ldr%?\\t%0, %1
 180    ldr%?\\t%0, %1
 181    str%?\\t%1, %0
 182    str%?\\t%1, %0"
 183   [(set_attr "type" "*,*,*,*,load1,load1,store1,store1")
 184    (set_attr "predicable" "yes")
 185    (set_attr "pool_range" "*,*,*,*,1020,4096,*,*")
 186    (set_attr "neg_pool_range" "*,*,*,*,0,0,*,*")]
 187 )

It doesn't specify length, so gcc takes the default value of 4.

>From this test case I can find at least the following patterns containing wrong
length attribute.

push_multi
thumb2_movsi_insn
thumb2_cbnz
thumb2_cbz
arm_cmpsi_insn
arm_cond_branch
arm_addsi3
arm_jump
arm_cond_branch
arm_movqi_insn


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/47855] missed cbnz optimization
  2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
  2011-02-25  9:49 ` [Bug target/47855] " carrot at google dot com
@ 2011-03-26 16:56 ` steven at gcc dot gnu.org
  2011-03-26 17:54 ` steven at gcc dot gnu.org
  2011-09-22  7:21 ` jye2 at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: steven at gcc dot gnu.org @ 2011-03-26 16:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |steven at gcc dot gnu.org

--- Comment #2 from Steven Bosscher <steven at gcc dot gnu.org> 2011-03-26 16:06:21 UTC ---
If you compile with the -dAP option and look at the .s output of gcc, you can
see for each asm instruction which insn it came from and what length gcc
assumed for it.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/47855] missed cbnz optimization
  2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
  2011-02-25  9:49 ` [Bug target/47855] " carrot at google dot com
  2011-03-26 16:56 ` steven at gcc dot gnu.org
@ 2011-03-26 17:54 ` steven at gcc dot gnu.org
  2011-09-22  7:21 ` jye2 at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: steven at gcc dot gnu.org @ 2011-03-26 17:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|arm-linux-androideabi       |arm-*
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2011.03.26 17:15:33
     Ever Confirmed|0                           |1

--- Comment #3 from Steven Bosscher <steven at gcc dot gnu.org> 2011-03-26 17:15:33 UTC ---
Confirmed. Many of the lengths are wrong, even for simple instructions.

$ arm-eabi-gcc -c -Os -march=armv7-a -mthumb -dAp -Wa,-ahlms=t.lst t.c
$ cat t.lst
ARM GAS  /tmp/ccLnp6gX.s             page 1


   1                      .syntax unified
   2                      .arch armv7-a
   3                      .eabi_attribute 27, 3
   4                      .fpu neon
   5                      .eabi_attribute 20, 1
   6                      .eabi_attribute 21, 1
   7                      .eabi_attribute 23, 3
   8                      .eabi_attribute 24, 1
   9                      .eabi_attribute 25, 1
  10                      .eabi_attribute 26, 1
  11                      .eabi_attribute 30, 4
  12                      .eabi_attribute 18, 4
  13                      .file    "t.c"
  14                      .text
  15                      .align    1
  16                      .global    pnm_gethdr
  17                      .thumb
  18                      .thumb_func
  19                      .type    pnm_gethdr, %function
  20                  pnm_gethdr:
  21                      @ args = 0, pretend = 0, frame = 8
  22                      @ frame_needed = 0, uses_anonymous_args = 0
  23                      @ basic block 2
  24 0000 37B5             push    {r0, r1, r2, r4, r5, lr}    @ 173   
*push_multi    [length = 4]
  25 0002 0546             mov    r5, r0    @ 2    *thumb2_movsi_vfp/1   
[length = 4]
  26 0004 0C46             mov    r4, r1    @ 3    *thumb2_movsi_vfp/1   
[length = 4]
  27 0006 FFF7FEFF         bl    foo2    @ 15    *call_value_symbol    [length
= 4]
  28 000a 0028             cmp    r0, #0    @ 18    *thumb2_cbnz/1    [length =
8]
  29 000c 33D1             bne    .L13
  30                      @ basic block 3
  31 000e 211D             adds    r1, r4, #4    @ 199    *thumb2_addsi_short/1
   [length = 2]
  32 0010 2846             mov    r0, r5    @ 21    *thumb2_movsi_vfp/1   
[length = 4]
  33 0012 FFF7FEFF         bl    foo3    @ 23    *call_value_symbol    [length
= 4]
  34 0016 0028             cmp    r0, #0    @ 26    *thumb2_cbnz/1    [length =
8]
  35 0018 2DD1             bne    .L13
  36                      @ basic block 4
  37 001a 04F10801         add    r1, r4, #8    @ 28    *arm_addsi3/1   
[length = 4]
  38 001e 2846             mov    r0, r5    @ 29    *thumb2_movsi_vfp/1   
[length = 4]
  39 0020 FFF7FEFF         bl    foo1    @ 31    *call_value_symbol    [length
= 4]
  40 0024 38BB             cbnz    r0, .L13    @ 34    *thumb2_cbnz/1   
[length = 2]
  41                      @ basic block 5
  42 0026 2068             ldr    r0, [r4, #0]    @ 36    *thumb2_movsi_vfp/5  
 [length = 4]
  43 0028 FFF7FEFF         bl    pnm_type    @ 37    *call_value_symbol   
[length = 4]
  44 002c 0228             cmp    r0, #2    @ 39    *arm_cmpsi_insn/1   
[length = 4]
  45 002e 05D0             beq    .L3    @ 40    *arm_cond_branch    [length =
4]
  46                      @ basic block 6
  47 0030 2846             mov    r0, r5    @ 43    *thumb2_movsi_vfp/1   
[length = 4]
  48 0032 01A9             add    r1, sp, #4    @ 44    *arm_addsi3/1   
[length = 4]
  49 0034 FFF7FEFF         bl    pnm_getsintstr    @ 45    *call_value_symbol  
 [length = 4]
  50 0038 10B1             cbz    r0, .L4    @ 48    *thumb2_cbz/1    [length =
2]
  51                      @ basic block 7
  52 003a 1CE0             b    .L13    @ 201    *arm_jump    [length = 4]
  53                  .L3:
  54                      @ basic block 8
  55 003c 0123             movs    r3, #1    @ 198    *thumb2_movsi_shortim   
[length = 2]
  56 003e 0193             str    r3, [sp, #4]    @ 55    *thumb2_movsi_vfp/7  
 [length = 4]
  57                  .L4:
\fARM GAS  /tmp/ccLnp6gX.s             page 2


  58                      @ basic block 9
  59 0040 019B             ldr    r3, [sp, #4]    @ 58    *thumb2_movsi_vfp/5  
 [length = 4]
  60 0042 002B             cmp    r3, #0    @ 59    *arm_cmpsi_insn/1   
[length = 4]
  61 0044 03DA             bge    .L5    @ 60    *arm_cond_branch    [length =
4]
  62                      @ basic block 10
  63 0046 5B42             negs    r3, r3    @ 196    *thumb2_negsi2_short   
[length = 2]
  64 0048 2361             str    r3, [r4, #16]    @ 63    *thumb2_movsi_vfp/7 
  [length = 4]
  65 004a 0123             movs    r3, #1    @ 197    *thumb2_movsi_shortim   
[length = 2]
  66 004c 01E0             b    .L14    @ 203    *arm_jump    [length = 4]
  67                  .L5:
  68                      @ basic block 11
  69 004e 2361             str    r3, [r4, #16]    @ 71    *thumb2_movsi_vfp/7 
  [length = 4]
  70 0050 0023             movs    r3, #0    @ 195    *thumb2_movsi_shortim   
[length = 2]
  71                  .L14:
  72                      @ basic block 12
  73 0052 2375             strb    r3, [r4, #20]    @ 74    *arm_movqi_insn/4  
 [length = 4]
  74 0054 2068             ldr    r0, [r4, #0]    @ 77    *thumb2_movsi_vfp/5  
 [length = 4]
  75 0056 FFF7FEFF         bl    pnm_type    @ 78    *call_value_symbol   
[length = 4]
  76 005a 0028             cmp    r0, #0    @ 80    *arm_cmpsi_insn/1   
[length = 4]
  77 005c 06D0             beq    .L8    @ 81    *arm_cond_branch    [length =
4]
  78                      @ basic block 13
  79 005e 08DB             blt    .L7    @ 83    *arm_cond_branch    [length =
4]
  80                      @ basic block 14
  81 0060 0228             cmp    r0, #2    @ 84    *arm_cmpsi_insn/1   
[length = 4]
  82 0062 06DC             bgt    .L7    @ 85    *arm_cond_branch    [length =
4]
  83                      @ basic block 15
  84 0064 0123             movs    r3, #1    @ 193    *thumb2_movsi_shortim   
[length = 2]
  85 0066 0020             movs    r0, #0    @ 194    *thumb2_movsi_shortim   
[length = 2]
  86 0068 E360             str    r3, [r4, #12]    @ 93    *thumb2_movsi_vfp/7 
  [length = 4]
  87 006a 06E0             b    .L2    @ 205    *arm_jump    [length = 4]
  88                  .L8:
  89                      @ basic block 16
  90 006c 0323             movs    r3, #3    @ 192    *thumb2_movsi_shortim   
[length = 2]
  91 006e E360             str    r3, [r4, #12]    @ 99    *thumb2_movsi_vfp/7 
  [length = 4]
  92 0070 03E0             b    .L2    @ 207    *arm_jump    [length = 4]
  93                  .L7:
  94                      @ basic block 17
  95 0072 FFF7FEFF         bl    abort    @ 104    *call_symbol    [length = 4]
  96                  .L13:
  97                      @ basic block 18
  98 0076 4FF0FF30         mov    r0, #-1    @ 6    *thumb2_movsi_vfp/2   
[length = 4]
  99                  .L2:
 100                      @ basic block 19
 101 007a 3EBD             pop    {r1, r2, r3, r4, r5, pc}
 102                      .size    pnm_gethdr, .-pnm_gethdr
 103                      .ident    "GCC: (GNU) 4.7.0 20110326 (experimental)
[trunk revision 171556]"
\fARM GAS  /tmp/ccLnp6gX.s             page 3


DEFINED SYMBOLS
                            *ABS*:0000000000000000 t.c
     /tmp/ccLnp6gX.s:20     .text:0000000000000000 pnm_gethdr
     /tmp/ccLnp6gX.s:24     .text:0000000000000000 $t

UNDEFINED SYMBOLS
foo2
foo3
foo1
pnm_type
pnm_getsintstr
abort


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/47855] missed cbnz optimization
  2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
                   ` (2 preceding siblings ...)
  2011-03-26 17:54 ` steven at gcc dot gnu.org
@ 2011-09-22  7:21 ` jye2 at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: jye2 at gcc dot gnu.org @ 2011-09-22  7:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855

--- Comment #4 from jye2 at gcc dot gnu.org 2011-09-22 06:41:49 UTC ---
Author: jye2
Date: Thu Sep 22 06:41:44 2011
New Revision: 179077

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179077
Log:
2011-09-22  Joey Ye  <joey.ye@arm.com>

    Backport r178852 from mainline
    2011-09-14  Julian Brown  <julian@codesourcery.com>

    * config/arm/arm.c (arm_override_options): Add unaligned_access
    support.
    (arm_file_start): Emit attribute for unaligned access as appropriate.
    * config/arm/arm.md (UNSPEC_UNALIGNED_LOAD)
    (UNSPEC_UNALIGNED_STORE): Add constants for unspecs.
    (insv, extzv): Add unaligned-access support.
    (extv): Change to expander. Likewise.
    (extzv_t1, extv_regsi): Add helpers.
    (unaligned_loadsi, unaligned_loadhis, unaligned_loadhiu)
    (unaligned_storesi, unaligned_storehi): New.
    (*extv_reg): New (previous extv implementation).
    * config/arm/arm.opt (munaligned_access): Add option.
    * config/arm/constraints.md (Uw): New constraint.
    * expmed.c (store_bit_field_1): Adjust bitfield numbering according
    to size of access, not size of unit, when BITS_BIG_ENDIAN !=
    BYTES_BIG_ENDIAN. Don't use bitfield accesses for
    volatile accesses when -fstrict-volatile-bitfields is in effect.
    (extract_bit_field_1): Likewise.

    Backport r172697 from mainline
    2011-04-19  Wei Guozhi  <carrot@google.com>

    PR target/47855
    * config/arm/arm-protos.h (thumb1_legitimate_address_p): New prototype.
    * config/arm/arm.c (thumb1_legitimate_address_p): Remove the static
    linkage.
    * config/arm/constraints.md (Uu): New constraint.
    * config/arm/arm.md (*arm_movqi_insn): Compute attr "length".

Modified:
    branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm
    branches/ARM/embedded-4_6-branch/gcc/config/arm/arm-protos.h
    branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.c
    branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.md
    branches/ARM/embedded-4_6-branch/gcc/config/arm/arm.opt
    branches/ARM/embedded-4_6-branch/gcc/config/arm/constraints.md
    branches/ARM/embedded-4_6-branch/gcc/expmed.c


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-09-22  6:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-23 11:48 [Bug target/47855] New: missed cbnz optimization carrot at google dot com
2011-02-25  9:49 ` [Bug target/47855] " carrot at google dot com
2011-03-26 16:56 ` steven at gcc dot gnu.org
2011-03-26 17:54 ` steven at gcc dot gnu.org
2011-09-22  7:21 ` jye2 at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).