public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/53447] New: missed optimization of 64bit ALU operation with small constant
@ 2012-05-22  7:37 carrot at google dot com
  2012-05-22  9:57 ` [Bug target/53447] " steven at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: carrot at google dot com @ 2012-05-22  7:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53447

             Bug #: 53447
           Summary: missed optimization of 64bit ALU operation with small
                    constant
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: carrot@google.com
            Target: arm-unknown-linux-gnueabi


Compile the following code with options -march=armv7-a -O2 -mthumb

void t0p(long long *p)
{
  *p += 1;
}

GCC 4.8  generates:


t0p:
    ldrd    r2, [r0]
    push    {r4, r5}
    movs    r4, #1       //A
    adds    r2, r2, r4   //B
    mov    r5, #0       //C
    adc    r3, r3, r5   //D
    strd    r2, [r0]
    pop    {r4, r5}
    bx    lr

Instructions ABCD can be simplified as

        adds   r2, r2, 1
        adc    r3, r3, 0

This sequence is smaller and faster than original code, it uses two less
registers, so the push/pop instructions can also be removed.

Both arm/thumb mode and Os/O2 generates similar code.

This optimization can also be applied to other alu operations, such as
sub/and/or/xor/cmp.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53447] missed optimization of 64bit ALU operation with small constant
  2012-05-22  7:37 [Bug target/53447] New: missed optimization of 64bit ALU operation with small constant carrot at google dot com
@ 2012-05-22  9:57 ` steven at gcc dot gnu.org
  2012-05-22 10:01 ` steven at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: steven at gcc dot gnu.org @ 2012-05-22  9:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53447

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-05-22
     Ever Confirmed|0                           |1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53447] missed optimization of 64bit ALU operation with small constant
  2012-05-22  7:37 [Bug target/53447] New: missed optimization of 64bit ALU operation with small constant carrot at google dot com
  2012-05-22  9:57 ` [Bug target/53447] " steven at gcc dot gnu.org
@ 2012-05-22 10:01 ` steven at gcc dot gnu.org
  2012-05-23  7:06 ` carrot at google dot com
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: steven at gcc dot gnu.org @ 2012-05-22 10:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53447

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |steven at gcc dot gnu.org

--- Comment #1 from Steven Bosscher <steven at gcc dot gnu.org> 2012-05-22 08:24:52 UTC ---
Confirmed. 

Here is the assembler output with the "-dAp -fdump-rtl-all-details" options:

t0p:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
@ BLOCK 2 freq:10000 seq:0
@ PRED: ENTRY [100.0%]  (fallthru)
    ldrd    r2, [r0]    @ 6    *arm_movdi/4    [length = 8]
    push    {r4, r5}    @ 20    *push_multi    [length = 2]
    movs    r4, #1    @ 16    *thumb2_movsi_shortim    [length = 2]
    adds    r2, r2, r4    @ 18    *addsi3_compare_op1/1    [length = 4]
    mov    r5, #0    @ 17    *thumb2_movsi_insn/2    [length = 4]
    adc    r3, r3, r5    @ 19    *addsi3_carryin_ltu    [length = 4]
    strd    r2, [r0]    @ 9    *arm_movdi/5    [length = 8]
@ SUCC: EXIT [100.0%] 
    pop    {r4, r5}
    bx    lr


The add is a DImode add up to the pr53447.c.195r.postreload dump:

(insn 6 3 13 2 (set (reg:DI 2 r2 [orig:137 D.4118 ] [137])
        (mem:DI (reg/v/f:SI 0 r0 [orig:136 p ] [136]) [2 *p_1(D)+0 S8 A64]))
pr53447.c:3 182 {*arm_movdi}
     (nil))

(insn 13 6 8 2 (set (reg:DI 4 r4 [139])
        (const_int 1 [0x1])) pr53447.c:3 182 {*arm_movdi}
     (expr_list:REG_EQUIV (const_int 1 [0x1])
        (nil)))

(insn 8 13 9 2 (parallel [ 
            (set (reg:DI 2 r2 [orig:137 D.4118 ] [137])
                (plus:DI (reg:DI 2 r2 [orig:137 D.4118 ] [137])
                    (reg:DI 4 r4 [139])))
            (clobber (reg:CC 24 cc))
        ]) pr53447.c:3 1 {*arm_adddi3}
     (nil))         

(insn 9 8 12 2 (set (mem:DI (reg/v/f:SI 0 r0 [orig:136 p ] [136]) [2 *p_1(D)+0
S8 A64])
        (reg:DI 2 r2 [orig:137 D.4118 ] [137])) pr53447.c:3 182 {*arm_movdi}
     (nil)) 



The add is split in the pr53447.c.197r.split2 dump:

(insn 6 3 16 2 (set (reg:DI 2 r2 [orig:137 D.4118 ] [137])
        (mem:DI (reg/v/f:SI 0 r0 [orig:136 p ] [136]) [2 *p_1(D)+0 S8 A64]))
pr53447.c:3 182 {*arm_movdi}
     (nil)) 

(insn 16 6 17 2 (set (reg:SI 4 r4 [139])
        (const_int 1 [0x1])) pr53447.c:3 718 {*thumb2_movsi_insn}
     (nil))

(insn 17 16 18 2 (set (reg:SI 5 r5 [+4 ])
        (const_int 0 [0])) pr53447.c:3 718 {*thumb2_movsi_insn}
     (nil))

(insn 18 17 19 2 (parallel [
            (set (reg:CC_C 24 cc)
                (compare:CC_C (plus:SI (reg:SI 2 r2 [orig:137 D.4118 ] [137])
                        (reg:SI 4 r4 [139]))
                    (reg:SI 2 r2 [orig:137 D.4118 ] [137])))
            (set (reg:SI 2 r2 [orig:137 D.4118 ] [137])
                (plus:SI (reg:SI 2 r2 [orig:137 D.4118 ] [137])
                    (reg:SI 4 r4 [139])))
        ]) pr53447.c:3 10 {*addsi3_compare_op1}
     (nil))

(insn 19 18 9 2 (set (reg:SI 3 r3 [ D.4118+4 ])
        (plus:SI (plus:SI (reg:SI 3 r3 [ D.4118+4 ])
                (reg:SI 5 r5 [+4 ]))
            (ltu:SI (reg:CC_C 24 cc)
                (const_int 0 [0])))) pr53447.c:3 14 {*addsi3_carryin_ltu}
     (nil))

(insn 9 19 12 2 (set (mem:DI (reg/v/f:SI 0 r0 [orig:136 p ] [136]) [2 *p_1(D)+0
S8 A64])
        (reg:DI 2 r2 [orig:137 D.4118 ] [137])) pr53447.c:3 182 {*arm_movdi}
     (nil))


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53447] missed optimization of 64bit ALU operation with small constant
  2012-05-22  7:37 [Bug target/53447] New: missed optimization of 64bit ALU operation with small constant carrot at google dot com
  2012-05-22  9:57 ` [Bug target/53447] " steven at gcc dot gnu.org
  2012-05-22 10:01 ` steven at gcc dot gnu.org
@ 2012-05-23  7:06 ` carrot at google dot com
  2012-07-01 15:15 ` carrot at gcc dot gnu.org
  2012-07-06  2:23 ` carrot at google dot com
  4 siblings, 0 replies; 6+ messages in thread
From: carrot at google dot com @ 2012-05-23  7:06 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53447

--- Comment #2 from Carrot <carrot at google dot com> 2012-05-23 06:54:55 UTC ---
A question about related pattern

  626 (define_insn_and_split "*arm_adddi3"
  627   [(set (match_operand:DI          0 "s_register_operand" "=&r,&r")
  628         (plus:DI (match_operand:DI 1 "s_register_operand" "%0, 0")
  629                  (match_operand:DI 2 "s_register_operand" "r,  0")))
  630    (clobber (reg:CC CC_REGNUM))]
  631   "TARGET_32BIT && !(TARGET_HARD_FLOAT && TARGET_MAVERICK) &&
!TARGET_NEON"

Why operand1 must be equal to operand0? Both Arm and Thumb2 have 3 register add
and adc instructions.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53447] missed optimization of 64bit ALU operation with small constant
  2012-05-22  7:37 [Bug target/53447] New: missed optimization of 64bit ALU operation with small constant carrot at google dot com
                   ` (2 preceding siblings ...)
  2012-05-23  7:06 ` carrot at google dot com
@ 2012-07-01 15:15 ` carrot at gcc dot gnu.org
  2012-07-06  2:23 ` carrot at google dot com
  4 siblings, 0 replies; 6+ messages in thread
From: carrot at gcc dot gnu.org @ 2012-07-01 15:15 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53447

--- Comment #3 from carrot at gcc dot gnu.org 2012-07-01 15:14:56 UTC ---
Author: carrot
Date: Sun Jul  1 15:14:52 2012
New Revision: 189102

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189102
Log:
    PR target/53447
    * config/arm/arm-protos.h (const_ok_for_dimode_op): New prototype.
    * config/arm/arm.c (const_ok_for_dimode_op): New function.
    * config/arm/constraints.md (Dd): New constraint.
    * config/arm/predicates.md (arm_adddi_operand): New predicate.
    * config/arm/arm.md (adddi3): Extend it to handle constants.
    (arm_adddi3): Likewise.
    (addsi3_carryin_<optab>): Extend it to handle sbc case.
    (addsi3_carryin_alt2_<optab>): Likewise.
    * config/arm/neon.md (adddi3_neon): Extend it to handle constants.

    * gcc.target/arm/pr53447-1.c: New testcase.
    * gcc.target/arm/pr53447-2.c: New testcase.
    * gcc.target/arm/pr53447-3.c: New testcase.
    * gcc.target/arm/pr53447-4.c: New testcase.


Added:
    trunk/gcc/testsuite/gcc.target/arm/pr53447-1.c
    trunk/gcc/testsuite/gcc.target/arm/pr53447-2.c
    trunk/gcc/testsuite/gcc.target/arm/pr53447-3.c
    trunk/gcc/testsuite/gcc.target/arm/pr53447-4.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/arm/arm-protos.h
    trunk/gcc/config/arm/arm.c
    trunk/gcc/config/arm/arm.md
    trunk/gcc/config/arm/constraints.md
    trunk/gcc/config/arm/neon.md
    trunk/gcc/config/arm/predicates.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/53447] missed optimization of 64bit ALU operation with small constant
  2012-05-22  7:37 [Bug target/53447] New: missed optimization of 64bit ALU operation with small constant carrot at google dot com
                   ` (3 preceding siblings ...)
  2012-07-01 15:15 ` carrot at gcc dot gnu.org
@ 2012-07-06  2:23 ` carrot at google dot com
  4 siblings, 0 replies; 6+ messages in thread
From: carrot at google dot com @ 2012-07-06  2:23 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53447

Carrot <carrot at google dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #4 from Carrot <carrot at google dot com> 2012-07-06 02:22:58 UTC ---
Fixed by http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189102.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-07-06  2:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-22  7:37 [Bug target/53447] New: missed optimization of 64bit ALU operation with small constant carrot at google dot com
2012-05-22  9:57 ` [Bug target/53447] " steven at gcc dot gnu.org
2012-05-22 10:01 ` steven at gcc dot gnu.org
2012-05-23  7:06 ` carrot at google dot com
2012-07-01 15:15 ` carrot at gcc dot gnu.org
2012-07-06  2:23 ` carrot at google dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).