* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] CSE doesn't work
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
@ 2009-04-24 9:14 ` rguenth at gcc dot gnu dot org
2009-05-05 15:41 ` [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE mmitchel at gcc dot gnu dot org
` (28 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-04-24 9:14 UTC (permalink / raw)
To: gcc-bugs
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Target Milestone|--- |4.3.4
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
2009-04-24 9:14 ` [Bug rtl-optimization/39871] " rguenth at gcc dot gnu dot org
@ 2009-05-05 15:41 ` mmitchel at gcc dot gnu dot org
2009-05-06 15:07 ` bonzini at gnu dot org
` (27 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2009-05-05 15:41 UTC (permalink / raw)
To: gcc-bugs
--
mmitchel at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
Summary|[4.3/4.4/4.5 regression] CSE|[4.3/4.4/4.5 regression]
|doesn't work |Code size increase on ARM
| |due to inferior CSE
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
2009-04-24 9:14 ` [Bug rtl-optimization/39871] " rguenth at gcc dot gnu dot org
2009-05-05 15:41 ` [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE mmitchel at gcc dot gnu dot org
@ 2009-05-06 15:07 ` bonzini at gnu dot org
2009-05-20 14:17 ` ramana at gcc dot gnu dot org
` (26 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: bonzini at gnu dot org @ 2009-05-06 15:07 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from bonzini at gnu dot org 2009-05-06 15:06 -------
With 4.5 I see
With 4.5.0 I see:
push {lr}
sub sp, sp, #12
ldr r2, [r0]
ldr r1, [r0, #4]
mov r0, sp
str r2, [sp, #4]
bl func
add sp, sp, #12
pop {pc}
Can you bisect which revision fixed this?
--
bonzini at gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |bonzini at gnu dot org
Known to work| |4.5.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (2 preceding siblings ...)
2009-05-06 15:07 ` bonzini at gnu dot org
@ 2009-05-20 14:17 ` ramana at gcc dot gnu dot org
2009-06-14 10:24 ` mikpe at it dot uu dot se
` (25 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: ramana at gcc dot gnu dot org @ 2009-05-20 14:17 UTC (permalink / raw)
To: gcc-bugs
--
ramana at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2009-05-20 14:17:16
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (3 preceding siblings ...)
2009-05-20 14:17 ` ramana at gcc dot gnu dot org
@ 2009-06-14 10:24 ` mikpe at it dot uu dot se
2009-06-14 14:06 ` mikpe at it dot uu dot se
` (24 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: mikpe at it dot uu dot se @ 2009-06-14 10:24 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from mikpe at it dot uu dot se 2009-06-14 10:24 -------
(In reply to comment #1)
> With 4.5 I see
> With 4.5.0 I see:
>
> push {lr}
> sub sp, sp, #12
> ldr r2, [r0]
> ldr r1, [r0, #4]
> mov r0, sp
> str r2, [sp, #4]
> bl func
> add sp, sp, #12
> pop {pc}
However, gcc-4.5-20090611 generates the longer 10-instruction code:
push {lr}
ldr r2, [r0]
sub sp, sp, #20
add r3, sp, #4
ldr r1, [r0, #4]
str r2, [r3, #4]
mov r0, r3
bl func
add sp, sp, #20
pop {pc}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (4 preceding siblings ...)
2009-06-14 10:24 ` mikpe at it dot uu dot se
@ 2009-06-14 14:06 ` mikpe at it dot uu dot se
2009-08-04 12:49 ` rguenth at gcc dot gnu dot org
` (23 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: mikpe at it dot uu dot se @ 2009-06-14 14:06 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from mikpe at it dot uu dot se 2009-06-14 14:06 -------
(In reply to comment #1)
> With 4.5 I see
> With 4.5.0 I see:
>
> push {lr}
> sub sp, sp, #12
> ldr r2, [r0]
> ldr r1, [r0, #4]
> mov r0, sp
> str r2, [sp, #4]
> bl func
> add sp, sp, #12
> pop {pc}
I've tested every weekly gcc-4.5 snapshot and they all generate one instruction
more than this code.
How did you configure and invoke gcc-4.5 to get this 9-instruction code?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (5 preceding siblings ...)
2009-06-14 14:06 ` mikpe at it dot uu dot se
@ 2009-08-04 12:49 ` rguenth at gcc dot gnu dot org
2010-01-02 0:21 ` steven at gcc dot gnu dot org
` (22 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-08-04 12:49 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from rguenth at gcc dot gnu dot org 2009-08-04 12:30 -------
GCC 4.3.4 is being released, adjusting target milestone.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.4 |4.3.5
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (6 preceding siblings ...)
2009-08-04 12:49 ` rguenth at gcc dot gnu dot org
@ 2010-01-02 0:21 ` steven at gcc dot gnu dot org
2010-01-02 10:29 ` bonzini at gnu dot org
` (21 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-01-02 0:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from steven at gcc dot gnu dot org 2010-01-02 00:21 -------
Re. comment #1: Paulo, which compiler did you use (svn revision number)?
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |ra
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (7 preceding siblings ...)
2010-01-02 0:21 ` steven at gcc dot gnu dot org
@ 2010-01-02 10:29 ` bonzini at gnu dot org
2010-01-02 10:31 ` bonzini at gnu dot org
` (20 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: bonzini at gnu dot org @ 2010-01-02 10:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from bonzini at gnu dot org 2010-01-02 10:28 -------
I don't know but I found a tree that generates the 9-instruction sequence, and
it was "GCC: (GNU) 4.4.0 20090314 (experimental)".
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (8 preceding siblings ...)
2010-01-02 10:29 ` bonzini at gnu dot org
@ 2010-01-02 10:31 ` bonzini at gnu dot org
2010-02-08 11:15 ` steven at gcc dot gnu dot org
` (19 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: bonzini at gnu dot org @ 2010-01-02 10:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from bonzini at gnu dot org 2010-01-02 10:31 -------
(That would be r144855 or r144857).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (9 preceding siblings ...)
2010-01-02 10:31 ` bonzini at gnu dot org
@ 2010-02-08 11:15 ` steven at gcc dot gnu dot org
2010-02-08 11:54 ` rguenth at gcc dot gnu dot org
` (18 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-08 11:15 UTC (permalink / raw)
To: gcc-bugs
------- Comment #8 from steven at gcc dot gnu dot org 2010-02-08 11:15 -------
test:
push {lr}
sub sp, sp, #12
ldr r2, [r0]
ldr r1, [r0, #4]
mov r0, sp
str r2, [sp, #4]
bl func
add sp, sp, #12
@ sp needed for prologue
pop {pc}
.size test, .-test
.ident "GCC: (GNU) 4.5.0 20100208 (experimental) [trunk revision
156595]"
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (10 preceding siblings ...)
2010-02-08 11:15 ` steven at gcc dot gnu dot org
@ 2010-02-08 11:54 ` rguenth at gcc dot gnu dot org
2010-02-10 12:18 ` jingyu at google dot com
` (17 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-02-08 11:54 UTC (permalink / raw)
To: gcc-bugs
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.3.5 |4.5.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (11 preceding siblings ...)
2010-02-08 11:54 ` rguenth at gcc dot gnu dot org
@ 2010-02-10 12:18 ` jingyu at google dot com
2010-02-10 13:01 ` steven at gcc dot gnu dot org
` (16 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: jingyu at google dot com @ 2010-02-10 12:18 UTC (permalink / raw)
To: gcc-bugs
------- Comment #9 from jingyu at google dot com 2010-02-10 12:18 -------
I still get the 10-instruction code with trunk GCC I checked out today.
$arm-eabi-gcc -mthumb -mthumb-interwork -fpic -Os code.c -S
test:
push {lr}
sub sp, sp, #20
ldr r2, [r0]
add r3, sp, #4
ldr r1, [r0, #4]
str r2, [r3, #4]
mov r0, r3
bl func
add sp, sp, #20
@ sp needed for prologue
pop {r0}
bx r0
.size test, .-test
.ident "GCC: (GNU) 4.5.0 20100210 (experimental)"
The toolchain was built with newlib-1.17.0, mpc-0.8.1, mpfr-2.4.1, gmp-4.2.4,
binutils-2.19.
Target: arm-eabi
Configured with:
/usr/local/home/projects/newlib_armtoolchain/gcc-trunk-read/configure
--prefix=/usr/local/home/projects/toolchain_build/newlib_build_trunk/install
--target=arm-eabi --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--with-gmp=/usr/local/home/projects/toolchain_build/newlib_build_trunk/install
--with-mpfr=/usr/local/home/projects/toolchain_build/newlib_build_trunk/install
--with-mpc=/usr/local/home/projects/toolchain_build/newlib_build_trunk/install
--enable-multilib --with-newlib --with-gnu-as --with-gnu-ld
--enable-languages=c,c++
Thread model: single
gcc version 4.5.0 20100210 (experimental) (GCC)
Steven, could you please share how you build the toolchain which generates the
9-instruction code?
Thanks!
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (12 preceding siblings ...)
2010-02-10 12:18 ` jingyu at google dot com
@ 2010-02-10 13:01 ` steven at gcc dot gnu dot org
2010-02-10 13:04 ` steven at gcc dot gnu dot org
` (15 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-10 13:01 UTC (permalink / raw)
To: gcc-bugs
------- Comment #10 from steven at gcc dot gnu dot org 2010-02-10 13:00 -------
My compiler is configured for arm-elf, I guess that's the difference...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (13 preceding siblings ...)
2010-02-10 13:01 ` steven at gcc dot gnu dot org
@ 2010-02-10 13:04 ` steven at gcc dot gnu dot org
2010-02-10 16:28 ` steven at gcc dot gnu dot org
` (14 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-10 13:04 UTC (permalink / raw)
To: gcc-bugs
------- Comment #11 from steven at gcc dot gnu dot org 2010-02-10 13:04 -------
Closed for wrong ABI.
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|FIXED |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to inferior CSE
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (14 preceding siblings ...)
2010-02-10 13:04 ` steven at gcc dot gnu dot org
@ 2010-02-10 16:28 ` steven at gcc dot gnu dot org
2010-02-10 17:23 ` [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation steven at gcc dot gnu dot org
` (13 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-10 16:28 UTC (permalink / raw)
To: gcc-bugs
------- Comment #12 from steven at gcc dot gnu dot org 2010-02-10 16:27 -------
Trying with r156650, I get this before regalloc (in the .184r.asmcons dump):
1 NOTE_INSN_DELETED
4 NOTE_INSN_BASIC_BLOCK
2 r135:SI=r0:SI
REG_DEAD: r0:SI
3 NOTE_INSN_FUNCTION_BEG
6 r136:SI=sfp:SI-0xc
7 r137:SI=[r135:SI]
8 [r136:SI+0x4]=r137:SI
REG_DEAD: r137:SI
10 r139:SI=[r135:SI+0x4]
REG_DEAD: r135:SI
11 r0:SI=r136:SI
REG_DEAD: r136:SI
REG_EQUAL: sfp:SI-0xc
12 r1:SI=r139:SI
REG_DEAD: r139:SI
13 call [`func'] argc:0x0
REG_DEAD: r1:SI
REG_DEAD: r0:SI
To get the same code as gcc 4.2.1 of comment #0, r0 should be assigned to r136.
There is no reason why that should not happen, because there are no conflicts:
+++Allocating 32 bytes for conflict table (uncompressed size 32)
;; a0(r139,l0) conflicts: a1(r136,l0)
;; total conflict hard regs: 0
;; conflict hard regs: 0
;; a1(r136,l0) conflicts: a0(r139,l0) a2(r135,l0) a3(r137,l0)
;; total conflict hard regs:
;; conflict hard regs:
;; a2(r135,l0) conflicts: a1(r136,l0) a3(r137,l0)
;; total conflict hard regs:
;; conflict hard regs:
;; a3(r137,l0) conflicts: a1(r136,l0) a2(r135,l0)
;; total conflict hard regs:
;; conflict hard regs:
There are also no indications (not that I can find anyway) in the IRA dumps,
that suggests that IRA notices a tie between r0-r136 and r1-r139 may be
beneficial (thanks to insns 11 and 12 in the pre-regalloc dump).
IRA has done the following:
Popping a0(r139,l0) -- assign reg 1
Popping a1(r136,l0) -- assign reg 3
Popping a2(r135,l0) -- assign reg 0
Popping a3(r137,l0) -- assign reg 2
With this assignment, insn 2 and insn 12 become no-op moves. It looks like r139
ends up in r1 by pure luck, or there would have been another extra move.
After regalloc (in the .187r.ira dump) it looks like this:
1 NOTE_INSN_DELETED
4 NOTE_INSN_BASIC_BLOCK
3 NOTE_INSN_FUNCTION_BEG
6 r3:SI=sp:SI+0x4
REG_EQUIV: sp:SI+0x4
7 r2:SI=[r0:SI]
REG_EQUIV: [r0:SI]
8 [r3:SI+0x4]=r2:SI
10 r1:SI=[r0:SI+0x4]
11 r0:SI=r3:SI
REG_EQUAL: sfp:SI-0xc
13 call [`func'] argc:0x0
16 NOTE_INSN_DELETED
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
GCC build triplet|x86_64-unknown-linux-gnu |
GCC host triplet|x86_64-unknown-linux-gnu |
Known to fail| |4.4.0 4.5.0
Known to work|4.5.0 |4.2.1
Last reconfirmed|2009-05-20 14:17:16 |2010-02-10 16:27:56
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (15 preceding siblings ...)
2010-02-10 16:28 ` steven at gcc dot gnu dot org
@ 2010-02-10 17:23 ` steven at gcc dot gnu dot org
2010-02-10 17:50 ` steven at gcc dot gnu dot org
` (12 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-10 17:23 UTC (permalink / raw)
To: gcc-bugs
------- Comment #13 from steven at gcc dot gnu dot org 2010-02-10 17:23 -------
As comment #12 shows, CSE can't do much about this -- there is no common
subexpression before register allocation.
Vlad, this is another one that you probably should have a look at, please.
I will have a look at the difference between the pre-regalloc RTL in 118474 and
118475. Perhaps there is something that can be recovered. But fundamentally
this is something the register allocator should be able to handle.
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|unassigned at gcc dot gnu |steven at gcc dot gnu dot
|dot org |org
Status|REOPENED |ASSIGNED
Last reconfirmed|2010-02-10 16:27:56 |2010-02-10 17:23:41
date| |
Summary|[4.3/4.4/4.5 regression] |[4.3/4.4/4.5 regression]
|Code size increase on ARM |Code size increase on ARM
|due to inferior CSE |due to poor register
| |allocation
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (16 preceding siblings ...)
2010-02-10 17:23 ` [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation steven at gcc dot gnu dot org
@ 2010-02-10 17:50 ` steven at gcc dot gnu dot org
2010-02-10 19:25 ` steven at gcc dot gnu dot org
` (11 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-10 17:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #14 from steven at gcc dot gnu dot org 2010-02-10 17:50 -------
Vlad, this is another one that you probably should have a look at, please.
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vmakarov at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (17 preceding siblings ...)
2010-02-10 17:50 ` steven at gcc dot gnu dot org
@ 2010-02-10 19:25 ` steven at gcc dot gnu dot org
2010-02-10 22:50 ` steven at gcc dot gnu dot org
` (10 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-10 19:25 UTC (permalink / raw)
To: gcc-bugs
------- Comment #15 from steven at gcc dot gnu dot org 2010-02-10 19:24 -------
The difference between r118474 (left) and r118475 just before register
allocation (in the .life2 dumps) is this:
2 NOTE_INSN_DELETED 2 NOTE_INSN_DELETED
8 NOTE_INSN_BASIC_BLOCK 8 NOTE_INSN_BASIC_BLOCK
6 r101:SI=r0:SI 6 r101:SI=r0:SI
REG_DEAD: r0:SI REG_DEAD: r0:SI
7 NOTE_INSN_FUNCTION_BEG 7 NOTE_INSN_FUNCTION_BEG
11 r102:SI=sfp:SI-0xc 11 r102:SI=sfp:SI-0xc
12 r103:SI=[r101:SI] 12 r103:SI=[r101:SI]
13 [sfp:SI-0x8]=r103:SI | 13 [r102:SI+0x4]=r103:SI
REG_DEAD: r103:SI REG_DEAD: r103:SI
16 r105:SI=[r101:SI+0x4] 16 r105:SI=[r101:SI+0x4]
REG_DEAD: r101:SI REG_DEAD: r101:SI
17 r0:SI=r102:SI 17 r0:SI=r102:SI
REG_DEAD: r102:SI REG_DEAD: r102:SI
REG_EQUAL: sfp:SI-0xc REG_EQUAL: sfp:SI-0xc
18 r1:SI=r105:SI 18 r1:SI=r105:SI
REG_DEAD: r105:SI REG_DEAD: r105:SI
19 call [`func'] argc:0x0 19 call [`func'] argc:0x0
REG_DEAD: r0:SI REG_DEAD: r0:SI
REG_DEAD: r1:SI REG_DEAD: r1:SI
REG_UNUSED: lr:SI REG_UNUSED: lr:SI
20 NOTE_INSN_FUNCTION_END 20 NOTE_INSN_FUNCTION_END
In r118474 the cse1 pass transforms the code on the left (.jump dump) to the
code on the right (.cse1 dump):
11 r102:SI=sfp:SI-0xc 11 r102:SI=sfp:SI-0xc
12 r103:SI=[r101:SI] 12 r103:SI=[r101:SI]
13 [r102:SI+0x4]=r103:SI | 13 [sfp:SI-0x8]=r103:SI
apparently noticing that "sfp-0xc+0x4" == "sfp-0x8". It would be interesting to
know why fwprop is not doing this transformation.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (18 preceding siblings ...)
2010-02-10 19:25 ` steven at gcc dot gnu dot org
@ 2010-02-10 22:50 ` steven at gcc dot gnu dot org
2010-02-10 23:11 ` bonzini at gnu dot org
` (9 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-10 22:50 UTC (permalink / raw)
To: gcc-bugs
------- Comment #16 from steven at gcc dot gnu dot org 2010-02-10 22:50 -------
In fwprop.c of r118475, we get to propagate_rtx_1 (fwprop.c:334):
/* Copy propagations are always ok. Otherwise check the costs. */
if (!(REG_P (old) && REG_P (new))
&& !should_replace_address (op0, new_op0, GET_MODE (x)))
return true;
At this point the simplified address has been found, but fwprop decides not to
substitute the new address:
(gdb) p debug_rtx(op0)
(plus:SI (reg/f:SI 102)
(const_int 4 [0x4]))
$58 = void
(gdb) p debug_rtx(new_op0)
(plus:SI (reg/f:SI 25 sfp)
(const_int -8 [0xfffffffffffffff8]))
$59 = void
(gdb) p should_replace_address(op0,new_op0,SImode)
$60 = 0 '\000'
The replacement isn't done because fwprop sees no benefit in doing the
transformation. Stepping through should_replace_address we get:
202 gain = address_cost (old, mode) - address_cost (new, mode);
(gdb) next
208 if (gain == 0)
(gdb) p gain
$64 = 0
(gdb) next
209 gain = rtx_cost (new, SET) - rtx_cost (old, SET);
(gdb)
211 return (gain > 0);
(gdb) p gain
$65 = 0
Perhaps we should prefer addresses based on the frame pointer over other
addresses?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (19 preceding siblings ...)
2010-02-10 22:50 ` steven at gcc dot gnu dot org
@ 2010-02-10 23:11 ` bonzini at gnu dot org
2010-02-10 23:45 ` ramana at gcc dot gnu dot org
` (8 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: bonzini at gnu dot org @ 2010-02-10 23:11 UTC (permalink / raw)
To: gcc-bugs
------- Comment #17 from bonzini at gnu dot org 2010-02-10 23:11 -------
Subject: Re: [4.3/4.4/4.5 regression] Code size
increase on ARM due to poor register allocation
> Perhaps we should prefer addresses based on the frame pointer over other
> addresses?
Yes, that's definitely better from a register pressure point of view.
(That's however not why CSE was doing that---probably that was just
hashing luck).
Paolo
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (20 preceding siblings ...)
2010-02-10 23:11 ` bonzini at gnu dot org
@ 2010-02-10 23:45 ` ramana at gcc dot gnu dot org
2010-02-10 23:47 ` steven at gcc dot gnu dot org
` (7 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: ramana at gcc dot gnu dot org @ 2010-02-10 23:45 UTC (permalink / raw)
To: gcc-bugs
------- Comment #18 from ramana at gcc dot gnu dot org 2010-02-10 23:45 -------
(In reply to comment #16)
> In fwprop.c of r118475, we get to propagate_rtx_1 (fwprop.c:334):
>
> /* Copy propagations are always ok. Otherwise check the costs. */
> if (!(REG_P (old) && REG_P (new))
> && !should_replace_address (op0, new_op0, GET_MODE (x)))
> return true;
>
> At this point the simplified address has been found, but fwprop decides not to
> substitute the new address:
>
> (gdb) p debug_rtx(op0)
> (plus:SI (reg/f:SI 102)
> (const_int 4 [0x4]))
> $58 = void
> (gdb) p debug_rtx(new_op0)
> (plus:SI (reg/f:SI 25 sfp)
> (const_int -8 [0xfffffffffffffff8]))
> $59 = void
> (gdb) p should_replace_address(op0,new_op0,SImode)
> $60 = 0 '\000'
>
> The replacement isn't done because fwprop sees no benefit in doing the
> transformation. Stepping through should_replace_address we get:
>
> 202 gain = address_cost (old, mode) - address_cost (new, mode);
> (gdb) next
> 208 if (gain == 0)
> (gdb) p gain
> $64 = 0
> (gdb) next
> 209 gain = rtx_cost (new, SET) - rtx_cost (old, SET);
> (gdb)
> 211 return (gain > 0);
> (gdb) p gain
> $65 = 0
>
> Perhaps we should prefer addresses based on the frame pointer over other
> addresses?
>
Why does this sound like some bit of the discussion in this thread here
?http://gcc.gnu.org/ml/gcc/2009-12/msg00347.html
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (21 preceding siblings ...)
2010-02-10 23:45 ` ramana at gcc dot gnu dot org
@ 2010-02-10 23:47 ` steven at gcc dot gnu dot org
2010-02-10 23:53 ` steven at gcc dot gnu dot org
` (6 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-10 23:47 UTC (permalink / raw)
To: gcc-bugs
------- Comment #19 from steven at gcc dot gnu dot org 2010-02-10 23:47 -------
In r118474, cse.c:find_best_addr makes the replacement here:
if ((addr_folded_cost < addr_cost
|| (addr_folded_cost == addr_cost
/* ??? The rtx_cost comparison is left over from an older
version of this code. It is probably no longer
helpful.*/
&& (rtx_cost (folded, MEM) > rtx_cost (addr, MEM)
|| approx_reg_cost (folded) < approx_reg_cost (addr))))
&& validate_change (insn, loc, folded, 0))
addr = folded;
All the costs are the same, except the approx_reg_cost tests:
(gdb) p debug_rtx(addr)
(plus:SI (reg/f:SI 102)
(const_int 4 [0x4]))
$35 = void
(gdb) p debug_rtx(folded)
(plus:SI (reg/f:SI 25 sfp)
(const_int -8 [0xfffffffffffffff8]))
$36 = void
(gdb) p approx_reg_cost(addr)
$37 = 1
(gdb) p approx_reg_cost(folded)
$38 = 0
(gdb)
The cost difference comes from approx_reg_cost_1, which uses CHEAP_REGNO for
regs. CHEAP_REGNO prefers the frame pointer reg over normal pseudos:
/* Compute cost of X, as stored in the `cost' field of a table_elt. Fixed
hard registers and pointers into the frame are the cheapest with a cost
of 0. Next come pseudos with a cost of one and other hard registers with
a cost of 2. Aside from these special cases, call `rtx_cost'. */
#define CHEAP_REGNO(N) \
(REGNO_PTR_FRAME_P(N) \
|| (HARD_REGISTER_NUM_P (N) \
&& FIXED_REGNO_P (N) && REGNO_REG_CLASS (N) != NO_REGS))
This shouldn't be very difficult to teach fwprop about, too.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (22 preceding siblings ...)
2010-02-10 23:47 ` steven at gcc dot gnu dot org
@ 2010-02-10 23:53 ` steven at gcc dot gnu dot org
2010-03-18 8:29 ` steven at gcc dot gnu dot org
` (5 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-02-10 23:53 UTC (permalink / raw)
To: gcc-bugs
------- Comment #20 from steven at gcc dot gnu dot org 2010-02-10 23:53 -------
I'll leave it to someone else to implement and test the details...
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |steven at gcc dot gnu dot
| |org
AssignedTo|steven at gcc dot gnu dot |unassigned at gcc dot gnu
|org |dot org
Status|ASSIGNED |NEW
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (23 preceding siblings ...)
2010-02-10 23:53 ` steven at gcc dot gnu dot org
@ 2010-03-18 8:29 ` steven at gcc dot gnu dot org
2010-03-18 8:31 ` steven at gcc dot gnu dot org
` (4 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-03-18 8:29 UTC (permalink / raw)
To: gcc-bugs
------- Comment #21 from steven at gcc dot gnu dot org 2010-03-18 08:29 -------
*** Bug 43286 has been marked as a duplicate of this bug. ***
--
steven at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |carrot at google dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (24 preceding siblings ...)
2010-03-18 8:29 ` steven at gcc dot gnu dot org
@ 2010-03-18 8:31 ` steven at gcc dot gnu dot org
2010-04-06 11:38 ` rguenth at gcc dot gnu dot org
` (3 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2010-03-18 8:31 UTC (permalink / raw)
To: gcc-bugs
------- Comment #22 from steven at gcc dot gnu dot org 2010-03-18 08:31 -------
In the test case from bug 43286, should_replace_address does not perform the
following replacement because the address cost is the same and the replacement
is only done if new_rtx is more expensive than old_rtx.
old_rtx
(plus:SI (reg/v/f:SI 133 [ saveArea ])
(const_int 8 [0x8]))
new_rtx
(plus:SI (reg/v/f:SI 140 [ fp ])
(const_int -8 [0xfffffffffffffff8]))
This is the same situation as described in comment #16.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (25 preceding siblings ...)
2010-03-18 8:31 ` steven at gcc dot gnu dot org
@ 2010-04-06 11:38 ` rguenth at gcc dot gnu dot org
2010-06-04 12:44 ` [Bug rtl-optimization/39871] [4.3/4.4/4.5/4.6 " bernds at gcc dot gnu dot org
` (2 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2010-04-06 11:38 UTC (permalink / raw)
To: gcc-bugs
------- Comment #23 from rguenth at gcc dot gnu dot org 2010-04-06 11:19 -------
GCC 4.5.0 is being released. Deferring to 4.5.1.
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.5.0 |4.5.1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5/4.6 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (26 preceding siblings ...)
2010-04-06 11:38 ` rguenth at gcc dot gnu dot org
@ 2010-06-04 12:44 ` bernds at gcc dot gnu dot org
2010-06-17 21:52 ` bernds at gcc dot gnu dot org
2010-06-17 21:55 ` bernds at gcc dot gnu dot org
29 siblings, 0 replies; 31+ messages in thread
From: bernds at gcc dot gnu dot org @ 2010-06-04 12:44 UTC (permalink / raw)
To: gcc-bugs
------- Comment #24 from bernds at gcc dot gnu dot org 2010-06-04 12:44 -------
Subject: Bug 39871
Author: bernds
Date: Fri Jun 4 12:44:01 2010
New Revision: 160260
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=160260
Log:
PR rtl-optimization/39871
PR rtl-optimization/40615
PR rtl-optimization/42500
PR rtl-optimization/42502
* ira.c (init_reg_equiv_memory_loc: New function.
(ira): Call it twice.
* reload.h (calculate_elim_costs_all_insns): Declare.
* ira-costs.c: Include "reload.h".
(regno_equiv_gains): New static variable.
(init_costs): Allocate it.
(finish_costs): Free it.
(ira_costs): Call calculate_elim_costs_all_insns.
(find_costs_and_classes): Take estimated elimination costs
into account.
(ira_adjust_equiv_reg_cost): New function.
* ira.h (ira_adjust_equiv_reg_cost): Declare it.
* reload1.c (init_eliminable_invariants, free_reg_equiv,
elimination_costs_in_insn, note_reg_elim_costly): New static
functions.
(elim_bb): New static variable.
(reload): Move code out of here into init_eliminable_invariants and
free_reg_equiv. Call them.
(calculate_elim_costs_all_insns): New function.
(eliminate_regs_1): Declare. Add extra arg FOR_COSTS;
all callers changed. If FOR_COSTS is true, don't call alter_reg,
but call note_reg_elim_costly if we turned a valid memory address
into an invalid one.
* Makefile.in (ira-costs.o): Depend on reload.h.
testsuite/
PR rtl-optimization/39871
PR rtl-optimization/40615
PR rtl-optimization/42500
PR rtl-optimization/42502
* gcc.target/arm/eliminate.c: New test.
Added:
trunk/gcc/testsuite/gcc.target/arm/eliminate.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/Makefile.in
trunk/gcc/ira-costs.c
trunk/gcc/ira.c
trunk/gcc/ira.h
trunk/gcc/reload.h
trunk/gcc/reload1.c
trunk/gcc/testsuite/ChangeLog
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5/4.6 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (27 preceding siblings ...)
2010-06-04 12:44 ` [Bug rtl-optimization/39871] [4.3/4.4/4.5/4.6 " bernds at gcc dot gnu dot org
@ 2010-06-17 21:52 ` bernds at gcc dot gnu dot org
2010-06-17 21:55 ` bernds at gcc dot gnu dot org
29 siblings, 0 replies; 31+ messages in thread
From: bernds at gcc dot gnu dot org @ 2010-06-17 21:52 UTC (permalink / raw)
To: gcc-bugs
------- Comment #25 from bernds at gcc dot gnu dot org 2010-06-17 21:52 -------
Subject: Bug 39871
Author: bernds
Date: Thu Jun 17 21:51:55 2010
New Revision: 160947
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=160947
Log:
PR rtl-optimization/39871
* reload1.c (init_eliminable_invariants): For flag_pic, disable
equivalences only for constants that aren't LEGITIMATE_PIC_OPERAND_P.
(function_invariant_p): Rule out a plus of frame or arg pointer with
a SYMBOL_REF.
* ira.c (find_reg_equiv_invariant_const): Likewise.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ira.c
trunk/gcc/reload1.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug rtl-optimization/39871] [4.3/4.4/4.5/4.6 regression] Code size increase on ARM due to poor register allocation
2009-04-23 16:16 [Bug rtl-optimization/39871] New: [4.3/4.4/4.5 regression] CSE doesn't work alexvod at google dot com
` (28 preceding siblings ...)
2010-06-17 21:52 ` bernds at gcc dot gnu dot org
@ 2010-06-17 21:55 ` bernds at gcc dot gnu dot org
29 siblings, 0 replies; 31+ messages in thread
From: bernds at gcc dot gnu dot org @ 2010-06-17 21:55 UTC (permalink / raw)
To: gcc-bugs
------- Comment #26 from bernds at gcc dot gnu dot org 2010-06-17 21:54 -------
Fixed.
--
bernds at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39871
^ permalink raw reply [flat|nested] 31+ messages in thread