public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/41868] New: cell microcode instruction is generated for a trivial loop with -O2 optimizations, hurting performance badly
@ 2009-10-29 15:16 siarhei dot siamashka at gmail dot com
2009-10-29 15:21 ` [Bug target/41868] cell microcode instruction (addic.) " siarhei dot siamashka at gmail dot com
` (6 more replies)
0 siblings, 7 replies; 10+ messages in thread
From: siarhei dot siamashka at gmail dot com @ 2009-10-29 15:16 UTC (permalink / raw)
To: gcc-bugs
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1374 bytes --]
/***************************************/
void __attribute__((noinline)) y()
{
asm volatile ("# nop\n");
}
void __attribute__((noinline)) x(long c)
{
while (c--)
y();
}
int main()
{
/* Run total 3.2G iterations */
x(1600000000);
x(1600000000);
return 0;
}
/***************************************/
$ gcc -O2 -mcpu=cell -mtune=cell -mwarn-cell-microcode -o test-O2 test.c
test.c: In function x:
test.c:9: warning: emitting microcode insn {ai.|addic.} %0,%1,%2
[*adddi3_internal3] #38
$ time ./test-O2
real 0m56.385s
user 0m56.232s
sys 0m0.138s
$ gcc -Os -mcpu=cell -mtune=cell -mwarn-cell-microcode -o test-Os test.c
$ time ./test-Os
real 0m24.149s
user 0m24.086s
sys 0m0.060s
--
Summary: cell microcode instruction is generated for a trivial
loop with -O2 optimizations, hurting performance badly
Product: gcc
Version: 4.4.2
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: siarhei dot siamashka at gmail dot com
GCC build triplet: powerpc64-unknown-linux-gnu
GCC host triplet: powerpc64-unknown-linux-gnu
GCC target triplet: powerpc64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
2009-10-29 15:16 [Bug target/41868] New: cell microcode instruction is generated for a trivial loop with -O2 optimizations, hurting performance badly siarhei dot siamashka at gmail dot com
@ 2009-10-29 15:21 ` siarhei dot siamashka at gmail dot com
2009-11-02 16:51 ` pinskia at gcc dot gnu dot org
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: siarhei dot siamashka at gmail dot com @ 2009-10-29 15:21 UTC (permalink / raw)
To: gcc-bugs
------- Comment #1 from siarhei dot siamashka at gmail dot com 2009-10-29 15:21 -------
-O2:
0000000000000010 <.x>:
10: 2c 23 00 00 cmpdi r3,0
14: 7c 08 02 a6 mflr r0
18: f8 01 00 10 std r0,16(r1)
1c: f8 21 ff 81 stdu r1,-128(r1)
20: 41 82 00 1c beq- 3c <.x+0x2c>
24: f8 61 00 70 std r3,112(r1)
28: 48 00 00 01 bl 28 <.x+0x18>
2c: e8 01 00 70 ld r0,112(r1)
30: 35 20 ff ff addic. r9,r0,-1
34: f9 21 00 70 std r9,112(r1)
38: 40 82 ff f0 bne+ 28 <.x+0x18>
3c: 38 21 00 80 addi r1,r1,128
40: e8 01 00 10 ld r0,16(r1)
44: 7c 08 03 a6 mtlr r0
48: 4e 80 00 20 blr
4c: 00 00 00 00 .long 0x0
50: 00 00 00 01 .long 0x1
54: 80 00 00 00 lwz r0,0(0)
-Os:
0000000000000010 <.x>:
10: fb e1 ff f8 std r31,-8(r1)
14: 7c 08 02 a6 mflr r0
18: f8 01 00 10 std r0,16(r1)
1c: 7c 7f 1b 78 mr r31,r3
20: f8 21 ff 81 stdu r1,-128(r1)
24: 48 00 00 08 b 2c <.x+0x1c>
28: 48 00 00 01 bl 28 <.x+0x18>
2c: 2f bf 00 00 cmpdi cr7,r31,0
30: 3b ff ff ff addi r31,r31,-1
34: 40 9e ff f4 bne+ cr7,28 <.x+0x18>
38: 38 21 00 80 addi r1,r1,128
3c: e8 01 00 10 ld r0,16(r1)
40: eb e1 ff f8 ld r31,-8(r1)
44: 7c 08 03 a6 mtlr r0
48: 4e 80 00 20 blr
4c: 00 00 00 00 .long 0x0
50: 00 00 00 01 .long 0x1
54: 80 01 00 00 lwz r0,0(r1)
--
siarhei dot siamashka at gmail dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |siarhei dot siamashka at
| |gmail dot com
Keywords| |missed-optimization
Summary|cell microcode instruction |cell microcode instruction
|is generated for a trivial |(addic.) is generated for a
|loop with -O2 optimizations,|trivial loop with -O2
|hurting performance badly |optimizations, hurting
| |performance badly
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
2009-10-29 15:16 [Bug target/41868] New: cell microcode instruction is generated for a trivial loop with -O2 optimizations, hurting performance badly siarhei dot siamashka at gmail dot com
2009-10-29 15:21 ` [Bug target/41868] cell microcode instruction (addic.) " siarhei dot siamashka at gmail dot com
@ 2009-11-02 16:51 ` pinskia at gcc dot gnu dot org
2009-11-02 16:56 ` pinskia at gcc dot gnu dot org
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-11-02 16:51 UTC (permalink / raw)
To: gcc-bugs
------- Comment #2 from pinskia at gcc dot gnu dot org 2009-11-02 16:51 -------
Simple patch which I am testing right now:
Index: gcc/gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/gcc/config/rs6000/rs6000.md (revision 153680)
+++ gcc/gcc/config/rs6000/rs6000.md (working copy)
@@ -1627,7 +1627,7 @@ (define_insn "*add<mode>3_internal3"
(set_attr "length" "4,4,8,8")])
(define_split
- [(set (match_operand:CC 3 "cc_reg_not_cr0_operand" "")
+ [(set (match_operand:CC 3 "cc_reg_not_micro_cr0_operand" "")
(compare:CC (plus:P (match_operand:P 1 "gpc_reg_operand" "")
(match_operand:P 2 "reg_or_short_operand" ""))
--
pinskia at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|unassigned at gcc dot gnu |pinskia at gcc dot gnu dot
|dot org |org
Status|UNCONFIRMED |ASSIGNED
Ever Confirmed|0 |1
Last reconfirmed|0000-00-00 00:00:00 |2009-11-02 16:51:40
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
2009-10-29 15:16 [Bug target/41868] New: cell microcode instruction is generated for a trivial loop with -O2 optimizations, hurting performance badly siarhei dot siamashka at gmail dot com
2009-10-29 15:21 ` [Bug target/41868] cell microcode instruction (addic.) " siarhei dot siamashka at gmail dot com
2009-11-02 16:51 ` pinskia at gcc dot gnu dot org
@ 2009-11-02 16:56 ` pinskia at gcc dot gnu dot org
2009-11-02 17:05 ` pinskia at gcc dot gnu dot org
` (3 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-11-02 16:56 UTC (permalink / raw)
To: gcc-bugs
------- Comment #3 from pinskia at gcc dot gnu dot org 2009-11-02 16:56 -------
Actually the warning is incorrect at least according to the PPU book 4.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
2009-10-29 15:16 [Bug target/41868] New: cell microcode instruction is generated for a trivial loop with -O2 optimizations, hurting performance badly siarhei dot siamashka at gmail dot com
` (2 preceding siblings ...)
2009-11-02 16:56 ` pinskia at gcc dot gnu dot org
@ 2009-11-02 17:05 ` pinskia at gcc dot gnu dot org
2009-11-02 17:09 ` pinskia at gcc dot gnu dot org
` (2 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-11-02 17:05 UTC (permalink / raw)
To: gcc-bugs
------- Comment #4 from pinskia at gcc dot gnu dot org 2009-11-02 17:05 -------
In fact changing the the addic. into addic/cmpwi does not improve the speed of
the code:
With the change:
[apinski@dhcp-10-98-10-216 local]$ time ./a.out
56.316u 0.084s 0:57.09 98.7% 0+0k 0+0io 0pf+0w
Without:
56.276u 0.088s 0:57.08 98.7% 0+0k 0+0io 0pf+0w
So the warning is only invalid.
With -Os on the trunk:
24.144u 0.032s 0:24.45 98.8% 0+0k 0+0io 0pf+0w
I don't know why off hand -Os is faster than -O2.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
2009-10-29 15:16 [Bug target/41868] New: cell microcode instruction is generated for a trivial loop with -O2 optimizations, hurting performance badly siarhei dot siamashka at gmail dot com
` (3 preceding siblings ...)
2009-11-02 17:05 ` pinskia at gcc dot gnu dot org
@ 2009-11-02 17:09 ` pinskia at gcc dot gnu dot org
2009-11-02 17:10 ` pinskia at gcc dot gnu dot org
2009-11-03 20:09 ` siarhei dot siamashka at gmail dot com
6 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-11-02 17:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #5 from pinskia at gcc dot gnu dot org 2009-11-02 17:08 -------
In fact doing the following diff to the -Os assembly:
--- t5.Os.s 2009-11-02 23:18:52.000000000 +0900
+++ t5.Os.dot.s 2009-11-02 23:20:19.000000000 +0900
@@ -29,9 +29,9 @@ x:
.L4:
bl y
.L3:
- cmpwi 7,31,0
- addi 31,31,-1
- bne 7,.L4
+# cmpwi 7,31,0
+ addic. 31,31,-1
+ bne .L4
addi 11,1,16
b _restgpr_31_x
.size x,.-x
produces the same result as -Os on the trunk.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
2009-10-29 15:16 [Bug target/41868] New: cell microcode instruction is generated for a trivial loop with -O2 optimizations, hurting performance badly siarhei dot siamashka at gmail dot com
` (4 preceding siblings ...)
2009-11-02 17:09 ` pinskia at gcc dot gnu dot org
@ 2009-11-02 17:10 ` pinskia at gcc dot gnu dot org
2009-11-03 20:09 ` siarhei dot siamashka at gmail dot com
6 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2009-11-02 17:10 UTC (permalink / raw)
To: gcc-bugs
------- Comment #6 from pinskia at gcc dot gnu dot org 2009-11-02 17:10 -------
So in conclusion, addic. is not microcoded and the warning is incorrect but
still -Os is faster than -O2.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
2009-10-29 15:16 [Bug target/41868] New: cell microcode instruction is generated for a trivial loop with -O2 optimizations, hurting performance badly siarhei dot siamashka at gmail dot com
` (5 preceding siblings ...)
2009-11-02 17:10 ` pinskia at gcc dot gnu dot org
@ 2009-11-03 20:09 ` siarhei dot siamashka at gmail dot com
6 siblings, 0 replies; 10+ messages in thread
From: siarhei dot siamashka at gmail dot com @ 2009-11-03 20:09 UTC (permalink / raw)
To: gcc-bugs
------- Comment #7 from siarhei dot siamashka at gmail dot com 2009-11-03 20:09 -------
Thanks a lot for checking this. And sorry about the confusion caused by
attributing slowness of the testcase to the microcoded stuff (which turned out
to be not the case) without proper checking this first.
So should this bug be split into two? One about the incorrect warning, and
another one about generating nonoptimal code at -O2 level (extra load and store
operations, which are probably penalized by something like RAW hazard in such a
short loop)?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
[not found] <bug-41868-4@http.gcc.gnu.org/bugzilla/>
2011-11-29 23:21 ` pinskia at gcc dot gnu.org
@ 2011-11-29 23:28 ` pinskia at gcc dot gnu.org
1 sibling, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-11-29 23:28 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-11-29 23:18:32 UTC ---
No longer working on this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/41868] cell microcode instruction (addic.) is generated for a trivial loop with -O2 optimizations, hurting performance badly
[not found] <bug-41868-4@http.gcc.gnu.org/bugzilla/>
@ 2011-11-29 23:21 ` pinskia at gcc dot gnu.org
2011-11-29 23:28 ` pinskia at gcc dot gnu.org
1 sibling, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-11-29 23:21 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41868
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |NEW
AssignedTo|pinskia at gcc dot gnu.org |unassigned at gcc dot
| |gnu.org
--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-11-29 23:18:46 UTC ---
No longer working on this.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-11-29 23:19 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-29 15:16 [Bug target/41868] New: cell microcode instruction is generated for a trivial loop with -O2 optimizations, hurting performance badly siarhei dot siamashka at gmail dot com
2009-10-29 15:21 ` [Bug target/41868] cell microcode instruction (addic.) " siarhei dot siamashka at gmail dot com
2009-11-02 16:51 ` pinskia at gcc dot gnu dot org
2009-11-02 16:56 ` pinskia at gcc dot gnu dot org
2009-11-02 17:05 ` pinskia at gcc dot gnu dot org
2009-11-02 17:09 ` pinskia at gcc dot gnu dot org
2009-11-02 17:10 ` pinskia at gcc dot gnu dot org
2009-11-03 20:09 ` siarhei dot siamashka at gmail dot com
[not found] <bug-41868-4@http.gcc.gnu.org/bugzilla/>
2011-11-29 23:21 ` pinskia at gcc dot gnu.org
2011-11-29 23:28 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).