public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
@ 2004-09-18 11:36 miguel55angel at hotmail dot com
2004-09-18 11:39 ` [Bug c/17549] " miguel55angel at hotmail dot com
` (29 more replies)
0 siblings, 30 replies; 31+ messages in thread
From: miguel55angel at hotmail dot com @ 2004-09-18 11:36 UTC (permalink / raw)
To: gcc-bugs
Hello,
I've observed a 25% increase in codesize between 3.3.4 and gcc version 4.0.0
20040917 (experimental)!
When compiled with "gcc -c -Os susan2l.c"
size says:
text data bss dec hex filename
17921 0 0 17921 4601 susan2l.o
for 3.3.4
and
text data bss dec hex filename
22341 0 0 22341 5745 susan2l.o
for 4.0.0 20040917 (experimental)
See http://www.fmrib.ox.ac.uk/~steve/susan/ for more info about susan.
--
Summary: 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: miguel55angel at hotmail dot com
CC: gcc-bugs at gcc dot gnu dot org
GCC host triplet: i386-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug c/17549] 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
@ 2004-09-18 11:39 ` miguel55angel at hotmail dot com
2004-09-18 17:05 ` [Bug c/17549] [4.0 Regression] " giovannibajo at libero dot it
` (28 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: miguel55angel at hotmail dot com @ 2004-09-18 11:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From miguel55angel at hotmail dot com 2004-09-18 11:39 -------
Created an attachment (id=7166)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=7166&action=view)
susan2l.c
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug c/17549] [4.0 Regression] 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
2004-09-18 11:39 ` [Bug c/17549] " miguel55angel at hotmail dot com
@ 2004-09-18 17:05 ` giovannibajo at libero dot it
2004-09-19 12:00 ` miguel55angel at hotmail dot com
` (27 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: giovannibajo at libero dot it @ 2004-09-18 17:05 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-09-18 17:05 -------
Confirmed on x86. Text sizes for various versions:
4.0.0: 22324 (20040914)
4.0.0: 20130 (20040901)
3.4.2: 18064
3.3.4: 17952
3.2.2: 18413
3.0.4: 19681
2.95: 19490
The increment in the last two weeks is caused by ivopts, and in fact can be
reverted with -fno-ivopts. Zdenek, maybe ivopts must be tuned/disabled
specially for -Os?
This leaves use with 2k of code increase though, which should still be
investigated.
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |giovannibajo at libero dot
| |it, rakdver at atrey dot
| |karlin dot mff dot cuni dot
| |cz
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Keywords| |missed-optimization
Known to fail| |4.0.0
Known to work| |3.3.4 3.4.2
Last reconfirmed|0000-00-00 00:00:00 |2004-09-18 17:05:39
date| |
Summary|25% increase in codesize |[4.0 Regression] 25%
|(3.3.4 -> 4.0.0 20040917) |increase in codesize (3.3.4
| |-> 4.0.0 20040917)
Target Milestone|--- |4.0.0
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug c/17549] [4.0 Regression] 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
2004-09-18 11:39 ` [Bug c/17549] " miguel55angel at hotmail dot com
2004-09-18 17:05 ` [Bug c/17549] [4.0 Regression] " giovannibajo at libero dot it
@ 2004-09-19 12:00 ` miguel55angel at hotmail dot com
2004-09-25 18:10 ` [Bug middle-end/17549] " roger at eyesopen dot com
` (26 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: miguel55angel at hotmail dot com @ 2004-09-19 12:00 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From miguel55angel at hotmail dot com 2004-09-19 12:00 -------
Hi,
the function "susan_edges" shows a 77% increase in code size even
when compiled with -fno-ivopts:
text data bss dec hex filename
2772 0 0 2772 ad4 susan_edges_334.o
text data bss dec hex filename
2905 0 0 2905 b59 susan_edges_342.o
text data bss dec hex filename
4930 0 0 4930 1342 susan_edges_400.o
I'm seeing a block of MOVs in the 4.0.0 disassembly code
if (z > (0.9*(float)n)) /* 0.5 */
910: db 85 e8 fd ff ff fildl 0xfffffde8(%ebp)
916: dd 05 00 00 00 00 fldl 0x0
91c: de c9 fmulp %st,%st(1)
91e: d9 c9 fxch %st(1)
920: da e9 fucompp
922: df e0 fnstsw %ax
924: 9e sahf
925: 0f 87 9d 01 00 00 ja ac8 <susan_edges+0xac8>
92b: 8b 95 ec fd ff ff mov 0xfffffdec(%ebp),%edx
931: 8b 9d d0 fd ff ff mov 0xfffffdd0(%ebp),%ebx
937: 8b b5 a0 fe ff ff mov 0xfffffea0(%ebp),%esi
93d: 8b 85 a4 fe ff ff mov 0xfffffea4(%ebp),%eax
943: 89 95 04 fe ff ff mov %edx,0xfffffe04(%ebp)
949: 8b 95 a8 fe ff ff mov 0xfffffea8(%ebp),%edx
94f: 89 9d 0c fe ff ff mov %ebx,0xfffffe0c(%ebp)
955: 8b 9d b0 fe ff ff mov 0xfffffeb0(%ebp),%ebx
95b: 89 b5 10 fe ff ff mov %esi,0xfffffe10(%ebp)
961: 8b b5 b4 fe ff ff mov 0xfffffeb4(%ebp),%esi
967: 8b 8d 9c fe ff ff mov 0xfffffe9c(%ebp),%ecx
96d: 89 85 14 fe ff ff mov %eax,0xfffffe14(%ebp)
973: 8b 85 b8 fe ff ff mov 0xfffffeb8(%ebp),%eax
979: 89 95 18 fe ff ff mov %edx,0xfffffe18(%ebp)
97f: 8b 95 bc fe ff ff mov 0xfffffebc(%ebp),%edx
985: 89 9d 28 fe ff ff mov %ebx,0xfffffe28(%ebp)
98b: 8b 9d c0 fe ff ff mov 0xfffffec0(%ebp),%ebx
991: 89 b5 2c fe ff ff mov %esi,0xfffffe2c(%ebp)
997: 8b b5 c4 fe ff ff mov 0xfffffec4(%ebp),%esi
99d: 89 8d 08 fe ff ff mov %ecx,0xfffffe08(%ebp)
9a3: 8b 8d ac fe ff ff mov 0xfffffeac(%ebp),%ecx
9a9: 89 85 30 fe ff ff mov %eax,0xfffffe30(%ebp)
9af: 8b 85 18 ff ff ff mov 0xffffff18(%ebp),%eax
9b5: 89 95 34 fe ff ff mov %edx,0xfffffe34(%ebp)
9bb: 8b 95 c8 fe ff ff mov 0xfffffec8(%ebp),%edx
9c1: 89 9d 38 fe ff ff mov %ebx,0xfffffe38(%ebp)
9c7: 8b 9d cc fe ff ff mov 0xfffffecc(%ebp),%ebx
9cd: 89 b5 40 fe ff ff mov %esi,0xfffffe40(%ebp)
9d3: 8b b5 d0 fe ff ff mov 0xfffffed0(%ebp),%esi
9d9: 89 8d 20 fe ff ff mov %ecx,0xfffffe20(%ebp)
9df: 8b 8d 14 ff ff ff mov 0xffffff14(%ebp),%ecx
9e5: 89 85 48 fe ff ff mov %eax,0xfffffe48(%ebp)
9eb: 89 95 4c fe ff ff mov %edx,0xfffffe4c(%ebp)
9f1: 89 9d 50 fe ff ff mov %ebx,0xfffffe50(%ebp)
9f7: 89 b5 54 fe ff ff mov %esi,0xfffffe54(%ebp)
9fd: 8b 85 d4 fe ff ff mov 0xfffffed4(%ebp),%eax
a03: 8b 9d dc fe ff ff mov 0xfffffedc(%ebp),%ebx
a09: 8b b5 e0 fe ff ff mov 0xfffffee0(%ebp),%esi
a0f: 8b 95 d8 fe ff ff mov 0xfffffed8(%ebp),%edx
a15: 89 85 58 fe ff ff mov %eax,0xfffffe58(%ebp)
a1b: 8b 85 e4 fe ff ff mov 0xfffffee4(%ebp),%eax
a21: 89 9d 8c fd ff ff mov %ebx,0xfffffd8c(%ebp)
a27: 8b 9d e8 fe ff ff mov 0xfffffee8(%ebp),%ebx
a2d: 89 b5 64 fe ff ff mov %esi,0xfffffe64(%ebp)
a33: 8b b5 ec fe ff ff mov 0xfffffeec(%ebp),%esi
a39: 89 95 5c fe ff ff mov %edx,0xfffffe5c(%ebp)
a3f: 8b 95 1c ff ff ff mov 0xffffff1c(%ebp),%edx
a45: 89 85 68 fe ff ff mov %eax,0xfffffe68(%ebp)
a4b: 8b 85 f0 fe ff ff mov 0xfffffef0(%ebp),%eax
a51: 89 9d 6c fe ff ff mov %ebx,0xfffffe6c(%ebp)
a57: 8b 9d f4 fe ff ff mov 0xfffffef4(%ebp),%ebx
a5d: 89 b5 70 fe ff ff mov %esi,0xfffffe70(%ebp)
a63: 8b b5 f8 fe ff ff mov 0xfffffef8(%ebp),%esi
a69: 89 85 74 fe ff ff mov %eax,0xfffffe74(%ebp)
a6f: 8b 85 fc fe ff ff mov 0xfffffefc(%ebp),%eax
a75: 89 9d 78 fe ff ff mov %ebx,0xfffffe78(%ebp)
a7b: 8b 9d 00 ff ff ff mov 0xffffff00(%ebp),%ebx
a81: 89 b5 7c fe ff ff mov %esi,0xfffffe7c(%ebp)
a87: 8b b5 04 ff ff ff mov 0xffffff04(%ebp),%esi
a8d: 89 85 80 fe ff ff mov %eax,0xfffffe80(%ebp)
a93: 8b 85 08 ff ff ff mov 0xffffff08(%ebp),%eax
a99: 89 9d 84 fe ff ff mov %ebx,0xfffffe84(%ebp)
a9f: 8b 9d 0c ff ff ff mov 0xffffff0c(%ebp),%ebx
aa5: 89 b5 8c fe ff ff mov %esi,0xfffffe8c(%ebp)
aab: 8b b5 10 ff ff ff mov 0xffffff10(%ebp),%esi
ab1: 89 85 90 fe ff ff mov %eax,0xfffffe90(%ebp)
ab7: 89 9d 94 fe ff ff mov %ebx,0xfffffe94(%ebp)
abd: 89 b5 98 fe ff ff mov %esi,0xfffffe98(%ebp)
ac3: e9 8c 02 00 00 jmp d54 <susan_edges+0xd54>
{
do_symmetry=0;
if (x==0)
ac8: 83 bd 48 ff ff ff 00 cmpl $0x0,0xffffff48(%ebp)
in the function "susan_edges" when compiled with
gcc -c -Os susan_edges.c -g -o susan_edges.o -fno-ivopts
and dissassembled with
objdump -DS susan_edges.o
HTH,
Miguel
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (2 preceding siblings ...)
2004-09-19 12:00 ` miguel55angel at hotmail dot com
@ 2004-09-25 18:10 ` roger at eyesopen dot com
2004-09-28 15:15 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
` (25 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: roger at eyesopen dot com @ 2004-09-25 18:10 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From roger at eyesopen dot com 2004-09-25 18:10 -------
A quick analysis of just the first doubly nested loop of susan_edges, comparing
gcc 3.3.3 to 3.5.0 20040828 (i.e. before ivopts), shows that even here the cause
of problems is induction variables (rather than unrolling or basic block
duplication). The number of assembly lines in the basic block in the middle
of the loop nest increases in size by nearly 15% (197 lines -> 224 lines).
The size of the stack frame doubles from 24 bytes to 48 bytes.
Growth of the stack frame for the whole function susan_edges is even more
dramatic increasing from 28 bytes to an impressive 620 bytes. That's a lot
of new pseudos! The section of code in comment #3, shows us shuffling these
extra temporaries [I was shocked to discover that it wasn't an inadvertently
inlined memcpy]
The code that's freaking out the induction variable heuristics looks like:
for (i=3;i<y_size-3;i++)
for (j=3;j<x_size-3;j++)
{
n=100;
p=in + (i-3)*x_size + j - 1;
cp=bp + in[i*x_size+j];
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p);
p+=x_size-3;
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p);
p+=x_size-5;
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p);
p+=x_size-6;
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p);
p+=2;
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p);
p+=x_size-6;
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p);
p+=x_size-5;
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p);
p+=x_size-3;
n+=*(cp-*p++);
n+=*(cp-*p++);
n+=*(cp-*p);
if (n<=max_no)
r[i*x_size+j] = max_no - n;
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (3 preceding siblings ...)
2004-09-25 18:10 ` [Bug middle-end/17549] " roger at eyesopen dot com
@ 2004-09-28 15:15 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2004-09-29 1:25 ` giovannibajo at libero dot it
` (24 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2004-09-28 15:15 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2004-09-28 15:15 -------
Subject: Re: [4.0 Regression] 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
This patch (for quite stupid bug in ivopts) could improve the situation
a bit. I will post it once it passes regtesting.
Index: tree-ssa-loop-ivopts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
retrieving revision 2.15
diff -c -3 -p -r2.15 tree-ssa-loop-ivopts.c
*** tree-ssa-loop-ivopts.c 28 Sep 2004 07:59:52 -0000 2.15
--- tree-ssa-loop-ivopts.c 28 Sep 2004 15:08:58 -0000
*************** struct ivopts_data
*** 218,223 ****
--- 218,226 ----
/* The candidates. */
varray_type iv_candidates;
+ /* A bitmap of important candidates. */
+ bitmap important_candidates;
+
/* Whether to consider just related and important candidates when replacing a
use. */
bool consider_all_candidates;
*************** find_best_candidate (struct ivopts_data
*** 3431,3437 ****
else
{
asol = BITMAP_XMALLOC ();
! bitmap_a_and_b (asol, sol, use->related_cands);
}
EXECUTE_IF_SET_IN_BITMAP (asol, 0, c, bi)
--- 3434,3442 ----
else
{
asol = BITMAP_XMALLOC ();
!
! bitmap_a_or_b (asol, data->important_candidates, use->related_cands);
! bitmap_a_and_b (asol, asol, sol);
}
EXECUTE_IF_SET_IN_BITMAP (asol, 0, c, bi)
*************** find_optimal_iv_set (struct ivopts_data
*** 3698,3703 ****
--- 3703,3717 ----
bitmap inv = BITMAP_XMALLOC ();
struct iv_use *use;
+ data->important_candidates = BITMAP_XMALLOC ();
+ for (i = 0; i < n_iv_cands (data); i++)
+ {
+ struct iv_cand *cand = iv_cand (data, i);
+
+ if (cand->important)
+ bitmap_set_bit (data->important_candidates, i);
+ }
+
/* Set the upper bound. */
cost = get_initial_solution (data, set, inv);
if (cost == INFTY)
*************** find_optimal_iv_set (struct ivopts_data
*** 3740,3745 ****
--- 3754,3760 ----
}
BITMAP_XFREE (inv);
+ BITMAP_XFREE (data->important_candidates);
return set;
}
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (4 preceding siblings ...)
2004-09-28 15:15 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2004-09-29 1:25 ` giovannibajo at libero dot it
2004-10-17 19:55 ` pinskia at gcc dot gnu dot org
` (23 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: giovannibajo at libero dot it @ 2004-09-29 1:25 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-09-29 01:25 -------
Subject: Re: [4.0 Regression] 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> This patch (for quite stupid bug in ivopts) could improve the
> situation a bit. I will post it once it passes regtesting.
Do you have some numbers about what this patch does? Does it fix this ivopts
regression, or will we need further tuning?
Giovanni Bajo
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 25% increase in codesize (3.3.4 -> 4.0.0 20040917)
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (5 preceding siblings ...)
2004-09-29 1:25 ` giovannibajo at libero dot it
@ 2004-10-17 19:55 ` pinskia at gcc dot gnu dot org
2004-10-18 3:57 ` [Bug middle-end/17549] [4.0 Regression] 15% increase in codesize with C code giovannibajo at libero dot it
` (22 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-17 19:55 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-10-17 19:55 -------
The IV-OPT part is fixed now but there was another regression size regression here IIRC.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 15% increase in codesize with C code
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (6 preceding siblings ...)
2004-10-17 19:55 ` pinskia at gcc dot gnu dot org
@ 2004-10-18 3:57 ` giovannibajo at libero dot it
2004-10-18 13:35 ` pinskia at gcc dot gnu dot org
` (21 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: giovannibajo at libero dot it @ 2004-10-18 3:57 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-10-18 03:57 -------
With today's mainline:
3.4.1 -Os:
text data bss dec hex filename
17971 0 0 17971 4633 susan.o
4.0.0 -Os:
text data bss dec hex filename
21213 0 0 21213 52dd susan.o
4.0.0 -Os -fno-ivopts:
text data bss dec hex filename
20172 0 0 20172 4ecc susan.o
So there is still some bloating caused by ivopts (about 1k), and then another
2k elsewhere.
--
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.0 Regression] 25% |[4.0 Regression] 15%
|increase in codesize (3.3.4 |increase in codesize with C
|-> 4.0.0 20040917) |code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 15% increase in codesize with C code
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (7 preceding siblings ...)
2004-10-18 3:57 ` [Bug middle-end/17549] [4.0 Regression] 15% increase in codesize with C code giovannibajo at libero dot it
@ 2004-10-18 13:35 ` pinskia at gcc dot gnu dot org
2004-10-18 13:55 ` giovannibajo at libero dot it
` (20 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-18 13:35 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-10-18 13:35 -------
Turning off GVN-PRE at least on PPC gets back to what it was for 3.3:
5704 temp.s
5796 temp1.s
11500 total
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 15% increase in codesize with C code
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (8 preceding siblings ...)
2004-10-18 13:35 ` pinskia at gcc dot gnu dot org
@ 2004-10-18 13:55 ` giovannibajo at libero dot it
2004-10-18 14:06 ` dberlin at dberlin dot org
` (19 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: giovannibajo at libero dot it @ 2004-10-18 13:55 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-10-18 13:55 -------
Yes, also on x86, we save approx 2k:
21213 4.0.0 -Os
19158 4.0.0 -Os -fno-tree-pre
18466 4.0.0 -Os -fno-tree-pre -fno-ivopts
20172 4.0.0 -Os -fno-ivopts
17971 3.4.1 -Os
Daniel?
--
What |Removed |Added
----------------------------------------------------------------------------
CC| |dberlin at gcc dot gnu dot
| |org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 15% increase in codesize with C code
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (9 preceding siblings ...)
2004-10-18 13:55 ` giovannibajo at libero dot it
@ 2004-10-18 14:06 ` dberlin at dberlin dot org
2004-10-28 18:34 ` pinskia at gcc dot gnu dot org
` (18 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: dberlin at dberlin dot org @ 2004-10-18 14:06 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From dberlin at dberlin dot org 2004-10-18 14:06 -------
Subject: Re: [4.0 Regression] 15% increase in codesize with C code
On Oct 18, 2004, at 9:55 AM, giovannibajo at libero dot it wrote:
>
> ------- Additional Comments From giovannibajo at libero dot it
> 2004-10-18 13:55 -------
> Yes, also on x86, we save approx 2k:
>
> 21213 4.0.0 -Os
> 19158 4.0.0 -Os -fno-tree-pre
> 18466 4.0.0 -Os -fno-tree-pre -fno-ivopts
> 20172 4.0.0 -Os -fno-ivopts
> 17971 3.4.1 -Os
>
> Daniel?
Tree PRE (but not FRE) should probably be disabled at -Os, since it
will replicate code for speed reasons.
I'm happy to accept a patch to do this, it should be a simple setting
of flag_tree_pre to 0 when -Os is turned on.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 15% increase in codesize with C code
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (10 preceding siblings ...)
2004-10-18 14:06 ` dberlin at dberlin dot org
@ 2004-10-28 18:34 ` pinskia at gcc dot gnu dot org
2004-10-29 2:04 ` [Bug middle-end/17549] [4.0 Regression] 35% " giovannibajo at libero dot it
` (17 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-10-28 18:34 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2004-10-28 18:34 -------
PRE has now been disabled at -Os, someone wants to try again?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 35% increase in codesize with C code
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (11 preceding siblings ...)
2004-10-28 18:34 ` pinskia at gcc dot gnu dot org
@ 2004-10-29 2:04 ` giovannibajo at libero dot it
2004-11-27 21:17 ` neroden at gcc dot gnu dot org
` (16 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: giovannibajo at libero dot it @ 2004-10-29 2:04 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From giovannibajo at libero dot it 2004-10-29 02:04 -------
Well, nothing changes from our last tests with -Os -fno-tree-pre. Comment #10
and Comment #12 reflect the current situation. We still have a big regression
with the new attacchment Miguel posted (susan_edges_mod_1.c), and a somewhat
little regression caused by ivopts.
Zdenek, can you please check again ivopts behaviour on these testcases? Maybe
the heuristics you used still needs to be adjusted?
--
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.0 Regression] 15% |[4.0 Regression] 35%
|increase in codesize with C |increase in codesize with C
|code |code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 35% increase in codesize with C code
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (12 preceding siblings ...)
2004-10-29 2:04 ` [Bug middle-end/17549] [4.0 Regression] 35% " giovannibajo at libero dot it
@ 2004-11-27 21:17 ` neroden at gcc dot gnu dot org
2005-01-21 13:47 ` steven at gcc dot gnu dot org
` (15 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: neroden at gcc dot gnu dot org @ 2004-11-27 21:17 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
OtherBugsDependingO| |18693
nThis| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 35% increase in codesize with C code
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (13 preceding siblings ...)
2004-11-27 21:17 ` neroden at gcc dot gnu dot org
@ 2005-01-21 13:47 ` steven at gcc dot gnu dot org
2005-01-21 14:00 ` [Bug middle-end/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3 steven at gcc dot gnu dot org
` (14 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-21 13:47 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-21 13:47 -------
(From update of attachment 7380)
already applied
--
What |Removed |Added
----------------------------------------------------------------------------
Attachment #7380 is|0 |1
obsolete| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (14 preceding siblings ...)
2005-01-21 13:47 ` steven at gcc dot gnu dot org
@ 2005-01-21 14:00 ` steven at gcc dot gnu dot org
2005-01-21 14:03 ` steven at gcc dot gnu dot org
` (13 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-21 14:00 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-21 14:00 -------
For the test case from comment #1 I get the following for AMD64:
GCC 4.0 (20050121):
text data bss dec hex filename
24689 0 0 24689 6071 t.O2.o
20728 0 0 20728 50f8 t.Os.o
GCC 3.3-SUSE (pre 3.3.5 20040809 hammer-branch)
22682 0 0 22682 589a t.O2.o
21281 0 0 21281 5321 t.Os.o
and for i686:
GCC 4.0 (20050121):
24064 0 0 24064 5e00 t.O2.o
19479 0 0 19479 4c17 t.Os.o
GCC 3.3-SUSE (pre 3.3.5 20040809 hammer-branch) -m32
19646 0 0 19646 4cbe t.O2.o
17713 0 0 17713 4531 t.Os.o
So I am seeing a 10% code size increase at -O2 for GCC 4.0 compared to the
hammer-branch based GCC 3.3. GCC 3.3 was the best score we had for this so
far. The 35% from the subject is quite exaggerated, so I have adjusted it.
(FWIW, the GCC 4.0 I tested has my patch for PR19454 applied, which makes
quite a difference for -m32 -O2, but not for -Os...).
--
What |Removed |Added
----------------------------------------------------------------------------
Summary|[4.0 Regression] 35% |[4.0 Regression] 10%
|increase in codesize with C |increase in codesize with C
|code |code compared to GCC 3.3
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug middle-end/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (15 preceding siblings ...)
2005-01-21 14:00 ` [Bug middle-end/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3 steven at gcc dot gnu dot org
@ 2005-01-21 14:03 ` steven at gcc dot gnu dot org
2005-02-08 13:23 ` [Bug tree-optimization/17549] " steven at gcc dot gnu dot org
` (12 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-21 14:03 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-01-21 14:02 -------
> (FWIW, the GCC 4.0 I tested has my patch for PR19454 applied, which makes
> quite a difference for -m32 -O2, but not for -Os...).
That'd be PR19464 ;-)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (16 preceding siblings ...)
2005-01-21 14:03 ` steven at gcc dot gnu dot org
@ 2005-02-08 13:23 ` steven at gcc dot gnu dot org
2005-02-08 13:37 ` steven at gcc dot gnu dot org
` (11 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-02-08 13:23 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-02-07 23:13 -------
Using var_to_partition does not help. The reason is that the SSA names with
the same root var are not in the same partition, e.g.
int
foo (int x, int a, int b)
{
x = a + b;
x = x * a;
x = x * b;
return x;
}
-->
Sorted Coalesce list:
Partition map
Partition 0 (x_3 - 3 )
Partition 1 (x_4 - 4 )
Partition 2 (x_5 - 5 )
After Coalescing:
Partition map
Partition 0 (a_1 - 1 )
Partition 1 (b_2 - 2 )
Partition 2 (x_3 - 3 )
Partition 3 (x_4 - 4 )
Partition 4 (x_5 - 5 )
Partition 5 (<retval>_7 - 7 )
Replacing Expressions
x_3 replace with --> a_1 + b_2
x_4 replace with --> a_1 * x_3
x_5 replace with --> b_2 * x_4
<retval>_7 --> <retval>
x_3 --> x
x_4 not coalesced with x --> New temp: 'x.0'
x_5 not coalesced with x.0 --> New temp: 'x.1'
b_2 --> b
a_1 --> a
After Root variable replacement:
Partition map
Partition 0 (a - 1 )
Partition 1 (b - 2 )
Partition 2 (x - 3 )
Partition 3 (x.0 - 4 )
Partition 4 (x.1 - 5 )
Partition 5 (<retval> - 7 )
So if you replace the root var comparison in my hack with a check to make sure
def and def2 are not in the same partition, that whole check will always be
false and you still get crap code.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (17 preceding siblings ...)
2005-02-08 13:23 ` [Bug tree-optimization/17549] " steven at gcc dot gnu dot org
@ 2005-02-08 13:37 ` steven at gcc dot gnu dot org
2005-02-08 13:43 ` rth at gcc dot gnu dot org
` (10 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-02-08 13:37 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-02-07 23:16 -------
Note the following:
x_4 not coalesced with x --> New temp: 'x.0'
x_5 not coalesced with x.0 --> New temp: 'x.1'
Not very useful, because x_4 and x_5 have no uses left. So you start with
this:
foo (xD.1447, aD.1448, bD.1449)
{
intD.0 D.1452;
# BLOCK 0
# PRED: ENTRY [100.0%] (fallthru,exec)
xD.1447_3 = aD.1448_1 + bD.1449_2;
xD.1447_4 = aD.1448_1 * xD.1447_3;
xD.1447_5 = bD.1449_2 * xD.1447_4;
return xD.1447_5;
# SUCC: EXIT [100.0%]
}
and you end with this:
foo (xD.1447, aD.1448, bD.1449)
{
intD.0 x.1D.1456;
intD.0 x.0D.1455;
intD.0 D.1452;
# BLOCK 0
# PRED: ENTRY [100.0%] (fallthru,exec)
return bD.1449 * aD.1448 * (aD.1448 + bD.1449);
# SUCC: EXIT [100.0%]
}
Note the redundant temporaries.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (18 preceding siblings ...)
2005-02-08 13:37 ` steven at gcc dot gnu dot org
@ 2005-02-08 13:43 ` rth at gcc dot gnu dot org
2005-02-08 13:50 ` steven at gcc dot gnu dot org
` (9 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: rth at gcc dot gnu dot org @ 2005-02-08 13:43 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From rth at gcc dot gnu dot org 2005-02-07 23:36 -------
Subject: Re: [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
On Mon, Feb 07, 2005 at 11:13:27PM -0000, steven at gcc dot gnu dot org wrote:
> x = a + b;
> x = x * a;
> x = x * b;
...
> After Coalescing:
...
> Partition 2 (x_3 - 3 )
> Partition 3 (x_4 - 4 )
> Partition 4 (x_5 - 5 )
That is curious. Certainly not the way I'd have expected things to work.
Why are we not coalescing here? Do we think that x_4 as an input to the
same insn that creates x_5 means that the two conflict? Unless someone
can convince me otherwise, I'd call this a bug.
r~
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (19 preceding siblings ...)
2005-02-08 13:43 ` rth at gcc dot gnu dot org
@ 2005-02-08 13:50 ` steven at gcc dot gnu dot org
2005-02-08 19:21 ` amacleod at redhat dot com
` (8 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-02-08 13:50 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-02-08 00:15 -------
Might as well make it mine while I'm looking at it.
--
What |Removed |Added
----------------------------------------------------------------------------
AssignedTo|unassigned at gcc dot gnu |steven at gcc dot gnu dot
|dot org |org
Status|NEW |ASSIGNED
Last reconfirmed|2005-02-07 22:13:59 |2005-02-08 00:15:16
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (20 preceding siblings ...)
2005-02-08 13:50 ` steven at gcc dot gnu dot org
@ 2005-02-08 19:21 ` amacleod at redhat dot com
2005-02-08 19:36 ` amacleod at redhat dot com
` (7 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: amacleod at redhat dot com @ 2005-02-08 19:21 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From amacleod at redhat dot com 2005-02-08 14:02 -------
(In reply to comment #30)
> Subject: Re: [4.0 Regression] 10% increase in codesize with C code compared
to GCC 3.3
>
> On Mon, Feb 07, 2005 at 11:13:27PM -0000, steven at gcc dot gnu dot org wrote:
> > x = a + b;
> > x = x * a;
> > x = x * b;
> ...
> > After Coalescing:
> ...
> > Partition 2 (x_3 - 3 )
> > Partition 3 (x_4 - 4 )
> > Partition 4 (x_5 - 5 )
>
> That is curious. Certainly not the way I'd have expected things to work.
> Why are we not coalescing here? Do we think that x_4 as an input to the
> same insn that creates x_5 means that the two conflict? Unless someone
> can convince me otherwise, I'd call this a bug.
>
No we dont think that, but they are disjoint live ranges, so we want to keep
them seperate. we do live range splitting on the way out of SSA. there is no
conflict, just reason NOT to coalesce them. if there was a copy between them,
then we consider coalescing them.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (21 preceding siblings ...)
2005-02-08 19:21 ` amacleod at redhat dot com
@ 2005-02-08 19:36 ` amacleod at redhat dot com
2005-02-10 1:08 ` steven at gcc dot gnu dot org
` (6 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: amacleod at redhat dot com @ 2005-02-08 19:36 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From amacleod at redhat dot com 2005-02-08 14:26 -------
(In reply to comment #28)
> Using var_to_partition does not help. The reason is that the SSA names with
> the same root var are not in the same partition, e.g.
>
> <retval>_7 --> <retval>
> x_3 --> x
> x_4 not coalesced with x --> New temp: 'x.0'
> x_5 not coalesced with x.0 --> New temp: 'x.1'
<...>
>
> Partition 0 (a - 1 )
> Partition 1 (b - 2 )
> Partition 2 (x - 3 )
> Partition 3 (x.0 - 4 )
> Partition 4 (x.1 - 5 )
> Partition 5 (<retval> - 7 )
>
> So if you replace the root var comparison in my hack with a check to make sure
> def and def2 are not in the same partition, that whole check will always be
> false and you still get crap code.
>
of course. doh. Accumulation will result in live range splitting, so they will
all be different variables. Stick with checking the root variable, its probably
our simplest measure I guess. :-)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (22 preceding siblings ...)
2005-02-08 19:36 ` amacleod at redhat dot com
@ 2005-02-10 1:08 ` steven at gcc dot gnu dot org
2005-02-10 1:53 ` steven at gcc dot gnu dot org
` (5 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-02-10 1:08 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-02-09 22:00 -------
My TER hack does fix most of the problems, but it also causes a significant
regression in the SPEC twolf benchmark. All other benchmarks are roughly the
same. I'll try to figure out what is causing the regression.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (23 preceding siblings ...)
2005-02-10 1:08 ` steven at gcc dot gnu dot org
@ 2005-02-10 1:53 ` steven at gcc dot gnu dot org
2005-02-10 14:50 ` steven at gcc dot gnu dot org
` (4 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-02-10 1:53 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-02-09 23:35 -------
The entire diff of .optimized dumps and .s output for twolf on AMD64 is really
small, in fact the asm output is different for only one file:
config1.c.t65.optimized | 120
++++++++++++++++++++++++++++++----------------
configure.c.t65.optimized | 78 +++++++++++++++++++----------
outpins.c.t65.optimized | 6 +-
outpins.s | 36 ++++++-------
qsorte.c.t65.optimized | 3 -
qsortg.c.t65.optimized | 3 -
qsortgdx.c.t65.optimized | 3 -
qsortx.c.t65.optimized | 3 -
readcell.c.t65.optimized | 3 -
readseg.c.t65.optimized | 6 +-
ucgxp.c.t65.optimized | 3 -
uloop.c.t65.optimized | 6 +-
12 files changed, 174 insertions(+), 96 deletions(-)
The file with the assembler difference is outpins.c. The relevant diff is
below. There is nothing in the diff that explains the ~4% slowdown I see in
my SPEC benchmarks (3 runs, so the slowdown is consistent). The same
instructions are there, just ordered differently and using different
registers. So I'm not sure how to proceed...
diff -u base/outpins.c.t65.optimized hacked/outpins.c.t65.optimized
--- base/outpins.c.t65.optimized 2005-02-10 00:19:20.950581229 +0100
+++ patched/outpins.c.t65.optimized 2005-02-10 00:16:19.436444879 +0100
@@ -99,8 +99,9 @@
pairArray.39 = pairArray;
carray.40 = carray;
D.3698 = *((struct cellbox * *) ((long unsigned int) *(*((int * *) D.3712 +
pairArray.39 - 8B) + 4B) * 8) + carray.40);
+ end.81 = D.3698->cxcenter + (int) D.3698->tileptr->left;
temp.59 = *(carray.40 + (struct cellbox * *) ((long unsigned int)
*(*(pairArray.39 + (int * *) D.3712) + 4B) * 8));
- end = MAX_EXPR <D.3698->cxcenter + (int) D.3698->tileptr->left,
temp.59->cxcenter + (int) temp.59->tileptr->left>;
+ end = MAX_EXPR <end.81, temp.59->cxcenter + (int) temp.59->tileptr->left>;
<L4>:;
return end;
@@ -228,9 +229,10 @@
D.3668 = *((int * *) D.3664 + pairArray.36 - 8B);
carray.37 = carray;
D.3646 = *((struct cellbox * *) ((long unsigned int) *(D.3668 + (int *)
((long unsigned int) *D.3668 * 4)) * 8) + carray.37);
+ end.121 = D.3646->cxcenter + (int) D.3646->tileptr->right;
D.3676 = *(pairArray.36 + (int * *) D.3664);
temp.99 = *(carray.37 + (struct cellbox * *) ((long unsigned int) *(D.3676
+ (int *) ((long unsigned int) *D.3676 * 4)) * 8));
- end = MIN_EXPR <D.3646->cxcenter + (int) D.3646->tileptr->right,
temp.99->cxcenter + (int) temp.99->tileptr->right>;
+ end = MIN_EXPR <end.121, temp.99->cxcenter + (int)
temp.99->tileptr->right>;
<L4>:;
return end;
diff -u base/outpins.s hacked/outpins.s
--- base/outpins.s 2005-02-10 00:19:21.064543028 +0100
+++ patched/outpins.s 2005-02-10 00:16:19.551406289 +0100
@@ -18,18 +18,18 @@
movq -8(%rdx,%rcx), %rax
movslq 4(%rax),%rax
movq (%rsi,%rax,8), %rdi
+ movq 40(%rdi), %rax
+ movswl (%rax),%r8d
movq (%rcx,%rdx), %rax
+ addl 12(%rdi), %r8d
movslq 4(%rax),%rax
movq (%rsi,%rax,8), %rdx
- movq 40(%rdi), %rax
- movswl (%rax),%ecx
movq 40(%rdx), %rax
- addl 12(%rdi), %ecx
movswl (%rax),%eax
addl 12(%rdx), %eax
- cmpl %eax, %ecx
- cmovl %eax, %ecx
- movl %ecx, %eax
+ cmpl %eax, %r8d
+ cmovl %eax, %r8d
+ movl %r8d, %eax
ret
.p2align 4,,7
.L11:
@@ -40,9 +40,9 @@
movq carray(%rip), %rax
movq (%rax,%rdx,8), %rdx
movq 40(%rdx), %rax
- movswl (%rax),%ecx
- addl 12(%rdx), %ecx
- movl %ecx, %eax
+ movswl (%rax),%r8d
+ addl 12(%rdx), %r8d
+ movl %r8d, %eax
ret
.p2align 4,,7
.L12:
@@ -72,18 +72,18 @@
movslq (%rcx),%rax
movslq (%rcx,%rax,4),%rax
movq (%rdi,%rax,8), %rcx
+ movq 40(%rcx), %rax
+ movswl 2(%rax),%r8d
movslq (%rdx),%rax
+ addl 12(%rcx), %r8d
movslq (%rdx,%rax,4),%rax
movq (%rdi,%rax,8), %rdx
- movq 40(%rcx), %rax
- movswl 2(%rax),%esi
movq 40(%rdx), %rax
- addl 12(%rcx), %esi
movswl 2(%rax),%eax
addl 12(%rdx), %eax
- cmpl %eax, %esi
- cmovg %eax, %esi
- movl %esi, %eax
+ cmpl %eax, %r8d
+ cmovg %eax, %r8d
+ movl %r8d, %eax
ret
.p2align 4,,7
.L22:
@@ -95,9 +95,9 @@
movq carray(%rip), %rax
movq (%rax,%rdx,8), %rdx
movq 40(%rdx), %rax
- movswl 2(%rax),%esi
- addl 12(%rdx), %esi
- movl %esi, %eax
+ movswl 2(%rax),%r8d
+ addl 12(%rdx), %r8d
+ movl %r8d, %eax
ret
.p2align 4,,7
.L23:
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (24 preceding siblings ...)
2005-02-10 1:53 ` steven at gcc dot gnu dot org
@ 2005-02-10 14:50 ` steven at gcc dot gnu dot org
2005-02-10 15:37 ` steven at gcc dot gnu dot org
` (3 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-02-10 14:50 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-02-10 09:08 -------
The slowdown is probably some unfortunate icache effect - ccould be anything
from alignment, the slightly larger instructions due to using r8 instead of
rcx. I guess we should not care too much about such random effects that we
cannot do anything about anyway. I'm going to see if it doesn't hurt on i686,
and submit the patch if things look good.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (25 preceding siblings ...)
2005-02-10 14:50 ` steven at gcc dot gnu dot org
@ 2005-02-10 15:37 ` steven at gcc dot gnu dot org
2005-02-10 23:53 ` pinskia at gcc dot gnu dot org
` (2 subsequent siblings)
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-02-10 15:37 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-02-10 10:06 -------
'size' for susan_edged_mod_1 .o files
33 = pre 3.3.3-suse (hammer branch
40 = CVS head 20050209
patched = CVS head 20050209 with the 'TER hack' patch applied.
i686:
text data bss dec hex filename
2133 0 0 2133 855 33.o
3003 0 0 3003 bbb 40.o
2237 0 0 2237 8bd patched.o
amd64:
text data bss dec hex filename
2710 0 0 2710 a96 33.o
3414 0 0 3414 d56 40.o
2421 0 0 2421 975 patched.o
ppc32:
text data bss dec hex filename
2780 0 0 2780 adc 33.o
3348 0 0 3348 d14 40.o
3140 0 0 3140 c44 patched.o
So for ppc this bug is still not fixed even with my patch. Interesting data
point is the ppc32 size with -Os -fno-ivopts:
2820 0 0 2820 b04 no-ivopts.o
So perhaps the pending IVopts patches will also help for this problem.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (26 preceding siblings ...)
2005-02-10 15:37 ` steven at gcc dot gnu dot org
@ 2005-02-10 23:53 ` pinskia at gcc dot gnu dot org
2005-02-11 2:16 ` cvs-commit at gcc dot gnu dot org
2005-02-11 2:52 ` steven at gcc dot gnu dot org
29 siblings, 0 replies; 31+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-02-10 23:53 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-02-10 21:11 -------
(In reply to comment #37)
> So for ppc this bug is still not fixed even with my patch. Interesting data
> point is the ppc32 size with -Os -fno-ivopts:
> 2820 0 0 2820 b04 no-ivopts.o
>
> So perhaps the pending IVopts patches will also help for this problem.
I bet the problem here for PPC is the same as 18219 where we generate more than one IV.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (27 preceding siblings ...)
2005-02-10 23:53 ` pinskia at gcc dot gnu dot org
@ 2005-02-11 2:16 ` cvs-commit at gcc dot gnu dot org
2005-02-11 2:52 ` steven at gcc dot gnu dot org
29 siblings, 0 replies; 31+ messages in thread
From: cvs-commit at gcc dot gnu dot org @ 2005-02-11 2:16 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From cvs-commit at gcc dot gnu dot org 2005-02-10 22:57 -------
Subject: Bug 17549
CVSROOT: /cvs/gcc
Module name: gcc
Changes by: steven@gcc.gnu.org 2005-02-10 22:57:31
Modified files:
gcc : ChangeLog tree-outof-ssa.c
Log message:
PR tree-optimization/17549
* tree-outof-ssa.c (find_replaceable_in_bb): Do not allow
TER to replace a DEF with its expression if the DEF and the
rhs of the expression we replace into have the same root
variable.
Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.7438&r2=2.7439
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-outof-ssa.c.diff?cvsroot=gcc&r1=2.43&r2=2.44
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
* [Bug tree-optimization/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
` (28 preceding siblings ...)
2005-02-11 2:16 ` cvs-commit at gcc dot gnu dot org
@ 2005-02-11 2:52 ` steven at gcc dot gnu dot org
29 siblings, 0 replies; 31+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-02-11 2:52 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From steven at gcc dot gnu dot org 2005-02-10 22:59 -------
.
--
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution| |FIXED
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17549
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2005-02-10 22:59 UTC | newest]
Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-18 11:36 [Bug c/17549] New: 25% increase in codesize (3.3.4 -> 4.0.0 20040917) miguel55angel at hotmail dot com
2004-09-18 11:39 ` [Bug c/17549] " miguel55angel at hotmail dot com
2004-09-18 17:05 ` [Bug c/17549] [4.0 Regression] " giovannibajo at libero dot it
2004-09-19 12:00 ` miguel55angel at hotmail dot com
2004-09-25 18:10 ` [Bug middle-end/17549] " roger at eyesopen dot com
2004-09-28 15:15 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2004-09-29 1:25 ` giovannibajo at libero dot it
2004-10-17 19:55 ` pinskia at gcc dot gnu dot org
2004-10-18 3:57 ` [Bug middle-end/17549] [4.0 Regression] 15% increase in codesize with C code giovannibajo at libero dot it
2004-10-18 13:35 ` pinskia at gcc dot gnu dot org
2004-10-18 13:55 ` giovannibajo at libero dot it
2004-10-18 14:06 ` dberlin at dberlin dot org
2004-10-28 18:34 ` pinskia at gcc dot gnu dot org
2004-10-29 2:04 ` [Bug middle-end/17549] [4.0 Regression] 35% " giovannibajo at libero dot it
2004-11-27 21:17 ` neroden at gcc dot gnu dot org
2005-01-21 13:47 ` steven at gcc dot gnu dot org
2005-01-21 14:00 ` [Bug middle-end/17549] [4.0 Regression] 10% increase in codesize with C code compared to GCC 3.3 steven at gcc dot gnu dot org
2005-01-21 14:03 ` steven at gcc dot gnu dot org
2005-02-08 13:23 ` [Bug tree-optimization/17549] " steven at gcc dot gnu dot org
2005-02-08 13:37 ` steven at gcc dot gnu dot org
2005-02-08 13:43 ` rth at gcc dot gnu dot org
2005-02-08 13:50 ` steven at gcc dot gnu dot org
2005-02-08 19:21 ` amacleod at redhat dot com
2005-02-08 19:36 ` amacleod at redhat dot com
2005-02-10 1:08 ` steven at gcc dot gnu dot org
2005-02-10 1:53 ` steven at gcc dot gnu dot org
2005-02-10 14:50 ` steven at gcc dot gnu dot org
2005-02-10 15:37 ` steven at gcc dot gnu dot org
2005-02-10 23:53 ` pinskia at gcc dot gnu dot org
2005-02-11 2:16 ` cvs-commit at gcc dot gnu dot org
2005-02-11 2:52 ` steven at gcc dot gnu dot org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).