* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
@ 2013-01-18 0:48 ` vda.linux at googlemail dot com
2013-01-18 0:51 ` vda.linux at googlemail dot com
` (17 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: vda.linux at googlemail dot com @ 2013-01-18 0:48 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
--- Comment #6 from Denis Vlasenko <vda.linux at googlemail dot com> 2013-01-18 00:48:23 UTC ---
Created attachment 29200
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29200
Updated testcase, build heper, and results of testing with different gcc
versions
Tarball contains:
serpent.c:
the original testcase, only with "#ifdef NAIL_REGS" instead of "#if 0" which
allows test compiles w/o editing it. Basically, "gcc -DNAIL_REGS serpent.c"
will try to force gcc to use only registers instead of stack.
gencode.sh:
builds serpent.c with -O2 and -O3, with and without -DNAIL_REGS. The object
file names contain gcc version and used options. Then they are objdump'ed and
output saved. Tweakable with setting $PREFIX and/or $CC.
No -fomit-frame-pointer used: the testcase can be compiled so that stack is not
used even without that option.
Disassembly:
serpent-O2-3.4.3.asm
serpent-O2-4.2.1.asm
serpent-O2-4.6.3.asm
serpent-O2-DNAIL_REGS-3.4.3.asm
serpent-O2-DNAIL_REGS-4.2.1.asm
serpent-O2-DNAIL_REGS-4.6.3.asm
serpent-O3-3.4.3.asm
serpent-O3-4.2.1.asm
serpent-O3-4.6.3.asm
serpent-O3-DNAIL_REGS-3.4.3.asm
serpent-O3-DNAIL_REGS-4.2.1.asm
serpent-O3-DNAIL_REGS-4.6.3.asm
Object files:
text data bss dec hex filename
3260 0 0 3260 cbc serpent-O2-DNAIL_REGS-3.4.3.o
3260 0 0 3260 cbc serpent-O3-DNAIL_REGS-3.4.3.o
3292 0 0 3292 cdc serpent-O3-3.4.3.o
3536 0 0 3536 dd0 serpent-O2-4.6.3.o
3536 0 0 3536 dd0 serpent-O3-4.6.3.o
3845 0 0 3845 f05 serpent-O2-DNAIL_REGS-4.6.3.o
3845 0 0 3845 f05 serpent-O3-DNAIL_REGS-4.6.3.o
3877 0 0 3877 f25 serpent-O2-4.2.1.o
3877 0 0 3877 f25 serpent-O3-4.2.1.o
4302 0 0 4302 10ce serpent-O2-3.4.3.o
4641 0 0 4641 1221 serpent-O2-DNAIL_REGS-4.2.1.o
4641 0 0 4641 1221 serpent-O3-DNAIL_REGS-4.2.1.o
Take a look inside serpent-O2-DNAIL_REGS-3.4.3.asm file.
This is what I want to get without asm hacks: the smallest code, uses no stack.
gcc-3.4.3 -O3 comes close: it does spill a few words to stack (search for
(%ebp)), but is generally good code (close to ideal?).
All other attempts fare worse:
gcc-3.4.3 -O2: code is significantly worse than -O3.
gcc-4.2.1 -O2/-O3: code is better than gcc-3.4.3 -O2, worse than gcc-4.6.3
gcc-4.6.3 -O2/-O3: six instances of spills to stack . Code is still not as good
as gcc-3.4.3 -O3. (-DNAIL_REGS only confuses it more, unlike 3.4.3).
Stack usage summary:
$ grep 'sub.*,%esp' *.asm | grep -v DNAIL_REGS
serpent-O2-3.4.3.asm: 6: 81 ec 00 01 00 00 sub $0x100,%esp
serpent-O2-4.2.1.asm: 6: 83 ec 78 sub $0x78,%esp
serpent-O2-4.6.3.asm: 4: 83 ec 04 sub $0x4,%esp
serpent-O3-4.2.1.asm: 6: 83 ec 78 sub $0x78,%esp
serpent-O3-4.6.3.asm: 4: 83 ec 04 sub $0x4,%esp
(serpent-O3-3.4.3.asm is not listed, but it allocates and uses one word on
stack by push insn).
Modules with best (= minimal) stack usage:
$ grep -F -e '(%esp)' -e '(%ebp)' serpent-O2-DNAIL_REGS-3.4.3.asm
6: 8b 75 08 mov 0x8(%ebp),%esi
9: 8b 7d 10 mov 0x10(%ebp),%edi
ca9: 8b 75 0c mov 0xc(%ebp),%esi
$ grep -F -e '(%esp)' -e '(%ebp)' serpent-O3-3.4.3.asm
7: 8b 7d 08 mov 0x8(%ebp),%edi
a: 8b 4d 10 mov 0x10(%ebp),%ecx
18c: 89 7d f0 mov %edi,-0x10(%ebp)
1dd: 8b 45 f0 mov -0x10(%ebp),%eax
23b: 8b 75 f0 mov -0x10(%ebp),%esi
299: 8b 7d f0 mov -0x10(%ebp),%edi
432: 8b 55 f0 mov -0x10(%ebp),%edx
4a0: 8b 4d f0 mov -0x10(%ebp),%ecx
50e: 8b 7d f0 mov -0x10(%ebp),%edi
84f: 8b 45 f0 mov -0x10(%ebp),%eax
8b9: 8b 75 f0 mov -0x10(%ebp),%esi
923: 8b 7d f0 mov -0x10(%ebp),%edi
cb6: 8b 55 0c mov 0xc(%ebp),%edx
$ grep -F -e '(%esp)' -e '(%ebp)' serpent-O3-4.6.3.asm
7: 8b 4c 24 20 mov 0x20(%esp),%ecx
b: 8b 44 24 18 mov 0x18(%esp),%eax
22e: 89 0c 24 mov %ecx,(%esp)
239: 23 3c 24 and (%esp),%edi
588: 89 0c 24 mov %ecx,(%esp)
58f: 23 3c 24 and (%esp),%edi
8f4: 89 0c 24 mov %ecx,(%esp)
8fd: 23 3c 24 and (%esp),%edi
c60: 89 0c 24 mov %ecx,(%esp)
c6b: 23 3c 24 and (%esp),%edi
d37: 89 14 24 mov %edx,(%esp)
d5a: 8b 44 24 1c mov 0x1c(%esp),%eax
d5e: 33 14 24 xor (%esp),%edx
Conclusion:
gcc-4.6.3 -O3 was close to ideal.
gcc-4.2.1 is worse.
gcc-4.6.3 got better a bit, still not as good as gcc-4.6.3 -O3.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
2013-01-18 0:48 ` [Bug rtl-optimization/21182] gcc can use registers but uses stack instead vda.linux at googlemail dot com
@ 2013-01-18 0:51 ` vda.linux at googlemail dot com
2013-01-18 0:55 ` vda.linux at googlemail dot com
` (16 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: vda.linux at googlemail dot com @ 2013-01-18 0:51 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Denis Vlasenko <vda.linux at googlemail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vda.linux at googlemail dot
| |com
--- Comment #7 from Denis Vlasenko <vda.linux at googlemail dot com> 2013-01-18 00:51:01 UTC ---
"gcc-4.6.3 got better a bit, still not as good as gcc-4.6.3 -O3."
I meant:
gcc-4.6.3 got better a bit, still not as good as gcc-3.4.3 -O3 used to be.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
2013-01-18 0:48 ` [Bug rtl-optimization/21182] gcc can use registers but uses stack instead vda.linux at googlemail dot com
2013-01-18 0:51 ` vda.linux at googlemail dot com
@ 2013-01-18 0:55 ` vda.linux at googlemail dot com
2013-01-18 0:57 ` pinskia at gcc dot gnu.org
` (15 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: vda.linux at googlemail dot com @ 2013-01-18 0:55 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
--- Comment #8 from Denis Vlasenko <vda.linux at googlemail dot com> 2013-01-18 00:55:37 UTC ---
Grrr, another mistake. Correcting again:
Conclusion:
gcc-3.4.3 -O3 was close to ideal.
^^^^^^^^^
gcc-4.2.1 is worse.
gcc-4.6.3 got better a bit, still not as good as gcc-3.4.3 -O3 used to be.
^^^^^^^^^^^^^^^^^^^^^^^^^
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2013-01-18 0:55 ` vda.linux at googlemail dot com
@ 2013-01-18 0:57 ` pinskia at gcc dot gnu.org
2013-01-18 10:39 ` [Bug rtl-optimization/21182] [4.6/4.7/4.8 Regression] " rguenth at gcc dot gnu.org
` (14 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: pinskia at gcc dot gnu.org @ 2013-01-18 0:57 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> 2013-01-18 00:57:00 UTC ---
It would be interesting to try the trunk which has a newer register allocator
than even 4.6.x/4.7.x.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [4.6/4.7/4.8 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2013-01-18 0:57 ` pinskia at gcc dot gnu.org
@ 2013-01-18 10:39 ` rguenth at gcc dot gnu.org
2013-01-20 14:40 ` vda.linux at googlemail dot com
` (13 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu.org @ 2013-01-18 10:39 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vmakarov at gcc dot gnu.org
Target Milestone|--- |4.6.4
Summary|gcc can use registers but |[4.6/4.7/4.8 Regression]
|uses stack instead |gcc can use registers but
| |uses stack instead
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> 2013-01-18 10:38:54 UTC ---
4.4.7 and 4.5.4 generate the same code (no stack use) for -D/-UNAIL_REGS.
With 4.6.3, the -DNAIL_REGS code regresses very much (IRA ...), the
-UNAIL_REGS code is nearly perfect but less good than 4.4/4.5 (if you
only consider grep esp serpent.s | wc -l). Same behavior with 4.7.2.
Trunk got somewhat worse with -UNAIL_REGS but better with -DNAIL_REGS (at -O2):
-UNAIL_REGS -DNAIL_REGS
4.5.4 3 3
4.6.3 15 101
4.7.2 15 93
4.8.0 23 70
The most important thing to fix is the -UNAIL_REGS case of course.
A regression for that from 4.4/4.5.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [4.6/4.7/4.8 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2013-01-18 10:39 ` [Bug rtl-optimization/21182] [4.6/4.7/4.8 Regression] " rguenth at gcc dot gnu.org
@ 2013-01-20 14:40 ` vda.linux at googlemail dot com
2013-03-13 20:38 ` steven at gcc dot gnu.org
` (12 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: vda.linux at googlemail dot com @ 2013-01-20 14:40 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
--- Comment #11 from Denis Vlasenko <vda.linux at googlemail dot com> 2013-01-20 14:39:42 UTC ---
(In reply to comment #10)
> 4.4.7 and 4.5.4 generate the same code (no stack use) for -D/-UNAIL_REGS.
> With 4.6.3, the -DNAIL_REGS code regresses very much (IRA ...), the
> -UNAIL_REGS code is nearly perfect but less good than 4.4/4.5 (if you
> only consider grep esp serpent.s | wc -l). Same behavior with 4.7.2.
>
> Trunk got somewhat worse with -UNAIL_REGS but better with -DNAIL_REGS (at -O2):
>
> -UNAIL_REGS -DNAIL_REGS
> 4.5.4 3 3
> 4.6.3 15 101
This matches what I see with 4.6.3 - 15 insns with %esp (and no %ebp):
$ grep '%esp' serpent-4.6.3-O2.asm
4: 83 ec 04 sub $0x4,%esp
7: 8b 4c 24 20 mov 0x20(%esp),%ecx
b: 8b 44 24 18 mov 0x18(%esp),%eax
22e: 89 0c 24 mov %ecx,(%esp)
239: 23 3c 24 and (%esp),%edi
588: 89 0c 24 mov %ecx,(%esp)
58f: 23 3c 24 and (%esp),%edi
8f4: 89 0c 24 mov %ecx,(%esp)
8fd: 23 3c 24 and (%esp),%edi
c60: 89 0c 24 mov %ecx,(%esp)
c6b: 23 3c 24 and (%esp),%edi
d37: 89 14 24 mov %edx,(%esp)
d5a: 8b 44 24 1c mov 0x1c(%esp),%eax
d5e: 33 14 24 xor (%esp),%edx
d70: 83 c4 04 add $0x4,%esp
> The most important thing to fix is the -UNAIL_REGS case of course.
Sure. NAIL_REGS is only a hack meant to demonstrate that regs *can* be
allocated optimally.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [4.6/4.7/4.8 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2013-01-20 14:40 ` vda.linux at googlemail dot com
@ 2013-03-13 20:38 ` steven at gcc dot gnu.org
2013-04-12 15:18 ` [Bug rtl-optimization/21182] [4.7/4.8/4.9 " jakub at gcc dot gnu.org
` (11 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: steven at gcc dot gnu.org @ 2013-03-13 20:38 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
--- Comment #12 from Steven Bosscher <steven at gcc dot gnu.org> 2013-03-13 20:37:23 UTC ---
Curious to hear whether "-fschedule-insns -fsched-pressure" helps.
At least from the %esp and %ebp counts, it looks hopeful:
$ ./cc1 -quiet -m32 -O2 t.c -DNAIL_REGS -o t.s.NAIL
$ ./cc1 -quiet -m32 -O2 t.c -UNAIL_REGS -o t.s
$ ./cc1 -quiet -m32 -O2 t.c -UNAIL_REGS -o t.s.sched_pres \
-fschedule-insns -fsched-pressure
$ egrep -c '%ebp|%esp' t.s*
t.s:366
t.s.NAIL:305
t.s.sched_pres:277
$ grep ident t.s
.ident "GCC: (GNU) 4.8.0 20130313 (experimental) [trunk revision 196638]"
It is unfortunate that nobody has put in the resources yet to make options
like "-fschedule-insns -fsched-pressure" the default for x86.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [4.7/4.8/4.9 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2013-03-13 20:38 ` steven at gcc dot gnu.org
@ 2013-04-12 15:18 ` jakub at gcc dot gnu.org
2014-06-12 13:49 ` [Bug rtl-optimization/21182] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
` (10 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: jakub at gcc dot gnu.org @ 2013-04-12 15:18 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.6.4 |4.7.4
--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> 2013-04-12 15:17:13 UTC ---
GCC 4.6.4 has been released and the branch has been closed.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [4.7/4.8/4.9/4.10 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (7 preceding siblings ...)
2013-04-12 15:18 ` [Bug rtl-optimization/21182] [4.7/4.8/4.9 " jakub at gcc dot gnu.org
@ 2014-06-12 13:49 ` rguenth at gcc dot gnu.org
2014-12-19 13:35 ` [Bug rtl-optimization/21182] [4.8/4.9/5 " jakub at gcc dot gnu.org
` (9 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-06-12 13:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.7.4 |4.8.4
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
The 4.7 branch is being closed, moving target milestone to 4.8.4.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [4.8/4.9/5 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (8 preceding siblings ...)
2014-06-12 13:49 ` [Bug rtl-optimization/21182] [4.7/4.8/4.9/4.10 " rguenth at gcc dot gnu.org
@ 2014-12-19 13:35 ` jakub at gcc dot gnu.org
2015-06-23 8:35 ` [Bug rtl-optimization/21182] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
` (8 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-12-19 13:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.8.4 |4.8.5
--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.8.4 has been released.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [4.8/4.9/5/6 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (9 preceding siblings ...)
2014-12-19 13:35 ` [Bug rtl-optimization/21182] [4.8/4.9/5 " jakub at gcc dot gnu.org
@ 2015-06-23 8:35 ` rguenth at gcc dot gnu.org
2015-06-26 20:03 ` [Bug rtl-optimization/21182] [4.9/5/6 " jakub at gcc dot gnu.org
` (7 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-06-23 8:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.8.5 |4.9.3
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [4.9/5/6 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (10 preceding siblings ...)
2015-06-23 8:35 ` [Bug rtl-optimization/21182] [4.8/4.9/5/6 " rguenth at gcc dot gnu.org
@ 2015-06-26 20:03 ` jakub at gcc dot gnu.org
2015-06-26 20:32 ` jakub at gcc dot gnu.org
` (6 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-06-26 20:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
--- Comment #17 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.9.3 has been released.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [4.9/5/6 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (11 preceding siblings ...)
2015-06-26 20:03 ` [Bug rtl-optimization/21182] [4.9/5/6 " jakub at gcc dot gnu.org
@ 2015-06-26 20:32 ` jakub at gcc dot gnu.org
2021-01-26 13:43 ` [Bug rtl-optimization/21182] [8/9/10/11 " rguenth at gcc dot gnu.org
` (5 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-06-26 20:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|4.9.3 |4.9.4
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [8/9/10/11 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (12 preceding siblings ...)
2015-06-26 20:32 ` jakub at gcc dot gnu.org
@ 2021-01-26 13:43 ` rguenth at gcc dot gnu.org
2021-04-27 11:37 ` [Bug rtl-optimization/21182] [8/9/10/11/12 " jakub at gcc dot gnu.org
` (4 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-26 13:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
--- Comment #31 from Richard Biener <rguenth at gcc dot gnu.org> ---
-fno-tree-ter improves things quite a bit. With -DNAILED_REGS gimple doesn't
do much because we treat registers as memory here.
For trunk
-O2 has 52 spills
-O2 -fno-tree-ter has 35 spills
-O2 -fno-tree-ter -fschedule-insns has 74 spills
-O2 -fno-tree-ter -fschedule-insns -fsched-pressure has 18 spills
-O2 -fschedule-insns -fsched-pressure has 17 spills
to me this really hints at out-of-SSA producing a very bad initial
schedule, by TER but also likely due to folding & friends doing
random stmt placing (it's all a single BB). I think we'd benefit
quite a bit with killing TER (doing all interesting bits pre-RTL
or via SSA RTL forwprop) and ordering SSA def expansion for optimal
register pressure.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [8/9/10/11/12 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (13 preceding siblings ...)
2021-01-26 13:43 ` [Bug rtl-optimization/21182] [8/9/10/11 " rguenth at gcc dot gnu.org
@ 2021-04-27 11:37 ` jakub at gcc dot gnu.org
2021-07-28 7:04 ` [Bug rtl-optimization/21182] [9/10/11/12 " rguenth at gcc dot gnu.org
` (3 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-27 11:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|11.0 |11.2
--- Comment #32 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 11.1 has been released, retargeting bugs to GCC 11.2.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [9/10/11/12 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (14 preceding siblings ...)
2021-04-27 11:37 ` [Bug rtl-optimization/21182] [8/9/10/11/12 " jakub at gcc dot gnu.org
@ 2021-07-28 7:04 ` rguenth at gcc dot gnu.org
2022-04-21 7:47 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
18 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-28 7:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|11.2 |11.3
--- Comment #33 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 11.2 is being released, retargeting bugs to GCC 11.3
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [9/10/11/12 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (15 preceding siblings ...)
2021-07-28 7:04 ` [Bug rtl-optimization/21182] [9/10/11/12 " rguenth at gcc dot gnu.org
@ 2022-04-21 7:47 ` rguenth at gcc dot gnu.org
2023-05-29 10:01 ` [Bug rtl-optimization/21182] [10/11/12/13/14 " jakub at gcc dot gnu.org
2023-07-15 7:36 ` [Bug rtl-optimization/21182] [11/12/13/14 " pinskia at gcc dot gnu.org
18 siblings, 0 replies; 25+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-21 7:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|11.3 |11.4
--- Comment #34 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 11.3 is being released, retargeting bugs to GCC 11.4.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [10/11/12/13/14 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (16 preceding siblings ...)
2022-04-21 7:47 ` rguenth at gcc dot gnu.org
@ 2023-05-29 10:01 ` jakub at gcc dot gnu.org
2023-07-15 7:36 ` [Bug rtl-optimization/21182] [11/12/13/14 " pinskia at gcc dot gnu.org
18 siblings, 0 replies; 25+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-05-29 10:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|11.4 |11.5
--- Comment #35 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 11.4 is being released, retargeting bugs to GCC 11.5.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] [11/12/13/14 Regression] gcc can use registers but uses stack instead
[not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/>
` (17 preceding siblings ...)
2023-05-29 10:01 ` [Bug rtl-optimization/21182] [10/11/12/13/14 " jakub at gcc dot gnu.org
@ 2023-07-15 7:36 ` pinskia at gcc dot gnu.org
18 siblings, 0 replies; 25+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-07-15 7:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2006-01-15 20:37:58 |2023-7-15
--- Comment #36 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
turning off TER still helps.
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
2005-04-23 22:30 [Bug rtl-optimization/21182] New: " vda at port dot imtp dot ilyichevsk dot odessa dot ua
@ 2005-04-23 22:32 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
2005-04-23 22:39 ` pinskia at gcc dot gnu dot org
` (4 subsequent siblings)
5 siblings, 0 replies; 25+ messages in thread
From: vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:32 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:32 -------
Created an attachment (id=8719)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8719&action=view)
testcase. change #if 0 into #if 1 and compare resulting asm
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
2005-04-23 22:30 [Bug rtl-optimization/21182] New: " vda at port dot imtp dot ilyichevsk dot odessa dot ua
2005-04-23 22:32 ` [Bug rtl-optimization/21182] " vda at port dot imtp dot ilyichevsk dot odessa dot ua
@ 2005-04-23 22:39 ` pinskia at gcc dot gnu dot org
2005-04-23 22:49 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
` (3 subsequent siblings)
5 siblings, 0 replies; 25+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-04-23 22:39 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From pinskia at gcc dot gnu dot org 2005-04-23 22:39 -------
Hmm, on the mainline, we get for wc -l:
1613 t.s
1459 t1.s
t1 is the normal #if 0.
Note I used "-O2 -fomit-frame-pointer".
--
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization, ra
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
2005-04-23 22:30 [Bug rtl-optimization/21182] New: " vda at port dot imtp dot ilyichevsk dot odessa dot ua
2005-04-23 22:32 ` [Bug rtl-optimization/21182] " vda at port dot imtp dot ilyichevsk dot odessa dot ua
2005-04-23 22:39 ` pinskia at gcc dot gnu dot org
@ 2005-04-23 22:49 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
2005-04-23 22:54 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
` (2 subsequent siblings)
5 siblings, 0 replies; 25+ messages in thread
From: vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:49 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:49 -------
Aha!
I found out that gcc will use registers with -O3, but not with -O2.
# gcc -O3 serpent.c -S -o serpent-O3.s
# gcc -O2 serpent.c -S -o serpent-O2.s
# ls -l
-rw-r--r-- 1 root root 27975 Apr 24 01:47 serpent-O2.s
-rw-r--r-- 1 root root 21566 Apr 24 01:47 serpent-O3.s
# wc -l serpent-O2.s serpent-O3.s
1558 serpent-O2.s
1265 serpent-O3.s
2823 total
I don't have 4.0.0 here yet...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
2005-04-23 22:30 [Bug rtl-optimization/21182] New: " vda at port dot imtp dot ilyichevsk dot odessa dot ua
` (2 preceding siblings ...)
2005-04-23 22:49 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
@ 2005-04-23 22:54 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
2005-04-24 13:05 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
2005-05-07 15:24 ` steven at gcc dot gnu dot org
5 siblings, 0 replies; 25+ messages in thread
From: vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:54 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:54 -------
These are -O2 and -O3 code comparison.
-O3 code have all modified variables in registers
and thus is smaller and most likely faster.
serpent_encrypt:
pushl %ebp
movl %esp, %ebp
pushl %edi
pushl %esi
pushl %ebx
subl $256, %esp
movl 8(%ebp), %edx
movl 16(%ebp), %eax
movl 12(%eax), %ebx
movl 12(%edx), %ecx
xorl %ebx, %ecx
movl (%edx), %edi
movl %ecx, -20(%ebp)
xorl (%eax), %edi
movl 8(%edx), %ecx
movl 4(%edx), %ebx
movl -20(%ebp), %esi
xorl 8(%eax), %ecx
orl %edi, -20(%ebp)
xorl 4(%eax), %ebx
xorl %ebx, -20(%ebp)
xorl %esi, %edi
xorl %ecx, %esi
andl %edi, %ebx
xorl %edi, %ecx
notl %esi
xorl -20(%ebp), %edi
movl %edx, -16(%ebp)
serpent_encrypt:
pushl %ebp
movl %esp, %ebp
pushl %edi
pushl %esi
pushl %ebx
pushl %edx
movl 8(%ebp), %edi
movl 16(%ebp), %ecx
movl 12(%edi), %eax
xorl 12(%ecx), %eax
movl 8(%edi), %esi
movl 4(%edi), %edx
movl (%edi), %ebx
xorl 8(%ecx), %esi
xorl 4(%ecx), %edx
xorl (%ecx), %ebx
movl %eax, %ecx
orl %ebx, %ecx
xorl %eax, %ebx
xorl %esi, %eax
xorl %edx, %ecx
notl %eax
andl %ebx, %edx
xorl %eax, %edx
xorl %ebx, %esi
xorl %ecx, %ebx
orl %ebx, %eax
xorl %esi, %ebx
andl %edx, %esi
xorl %esi, %eax
notl %edx
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
2005-04-23 22:30 [Bug rtl-optimization/21182] New: " vda at port dot imtp dot ilyichevsk dot odessa dot ua
` (3 preceding siblings ...)
2005-04-23 22:54 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
@ 2005-04-24 13:05 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
2005-05-07 15:24 ` steven at gcc dot gnu dot org
5 siblings, 0 replies; 25+ messages in thread
From: vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-24 13:05 UTC (permalink / raw)
To: gcc-bugs
------- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-24 13:05 -------
With 4.0.0: gcc -O2 gives the same result as gcc -O3,
which is better than gcc 3.4.3 -O2 but worse than 3.4.3 -O3.
For example:
movl %edx, -20(%ebp)
orl %ecx, %edi
movl %ebx, %esi
xorl %ecx, %esi
andl %eax, %ebx
xorl %edi, %ebx
movl %eax, %ecx
notl %ecx
xorl %ebx, %ecx
orl %edi, %eax
xorl %eax, %esi
rorl $19, %esi
rorl $29, -20(%ebp)
xorl %esi, %ebx
xorl -20(%ebp), %ecx
xorl -20(%ebp), %ebx
rorl $31, %ebx
leal 0(,%esi,8), %edx
1) Why %edx was stored in -20(%ebp), there is no %edx usage
in the following insns. %edx value could stay in register
and we can continue to work on its value in register.
2) rorl $31, %ebx == roll $1, %ebx, but 1 bit roll insn is
smaller.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
^ permalink raw reply [flat|nested] 25+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead
2005-04-23 22:30 [Bug rtl-optimization/21182] New: " vda at port dot imtp dot ilyichevsk dot odessa dot ua
` (4 preceding siblings ...)
2005-04-24 13:05 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua
@ 2005-05-07 15:24 ` steven at gcc dot gnu dot org
5 siblings, 0 replies; 25+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-05-07 15:24 UTC (permalink / raw)
To: gcc-bugs
--
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Ever Confirmed| |1
Last reconfirmed|0000-00-00 00:00:00 |2005-05-07 15:23:12
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182
^ permalink raw reply [flat|nested] 25+ messages in thread