public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug rtl-optimization/21182] New: gcc can use registers but uses stack instead @ 2005-04-23 22:30 vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:32 ` [Bug rtl-optimization/21182] " vda at port dot imtp dot ilyichevsk dot odessa dot ua ` (5 more replies) 0 siblings, 6 replies; 11+ messages in thread From: vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:30 UTC (permalink / raw) To: gcc-bugs in this long but relatively simple function gcc can store all frequently used local variables in registers, but it fails to do so. gcc can be forced to do this optimization by asm("reg") modifiers. Resulting code is ~1k smaller. # gcc -v Reading specs from /.share/usr/app/gcc-3.4.3/bin/../lib/gcc/i386-pc-linux-gnu/3.4.3/specs Configured with: ../gcc-3.4.3/configure --prefix=/usr/app/gcc-3.4.3 --exec-prefix=/usr/app/gcc-3.4.3 --bindir=/usr/bin --sbindir=/usr/sbin --libexecdir=/usr/app/gcc-3.4.3/libexec --datadir=/usr/app/gcc-3.4.3/share --sysconfdir=/etc --sharedstatedir=/usr/app/gcc-3.4.3/var/com --localstatedir=/usr/app/gcc-3.4.3/var --libdir=/usr/lib --includedir=/usr/include --infodir=/usr/info --mandir=/usr/man --with-slibdir=/usr/app/gcc-3.4.3/lib --with-local-prefix=/usr/local --with-gxx-include-dir=/usr/app/gcc-3.4.3/include/g++-v3 --enable-languages=c,c++ --with-system-zlib --disable-nls --enable-threads=posix i386-pc-linux-gnu Thread model: posix gcc version 3.4.3 -- Summary: gcc can use registers but uses stack instead Product: gcc Version: 3.4.3 Status: UNCONFIRMED Severity: normal Priority: P2 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: vda at port dot imtp dot ilyichevsk dot odessa dot ua CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: i386-pc-linux-gnu GCC host triplet: i386-pc-linux-gnu GCC target triplet: i386-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead 2005-04-23 22:30 [Bug rtl-optimization/21182] New: gcc can use registers but uses stack instead vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:32 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:39 ` pinskia at gcc dot gnu dot org ` (4 subsequent siblings) 5 siblings, 0 replies; 11+ messages in thread From: vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:32 UTC (permalink / raw) To: gcc-bugs ------- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:32 ------- Created an attachment (id=8719) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8719&action=view) testcase. change #if 0 into #if 1 and compare resulting asm -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead 2005-04-23 22:30 [Bug rtl-optimization/21182] New: gcc can use registers but uses stack instead vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:32 ` [Bug rtl-optimization/21182] " vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:39 ` pinskia at gcc dot gnu dot org 2005-04-23 22:49 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua ` (3 subsequent siblings) 5 siblings, 0 replies; 11+ messages in thread From: pinskia at gcc dot gnu dot org @ 2005-04-23 22:39 UTC (permalink / raw) To: gcc-bugs ------- Additional Comments From pinskia at gcc dot gnu dot org 2005-04-23 22:39 ------- Hmm, on the mainline, we get for wc -l: 1613 t.s 1459 t1.s t1 is the normal #if 0. Note I used "-O2 -fomit-frame-pointer". -- What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization, ra http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead 2005-04-23 22:30 [Bug rtl-optimization/21182] New: gcc can use registers but uses stack instead vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:32 ` [Bug rtl-optimization/21182] " vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:39 ` pinskia at gcc dot gnu dot org @ 2005-04-23 22:49 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:54 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua ` (2 subsequent siblings) 5 siblings, 0 replies; 11+ messages in thread From: vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:49 UTC (permalink / raw) To: gcc-bugs ------- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:49 ------- Aha! I found out that gcc will use registers with -O3, but not with -O2. # gcc -O3 serpent.c -S -o serpent-O3.s # gcc -O2 serpent.c -S -o serpent-O2.s # ls -l -rw-r--r-- 1 root root 27975 Apr 24 01:47 serpent-O2.s -rw-r--r-- 1 root root 21566 Apr 24 01:47 serpent-O3.s # wc -l serpent-O2.s serpent-O3.s 1558 serpent-O2.s 1265 serpent-O3.s 2823 total I don't have 4.0.0 here yet... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead 2005-04-23 22:30 [Bug rtl-optimization/21182] New: gcc can use registers but uses stack instead vda at port dot imtp dot ilyichevsk dot odessa dot ua ` (2 preceding siblings ...) 2005-04-23 22:49 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:54 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-24 13:05 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-05-07 15:24 ` steven at gcc dot gnu dot org 5 siblings, 0 replies; 11+ messages in thread From: vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-23 22:54 UTC (permalink / raw) To: gcc-bugs ------- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:54 ------- These are -O2 and -O3 code comparison. -O3 code have all modified variables in registers and thus is smaller and most likely faster. serpent_encrypt: pushl %ebp movl %esp, %ebp pushl %edi pushl %esi pushl %ebx subl $256, %esp movl 8(%ebp), %edx movl 16(%ebp), %eax movl 12(%eax), %ebx movl 12(%edx), %ecx xorl %ebx, %ecx movl (%edx), %edi movl %ecx, -20(%ebp) xorl (%eax), %edi movl 8(%edx), %ecx movl 4(%edx), %ebx movl -20(%ebp), %esi xorl 8(%eax), %ecx orl %edi, -20(%ebp) xorl 4(%eax), %ebx xorl %ebx, -20(%ebp) xorl %esi, %edi xorl %ecx, %esi andl %edi, %ebx xorl %edi, %ecx notl %esi xorl -20(%ebp), %edi movl %edx, -16(%ebp) serpent_encrypt: pushl %ebp movl %esp, %ebp pushl %edi pushl %esi pushl %ebx pushl %edx movl 8(%ebp), %edi movl 16(%ebp), %ecx movl 12(%edi), %eax xorl 12(%ecx), %eax movl 8(%edi), %esi movl 4(%edi), %edx movl (%edi), %ebx xorl 8(%ecx), %esi xorl 4(%ecx), %edx xorl (%ecx), %ebx movl %eax, %ecx orl %ebx, %ecx xorl %eax, %ebx xorl %esi, %eax xorl %edx, %ecx notl %eax andl %ebx, %edx xorl %eax, %edx xorl %ebx, %esi xorl %ecx, %ebx orl %ebx, %eax xorl %esi, %ebx andl %edx, %esi xorl %esi, %eax notl %edx -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead 2005-04-23 22:30 [Bug rtl-optimization/21182] New: gcc can use registers but uses stack instead vda at port dot imtp dot ilyichevsk dot odessa dot ua ` (3 preceding siblings ...) 2005-04-23 22:54 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-24 13:05 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-05-07 15:24 ` steven at gcc dot gnu dot org 5 siblings, 0 replies; 11+ messages in thread From: vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-04-24 13:05 UTC (permalink / raw) To: gcc-bugs ------- Additional Comments From vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-24 13:05 ------- With 4.0.0: gcc -O2 gives the same result as gcc -O3, which is better than gcc 3.4.3 -O2 but worse than 3.4.3 -O3. For example: movl %edx, -20(%ebp) orl %ecx, %edi movl %ebx, %esi xorl %ecx, %esi andl %eax, %ebx xorl %edi, %ebx movl %eax, %ecx notl %ecx xorl %ebx, %ecx orl %edi, %eax xorl %eax, %esi rorl $19, %esi rorl $29, -20(%ebp) xorl %esi, %ebx xorl -20(%ebp), %ecx xorl -20(%ebp), %ebx rorl $31, %ebx leal 0(,%esi,8), %edx 1) Why %edx was stored in -20(%ebp), there is no %edx usage in the following insns. %edx value could stay in register and we can continue to work on its value in register. 2) rorl $31, %ebx == roll $1, %ebx, but 1 bit roll insn is smaller. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead 2005-04-23 22:30 [Bug rtl-optimization/21182] New: gcc can use registers but uses stack instead vda at port dot imtp dot ilyichevsk dot odessa dot ua ` (4 preceding siblings ...) 2005-04-24 13:05 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua @ 2005-05-07 15:24 ` steven at gcc dot gnu dot org 5 siblings, 0 replies; 11+ messages in thread From: steven at gcc dot gnu dot org @ 2005-05-07 15:24 UTC (permalink / raw) To: gcc-bugs -- What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever Confirmed| |1 Last reconfirmed|0000-00-00 00:00:00 |2005-05-07 15:23:12 date| | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <bug-21182-4@http.gcc.gnu.org/bugzilla/>]
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead [not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/> @ 2013-01-18 0:48 ` vda.linux at googlemail dot com 2013-01-18 0:51 ` vda.linux at googlemail dot com ` (2 subsequent siblings) 3 siblings, 0 replies; 11+ messages in thread From: vda.linux at googlemail dot com @ 2013-01-18 0:48 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 --- Comment #6 from Denis Vlasenko <vda.linux at googlemail dot com> 2013-01-18 00:48:23 UTC --- Created attachment 29200 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29200 Updated testcase, build heper, and results of testing with different gcc versions Tarball contains: serpent.c: the original testcase, only with "#ifdef NAIL_REGS" instead of "#if 0" which allows test compiles w/o editing it. Basically, "gcc -DNAIL_REGS serpent.c" will try to force gcc to use only registers instead of stack. gencode.sh: builds serpent.c with -O2 and -O3, with and without -DNAIL_REGS. The object file names contain gcc version and used options. Then they are objdump'ed and output saved. Tweakable with setting $PREFIX and/or $CC. No -fomit-frame-pointer used: the testcase can be compiled so that stack is not used even without that option. Disassembly: serpent-O2-3.4.3.asm serpent-O2-4.2.1.asm serpent-O2-4.6.3.asm serpent-O2-DNAIL_REGS-3.4.3.asm serpent-O2-DNAIL_REGS-4.2.1.asm serpent-O2-DNAIL_REGS-4.6.3.asm serpent-O3-3.4.3.asm serpent-O3-4.2.1.asm serpent-O3-4.6.3.asm serpent-O3-DNAIL_REGS-3.4.3.asm serpent-O3-DNAIL_REGS-4.2.1.asm serpent-O3-DNAIL_REGS-4.6.3.asm Object files: text data bss dec hex filename 3260 0 0 3260 cbc serpent-O2-DNAIL_REGS-3.4.3.o 3260 0 0 3260 cbc serpent-O3-DNAIL_REGS-3.4.3.o 3292 0 0 3292 cdc serpent-O3-3.4.3.o 3536 0 0 3536 dd0 serpent-O2-4.6.3.o 3536 0 0 3536 dd0 serpent-O3-4.6.3.o 3845 0 0 3845 f05 serpent-O2-DNAIL_REGS-4.6.3.o 3845 0 0 3845 f05 serpent-O3-DNAIL_REGS-4.6.3.o 3877 0 0 3877 f25 serpent-O2-4.2.1.o 3877 0 0 3877 f25 serpent-O3-4.2.1.o 4302 0 0 4302 10ce serpent-O2-3.4.3.o 4641 0 0 4641 1221 serpent-O2-DNAIL_REGS-4.2.1.o 4641 0 0 4641 1221 serpent-O3-DNAIL_REGS-4.2.1.o Take a look inside serpent-O2-DNAIL_REGS-3.4.3.asm file. This is what I want to get without asm hacks: the smallest code, uses no stack. gcc-3.4.3 -O3 comes close: it does spill a few words to stack (search for (%ebp)), but is generally good code (close to ideal?). All other attempts fare worse: gcc-3.4.3 -O2: code is significantly worse than -O3. gcc-4.2.1 -O2/-O3: code is better than gcc-3.4.3 -O2, worse than gcc-4.6.3 gcc-4.6.3 -O2/-O3: six instances of spills to stack . Code is still not as good as gcc-3.4.3 -O3. (-DNAIL_REGS only confuses it more, unlike 3.4.3). Stack usage summary: $ grep 'sub.*,%esp' *.asm | grep -v DNAIL_REGS serpent-O2-3.4.3.asm: 6: 81 ec 00 01 00 00 sub $0x100,%esp serpent-O2-4.2.1.asm: 6: 83 ec 78 sub $0x78,%esp serpent-O2-4.6.3.asm: 4: 83 ec 04 sub $0x4,%esp serpent-O3-4.2.1.asm: 6: 83 ec 78 sub $0x78,%esp serpent-O3-4.6.3.asm: 4: 83 ec 04 sub $0x4,%esp (serpent-O3-3.4.3.asm is not listed, but it allocates and uses one word on stack by push insn). Modules with best (= minimal) stack usage: $ grep -F -e '(%esp)' -e '(%ebp)' serpent-O2-DNAIL_REGS-3.4.3.asm 6: 8b 75 08 mov 0x8(%ebp),%esi 9: 8b 7d 10 mov 0x10(%ebp),%edi ca9: 8b 75 0c mov 0xc(%ebp),%esi $ grep -F -e '(%esp)' -e '(%ebp)' serpent-O3-3.4.3.asm 7: 8b 7d 08 mov 0x8(%ebp),%edi a: 8b 4d 10 mov 0x10(%ebp),%ecx 18c: 89 7d f0 mov %edi,-0x10(%ebp) 1dd: 8b 45 f0 mov -0x10(%ebp),%eax 23b: 8b 75 f0 mov -0x10(%ebp),%esi 299: 8b 7d f0 mov -0x10(%ebp),%edi 432: 8b 55 f0 mov -0x10(%ebp),%edx 4a0: 8b 4d f0 mov -0x10(%ebp),%ecx 50e: 8b 7d f0 mov -0x10(%ebp),%edi 84f: 8b 45 f0 mov -0x10(%ebp),%eax 8b9: 8b 75 f0 mov -0x10(%ebp),%esi 923: 8b 7d f0 mov -0x10(%ebp),%edi cb6: 8b 55 0c mov 0xc(%ebp),%edx $ grep -F -e '(%esp)' -e '(%ebp)' serpent-O3-4.6.3.asm 7: 8b 4c 24 20 mov 0x20(%esp),%ecx b: 8b 44 24 18 mov 0x18(%esp),%eax 22e: 89 0c 24 mov %ecx,(%esp) 239: 23 3c 24 and (%esp),%edi 588: 89 0c 24 mov %ecx,(%esp) 58f: 23 3c 24 and (%esp),%edi 8f4: 89 0c 24 mov %ecx,(%esp) 8fd: 23 3c 24 and (%esp),%edi c60: 89 0c 24 mov %ecx,(%esp) c6b: 23 3c 24 and (%esp),%edi d37: 89 14 24 mov %edx,(%esp) d5a: 8b 44 24 1c mov 0x1c(%esp),%eax d5e: 33 14 24 xor (%esp),%edx Conclusion: gcc-4.6.3 -O3 was close to ideal. gcc-4.2.1 is worse. gcc-4.6.3 got better a bit, still not as good as gcc-4.6.3 -O3. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead [not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/> 2013-01-18 0:48 ` vda.linux at googlemail dot com @ 2013-01-18 0:51 ` vda.linux at googlemail dot com 2013-01-18 0:55 ` vda.linux at googlemail dot com 2013-01-18 0:57 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 11+ messages in thread From: vda.linux at googlemail dot com @ 2013-01-18 0:51 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 Denis Vlasenko <vda.linux at googlemail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |vda.linux at googlemail dot | |com --- Comment #7 from Denis Vlasenko <vda.linux at googlemail dot com> 2013-01-18 00:51:01 UTC --- "gcc-4.6.3 got better a bit, still not as good as gcc-4.6.3 -O3." I meant: gcc-4.6.3 got better a bit, still not as good as gcc-3.4.3 -O3 used to be. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead [not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/> 2013-01-18 0:48 ` vda.linux at googlemail dot com 2013-01-18 0:51 ` vda.linux at googlemail dot com @ 2013-01-18 0:55 ` vda.linux at googlemail dot com 2013-01-18 0:57 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 11+ messages in thread From: vda.linux at googlemail dot com @ 2013-01-18 0:55 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 --- Comment #8 from Denis Vlasenko <vda.linux at googlemail dot com> 2013-01-18 00:55:37 UTC --- Grrr, another mistake. Correcting again: Conclusion: gcc-3.4.3 -O3 was close to ideal. ^^^^^^^^^ gcc-4.2.1 is worse. gcc-4.6.3 got better a bit, still not as good as gcc-3.4.3 -O3 used to be. ^^^^^^^^^^^^^^^^^^^^^^^^^ ^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug rtl-optimization/21182] gcc can use registers but uses stack instead [not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/> ` (2 preceding siblings ...) 2013-01-18 0:55 ` vda.linux at googlemail dot com @ 2013-01-18 0:57 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 11+ messages in thread From: pinskia at gcc dot gnu.org @ 2013-01-18 0:57 UTC (permalink / raw) To: gcc-bugs http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21182 --- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> 2013-01-18 00:57:00 UTC --- It would be interesting to try the trunk which has a newer register allocator than even 4.6.x/4.7.x. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2013-01-18 0:57 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-04-23 22:30 [Bug rtl-optimization/21182] New: gcc can use registers but uses stack instead vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:32 ` [Bug rtl-optimization/21182] " vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:39 ` pinskia at gcc dot gnu dot org 2005-04-23 22:49 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-23 22:54 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-04-24 13:05 ` vda at port dot imtp dot ilyichevsk dot odessa dot ua 2005-05-07 15:24 ` steven at gcc dot gnu dot org [not found] <bug-21182-4@http.gcc.gnu.org/bugzilla/> 2013-01-18 0:48 ` vda.linux at googlemail dot com 2013-01-18 0:51 ` vda.linux at googlemail dot com 2013-01-18 0:55 ` vda.linux at googlemail dot com 2013-01-18 0:57 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).